Similar to the FontDatabase, this will be needed for Ladybird to find
emoji images. We now generate just the file name of emoji image in
LibUnicode, and look for that file in the specified path (defaulting to
/res/emoji) at runtime.
Keycap emoji, for example, begin with ASCII digits. Instead, check the
first code point for the Emoji Unicode property.
On a profile of scrolling around on the welcome page in the Browser,
this raises the runtime percentage of Font::glyph_or_emoji_width from
about 0.8% to 1.3%.
On a profile of scrolling around on the welcome page in the Browser,
this drops the runtime percentage of Font::glyph_or_emoji_width from
about 70% to 0.8%.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
This allows us to treat unqualified, minimally-qualified, and
fully-qualified emojis the same as long as emoji filenames are in their
least qualified form (with respect to emoji presentation).
For example, the transgender flag emoji has 4 possible forms:
1F3F3 FE0F 200D 26A7 FE0F ; fully-qualified # 🏳️⚧️
1F3F3 200D 26A7 FE0F ; unqualified # 🏳⚧️
1F3F3 FE0F 200D 26A7 ; unqualified # 🏳️⚧
1F3F3 200D 26A7 ; unqualified # 🏳⚧
In order to treat them all as the same, we now drop all forms down
to 1F3F3 200D 26A7 (skipping any FE0F codepoints) and then do the
lookup for that form.
We can't rely on the caller to keep the code points alive, and this
was sometimes causing incorrect cache hits, leading to the wrong
emoji being displayed.
Fixes#14693
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.