Keycap emoji, for example, begin with ASCII digits. Instead, check the
first code point for the Emoji Unicode property.
On a profile of scrolling around on the welcome page in the Browser,
this raises the runtime percentage of Font::glyph_or_emoji_width from
about 0.8% to 1.3%.
On a profile of scrolling around on the welcome page in the Browser,
this drops the runtime percentage of Font::glyph_or_emoji_width from
about 70% to 0.8%.
Currently, we compute the width of text one code point at a time. This
ignores grapheme clusters (emoji in particular). One effect of this is
when highlighting a multi-code point emoji. We will errantly increase
the highlight rect to the sum of all code point widths, rather than
just the width of the resolved emoji bitmap.
This patch adds a "GlyphPage" cache which stores the mapping between
code points and glyph IDs in a segmented table of "pages".
This makes Font::glyph_id_for_code_point() significantly faster by
not reparsing the font tables every time you call it.
In the future, we can add more information to GlyphPage (such as
horizontal metrics for each glyph) to further reduce time spent in
text layout and painting.
As the different Cmap encoding records are guaranteed to be sorted by
their platform ID, we would previously prefer the Macintosh platform
because of its lower ID value. However, this platform is split up into
a lot of encoding formats for different languages, and usually only
English is included. This meant that we could not handle most unicode
characters anymore.
The Windows platform now takes precedence again, as it can handle
arbitrary code points in its supported encodings.
This solution is still far from perfect, but it makes this regression
disappear for now.
This defines all the OpenType opcodes/instructions from the
specification:
https://learn.microsoft.com/en-us/typography/opentype/spec/tt_instructions
Each instructions has mnemonic and a range of possible opcodes (as some
of the bits are pretty much immediate value flags).
There's a little helper Instruction struct for accessing the flags and
any associated data (in the case of PUSH instructions).
Then the InstructionStream provides a way of iterating over all the
instructions in some bytes.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
While this subtable ID is supposed to be deprecated, it is used heavily
in PDF files.
It supports mapping one or two-byte values, with quite a large list of
encodings to tell you which one to expect.
For our use case, we ignore this encoding ID and just pick the first
subtable with this platform ID. Unsupported encodings will get caught
by Subtable::glyph_id_for_code_point() anyway.
This adds the option to pass a subpixel offset when fetching a glyph
from a font, this offset is currently snapped to thirds of a pixel
(i.e. 0, 0.33, 0.66). This is then used when rasterizing the glyph,
which is then cached like usual.
Note that when using subpixel offsets you're trading a bit of space
for accuracy. With the current third of a pixel offsets you can end
up with up to 9 bitmaps per glyph.
These instances were detected by searching for files that include
stdlib.h, but don't match the regex:
\\b(_abort|abort|abs|aligned_alloc|arc4random|arc4random_buf|arc4random_
uniform|atexit|atof|atoi|atol|atoll|bsearch|calloc|clearenv|div|div_t|ex
it|_Exit|EXIT_FAILURE|EXIT_SUCCESS|free|getenv|getprogname|grantpt|labs|
ldiv|ldiv_t|llabs|lldiv|lldiv_t|malloc|malloc_good_size|malloc_size|mble
n|mbstowcs|mbtowc|mkdtemp|mkstemp|mkstemps|mktemp|posix_memalign|posix_o
penpt|ptsname|ptsname_r|putenv|qsort|qsort_r|rand|RAND_MAX|random|reallo
c|realpath|secure_getenv|serenity_dump_malloc_stats|serenity_setenv|sete
nv|setprogname|srand|srandom|strtod|strtof|strtol|strtold|strtoll|strtou
l|strtoull|system|unlockpt|unsetenv|wcstombs|wctomb)\\b
(Without the linebreaks.)
This regex is pessimistic, so there might be more files that don't
actually use anything from the stdlib.
In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.