This patch adds a "GlyphPage" cache which stores the mapping between
code points and glyph IDs in a segmented table of "pages".
This makes Font::glyph_id_for_code_point() significantly faster by
not reparsing the font tables every time you call it.
In the future, we can add more information to GlyphPage (such as
horizontal metrics for each glyph) to further reduce time spent in
text layout and painting.
As the different Cmap encoding records are guaranteed to be sorted by
their platform ID, we would previously prefer the Macintosh platform
because of its lower ID value. However, this platform is split up into
a lot of encoding formats for different languages, and usually only
English is included. This meant that we could not handle most unicode
characters anymore.
The Windows platform now takes precedence again, as it can handle
arbitrary code points in its supported encodings.
This solution is still far from perfect, but it makes this regression
disappear for now.
This defines all the OpenType opcodes/instructions from the
specification:
https://learn.microsoft.com/en-us/typography/opentype/spec/tt_instructions
Each instructions has mnemonic and a range of possible opcodes (as some
of the bits are pretty much immediate value flags).
There's a little helper Instruction struct for accessing the flags and
any associated data (in the case of PUSH instructions).
Then the InstructionStream provides a way of iterating over all the
instructions in some bytes.
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
While this subtable ID is supposed to be deprecated, it is used heavily
in PDF files.
It supports mapping one or two-byte values, with quite a large list of
encodings to tell you which one to expect.
For our use case, we ignore this encoding ID and just pick the first
subtable with this platform ID. Unsupported encodings will get caught
by Subtable::glyph_id_for_code_point() anyway.
This adds the option to pass a subpixel offset when fetching a glyph
from a font, this offset is currently snapped to thirds of a pixel
(i.e. 0, 0.33, 0.66). This is then used when rasterizing the glyph,
which is then cached like usual.
Note that when using subpixel offsets you're trading a bit of space
for accuracy. With the current third of a pixel offsets you can end
up with up to 9 bitmaps per glyph.
These instances were detected by searching for files that include
stdlib.h, but don't match the regex:
\\b(_abort|abort|abs|aligned_alloc|arc4random|arc4random_buf|arc4random_
uniform|atexit|atof|atoi|atol|atoll|bsearch|calloc|clearenv|div|div_t|ex
it|_Exit|EXIT_FAILURE|EXIT_SUCCESS|free|getenv|getprogname|grantpt|labs|
ldiv|ldiv_t|llabs|lldiv|lldiv_t|malloc|malloc_good_size|malloc_size|mble
n|mbstowcs|mbtowc|mkdtemp|mkstemp|mkstemps|mktemp|posix_memalign|posix_o
penpt|ptsname|ptsname_r|putenv|qsort|qsort_r|rand|RAND_MAX|random|reallo
c|realpath|secure_getenv|serenity_dump_malloc_stats|serenity_setenv|sete
nv|setprogname|srand|srandom|strtod|strtof|strtol|strtold|strtoll|strtou
l|strtoull|system|unlockpt|unsetenv|wcstombs|wctomb)\\b
(Without the linebreaks.)
This regex is pessimistic, so there might be more files that don't
actually use anything from the stdlib.
In theory, one might use LibCPP to detect things like this
automatically, but let's do this one step after another.
Instead of fidgeting with offsets and manually reading out big-endian
values, we now declare the "head" table as a C++ struct and use the
BigEndian<T> template to deal with byte order.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
The custom TTF path rasterizer is actually generic enough for it to be
used for other fonts. To make this more clear, it now lives on its own
in the "Font" directory.
This table seems to only exist for OpenType compatibility. There are
some font files, including most embedded fonts in PDF documents, that
don't include one.
For those cases, we now just zero-initialize one to the largest
supported size.
This fixes an issue where, when looping over the components of a
composite glyph, we used to mutate the affine transformation of the
glyph itself when computing the transformations of its components.
(AffineTransform::multiply() is non-const).
This remained undetected for a long time as HeaderCheck is disabled by
default. This commit makes the following file compile again:
// file: compile_me.cpp
#include <LibDNS/Question.h>
// That's it, this was enough to cause a compilation error.
Likewise for most other files touched by this commit.