It turns out that hmtx and OS/2 table values _are_ used when
rendering OpenType for PDFs: hmtx is used for the left-side bearing
value (which is read in `Painter::draw_glyph()`), and OS/2 is used
for the ascender, which Type0's CIDFontType2::draw_glyph()
and TrueTypeFont::draw_glyph() read.
So instead of not trying to read these tables, instead try to read
them but tolerate them failing to read and ignore them then.
Follow-up to #23276.
(I've seen weird glyph positioning from not reading the hmtx table.
I haven't seen any problems caused by not reading the OS/2 table yet,
but since the PDF code does use the ascender value, let's read that
too.)
Check if we have a cmap before dereferencing it again.
Fixes a crash on page 8 of 0000188.pdf now that the font no
longer fails to load to due to a missing name table.
Looks like this is a Type2 truetype font, where we don't provide
an external cmap. How this font is supposed to work without a cmap
I don't know -- but for now, we no longer crash on it, and draw
some of the text with the previous font (which happens to work
fine in this particular case).
Kind of reverts #21675, but #21744 made that better
4 of my 1000 test PDFs complained "Invalid table offset or length in
font" before.
For example, in 0000203.pdf, these tags had length 0: 'cvt ', 'fpgm',
'prep', 'name', 'OS/2'. (Generally it's tables that aren't needed
for rendering PDFs, and the PDF writer figured it's easier to zero
out these tables instead of omitting them altogether for some reason.)
Increases number of PDFs that render without diagnostics from
765 to 767.
It is sometimes truncated in fonts embedded in PDFs, and the data
is not needed to render PDFs. 2 of my 1000 test PDFs used to
complain "Could not load OS2 v1: Not enough data" and 1
"Could not load OS2 v2: Not enough data" before.
Increases number of PDFs that render without diagnostics from
764 to 765 (and decreases the number of distinct error messages
from 27 to 25).
It is sometimes truncated in fonts embedded in PDFs, and the data
is not needed to render PDFs. 26 of my 1000 test files complained
"Could not load Hmtx: Not enough data" before.
Increases number of PDFs that render without diagnostics from
743 to 764.
It is often missing in fonts embedded in PDFs. 75 of my 1000 test
files complained "Font is missing Name" when trying to read fonts
before.
Increases number of PDFs that render without diagnostics from
682 to 743.
If this is passed in, we don't use the cmap table from the font,
but use the external lookup object instead.
This conveniently happens to create a place where we can put all the
Cmap configuration logic that was a bit spread out before.
No behavior change yet.
We now reject fonts where the active cmap subtable is in a format
we can't read yet, instead of silently drawing squares for all glyphs.
This doesn't fire at all for my 1000-file PDF test set, but seems
like a good thing to check.
(Instead of duplicating the switch, I first tried making a
glyph_id_for_code_point_or_else() that returns ErrorOr<u32> and then
make both glyph_id_for_code_point() and validate_format_can_be_read()
call that, but I liked less how that worked out -- felt too clever.)
This would've saved me some debugging on #23103.
We now return an error instead of a font that draws squares for all
characters. That seems preferable since it makes these cases easy to
find. This fires for three files in my 1000-file PDF test set, so it's
not exceedingly common (...but I wasn't aware that three files were
rendering boxes for this reason, and now I am and can just make them
work in the future).
FontDatabase.h with its includes add up to quite a lot of code. In the
next commit, compiled GML files are going to need to access the
FontWeight enum, so let's allow them to do that without pulling in lots
of other things.
Also, change users to include FontWeight.h instead of FontDatabase.h
where appropriate.
We were trying to read exactly 5 pairs for some reason, instead of the
number specified by the format 0 header.
Fixing this makes "OpenSans Condensed" load on my machine.
Before this change, we would only cache and reuse Gfx::ScaledFont
instances for downloaded CSS fonts.
By moving it into Gfx::VectorFont, we get caching for all vector fonts,
including local system TTFs etc.
This avoids a *lot* of style invalidations in LibWeb, since we now vend
the same Gfx::Font pointer for the same font when used repeatedly.
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:
```
Optional<I> opt;
if constexpr (IsSigned<I>)
opt = view.to_int<I>();
else
opt = view.to_uint<I>();
```
For us.
The main goal here however is to have a single generic number conversion
API between all of the String classes.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
These properties could be cached in the font object once they are
decoded from the table for the first time to make subsequent access
faster.
This change makes `Node::scaled_font()` in LibWeb go down from 6% to
1.5% in my profiles because building a font cache key is now a lot
cheaper.
Our previous check was not sufficient, since it merely checked the
first byte of the EncodingRecord offset is within range, while the
actual read is 4-byte wide.
Fixes ossfuzz-64165.
The one deviation from the spec here is to use this in the WOFF
TableDirectoryEntry's tag field. However, *not* making that a Tag made
a lot of things more complicated than they need to be.
Add two new members to Font::AllowInexactMatch: Larger and Smaller.
This allows callers of FontDatabase::get to specify if they want to
prefer a larger or smaller font in the event of a tie.
Read the basic lists as spans, and use those when looking for kerning.
Kerning lookup still does bit-casting for now. As for CBLC, the data is
a bit complicated.
A few closely-related changes:
- Move our definitions of the OpenType spec's "data types" into their
own header file.
- Add definitions for the integer types there too, for completeness.
(Plus Uint16 matches the spec term, and is less verbose than
BigEndian<u16>.)
- Include Traits for the non-BigEndian types so that we can read them
from Streams. (BigEndian<integer-type> already has this.)
- Use the integer types in our struct definitions.
As a bonus, this fixes a bug in Hmtx, which read the left-side bearings
as i16 instead of BigEndian<i16>.
Do more checks at load time, including categorizing the subtables and
producing our own directory of them.
The format for Kern is a little complicated, so use a Stream instead of
manual offsets.
Maxp had the shared fields duplicated, and OS2 embedded each version's
struct in the next. Instead, let's use inheritance to avoid duplicating
shared fields while still allowing them to be directly accessed.
While I'm at it, rename the Maxp and GPOS table structs to just be
VersionX_Y, because they're not ambiguous with anything else.
LibGfx: Rename GPOSHeader to HeaderVersion1_0
Because there's a version 1.1 as well, which we'll eventually want to
support.
At first glance this looks like it holds the memory that the various
slices point into... but it actually doesn't own that memory. Nobody
uses m_buffer, so it serves no purpose.
Some of these are odd sizes. We managed not to insert padding because
BigEndian is itself marked as packed, but let's be explicit instead of
relying on that. :^)