1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-18 14:45:08 +00:00
Commit graph

14 commits

Author SHA1 Message Date
Nico Weber
7ab4e53b99 LibPDF/CFF: Add code for fdselect parsing
This is one of the two top dict entries we need for CID-keyed fonts.
We don't send any CID-keyed font data into the CFF parser yet,
so no behavior change.
2024-02-12 14:05:16 +01:00
Nico Weber
6ebddab448 LibPDF/CFF: Add enum values for CID-keyed font top dict entries
No behavior change.
2024-02-12 14:05:16 +01:00
Nico Weber
8e7cb11856 LibPDF/CFF: Add enum values for remaining PrivDictOperators
No behavior change, except that we now dbgln() if we see a
PrivDictOperator we don't know about. (I haven't seen this in
practice, but I found this useful while debugging things.)
2024-02-11 14:52:54 +01:00
Nico Weber
9bccb8c8d7 LibPDF: Make CFF::parse_charset() return SIDs
...and do string expansion at the call site.

CID-keyed fonts treat the charset as CIDs instead of as SIDs,
so having access to the SIDs in numberic form will be useful
when we implement support for CID-keyed CFF fonts.

No behavior change.
2024-02-09 13:57:23 +01:00
Nico Weber
58ff7b5336 LibPDF: Support offset size 3 in CFF index reading
...and replace template instantiations with a loop, to make this
easily possible.

Vaguely nice for code size as well.

Needed for example in 0000054.pdf and 0000354.pdf in 0000.zip in the
pdfa dataset.
2023-10-23 09:31:11 -04:00
Nico Weber
04aec4a032 LibPDF: Don't log CFF Copyright tag as unknown 2023-10-21 21:04:02 +02:00
Nico Weber
3907374621 LibPDF: Implement support for callgsubr in CFF font programs
Font programs are bytecode programs defining glyphs. If several glyphs
share a piece of outline, that opcode sequence can be put in a
subroutine ("subr") table and the definition of those glyphs can then
call that subroutine by number, to reduce file size.

CFF fonts can in theory contain multiple fonts, and so there's a global
subr table shared by all the fonts in one CFF, and a local per-fornt
subr table.  We used to only implement the local subr table, now we
implement both.

(We only support one font per CFF, and at least in PDF files, that's
all that's ever used. So a global subr table isn't very useful.
But the spec explicitly allows it -- "Global subroutines may be used in
a FontSet even if it only contains one font." -- and it happens in
practice.)
2023-10-18 10:50:32 -04:00
Nico Weber
1cfe639b6c LibPDF: Implement CFF supplemental encoding
The main encoding data maps glyph ID ("GID") to its codepoint.
If a glyph has several codepoints, then a secondary table mapping
codepoint to string ID ("SID") of the glyph's name is present.

(A separate table associates each glyph with its name already.)

I haven't seen this used in the wild, but the structure of the
supplemental data is also going to be needed for built-in encodings.
2023-10-17 10:21:38 +02:00
Nico Weber
414a164850 LibPDF: Be louder about unimplemented CFF dict entries 2023-10-16 08:32:18 +02:00
Nico Weber
aba787a441 LibPDF: Implement reading of CFF String Index
Only really useful for reading SIDs in the Top DICT (copyright
text etc), which we currently don't do.

I haven't seen a difference from looking things up in the string
table. The only real effect from the commit that I need is that
it pulls a local resolve() labmda into a real function
resolve_sid(), which I want to call in a future commit.

But it makes things more spec-compliant, and if we ever want to
read SIDs in metadata in the future, now we can.
2023-10-15 09:33:34 +02:00
Nico Weber
d451197d3d LibPDF: Add spec comments to CFF 2023-10-13 07:53:27 +02:00
Nico Weber
349996f7f2 LibPDF: Don't crash on files with float CFF defaultWidthX
We'd unconditionally get the int from a Variant<int, float> here,
but PDFs often have a float for defaultWidthX and nominalWidthX.

Fixes crash opening Bakke2010a.pdf from pdffiles (but while the
file loads ok, it looks completely busted).
2023-10-12 19:43:57 +02:00
Rodrigo Tobar
c4507bb56e LibPDF: Add more built-in SIDs
The first iteration has enough SIDs to display simple documents, but
when trying more and more documents we started to need more of these
SIDs to be properly defined. This is a copy/paste exercise from the CFF
document, which is tedious, so it will continue in small drops.

This commit fills all the gaps until SID 228, which covers all the
ISOAdobe space, and should be enough for most use cases. Since this is a
continuous space starting at 0, we now use an Array instead of a Map to
store these names, which should be more performant. Also to simplify
things I've moved the Array out of the CFF class, making it a simpler
static variable, which allows us to use template type deduction.
2023-02-13 00:23:17 +00:00
Rodrigo Tobar
c4b45a82cd LibPDF: Add initial CFF parsing
The Compat Font Format specification (Adobe's Technical Note #5176) is
used by PDF's Type1C fonts to store their data. While being similar in
spirit to PS1 Type 1 Font Programs, it was designed for a more compact
representation and thus space reduction (but an increment on
complexity). It also shares most of the charstring encoding logic, which
is why the CFF class also inherits from Type1FontProgram.

This initial implementation is still lacking many details, e.g.:

 * It doesn't include all the built-in CFF SIDs
 * It doesn't support CFF-provided SIDs (defaults those glyphs to the
   space character)
 * More checks in general
2023-01-25 15:40:11 +01:00