serenity

mirror of https://github.com/RGBCube/serenity synced 2025-09-13 13:27:34 +00:00

Author	SHA1	Message	Date
Nico Weber	3be5719987	LibPDF: Rename `subroutines` to `local_subroutines` in CFF code	2023-10-18 11:02:10 +02:00
Nico Weber	9a0b559932	LibPDF: Tweak formatting of built-in CFF tables This makes the code look more like the pages in the spec. No behavior change, whitespace change only.	2023-10-18 11:00:17 +02:00
Nico Weber	f0e7fb7038	LibPDF: Make Subrs optional in PS1FontProgram https://adobe-type-tools.github.io/font-tech-notes/pdfs/T1_SPEC.pdf : "Using charstring subroutines is not a requirement of a Type 1 font program." And some versions of Computer Modern do in fact not contain a Subrs array. Together with #21473, makes Problemset.pdf from the pdffiles repro render ok instead of crashing.	2023-10-18 11:00:02 +02:00
Nico Weber	cb961101c7	LibPDF: Implement CFF built-in Standard and Expert encodings With this, all tables from the spec appendixes are in CFF.cpp. This fixes a crash reading page 2 (and onward) of 2ThestructureoftheCIE1997ColourAppearanceModelCIECAM97s.pdf in the pdffiles repo.	2023-10-17 10:21:38 +02:00
Nico Weber	eeada4678c	LibPDF: Postpone CFF encoding processing after Top DICT has been read The encoding offset defaults to 0, i.e. the Standard Encoding. That means reading the encoding only if the tag is present causes us to not read it if a font uses the Standard Encoding. Now, we always read an encoding, even if it's the (implicit) default one.	2023-10-17 10:21:38 +02:00
Nico Weber	1cfe639b6c	LibPDF: Implement CFF supplemental encoding The main encoding data maps glyph ID ("GID") to its codepoint. If a glyph has several codepoints, then a secondary table mapping codepoint to string ID ("SID") of the glyph's name is present. (A separate table associates each glyph with its name already.) I haven't seen this used in the wild, but the structure of the supplemental data is also going to be needed for built-in encodings.	2023-10-17 10:21:38 +02:00
Nico Weber	37daeae6fd	LibPDF: Add spec comments, dbgln_if()s to CFF's parse_encoding()	2023-10-17 10:21:38 +02:00
Nico Weber	007d7cdd53	LibPDF: Fix sign (and fixed point) in glyph decoding opcode 24 Two bugs: 1. We decoded a u32, not an i32 as the spec wants 2. (minor) Our fixed-point divisor was off by one Fixes text rendering in Bakke2010a.pdf in pdffiles, and rendering of other fonts with negative width adjustments from optcode 255. That PDF was produced by "Apple pstopdf" and uses font SFBX1200, which is apparently a variant of Computer Modern. So maybe this helps with lots of PDFs produced from TeX files, but I haven't checked that.	2023-10-16 08:33:35 +02:00
Nico Weber	96a4936567	LibPDF: Checking for built-in CFF encodings Only prints a warning for them for now. Also warn on the not-yet-implemented encoding supplement.	2023-10-16 08:32:18 +02:00
Nico Weber	414a164850	LibPDF: Be louder about unimplemented CFF dict entries	2023-10-16 08:32:18 +02:00
Nico Weber	c825194fb9	LibPDF: Reject CFFs with more than one font The code assumes that there's just one Top DICT, so let's be loud when that isn't the case.	2023-10-16 08:32:18 +02:00
Nico Weber	6f783929dd	LibPDF: Implement support for CFF charset format 2 I haven't seen this being used in the wild (yet), but it's easy to implement, and with this we support all charset formats. So we can now mention if we see a format we don't know about.	2023-10-15 15:27:15 +02:00
Nico Weber	5b915fb15c	LibPDF: Add more spec comments to parse_charset()	2023-10-15 15:27:15 +02:00
Nico Weber	49275c4b17	LibPDF: Don't overflow SIDs in type 1 charset parsing first_sid has type SID (aka u16), so don't store it in an u8. This fixes (among other things) page 24 on the PDF 1.7 spec.	2023-10-15 15:27:15 +02:00
Nico Weber	23d6e9f577	LibPDF: Implement CFF built-in charsets ISOAdobe, Expert, Expert Subset	2023-10-15 09:33:34 +02:00
Nico Weber	8060957d8d	LibPDF: Use Appendix A instead of Appendix C for standard names From "10 String INDEX": "Further space saving is obtained by allocating commonly occurring strings to predefined SIDs. These strings, known as the standard strings, describe all the names used in the ISOAdobe and Expert character sets along with a few other strings common to Type 1 fonts. A complete list of standard strings is given in Appendix A. The client program will contain an array of standard strings with nStoStrings elements. Thus, the standard strings take SIDs in the range 0 to (nStaStrings-1)." And "13 Charsets" says that charsets store SIDs. Fixes all "Couldn't find string for SID $n, going with space" messages when going through the encoding pages (page 1010 and thereabouts) in the PDF 1.7 spec.	2023-10-15 09:33:34 +02:00
Nico Weber	aba787a441	LibPDF: Implement reading of CFF String Index Only really useful for reading SIDs in the Top DICT (copyright text etc), which we currently don't do. I haven't seen a difference from looking things up in the string table. The only real effect from the commit that I need is that it pulls a local resolve() labmda into a real function resolve_sid(), which I want to call in a future commit. But it makes things more spec-compliant, and if we ever want to read SIDs in metadata in the future, now we can.	2023-10-15 09:33:34 +02:00
Nico Weber	3c49d0dad3	LibPDF: Add a CFF_DEBUG toggle I'd like to put some debug prints behind this soon. No behavior change.	2023-10-15 07:14:29 +02:00
Nico Weber	2249e79630	LibPDF: Add two FIXMEs	2023-10-13 07:53:27 +02:00
Nico Weber	d451197d3d	LibPDF: Add spec comments to CFF	2023-10-13 07:53:27 +02:00
Nico Weber	349996f7f2	LibPDF: Don't crash on files with float CFF defaultWidthX We'd unconditionally get the int from a Variant<int, float> here, but PDFs often have a float for defaultWidthX and nominalWidthX. Fixes crash opening Bakke2010a.pdf from pdffiles (but while the file loads ok, it looks completely busted).	2023-10-12 19:43:57 +02:00
Andreas Kling	13db3c5ce0	LibGfx: Convert FontDatabase APIs to use FlyString	2023-09-06 11:29:03 -04:00
Nico Weber	934340d845	LibPDF: Add FIXME for CIDFontType2 creation Move some code only needed for CIDFontType2 creation into a new function and add a FIXME describing what needs to happen there.	2023-08-14 16:26:09 +02:00
Nico Weber	1c263eee61	LibPDF: Add spec comments and FIXMEs to Type0Font::draw_string()	2023-08-14 16:26:09 +02:00
Nico Weber	715b6f868f	LibPDF: Sketch out Type0 font support some more Type0 fonts can be either CFF-based or TrueType-based. Create a subclass for each, put in some spec text, and give each case a dedicated error code, so that `--debugging-stats` can tell me which branch is more common.	2023-07-25 12:10:36 +02:00
Nico Weber	e3cc05b935	LibPDF: Don't ignore word_spacing	2023-07-22 12:24:29 -04:00
Nico Weber	9283c939bb	LibPDF: Include `width` in Type1Font glyph cache key LibGfx's ScaledFont doesn't do this, but in ScaledFont m_x_scale and m_y_scale are immutable once the class is created, so it can get away with not doing it. In Type1Font, `width` changes in different calls to Type1Font::draw_glyph(), so we need to make it part of the cache key. Fixes rendering of the word "Version" on the first page of pdf_reference_1-7.pdf.	2023-07-21 07:01:09 +02:00
Matthew Olsson	5f8fd47214	LibPDF: Resize fonts when the text and line matrices change	2023-07-20 06:56:41 +01:00
Nico Weber	117a5f1bd2	LibPDF: Remove an unused variable	2023-07-12 19:02:56 +02:00
MacDue	e1cf868e6e	LibGfx: Use AntiAliasingPainter::fill_path() for drawing font glyphs Using the general AA painter fill_path() is indistinguishable from the previous rasterizer, so this switch simply allows us to share more code.	2023-07-10 20:56:25 +02:00
Timothy Flynn	c911781c21	Everywhere: Remove needless trailing semi-colons after functions This is a new option in clang-format-16.	2023-07-08 10:32:56 +01:00
Nico Weber	f56b897622	Everywhere: Fix a few typos Some even user-visible!	2023-04-12 19:37:35 +02:00
Julian Offenhäuser	bdd5f36121	LibPDF: Load replacements for TrueTypeFonts without an embedded font This previously only happened for Type 1 fonts.	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	5deac3a7f5	LibPDF: Actually return an error when failing to load replacement fonts	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	fec7ccf020	LibPDF: Ask OpenType font programs for glyph widths if needed If the font dictionary didn't specify custom glyph widths, we would fall back to the specified "missing width" (or 0 in most cases!), which meant that we would draw glyphs on top of each other in a lot of cases, namely for TrueTypeFonts or standard Type1Fonts with an OpenType fallback. What we actually want to do in this case is ask the OpenType font for the correct width.	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	2b3a41be74	LibPDF: Remove the subroutine length limit for PS1 font programs A limit of 1024 subroutines seemed like a sensible choice, but some fonts actually do exceed it. We will now only assert that the specified amount is positive.	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	3400779047	LibPDF: Pass the right point width to the font loader in TrueTypeFont	2023-03-22 09:04:00 +01:00
Rodrigo Tobar	4a20751ff6	LibPDF: Detect CFF encodings with supplements These are not yet actually parsed, but detecting them means we at least don't fail to understand the actual format value, which was causing some CFF fonts to fail to load.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	9bca62c5fa	LibPDF: Increase argument stack for Type1FontPrograms Type1 imposes a stack limit of 24 elements, but Type2 has a limit of 48. We are better off relaxing the limit of the former in favour of properly supporting the latter.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	de5e7b487c	LibPDF: Improve Type2 hint counting There were two issues with how we counted hints with Type2 CharString commands: the first was that we assumed a single hint per command, even though there are commands that accept multiple hints thanks to taking a variable number of operands; and secondly, the hintmask/ctrlmask commands can also take operands (i.e., hints) themselves in certain situations. This commit fixes these two issues by correctly counting hints in both cases. This in turn fixes cases when there were more than 8 hints in total, therefore a hintmask/ctrlmask command needed to read more than one byte past the operator itself.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	cb04e4e9da	LibPDF: Refactor Font classes The PDFFont class hierarchy was very simple (a top-level PDFFont class, followed by all the children classes that derived directly from it). While this design was good enough for some things, it didn't correctly model the actual organization of font types: PDF fonts are first divided between "simple" and "composite" fonts. The latter is the Type0 font, while the rest are all simple. * PDF fonts yield a glyph per "character code". Simple fonts char codes are always 1 byte long, while Type0 char codes are of variable size. To this effect, this commit changes the hierarchy of Font classes, introducing a new SimpleFont class, deriving from PDFFont, and acting as the parent of Type1Font and TrueTypeFont, while Type0 still derives from PDFFont directly. This distinction allows us now to: * Model string rendering differently from simple and composite fonts: PDFFont now offers a generic draw_string method that takes a whole string to be rendered instead of a single char code. SimpleFont implements this as a loop over individual bytes of the string, with T1 and TT implementing draw_glyph for drawing a single char code. * Some common fields between T1 and TT fonts now live under SimpleFont instead of under PDFfont, where they previously resided. * Some other interfaces specific to SimpleFont have been cleaned up, with u16/u32 not appearing on these classes (or in PDFFont) anymore. * Type0Font's rendering still remains unimplemented. As part of this exercise I also took the chance to perform the following cleanups and restructurings: * Refactored the creation and initialisation of fonts. They are all centrally created at PDFFont::create, with a virtual "initialize" method that allows them to initialise their inner members in the correct order (parent first, child later) after creation. * Removed duplicated code. * Cleaned up some public interfaces: receive const refs, removed unnecessary ctro/dtors, etc. * Slightly changed how Type1 and TrueType fonts are implemented: if there's an embedded font that takes priority, otherwise we always look for a replacement. * This means we don't do anything special for the standard fonts. The only behavior previously associated to standard fonts was choosing an encoding, and even that was under questioning.	2023-02-24 20:16:50 +01:00
Rodrigo Tobar	c4507bb56e	LibPDF: Add more built-in SIDs The first iteration has enough SIDs to display simple documents, but when trying more and more documents we started to need more of these SIDs to be properly defined. This is a copy/paste exercise from the CFF document, which is tedious, so it will continue in small drops. This commit fills all the gaps until SID 228, which covers all the ISOAdobe space, and should be enough for most use cases. Since this is a continuous space starting at 0, we now use an Array instead of a Map to store these names, which should be more performant. Also to simplify things I've moved the Array out of the CFF class, making it a simpler static variable, which allows us to use template type deduction.	2023-02-13 00:23:17 +00:00
Julian Offenhäuser	a2b57dd188	LibPDF: Return an error if we fail to load a replacement font	2023-02-12 10:55:37 +00:00
Julian Offenhäuser	4f4bd3793f	LibPDF: Fix glyph sizing bug that caused incorrect spacing When loading OpenType fonts, either as a replacement for the standard 14 fonts or an embedded one, we previously passed the font size as the _point_ size to the loader class. The difference is quite subtle, being that Gfx::ScaledFont uses the optional dpi parameter to convert the input from inches to pixels. This meant that our glyphs were exactly 1.333% too large, causing them to overlap in places.	2023-02-10 15:37:51 +01:00
Julian Offenhäuser	152a8c5c43	LibPDF: Use more appropriate standard 14 replacement fonts The mapping of standard font to replacement now looks like this: Times New Roman -> Liberation Serif Courier -> Liberation Mono Helvetica, Arial -> Liberation Sans	2023-02-10 15:37:51 +01:00
Rodrigo Tobar	e4a7606b81	LibPDF: Construct accented characters with Type1 seac command The seac command provides the base and accented character that are needed to create an accented character glyph. Storing these values is all that was left to properly support these composed glyphs.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	3eaa27f53a	LibPDF: Add infrastructure for accented character glyphs Type1 accented character glyphs are composed of two other glyphs in the same font: a base glyph and an accent glyph, given as char codes in the standard encoding. These two glyphs are then composed together to form the accented character. This commit adds the data structures to hold the information for accented characters, and also the routine that composes the final glyph path out of the two individual components. All glyphs must have been loaded by the time this composition takes place, and thus a new protected consolidate_glyphs() routine has been added to perform this calculation.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	11a9bfd4b6	LibPDF: Turn Glyph into a class Glyph was a simple structure, but even now it's become more complex that it was initially. Turning it into a class hides some of that complexity, and make sit easier to understand to external eyes. While doing this I also decided to remove the float + bool combo for keeping track of the glyph's width, and replaced it with an Optional instead.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	c084943457	LibPDF: Index Type1 glyphs by name, not char code Storing glyphs indexed by char code in a Type1 Font Program binds a Font Program instance to the particular Encoding that was used at Font Program construction time. This makes it difficult to reuse Font Program instances against different Encodings, which would be otherwise possible. This commit changes how we store the glyphs on Type1 Font Programs. Instead of storing them on a map indexed by char code, the map is now indexed by glyph name. In turn, when rendering a glyph we use the Encoding object to turn the char code into a glyph name, which in turn is used to index into the map of glyphs. This is the first step towards reusability of Type1 Font Programs. It also unlocks the ability to render glyphs that are described via the "seac" command (standard encoding accented character), which requires accessing the base and accent glyphs by name.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	596119cf3e	LibPDF: Add placeholders for *flex Type2 commands These should be implemented properly in the future, but for now we are adding the as placeholders to avoid crashes.	2023-02-08 19:47:15 +01:00

1 2 3 4

189 commits