serenity

mirror of https://github.com/RGBCube/serenity synced 2025-07-17 08:47:35 +00:00

Author	SHA1	Message	Date
Nico Weber	934340d845	LibPDF: Add FIXME for CIDFontType2 creation Move some code only needed for CIDFontType2 creation into a new function and add a FIXME describing what needs to happen there.	2023-08-14 16:26:09 +02:00
Nico Weber	1c263eee61	LibPDF: Add spec comments and FIXMEs to Type0Font::draw_string()	2023-08-14 16:26:09 +02:00
Nico Weber	715b6f868f	LibPDF: Sketch out Type0 font support some more Type0 fonts can be either CFF-based or TrueType-based. Create a subclass for each, put in some spec text, and give each case a dedicated error code, so that `--debugging-stats` can tell me which branch is more common.	2023-07-25 12:10:36 +02:00
Nico Weber	e3cc05b935	LibPDF: Don't ignore word_spacing	2023-07-22 12:24:29 -04:00
Nico Weber	9283c939bb	LibPDF: Include `width` in Type1Font glyph cache key LibGfx's ScaledFont doesn't do this, but in ScaledFont m_x_scale and m_y_scale are immutable once the class is created, so it can get away with not doing it. In Type1Font, `width` changes in different calls to Type1Font::draw_glyph(), so we need to make it part of the cache key. Fixes rendering of the word "Version" on the first page of pdf_reference_1-7.pdf.	2023-07-21 07:01:09 +02:00
Matthew Olsson	5f8fd47214	LibPDF: Resize fonts when the text and line matrices change	2023-07-20 06:56:41 +01:00
Nico Weber	117a5f1bd2	LibPDF: Remove an unused variable	2023-07-12 19:02:56 +02:00
MacDue	e1cf868e6e	LibGfx: Use AntiAliasingPainter::fill_path() for drawing font glyphs Using the general AA painter fill_path() is indistinguishable from the previous rasterizer, so this switch simply allows us to share more code.	2023-07-10 20:56:25 +02:00
Timothy Flynn	c911781c21	Everywhere: Remove needless trailing semi-colons after functions This is a new option in clang-format-16.	2023-07-08 10:32:56 +01:00
Nico Weber	f56b897622	Everywhere: Fix a few typos Some even user-visible!	2023-04-12 19:37:35 +02:00
Julian Offenhäuser	bdd5f36121	LibPDF: Load replacements for TrueTypeFonts without an embedded font This previously only happened for Type 1 fonts.	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	5deac3a7f5	LibPDF: Actually return an error when failing to load replacement fonts	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	fec7ccf020	LibPDF: Ask OpenType font programs for glyph widths if needed If the font dictionary didn't specify custom glyph widths, we would fall back to the specified "missing width" (or 0 in most cases!), which meant that we would draw glyphs on top of each other in a lot of cases, namely for TrueTypeFonts or standard Type1Fonts with an OpenType fallback. What we actually want to do in this case is ask the OpenType font for the correct width.	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	2b3a41be74	LibPDF: Remove the subroutine length limit for PS1 font programs A limit of 1024 subroutines seemed like a sensible choice, but some fonts actually do exceed it. We will now only assert that the specified amount is positive.	2023-03-25 16:27:30 -06:00
Julian Offenhäuser	3400779047	LibPDF: Pass the right point width to the font loader in TrueTypeFont	2023-03-22 09:04:00 +01:00
Rodrigo Tobar	4a20751ff6	LibPDF: Detect CFF encodings with supplements These are not yet actually parsed, but detecting them means we at least don't fail to understand the actual format value, which was causing some CFF fonts to fail to load.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	9bca62c5fa	LibPDF: Increase argument stack for Type1FontPrograms Type1 imposes a stack limit of 24 elements, but Type2 has a limit of 48. We are better off relaxing the limit of the former in favour of properly supporting the latter.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	de5e7b487c	LibPDF: Improve Type2 hint counting There were two issues with how we counted hints with Type2 CharString commands: the first was that we assumed a single hint per command, even though there are commands that accept multiple hints thanks to taking a variable number of operands; and secondly, the hintmask/ctrlmask commands can also take operands (i.e., hints) themselves in certain situations. This commit fixes these two issues by correctly counting hints in both cases. This in turn fixes cases when there were more than 8 hints in total, therefore a hintmask/ctrlmask command needed to read more than one byte past the operator itself.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	cb04e4e9da	LibPDF: Refactor Font classes The PDFFont class hierarchy was very simple (a top-level PDFFont class, followed by all the children classes that derived directly from it). While this design was good enough for some things, it didn't correctly model the actual organization of font types: PDF fonts are first divided between "simple" and "composite" fonts. The latter is the Type0 font, while the rest are all simple. * PDF fonts yield a glyph per "character code". Simple fonts char codes are always 1 byte long, while Type0 char codes are of variable size. To this effect, this commit changes the hierarchy of Font classes, introducing a new SimpleFont class, deriving from PDFFont, and acting as the parent of Type1Font and TrueTypeFont, while Type0 still derives from PDFFont directly. This distinction allows us now to: * Model string rendering differently from simple and composite fonts: PDFFont now offers a generic draw_string method that takes a whole string to be rendered instead of a single char code. SimpleFont implements this as a loop over individual bytes of the string, with T1 and TT implementing draw_glyph for drawing a single char code. * Some common fields between T1 and TT fonts now live under SimpleFont instead of under PDFfont, where they previously resided. * Some other interfaces specific to SimpleFont have been cleaned up, with u16/u32 not appearing on these classes (or in PDFFont) anymore. * Type0Font's rendering still remains unimplemented. As part of this exercise I also took the chance to perform the following cleanups and restructurings: * Refactored the creation and initialisation of fonts. They are all centrally created at PDFFont::create, with a virtual "initialize" method that allows them to initialise their inner members in the correct order (parent first, child later) after creation. * Removed duplicated code. * Cleaned up some public interfaces: receive const refs, removed unnecessary ctro/dtors, etc. * Slightly changed how Type1 and TrueType fonts are implemented: if there's an embedded font that takes priority, otherwise we always look for a replacement. * This means we don't do anything special for the standard fonts. The only behavior previously associated to standard fonts was choosing an encoding, and even that was under questioning.	2023-02-24 20:16:50 +01:00
Rodrigo Tobar	c4507bb56e	LibPDF: Add more built-in SIDs The first iteration has enough SIDs to display simple documents, but when trying more and more documents we started to need more of these SIDs to be properly defined. This is a copy/paste exercise from the CFF document, which is tedious, so it will continue in small drops. This commit fills all the gaps until SID 228, which covers all the ISOAdobe space, and should be enough for most use cases. Since this is a continuous space starting at 0, we now use an Array instead of a Map to store these names, which should be more performant. Also to simplify things I've moved the Array out of the CFF class, making it a simpler static variable, which allows us to use template type deduction.	2023-02-13 00:23:17 +00:00
Julian Offenhäuser	a2b57dd188	LibPDF: Return an error if we fail to load a replacement font	2023-02-12 10:55:37 +00:00
Julian Offenhäuser	4f4bd3793f	LibPDF: Fix glyph sizing bug that caused incorrect spacing When loading OpenType fonts, either as a replacement for the standard 14 fonts or an embedded one, we previously passed the font size as the _point_ size to the loader class. The difference is quite subtle, being that Gfx::ScaledFont uses the optional dpi parameter to convert the input from inches to pixels. This meant that our glyphs were exactly 1.333% too large, causing them to overlap in places.	2023-02-10 15:37:51 +01:00
Julian Offenhäuser	152a8c5c43	LibPDF: Use more appropriate standard 14 replacement fonts The mapping of standard font to replacement now looks like this: Times New Roman -> Liberation Serif Courier -> Liberation Mono Helvetica, Arial -> Liberation Sans	2023-02-10 15:37:51 +01:00
Rodrigo Tobar	e4a7606b81	LibPDF: Construct accented characters with Type1 seac command The seac command provides the base and accented character that are needed to create an accented character glyph. Storing these values is all that was left to properly support these composed glyphs.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	3eaa27f53a	LibPDF: Add infrastructure for accented character glyphs Type1 accented character glyphs are composed of two other glyphs in the same font: a base glyph and an accent glyph, given as char codes in the standard encoding. These two glyphs are then composed together to form the accented character. This commit adds the data structures to hold the information for accented characters, and also the routine that composes the final glyph path out of the two individual components. All glyphs must have been loaded by the time this composition takes place, and thus a new protected consolidate_glyphs() routine has been added to perform this calculation.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	11a9bfd4b6	LibPDF: Turn Glyph into a class Glyph was a simple structure, but even now it's become more complex that it was initially. Turning it into a class hides some of that complexity, and make sit easier to understand to external eyes. While doing this I also decided to remove the float + bool combo for keeping track of the glyph's width, and replaced it with an Optional instead.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	c084943457	LibPDF: Index Type1 glyphs by name, not char code Storing glyphs indexed by char code in a Type1 Font Program binds a Font Program instance to the particular Encoding that was used at Font Program construction time. This makes it difficult to reuse Font Program instances against different Encodings, which would be otherwise possible. This commit changes how we store the glyphs on Type1 Font Programs. Instead of storing them on a map indexed by char code, the map is now indexed by glyph name. In turn, when rendering a glyph we use the Encoding object to turn the char code into a glyph name, which in turn is used to index into the map of glyphs. This is the first step towards reusability of Type1 Font Programs. It also unlocks the ability to render glyphs that are described via the "seac" command (standard encoding accented character), which requires accessing the base and accent glyphs by name.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	596119cf3e	LibPDF: Add placeholders for *flex Type2 commands These should be implemented properly in the future, but for now we are adding the as placeholders to avoid crashes.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	82bd854d6f	LibPDF: Account for other endings of PS1 Encoding array	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	286e3e6872	LibPDF: Simplify Encoding to align with simple font requirements All "Simple Fonts" in PDF (all but Type0 fonts) have the property that glyphs are selected with single byte character codes. This means that the Encoding objects should use u8 for representing these character codes. Moreover, and as mentioned in a previous commit, there is no need to store the unicode code point associated with a character (which was in turn wrongly associated to a glyph). This commit greatly simplifies the Encoding class. Namely it: * Removes the unnecessary CharDescriptor class. * Changes the internal maps to be u8 -> FlyString and vice-versa, effectively providing two-way lookups. * Adds a new method to set a two-way u8 -> FlyString mapping and uses it in all possible places. * Simplified the creation of Encoding objects. * Changes how the WinAnsi special treatment for bullet points is implemented.	2023-02-02 14:50:38 +01:00
Rodrigo Tobar	fb0c3a9e18	LibPDF: Stop calculating code points for glyphs When rendering text, a sequence of bytes corresponds to a glyph, but not necessarily to a character. This misunderstanding permeated through the Encoding through to the Font classes, which were all trying to calculate such values. Moreover, this was done only to identify "space" characters/glyphs, which were getting a special treatment (e.g., avoid rendering). Spaces are not special though -- there might be fonts that render something for them -- and thus should not be skipped	2023-02-02 14:50:38 +01:00
Tim Schumacher	ae64b68717	AK: Deprecate the old `AK::Stream` This also removes a few cases where the respective header wasn't actually required to be included.	2023-01-29 19:16:44 -07:00
Sam Atkins	bc1504c794	LibPDF: Remove declarations for non-existent methods	2023-01-27 20:33:18 +00:00
Rodrigo Tobar	3588709986	LibPDF: Load Type1C fonts when found Now that our CFF parser is working we can load Type1C fonts in PDF, which are backed by a CFF stream.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	c4b45a82cd	LibPDF: Add initial CFF parsing The Compat Font Format specification (Adobe's Technical Note #5176) is used by PDF's Type1C fonts to store their data. While being similar in spirit to PS1 Type 1 Font Programs, it was designed for a more compact representation and thus space reduction (but an increment on complexity). It also shares most of the charstring encoding logic, which is why the CFF class also inherits from Type1FontProgram. This initial implementation is still lacking many details, e.g.: * It doesn't include all the built-in CFF SIDs * It doesn't support CFF-provided SIDs (defaults those glyphs to the space character) * More checks in general	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	1ec4ad5eb6	LibPDF: Add name -> char code conversion in Encoding This is an operation that was already being done (sub-optimally) in PS1FontProgram, so we are replacing that. We will use this during CFF parsing too.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	1b90ea7d3a	LibPDF: Augment Type11FontProgram with Type2 capabilities The Type1FontProgram logic was based on the Adobe Type 1 Font Format; in particular, it implemented the CharStrings Dictionary section (charstring decoding, and most commands). In the case of Type1, these charstrings are read from a PS1 diciontary, with one entry per character in the font's charset. This has served us well for Type1 font rendering. When implementing Type1C font rendering, this wasn't enough. Type1C PDF fonts are specified in embedded CFF (Compact Font File) streams, which also contain a charstring dictionary with an entry for each character in the font's charset. These entries can be slightly different from those in a PS1 Font Program though: depending on a flag in the CFF, the entries will be encoded either in the original charstring format from the Adobe Type 1 Font Format, or in the "Type 2 Charstring Format" (Adobe's Technical Note #1577). This new format is for the most part a super-set of the original, with small differences, all in the name of making the representation as compact as possible: * The glyph's width is not specified via a separate command; instead it's an optional additional argument to the first command of the charstring stream (and even then, it's only the difference to a nominal character width specified in the CFF). * The interpretation of a 4-byte number is different from Type 1: in Type 1 this is a 4-byte unsigned integer, whereas in Type 1 it's a fixed decimal with 16 bits of fractional part. * Many commands accept a variable set of arguments, so they can draw more than one line/curve on a single go. These are all retro-compatible with Type 1's commands. All these changes are implemented in this patch in a backwards-compatible way. To ensure Type 1/2 behavior is accessed, a new parameter indicates which behavior is desired when decoding the charstring stream. I also took the chance to centralise some logic that was previously duplicated across the parse_glyph function. Common lambdas capture the logic for moving to, or drawing a line/curve to a given point and updating the glyph state. Similarly, some command logic, including reading parameters, are shared by several commands. Finally, I've re-organised the cases in the main switch to group together related commands.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	f06de0fa07	LibPDF: Remove unused member	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	416585f75a	LibPDF: Add new Type1FontProgram base class We are planning to add support for CFF fonts to read Type1 fonts, and therefore much of the logic already found in PS1FontProgram will be useful for representing the Type1 fonts read from CFF. This commit moves the PS1-independent bits of PS1FontProgram into a new Type1FontProgram base class that can be used as the base for CFF-based Type1 fonts in the future. The Type1Font class uses this new type now instead of storing a PS1FontProgram pointer. While doing this refactoring I also took care of making some minor adjustments to the PS1FontProgram API, namely: * Its create() method is static and returns a NonnullRefPtr<Type1FontProgram>. * Many (all?) of the parse_* methods are now static. * Added const where possible. Notably, the Type1FontProgram also contains at the moment the code that parses the CharString data from the PS1 program. This logic is very similar in CFF files, so after some minor adjustments later on it should be possible to reuse most of it.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	e751ec2089	LibPDF: Avoid reading fields from moved-from Data object This might not be an issue at the moment, but moved-from objects are usually in a unspecifed but valid state, meaning that we shouldn't read from them.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	bfeca4ebb3	LibPDF: Record base font name read from document This will be useful for debugging, or if we later on want to show all the fonts found in the document in an organised manner.	2023-01-25 15:40:11 +01:00
Timothy Flynn	f3db548a3d	AK+Everywhere: Rename FlyString to DeprecatedFlyString DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so let's rename it to A) match the name of DeprecatedString, B) write a new FlyString class that is tied to String.	2023-01-09 23:00:24 +00:00
Julian Offenhäuser	2a70ea3ee7	LibPDF: Propagate errors in PDFFont::create()	2023-01-09 22:54:36 +00:00
Julian Offenhäuser	ac31b1bda3	LibPDF: Make glyphs from standard 14 fonts show up in Type1Font Previously, we would assume that all standard 14 fonts use a TrueTypeFont dictionary. Now we render them in Type1Font as well, given that it doesn't contain a PostScript font program.	2023-01-09 22:54:36 +00:00
MacDue	eeb6072f15	LibGfx+LibPDF: Apply subpixel offset in affine transformation	2023-01-05 13:50:26 +01:00
MacDue	91db49f7b3	LibPDF: Use subpixel accurate text rendering This just enables the new tricks from LibGfx with the same nice improvements :^)	2023-01-05 12:09:35 +01:00
Andreas Kling	f982400063	LibGfx: Rename TTF/TrueType to OpenType OpenType is the backwards-compatible successor to TrueType, and the format we're actually parsing in LibGfx. So let's call it that.	2022-12-21 08:44:22 +01:00
Rodrigo Tobar	a1af79dca6	LibPDF: Follow a FontFile's Length values These can be references (at least from what I've found in some documents), so we want to resolve them before using them.	2022-12-16 01:24:43 -07:00
Rodrigo Tobar	41bd304a7f	LibPDF: Ignore seac PS1 commands for now This command is meant to print an Standard Encoding Accented Character. It's not critical to implement it yet, but if we want to render more documents we need to handle the instruction, even if simply ignore it.	2022-12-16 01:24:43 -07:00
Andreas Kling	d6a3be1615	LibPDF: Add missing character quirk for WinAnsiEncoding fonts Fonts with the encoding name "WinAnsiEncoding" should render missing characters above character code 040 (octal) as a "bullet" character. This patch adds Encoding::should_map_to_bullet(char_code) which is then called by char_code_to_code_point() to check if the given char code should be displayed as a bullet instead. I didn't have a good way to test this, so I've only verified that it works by manually overriding inputs to the function during the rendering stage. This takes care of a FIXME in the Annex D part of the PDF specification.	2022-12-08 09:54:20 +01:00

1 2 3 4

167 commits