serenity

mirror of https://github.com/RGBCube/serenity synced 2025-10-15 04:32:23 +00:00

Author	SHA1	Message	Date
Rodrigo Tobar	e4a7606b81	LibPDF: Construct accented characters with Type1 seac command The seac command provides the base and accented character that are needed to create an accented character glyph. Storing these values is all that was left to properly support these composed glyphs.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	3eaa27f53a	LibPDF: Add infrastructure for accented character glyphs Type1 accented character glyphs are composed of two other glyphs in the same font: a base glyph and an accent glyph, given as char codes in the standard encoding. These two glyphs are then composed together to form the accented character. This commit adds the data structures to hold the information for accented characters, and also the routine that composes the final glyph path out of the two individual components. All glyphs must have been loaded by the time this composition takes place, and thus a new protected consolidate_glyphs() routine has been added to perform this calculation.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	11a9bfd4b6	LibPDF: Turn Glyph into a class Glyph was a simple structure, but even now it's become more complex that it was initially. Turning it into a class hides some of that complexity, and make sit easier to understand to external eyes. While doing this I also decided to remove the float + bool combo for keeping track of the glyph's width, and replaced it with an Optional instead.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	c084943457	LibPDF: Index Type1 glyphs by name, not char code Storing glyphs indexed by char code in a Type1 Font Program binds a Font Program instance to the particular Encoding that was used at Font Program construction time. This makes it difficult to reuse Font Program instances against different Encodings, which would be otherwise possible. This commit changes how we store the glyphs on Type1 Font Programs. Instead of storing them on a map indexed by char code, the map is now indexed by glyph name. In turn, when rendering a glyph we use the Encoding object to turn the char code into a glyph name, which in turn is used to index into the map of glyphs. This is the first step towards reusability of Type1 Font Programs. It also unlocks the ability to render glyphs that are described via the "seac" command (standard encoding accented character), which requires accessing the base and accent glyphs by name.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	596119cf3e	LibPDF: Add placeholders for *flex Type2 commands These should be implemented properly in the future, but for now we are adding the as placeholders to avoid crashes.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	64bbe431b5	LibPDF: Add char_code -> name mapping function We already keep both mappings internally, now it's time to actually use it.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	82bd854d6f	LibPDF: Account for other endings of PS1 Encoding array	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	a533ea7ae6	LibPDF: Improve stream parsing When parsing streams we rely on a /Length item being defined in the stream's dictionary to know how much data comprises the stream. Its value is usually a direct value, but it can be indirect. There was however a contradiction in the code: the condition that allowed it to read and use the /Length value required it to be a direct value, but the actual code using the value would have worked with indirect ones. This meant that indirect /Length values triggered the fallback, "manual" stream parsing code. On the other hand, this latter code was also buggy, because it relied on the "endstream" keyword to appear on a separate line, which isn't always the case. This commit both fixes the bug in the manual stream parsing scenario, while also allowing for indirect /Length values to be used to parse streams more directly and avoid the manual approach. The main caveat to this second change is that for a brief period of time the Document is not able to resolve references (i.e., before the xref table itself is not parsed). Any parsing happening before that (e..g, the linearization dictionary) must therefore use the manual stream parsing approach.	2023-02-08 19:47:15 +01:00
Tim Schumacher	220fbcaa7e	AK: Remove the fallible constructor from `FixedMemoryStream`	2023-02-08 17:44:32 +00:00
Tim Schumacher	261d62438f	AK: Remove the fallible constructor from `LittleEndianInputBitStream`	2023-02-08 17:44:32 +00:00
Rodrigo Tobar	82bac7e665	LibPDF: Fix clipping of painting operations While the clipping logic was correct (current v/s new clipping path), the clipping path contents weren't. This commit fixed that. We calculate the clipping path in two places: when we set it to be the whole page at graphics state creation time, and when we perform clipping path intersection to calculate a new clipping path. The clipping path is then used to limit painting by passing it to the painter (more precisely, but passing its bounding box to the painter, as the latter doesn't support arbitrary path clipping). For this last point the clipping path must be in device coordinates. There was however a mix of coordinate systems involved in the creation, update and usage of the clipping path: * The initial values of the path (i.e., the whole page) were in user coordinates. * Clipping path intersection was performed against m_current_path, which is in device coordinates. * To perform the clipping operation, the current clipping path was assumed to be in user coordinates. This mix resulted in the clipping not working correctly depending on the zoom level at which one visualised a page. This commit fixes the issue by always keeping track of the clipping path in device coordinates. This means that the initial full-page contents are now converted to device coordinates before putting them in the graphics state, and that no mapping is performed when applied the clipping to the painter.	2023-02-04 12:29:57 +01:00
Rodrigo Tobar	286e3e6872	LibPDF: Simplify Encoding to align with simple font requirements All "Simple Fonts" in PDF (all but Type0 fonts) have the property that glyphs are selected with single byte character codes. This means that the Encoding objects should use u8 for representing these character codes. Moreover, and as mentioned in a previous commit, there is no need to store the unicode code point associated with a character (which was in turn wrongly associated to a glyph). This commit greatly simplifies the Encoding class. Namely it: * Removes the unnecessary CharDescriptor class. * Changes the internal maps to be u8 -> FlyString and vice-versa, effectively providing two-way lookups. * Adds a new method to set a two-way u8 -> FlyString mapping and uses it in all possible places. * Simplified the creation of Encoding objects. * Changes how the WinAnsi special treatment for bullet points is implemented.	2023-02-02 14:50:38 +01:00
Rodrigo Tobar	fb0c3a9e18	LibPDF: Stop calculating code points for glyphs When rendering text, a sequence of bytes corresponds to a glyph, but not necessarily to a character. This misunderstanding permeated through the Encoding through to the Font classes, which were all trying to calculate such values. Moreover, this was done only to identify "space" characters/glyphs, which were getting a special treatment (e.g., avoid rendering). Spaces are not special though -- there might be fonts that render something for them -- and thus should not be skipped	2023-02-02 14:50:38 +01:00
Rodrigo Tobar	7c42d6c737	LibPDF: Fix ZapfDingbat's char codes The initial values were fine, but those starting at 100 were wrong: they are all octal values, but since they were missing an initial 0 they were interpreted as decimals.	2023-02-02 14:50:38 +01:00
Rodrigo Tobar	2f773b3c5c	LibPDF: Stop storing unicode code points in Encoding In PDF's fonts, encoding objects are used to translate bytes into fonts' glyphs. Glyphs (in the fonts we currently support) organise their glyphs in such a way that they are accessed by name, and thus encoding translate between a byte sequence and a glyph name. Note that an no point this translation includes a Unicode character, and therefore assigning a character to a glyph in the Encoding object is the wrong thing to do. Moreover, using the code point for this character during the byte-sequence-to-glyph translation sequence is double-wrong. This commit removes the characters associated to each translation in the built-in Encoding objects. In order to keep commits short and sweet, I'm currently simply removing the character from the enumeration, leaving the old structure this information was held on intact. Instead, I'm filling the "code_point" member with a zero, and filling both mappings (which will be changed later on too) with the glyph name and the associated char code.	2023-02-02 14:50:38 +01:00
Tim Schumacher	093cf428a3	AK: Move memory streams from `LibCore`	2023-01-29 19:16:44 -07:00
Tim Schumacher	2470dd3bb5	AK: Move bit streams from `LibCore`	2023-01-29 19:16:44 -07:00
Tim Schumacher	ae64b68717	AK: Deprecate the old `AK::Stream` This also removes a few cases where the respective header wasn't actually required to be included.	2023-01-29 19:16:44 -07:00
Sam Atkins	bc1504c794	LibPDF: Remove declarations for non-existent methods	2023-01-27 20:33:18 +00:00
Tim Schumacher	82a152b696	LibGfx: Remove `try_` prefix from bitmap creation functions Those don't have any non-try counterpart, so we might as well just omit it.	2023-01-26 20:24:37 +00:00
Rodrigo Tobar	3588709986	LibPDF: Load Type1C fonts when found Now that our CFF parser is working we can load Type1C fonts in PDF, which are backed by a CFF stream.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	c4b45a82cd	LibPDF: Add initial CFF parsing The Compat Font Format specification (Adobe's Technical Note #5176) is used by PDF's Type1C fonts to store their data. While being similar in spirit to PS1 Type 1 Font Programs, it was designed for a more compact representation and thus space reduction (but an increment on complexity). It also shares most of the charstring encoding logic, which is why the CFF class also inherits from Type1FontProgram. This initial implementation is still lacking many details, e.g.: * It doesn't include all the built-in CFF SIDs * It doesn't support CFF-provided SIDs (defaults those glyphs to the space character) * More checks in general	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	1ec4ad5eb6	LibPDF: Add name -> char code conversion in Encoding This is an operation that was already being done (sub-optimally) in PS1FontProgram, so we are replacing that. We will use this during CFF parsing too.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	c592b889bf	LibPDF: Add Reader::try_read for easier error propagation This will allow us to use TRY(reader.try_read) instead of having to verify the result of reader.remaining() before calling read.read().	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	1b90ea7d3a	LibPDF: Augment Type11FontProgram with Type2 capabilities The Type1FontProgram logic was based on the Adobe Type 1 Font Format; in particular, it implemented the CharStrings Dictionary section (charstring decoding, and most commands). In the case of Type1, these charstrings are read from a PS1 diciontary, with one entry per character in the font's charset. This has served us well for Type1 font rendering. When implementing Type1C font rendering, this wasn't enough. Type1C PDF fonts are specified in embedded CFF (Compact Font File) streams, which also contain a charstring dictionary with an entry for each character in the font's charset. These entries can be slightly different from those in a PS1 Font Program though: depending on a flag in the CFF, the entries will be encoded either in the original charstring format from the Adobe Type 1 Font Format, or in the "Type 2 Charstring Format" (Adobe's Technical Note #1577). This new format is for the most part a super-set of the original, with small differences, all in the name of making the representation as compact as possible: * The glyph's width is not specified via a separate command; instead it's an optional additional argument to the first command of the charstring stream (and even then, it's only the difference to a nominal character width specified in the CFF). * The interpretation of a 4-byte number is different from Type 1: in Type 1 this is a 4-byte unsigned integer, whereas in Type 1 it's a fixed decimal with 16 bits of fractional part. * Many commands accept a variable set of arguments, so they can draw more than one line/curve on a single go. These are all retro-compatible with Type 1's commands. All these changes are implemented in this patch in a backwards-compatible way. To ensure Type 1/2 behavior is accessed, a new parameter indicates which behavior is desired when decoding the charstring stream. I also took the chance to centralise some logic that was previously duplicated across the parse_glyph function. Common lambdas capture the logic for moving to, or drawing a line/curve to a given point and updating the glyph state. Similarly, some command logic, including reading parameters, are shared by several commands. Finally, I've re-organised the cases in the main switch to group together related commands.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	f06de0fa07	LibPDF: Remove unused member	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	416585f75a	LibPDF: Add new Type1FontProgram base class We are planning to add support for CFF fonts to read Type1 fonts, and therefore much of the logic already found in PS1FontProgram will be useful for representing the Type1 fonts read from CFF. This commit moves the PS1-independent bits of PS1FontProgram into a new Type1FontProgram base class that can be used as the base for CFF-based Type1 fonts in the future. The Type1Font class uses this new type now instead of storing a PS1FontProgram pointer. While doing this refactoring I also took care of making some minor adjustments to the PS1FontProgram API, namely: * Its create() method is static and returns a NonnullRefPtr<Type1FontProgram>. * Many (all?) of the parse_* methods are now static. * Added const where possible. Notably, the Type1FontProgram also contains at the moment the code that parses the CharString data from the PS1 program. This logic is very similar in CFF files, so after some minor adjustments later on it should be possible to reuse most of it.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	e751ec2089	LibPDF: Avoid reading fields from moved-from Data object This might not be an issue at the moment, but moved-from objects are usually in a unspecifed but valid state, meaning that we shouldn't read from them.	2023-01-25 15:40:11 +01:00
Rodrigo Tobar	bfeca4ebb3	LibPDF: Record base font name read from document This will be useful for debugging, or if we later on want to show all the fonts found in the document in an organised manner.	2023-01-25 15:40:11 +01:00
Tim Schumacher	b1bfeb391e	LibPDF: Use `Core::Stream` to parse the page offset hint table	2023-01-21 00:45:33 +00:00
Liav A	57e19a7e56	LibGfx: Re-structure the whole initialization pattern for image decoders When trying to figure out the correct implementation, we now have a very strong distinction on plugins that are well suited for sniffing, and plugins that need a MIME type to be chosen. Instead of having multiple calls to non-static virtual sniff methods for each Image decoding plugin, we have 2 static methods for each implementation: 1. The sniff method, which in contrast to the old method, gets a ReadonlyBytes parameter and ensures we can figure out the result with zero heap allocations for most implementations. 2. The create method, which just creates a new instance so we don't expose the constructor to everyone anymore. In addition to that, we have a new virtual method called initialize, which has a per-implementation initialization pattern to actually ensure each implementation can construct a decoder object, and then have a correct context being applied to it for the actual decoding.	2023-01-20 15:13:31 +00:00
Timothy Flynn	f3db548a3d	AK+Everywhere: Rename FlyString to DeprecatedFlyString DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so let's rename it to A) match the name of DeprecatedString, B) write a new FlyString class that is tied to String.	2023-01-09 23:00:24 +00:00
Julian Offenhäuser	2a70ea3ee7	LibPDF: Propagate errors in PDFFont::create()	2023-01-09 22:54:36 +00:00
Julian Offenhäuser	ac31b1bda3	LibPDF: Make glyphs from standard 14 fonts show up in Type1Font Previously, we would assume that all standard 14 fonts use a TrueTypeFont dictionary. Now we render them in Type1Font as well, given that it doesn't contain a PostScript font program.	2023-01-09 22:54:36 +00:00
Julian Offenhäuser	a37f3390dc	LibPDF: Allow numbers to start with whitespace	2023-01-09 22:54:36 +00:00
Rodrigo Tobar	a5620fd41f	LibPDF: Load destinations from Catalogue -> Names -> Dests name tree PDF allows for named destinations to be provided as string. These can be either found in the Dests dictionary in the document catalogue (as already implemented), or in the Name Tree specified by the Dests key in the Names dictionary of the document catalogue (missing). This commit adds this missing case. Once the named destination is found in the name tree, its value is interpreted just like in the first case, so a new utility method encapsulates the common behavior.	2023-01-06 18:06:41 +01:00
Rodrigo Tobar	5420261347	LibPDF: Implement name tree lookups Name Trees are hierarchical, string-keyed, sorted-by-key dictionary structures in PDF where each node (except the root) specifies the bounds of the values it holds, and either its kids (more nodes) or the key/value pairs it contains. This commit implements a series of lookup calls for finding a key in such name trees. This implementation follows the tree as needed on each lookup, but if that becomes inefficient in the long run we can switch to creating a HashMap with all the contents, which as a drawback will require more memory.	2023-01-06 18:06:41 +01:00
Rodrigo Tobar	8c79f0e0cf	LibPDF: Add more utility methods to {Dict,Array}Object Being both of them containers, these classes already offered a set of methods to retrieve an inner element by key or index, respectively, with different methods for the different subtypes of the PDF::Object type returning the element cast to the correct type pointer. On top of that, DictObject offered an additional method to obtain an element as an Object pointer. While these methods were useful, they have some shortcomings: * They always take a Document pointer to first perform an object resolution, in case the element is a Reference. This is not always necessary though, as there are values that are always meant to be immediate, and hence the resolution lookup adds overhead. * There was no easy way to get an individual Object element from an ArrayObject like there is in DictObject. This makes it difficult to obtain such values, as one first needs to call dict.get() to get a Value, then cast it manually to a NonnullRefPtr<Object>. This commit fixes these two issues by: * Adding a new method that returns an Object for a given index. * Adding overloads for this new method, and all the existing methods described above, that do not take a Document, and therefore do not perform an object resolution lookup.	2023-01-06 18:06:41 +01:00
Rodrigo Tobar	0e1c858f90	LibPDF: Move casting code to its own cast_to function This functionality was previously part of the resolve_to() Document method, and thus only available only when resolving objects through the Document class. There are many use cases where this casting can be used, but no resolution is needed. This commit moves this functionality into a new cast_to function, and makes the resolve_to function call it internally. With this new function in place we can now offer new versions of DictObject::get_* and ArrayObject::get_*_at that don't perform Document resolution unnecessarily when not required.	2023-01-06 18:06:41 +01:00
Rodrigo Tobar	f510b2b180	LibPDF: Support null destination parameters Destination arrays contain a page number, a mode name, and parameters specific to that mode. In many cases these parameters can be set to "null", which our code wasn't taking into consideration. This commit parses these parameters taking into account whether they are null or actual numbers, and stores them as Optional<float> instead of plain floats. The parameters are not yet used anywhere else other than when formatting a Destination object, so the change is fairly small.	2023-01-06 18:06:41 +01:00
Rodrigo Tobar	2485c500a3	LibPDF: Fix Destination formatting This was not correctly written, and thus printed confusing output.	2023-01-06 18:06:41 +01:00
MacDue	eeb6072f15	LibGfx+LibPDF: Apply subpixel offset in affine transformation	2023-01-05 13:50:26 +01:00
MacDue	91db49f7b3	LibPDF: Use subpixel accurate text rendering This just enables the new tricks from LibGfx with the same nice improvements :^)	2023-01-05 12:09:35 +01:00
Simon Danner	5fa8068580	LibPDF: Fix calculation of encryption key Before this patch, the generation of the encryption key was not working correctly since the lifetime of the underlying data was too short, same inputs would give random encryption keys. Fixes #16668	2023-01-04 11:10:37 -05:00
Ben Wiederhake	c2a900b853	Everywhere: Remove unused includes of AK/StdLibExtras.h These instances were detected by searching for files that include AK/StdLibExtras.h, but don't match the regex: \\b(abs\|AK_REPLACED_STD_NAMESPACE\|array_size\|ceil_div\|clamp\|exchange\|for ward\|is_constant_evaluated\|is_power_of_two\|max\|min\|mix\|move\|_RawPtr\|RawP tr\|round_up_to_power_of_two\|swap\|to_underlying)\\b (Without the linebreaks.) This regex is pessimistic, so there might be more files that don't actually use any "extra stdlib" functions. In theory, one might use LibCPP to detect things like this automatically, but let's do this one step after another.	2023-01-02 20:27:20 -05:00
Ben Wiederhake	b83cb09db1	Everywhere: Fix badly-formatted includes In `7c5e30daaa`, the focus was "only" on Userland/Libraries/, whereas this commit cleans up the remaining headers in the repo, and any new badly-formatted include.	2023-01-02 11:06:15 -05:00
Andreas Kling	f982400063	LibGfx: Rename TTF/TrueType to OpenType OpenType is the backwards-compatible successor to TrueType, and the format we're actually parsing in LibGfx. So let's call it that.	2022-12-21 08:44:22 +01:00
Rodrigo Tobar	bb48a67f84	LibPDF: Reset encryption key on failed user password attempt When an attempt is made to provide the user password to a SecurityHandler a user gets back a boolean result indicating success or failure on the attempt. However, the SecurityHandler is left in a state where it thinks it has a user password, regardless of the outcome of the attempt. This confuses the rest of the system, which continues as if the provided password is correct, resulting in garbled content. This commit fixes the situation by resetting the internal fields holding the encryption key (which is used to determine whether a user password has been successfully provided) in case of a failed attempt.	2022-12-20 10:28:58 +01:00
Rodrigo Tobar	dc6a11cf6b	LibPDF: Treat Encyption's Length item as optional With the StandardSecurityHandler the Length item in the Encryption dictionary is optional, and needs to be given only if the encryption algorithm (V) is other than 1; otherwise we can assume a length of 40 bits for the encryption key.	2022-12-20 10:28:58 +01:00
Rodrigo Tobar	6df9aa8f2c	LibPDF: Store page number, not Value, in OutlineItem The Value previously stored corresponded to a Reference to a Page object in the PDF document. This isn't useful information, since what we want to display at the end of the day is the page an outline item refers to. This commit changes the page member on OutlineItem to be a Optional<u32> (some destinations don't necessarily refer to a Page), which we resolve while building OutlineItems.	2022-12-17 19:40:52 +01:00

1 2 3 4 5

227 commits