serenity

mirror of https://github.com/RGBCube/serenity synced 2025-07-05 17:37:35 +00:00

Author	SHA1	Message	Date
Julian Offenhäuser	3400779047	LibPDF: Pass the right point width to the font loader in TrueTypeFont	2023-03-22 09:04:00 +01:00
Julian Offenhäuser	fd78875662	LibPDF: Fix navigate_to_before_eof_marker() for PDFs not ending in EOL The way this was factored before, we would miss the %%EOF marker if it didn't have a valid end-of-line sequence after it.	2023-03-22 09:04:00 +01:00
Julian Offenhäuser	fca9da4191	LibPDF: Don't consume anything other than EOL in Reader::consume_eol() This was previously a slightly confusing API. Even when there was no EOL marker at the current location, we would still consume one byte. It will now consume either EOL or nothing at all.	2023-03-22 09:04:00 +01:00
Julian Offenhäuser	93062e2b78	LibPDF: Be more cautious of errors when looking for linearization dict We would previously assume that, following the header, there must be a valid PDF object that could be a linearization dict. However, if the file is not linearized, this is not necessarily true. We now try to detect if there even is an object, and don't treat parsing errors as fatal.	2023-03-22 09:04:00 +01:00
Julian Offenhäuser	6c0f7d83bb	LibPDF: Don't treat a broken document header as a fatal error As the current goal is to make our best effort loading documents, we might as well ignore a broken header and power through, giving the user a warning.	2023-03-22 09:04:00 +01:00
Lucas CHOLLET	496b7ffb2b	LibGfx: Move all image loaders and writers to a subdirectory	2023-03-21 22:39:25 +01:00
Andreas Kling	689ca370d4	Everywhere: Remove NonnullRefPtr.h includes	2023-03-06 23:46:35 +01:00
Andreas Kling	8a48246ed1	Everywhere: Stop using NonnullRefPtrVector This class had slightly confusing semantics and the added weirdness doesn't seem worth it just so we can say "." instead of "->" when iterating over a vector of NNRPs. This patch replaces NonnullRefPtrVector<T> with Vector<NNRP<T>>.	2023-03-06 23:46:35 +01:00
Rodrigo Tobar	4a20751ff6	LibPDF: Detect CFF encodings with supplements These are not yet actually parsed, but detecting them means we at least don't fail to understand the actual format value, which was causing some CFF fonts to fail to load.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	9bca62c5fa	LibPDF: Increase argument stack for Type1FontPrograms Type1 imposes a stack limit of 24 elements, but Type2 has a limit of 48. We are better off relaxing the limit of the former in favour of properly supporting the latter.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	de5e7b487c	LibPDF: Improve Type2 hint counting There were two issues with how we counted hints with Type2 CharString commands: the first was that we assumed a single hint per command, even though there are commands that accept multiple hints thanks to taking a variable number of operands; and secondly, the hintmask/ctrlmask commands can also take operands (i.e., hints) themselves in certain situations. This commit fixes these two issues by correctly counting hints in both cases. This in turn fixes cases when there were more than 8 hints in total, therefore a hintmask/ctrlmask command needed to read more than one byte past the operator itself.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	bf61f94413	LibPDF: Don't crash when a font hasn't been loaded yet This could happen because there was a problem while loading the first font in the document.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	79b4293687	LibPDF: Prevent crashes when loading XObject streams These streams might need a Filter that isn't implemented yet, and thus cannot be blindly MUST()-ed.	2023-03-02 12:18:53 +01:00
Rodrigo Tobar	2a8e0da71c	LibPDF: Improve error support for Filter class The Filter class had a few TODO()s that resulted in crashes at runtime. Since we now have a better way to report errors back to the user let's use that instead.	2023-03-02 12:18:53 +01:00
MacDue	6cf8eeb7a4	LibGfx: Return bool not ErrorOr<bool> from ImageDecoderPlugin::sniff() Nobody made use of the ErrorOr return value and it just added more chance of confusion, since it was not clear if failing to sniff an image should return an error or false. The answer was false, if you returned Error you'd crash the ImageDecoder.	2023-02-26 19:43:17 +01:00
Rodrigo Tobar	cb04e4e9da	LibPDF: Refactor Font classes The PDFFont class hierarchy was very simple (a top-level PDFFont class, followed by all the children classes that derived directly from it). While this design was good enough for some things, it didn't correctly model the actual organization of font types: PDF fonts are first divided between "simple" and "composite" fonts. The latter is the Type0 font, while the rest are all simple. * PDF fonts yield a glyph per "character code". Simple fonts char codes are always 1 byte long, while Type0 char codes are of variable size. To this effect, this commit changes the hierarchy of Font classes, introducing a new SimpleFont class, deriving from PDFFont, and acting as the parent of Type1Font and TrueTypeFont, while Type0 still derives from PDFFont directly. This distinction allows us now to: * Model string rendering differently from simple and composite fonts: PDFFont now offers a generic draw_string method that takes a whole string to be rendered instead of a single char code. SimpleFont implements this as a loop over individual bytes of the string, with T1 and TT implementing draw_glyph for drawing a single char code. * Some common fields between T1 and TT fonts now live under SimpleFont instead of under PDFfont, where they previously resided. * Some other interfaces specific to SimpleFont have been cleaned up, with u16/u32 not appearing on these classes (or in PDFFont) anymore. * Type0Font's rendering still remains unimplemented. As part of this exercise I also took the chance to perform the following cleanups and restructurings: * Refactored the creation and initialisation of fonts. They are all centrally created at PDFFont::create, with a virtual "initialize" method that allows them to initialise their inner members in the correct order (parent first, child later) after creation. * Removed duplicated code. * Cleaned up some public interfaces: receive const refs, removed unnecessary ctro/dtors, etc. * Slightly changed how Type1 and TrueType fonts are implemented: if there's an embedded font that takes priority, otherwise we always look for a replacement. * This means we don't do anything special for the standard fonts. The only behavior previously associated to standard fonts was choosing an encoding, and even that was under questioning.	2023-02-24 20:16:50 +01:00
Rodrigo Tobar	e8f1a2ef02	LibPDF: Add new error construction functions These should make it easier to create specific errors, specially when wanting to create a formatted message.	2023-02-24 20:16:50 +01:00
Rodrigo Tobar	db9fa7ff07	LibPDF: Allow show_text to return errors Errors can (and do) occur when trying to render text, and so far we've silently ignored them, making us think that all is well when it isn't. Letting show_text return errors will allow us to inform the user about these errors instead of having to hiding them.	2023-02-24 20:16:50 +01:00
Andreas Kling	39a1702c99	LibPDF: Make Object::cast<T>() non-const This was only ever used to cast non-const objects to other non-const object types.	2023-02-21 00:54:04 +01:00
Sam Atkins	2db168acc1	LibTextCodec+Everywhere: Port Decoders to new Strings	2023-02-19 17:15:47 +01:00
Lucas CHOLLET	856d0202f2	LibGfx: Rename `JPGLoader` to `JPEGLoader` The patch also contains modifications on several classes, functions or files that are related to the `JPGLoader`. Renaming include: - JPGLoader{.h, .cpp} - JPGImageDecoderPlugin - JPGLoadingContext - JPG_DEBUG - decode_jpg - FuzzJPGLoader.cpp - Few string literals or texts	2023-02-18 23:56:24 +01:00
Sam Atkins	d6075ef5b5	LibTextCodec+Everywhere: Make TextCodec::decoder_for() take a StringView We don't need a full String/DeprecatedString inside this function, so we might as well not force users to create one.	2023-02-15 12:48:26 -05:00
Rodrigo Tobar	c4507bb56e	LibPDF: Add more built-in SIDs The first iteration has enough SIDs to display simple documents, but when trying more and more documents we started to need more of these SIDs to be properly defined. This is a copy/paste exercise from the CFF document, which is tedious, so it will continue in small drops. This commit fills all the gaps until SID 228, which covers all the ISOAdobe space, and should be enough for most use cases. Since this is a continuous space starting at 0, we now use an Array instead of a Map to store these names, which should be more performant. Also to simplify things I've moved the Array out of the CFF class, making it a simpler static variable, which allows us to use template type deduction.	2023-02-13 00:23:17 +00:00
Julian Offenhäuser	1f27c47973	LibPDF: Check for end of stream in Reader::matches_regular_character() The way this was set up before, this function would return "true" if the underlying stream had ended, which would cause us to try to read past the end in some edge cases.	2023-02-12 10:55:37 +00:00
Julian Offenhäuser	a2b57dd188	LibPDF: Return an error if we fail to load a replacement font	2023-02-12 10:55:37 +00:00
Julian Offenhäuser	96064ec5af	LibPDF: Allow filter DecodeParms array entries to be null Filters will use the default values in this case.	2023-02-12 10:55:37 +00:00
Julian Offenhäuser	34350ee9e7	LibPDF: Allow reading documents with incremental updates The PDF spec allows incremental changes of a document by appending a new XRef table and file trailer to it. These will only contain the changed objects and will point back to the previous change, forming an arbitrarily long chain of XRef sections and file trailers. Every one of those XRef sections may be encoded as an XRef stream as well, in which case the trailer is part of the stream dictionary as usual. To make this easier, I made it so every XRef table may "own" a trailer. This means that the main file trailer is now part of the main XRef table.	2023-02-12 10:55:37 +00:00
Julian Offenhäuser	4f4bd3793f	LibPDF: Fix glyph sizing bug that caused incorrect spacing When loading OpenType fonts, either as a replacement for the standard 14 fonts or an embedded one, we previously passed the font size as the _point_ size to the loader class. The difference is quite subtle, being that Gfx::ScaledFont uses the optional dpi parameter to convert the input from inches to pixels. This meant that our glyphs were exactly 1.333% too large, causing them to overlap in places.	2023-02-10 15:37:51 +01:00
Julian Offenhäuser	152a8c5c43	LibPDF: Use more appropriate standard 14 replacement fonts The mapping of standard font to replacement now looks like this: Times New Roman -> Liberation Serif Courier -> Liberation Mono Helvetica, Arial -> Liberation Sans	2023-02-10 15:37:51 +01:00
MacDue	63b11030f0	Everywhere: Use ReadonlySpan<T> instead of Span<T const>	2023-02-08 19:15:45 +00:00
Rodrigo Tobar	e4a7606b81	LibPDF: Construct accented characters with Type1 seac command The seac command provides the base and accented character that are needed to create an accented character glyph. Storing these values is all that was left to properly support these composed glyphs.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	3eaa27f53a	LibPDF: Add infrastructure for accented character glyphs Type1 accented character glyphs are composed of two other glyphs in the same font: a base glyph and an accent glyph, given as char codes in the standard encoding. These two glyphs are then composed together to form the accented character. This commit adds the data structures to hold the information for accented characters, and also the routine that composes the final glyph path out of the two individual components. All glyphs must have been loaded by the time this composition takes place, and thus a new protected consolidate_glyphs() routine has been added to perform this calculation.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	11a9bfd4b6	LibPDF: Turn Glyph into a class Glyph was a simple structure, but even now it's become more complex that it was initially. Turning it into a class hides some of that complexity, and make sit easier to understand to external eyes. While doing this I also decided to remove the float + bool combo for keeping track of the glyph's width, and replaced it with an Optional instead.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	c084943457	LibPDF: Index Type1 glyphs by name, not char code Storing glyphs indexed by char code in a Type1 Font Program binds a Font Program instance to the particular Encoding that was used at Font Program construction time. This makes it difficult to reuse Font Program instances against different Encodings, which would be otherwise possible. This commit changes how we store the glyphs on Type1 Font Programs. Instead of storing them on a map indexed by char code, the map is now indexed by glyph name. In turn, when rendering a glyph we use the Encoding object to turn the char code into a glyph name, which in turn is used to index into the map of glyphs. This is the first step towards reusability of Type1 Font Programs. It also unlocks the ability to render glyphs that are described via the "seac" command (standard encoding accented character), which requires accessing the base and accent glyphs by name.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	596119cf3e	LibPDF: Add placeholders for *flex Type2 commands These should be implemented properly in the future, but for now we are adding the as placeholders to avoid crashes.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	64bbe431b5	LibPDF: Add char_code -> name mapping function We already keep both mappings internally, now it's time to actually use it.	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	82bd854d6f	LibPDF: Account for other endings of PS1 Encoding array	2023-02-08 19:47:15 +01:00
Rodrigo Tobar	a533ea7ae6	LibPDF: Improve stream parsing When parsing streams we rely on a /Length item being defined in the stream's dictionary to know how much data comprises the stream. Its value is usually a direct value, but it can be indirect. There was however a contradiction in the code: the condition that allowed it to read and use the /Length value required it to be a direct value, but the actual code using the value would have worked with indirect ones. This meant that indirect /Length values triggered the fallback, "manual" stream parsing code. On the other hand, this latter code was also buggy, because it relied on the "endstream" keyword to appear on a separate line, which isn't always the case. This commit both fixes the bug in the manual stream parsing scenario, while also allowing for indirect /Length values to be used to parse streams more directly and avoid the manual approach. The main caveat to this second change is that for a brief period of time the Document is not able to resolve references (i.e., before the xref table itself is not parsed). Any parsing happening before that (e..g, the linearization dictionary) must therefore use the manual stream parsing approach.	2023-02-08 19:47:15 +01:00
Tim Schumacher	220fbcaa7e	AK: Remove the fallible constructor from `FixedMemoryStream`	2023-02-08 17:44:32 +00:00
Tim Schumacher	261d62438f	AK: Remove the fallible constructor from `LittleEndianInputBitStream`	2023-02-08 17:44:32 +00:00
Rodrigo Tobar	82bac7e665	LibPDF: Fix clipping of painting operations While the clipping logic was correct (current v/s new clipping path), the clipping path contents weren't. This commit fixed that. We calculate the clipping path in two places: when we set it to be the whole page at graphics state creation time, and when we perform clipping path intersection to calculate a new clipping path. The clipping path is then used to limit painting by passing it to the painter (more precisely, but passing its bounding box to the painter, as the latter doesn't support arbitrary path clipping). For this last point the clipping path must be in device coordinates. There was however a mix of coordinate systems involved in the creation, update and usage of the clipping path: * The initial values of the path (i.e., the whole page) were in user coordinates. * Clipping path intersection was performed against m_current_path, which is in device coordinates. * To perform the clipping operation, the current clipping path was assumed to be in user coordinates. This mix resulted in the clipping not working correctly depending on the zoom level at which one visualised a page. This commit fixes the issue by always keeping track of the clipping path in device coordinates. This means that the initial full-page contents are now converted to device coordinates before putting them in the graphics state, and that no mapping is performed when applied the clipping to the painter.	2023-02-04 12:29:57 +01:00
Rodrigo Tobar	286e3e6872	LibPDF: Simplify Encoding to align with simple font requirements All "Simple Fonts" in PDF (all but Type0 fonts) have the property that glyphs are selected with single byte character codes. This means that the Encoding objects should use u8 for representing these character codes. Moreover, and as mentioned in a previous commit, there is no need to store the unicode code point associated with a character (which was in turn wrongly associated to a glyph). This commit greatly simplifies the Encoding class. Namely it: * Removes the unnecessary CharDescriptor class. * Changes the internal maps to be u8 -> FlyString and vice-versa, effectively providing two-way lookups. * Adds a new method to set a two-way u8 -> FlyString mapping and uses it in all possible places. * Simplified the creation of Encoding objects. * Changes how the WinAnsi special treatment for bullet points is implemented.	2023-02-02 14:50:38 +01:00
Rodrigo Tobar	fb0c3a9e18	LibPDF: Stop calculating code points for glyphs When rendering text, a sequence of bytes corresponds to a glyph, but not necessarily to a character. This misunderstanding permeated through the Encoding through to the Font classes, which were all trying to calculate such values. Moreover, this was done only to identify "space" characters/glyphs, which were getting a special treatment (e.g., avoid rendering). Spaces are not special though -- there might be fonts that render something for them -- and thus should not be skipped	2023-02-02 14:50:38 +01:00
Rodrigo Tobar	7c42d6c737	LibPDF: Fix ZapfDingbat's char codes The initial values were fine, but those starting at 100 were wrong: they are all octal values, but since they were missing an initial 0 they were interpreted as decimals.	2023-02-02 14:50:38 +01:00
Rodrigo Tobar	2f773b3c5c	LibPDF: Stop storing unicode code points in Encoding In PDF's fonts, encoding objects are used to translate bytes into fonts' glyphs. Glyphs (in the fonts we currently support) organise their glyphs in such a way that they are accessed by name, and thus encoding translate between a byte sequence and a glyph name. Note that an no point this translation includes a Unicode character, and therefore assigning a character to a glyph in the Encoding object is the wrong thing to do. Moreover, using the code point for this character during the byte-sequence-to-glyph translation sequence is double-wrong. This commit removes the characters associated to each translation in the built-in Encoding objects. In order to keep commits short and sweet, I'm currently simply removing the character from the enumeration, leaving the old structure this information was held on intact. Instead, I'm filling the "code_point" member with a zero, and filling both mappings (which will be changed later on too) with the glyph name and the associated char code.	2023-02-02 14:50:38 +01:00
Tim Schumacher	093cf428a3	AK: Move memory streams from `LibCore`	2023-01-29 19:16:44 -07:00
Tim Schumacher	2470dd3bb5	AK: Move bit streams from `LibCore`	2023-01-29 19:16:44 -07:00
Tim Schumacher	ae64b68717	AK: Deprecate the old `AK::Stream` This also removes a few cases where the respective header wasn't actually required to be included.	2023-01-29 19:16:44 -07:00
Sam Atkins	bc1504c794	LibPDF: Remove declarations for non-existent methods	2023-01-27 20:33:18 +00:00
Tim Schumacher	82a152b696	LibGfx: Remove `try_` prefix from bitmap creation functions Those don't have any non-try counterpart, so we might as well just omit it.	2023-01-26 20:24:37 +00:00

1 2 3 4 5 ...

257 commits