serenity

mirror of https://github.com/RGBCube/serenity synced 2025-10-21 10:42:33 +00:00

Author	SHA1	Message	Date
Rodrigo Tobar	a1af79dca6	LibPDF: Follow a FontFile's Length values These can be references (at least from what I've found in some documents), so we want to resolve them before using them.	2022-12-16 01:24:43 -07:00
Rodrigo Tobar	cb1a7cc721	LibPDF: Simplify outline construction While the Outline Items making up the document's Outline have all sorts of cross-references (parent, first/last chlid, next/previous sibling, etc), not all documents out there have fully-consistent references. Our implementation already discarded some of that information too (e.g., /Parent and /Prev were never read), and trusted that /First and /Next were good enough to traverse the whole hierarchy. Where the current implementation failed was in assuming that /Last was also a good source of information. There are documents out there were /Last also points to dead ends, and were therefore causing a crash when we verified that the last child found on a chain was the /Last child declared by the parent. To fix this I'm simply removing the check, and simplifying the function call to remove any references to /Last. This way we affirm our commitment to /First and /Next as the main sources of information.	2022-12-16 01:24:43 -07:00
Rodrigo Tobar	41bd304a7f	LibPDF: Ignore seac PS1 commands for now This command is meant to print an Standard Encoding Accented Character. It's not critical to implement it yet, but if we want to render more documents we need to handle the instruction, even if simply ignore it.	2022-12-16 01:24:43 -07:00
Ali Mohammad Pur	f96a3c002a	Everywhere: Stop shoving things into ::std and mentioning them as such Note that this still keeps the old behaviour of putting things in std by default on serenity so the tools can be happy, but if USING_AK_GLOBALLY is unset, AK behaves like a good citizen and doesn't try to put things in the ::std namespace. std::nothrow_t and its friends get to stay because I'm being told that compilers assume things about them and I can't yeet them into a different namespace...for now.	2022-12-14 11:44:32 +01:00
Rodrigo Tobar	adc45635e9	LibPDF: Add initial image display support After adding support for XObject Form rendering, the next was to display XObject images. This commit adds this initial support, Images come in many shapes and forms: encodings: color spaces, bits per component, width, height, etc. This initial support is constrained to the color spaces we currently support, to images that use 8 bits per component, to images that do not use the JPXDecode filter, and that are not Masks. There are surely other constraints that aren't considered in this initial support, so expect breakage here and there. In addition to supporting images, we also support applying an alpha mask (SMask) on them. Additionally, a new rendering preference allows to skip image loading and rendering altogether, instead showing an empty rectangle as a placeholder (useful for when actual images are not supported). Since RenderingPreferences is becoming a bit more complex, we add a hash option that will allow us to keep track of different preferences (e.g., in a HashMap).	2022-12-10 10:49:03 +01:00
Rodrigo Tobar	2331fe5e68	LibPDF: Add first interpolation methods Interpolation is needed in more than one place, and I couldn't find a central place where I could borrow a readily available interpolation routine, so I've implemented the first simple interpolation object. More will follow for more complex scenarios.	2022-12-10 10:49:03 +01:00
Rodrigo Tobar	17676705a5	LibPDF: Add facility to obtain Vector<float> from ArrayObject Arrays of float numbers are common in many PDF objects, and thus to avoid code repetition I'm introducing a new method to ArrayObject that will return exactly that.	2022-12-10 10:49:03 +01:00
Rodrigo Tobar	a63b93f724	LibPDF: Add new Error::Type for unsupported rendering features	2022-12-10 10:49:03 +01:00
Rodrigo Tobar	26f8c0b76c	LibPDF: Add more knowledge to ColorSpaces classes ColorSpaces now can tell users how many components they expect, and the default decode array that should be used when converting unit bit sequences into color space component input values during image rendering.	2022-12-10 10:49:03 +01:00
Rodrigo Tobar	ba16310739	LibPDF: Refactor parsing of ColorSpaces ColorSpaces can be specified in two ways: with a stream as operands of the color space operations (CS/cs), or as a separate PDF object, which is then referred to by other means (e.g., from Image XObjects and other entities). These two modes of addressing a ColorSpace are slightly different and need to be addressed separately. However, the current implementation embedded the full logic of the first case in the routine that created ColorSpace objects. This commit refactors the creation of ColorSpace to support both cases. First, a new ColorSpaceFamily class encapsulates the static aspects of a family, like its name or whether color space construction never requires parameters. Then we define the supported ColorSpaceFamily objects. On top of this also sit a breakage on how ColorSpaces are created. Two methods are now offered: one only providing construction of no-argument color spaces (and thus taking a simple name), and another taking an ArrayObject, hence used to create ColorSpaces requiring arguments. Finally, on top of that two ways to get a color space in the Renderer are made available: the first creates a ColorSpace with a name and a Resources dictionary, and another takes an Object. These model the two addressing modes described above.	2022-12-10 10:49:03 +01:00
Rodrigo Tobar	287bb0feac	LibPDF: Return results directly and avoid unpacking+packing	2022-12-10 10:49:03 +01:00
Andreas Kling	d6a3be1615	LibPDF: Add missing character quirk for WinAnsiEncoding fonts Fonts with the encoding name "WinAnsiEncoding" should render missing characters above character code 040 (octal) as a "bullet" character. This patch adds Encoding::should_map_to_bullet(char_code) which is then called by char_code_to_code_point() to check if the given char code should be displayed as a bullet instead. I didn't have a good way to test this, so I've only verified that it works by manually overriding inputs to the function during the rendering stage. This takes care of a FIXME in the Annex D part of the PDF specification.	2022-12-08 09:54:20 +01:00
MacDue	7be0b27dd3	Meta+Userland: Pass Gfx::IntPoint by value This is just two ints or 8 bytes or the size of the reference on x86_64 or AArch64.	2022-12-07 11:48:27 +01:00
Linus Groh	57dc179b1f	Everywhere: Rename to_{string => deprecated_string}() where applicable This will make it easier to support both string types at the same time while we convert code, and tracking down remaining uses. One big exception is Value::to_string() in LibJS, where the name is dictated by the ToString AO.	2022-12-06 08:54:33 +01:00
Linus Groh	6e19ab2bbc	AK+Everywhere: Rename String to DeprecatedString We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)	2022-12-06 08:54:33 +01:00
Linus Groh	d26aabff04	Everywhere: Run clang-format	2022-12-03 23:52:23 +00:00
Rodrigo Tobar	cb3e05f476	LibPDF: Add initial implementation of XObject rendering This implementation currently handles Form XObjects only, skipping image XObjects. When rendering an XObject, its resources are passed to the underlying operations so they use those instead of the Page's.	2022-11-30 14:51:14 +01:00
Rodrigo Tobar	b3007c17bd	LibPDF: Allow operators to receive optional resources Operators usually assume that the resources its operations will require will be the Page's. This assumption breaks however when XObjects with their own resources come into the picture (and maybe other cases too). In that case, the XObject's resources take precedence, but they should also contain the Page's resources. Because of this, one can safely use the XObject resources alone when given, and default to the Page's if not. This commit adds all operator calls an extra argument with optional resources, which will be fed by XObjects as necessary.	2022-11-30 14:51:14 +01:00
Rodrigo Tobar	e58165ed7a	LibPDF: Render cubic bezier curves The implementation of bezier curves already exists on Gfx, so implementing the PDF rendering of this command is trivial.	2022-11-30 14:51:14 +01:00
Rodrigo Tobar	fe5c823989	LibPDF: Communicate resources to ColorSpace, not Page Resources can come from other sources (e.g., XObjects), and since the only attribute we are reading from Page are its resources it makes sense to receive resources instead. That way we'll be able to pass down arbitrary resources that are not necessarily declared at the page level.	2022-11-30 14:51:14 +01:00
Rodrigo Tobar	164422f8d8	LibPDF: Add further common names	2022-11-30 14:51:14 +01:00
Rodrigo Tobar	5277ad1d6d	LibPDF: Implement Run Length Decoding This is a simple decoding process that is needed by some streams.	2022-11-30 14:51:14 +01:00
Rodrigo Tobar	e776048309	LibPDF: Ignore whitespace on hex strings The spec says that whitespaces should be ignored, but we weren't. PDFs with whitespaces in their hex strings were thus crushing the parser.	2022-11-30 14:51:14 +01:00
Rodrigo Tobar	d04613d252	LibPDF: Fix path coordinates calculation Paths rendering was buggy because the map() function that translates points from user space to bitmap space applied the vertical flip conversion that the current transformation matrix already considers; Hence, all paths were upside down. The only exception was the "re" instruction, which manually adjusted the Y coordinate of its points to be flipped again (and had a FIXME saying that this should be unnecessary). This commit fixes the map() function that maps userspace points to bitmap coordinates. The "re" operator implementation has also been simplified creating a rectangle first and mapping that instead of mapping each point individually.	2022-11-26 08:56:35 +01:00
Rodrigo Tobar	e92ec26771	LibPDF: Introduce rendering preferences and show clipping paths A new struct allows users to specify specific rendering preferences that the Renderer class might use to paint some Document elements onto the target bitmap. The first toggle allows rendering (or not) the clipping paths on a page, which is useful for debugging.	2022-11-25 23:03:24 +01:00
Rodrigo Tobar	a1e36e8f78	LibPDF: Improve path clipping support The existing path clipping support was broken, as it performed the clipping operation as soon as the path clipping commands (W/W) were received. The correct behavior is to keep a clipping path in the graphic state, intersect* that with the current path upon receiving W/W, and apply the clipping when performing painting operations. On top of that, the intersection happening at W/W time does not affect the painting operation happening on the current on-build path, but takes effect only after the current path is cleared; therefore a current and a next clipping path need to be kept track of. Path clipping is not yet supported on the Painter class, nor is path intersection. We thus continue using the same simplified bounding box approach to calculate clipping paths. Since now we are dealing with more rectangles-as-path code, I've made helper functions to build a rectangle path and reuse it as needed.	2022-11-25 23:03:24 +01:00
Julian Offenhäuser	d1bc89e30b	LibPDF: Try to repair XRef tables with broken indices An XRef table usually starts with an object number of zero. While it could technically start at any other number, this is a tell-tale sign of a broken table. For the "broken" documents I encountered, this always meant that some objects must have been removed from the start of the table, without updating the following indices. When this is the case, the document is not able to be read normally. However, most other PDF parsers seem to know of this quirk and fix the XRef table automatically. Likewise, we now check for this exact case, and if it matches up with what we expect, we update the XRef table such that all object numbers match the actual objects found in the file again.	2022-11-25 22:44:47 +01:00
Julian Offenhäuser	e06a065594	LibPDF: Override Type 1 character mappings by encoding in font dict If the font dictionary includes an "Encoding" entry, it will be used instead of the PS1FontProgram's built-in encoding.	2022-11-25 22:44:47 +01:00
Julian Offenhäuser	65ff80e8a5	LibPDF: Add alternative names to is_standard_latin_font() helper	2022-11-25 22:44:47 +01:00
Julian Offenhäuser	9cb3b23377	LibPDF: Move all font handling to Type1Font and TrueTypeFont classes It was previously the job of the renderer to create fonts, load replacements for the standard 14 fonts and to pass the font size back to the PDFFont when asking for glyph widths. Now, the renderer tells the font its size at creation, as it doesn't change throughout the life of the font. The PDFFont itself is now responsible to decide whether or not it needs to use a replacement font, which still is Liberation Serif for now. This means that we can now render embedded TrueType fonts as well :^) It also makes the renderer's job much more simple and leads to a much cleaner API design.	2022-11-25 22:44:47 +01:00
Julian Offenhäuser	e748a94f80	LibPDF: Introduce loading of common font data in PDFFont base class This font data is shared between Type 1 and TrueType fonts, which is why we can now load it in the base class that they both use.	2022-11-25 22:44:47 +01:00
Julian Offenhäuser	dd82a026f8	LibPDF: Pass PDFFont::draw_glyph() a char code instead of a code point We would previously pass this function a unicode code point, which is not actually what we want here. Instead, we want the "raw" code point, with the font itself deciding whether or not it needs to be re-mapped. This same mistake in terminology applied to PS1FontProgram.	2022-11-25 22:44:47 +01:00
Julian Offenhäuser	8532ca1b57	LibPDF: Convert dash pattern array elements to integers if necessary They may be floats instead.	2022-11-25 22:44:47 +01:00
Julian Offenhäuser	0bc3333740	LibPDF: Parse integer numbers with atoi() instead of strtof() strtof() produces rounding errors for very large numbers, which we don't want for integers, as they may have to be precise.	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	c2ad29c85f	LibPDF: Implement png predictor decoding for flate filter For flate and lzw filters, the data can be transformed by this predictor function to make it compress better. For us this means that we have to undo this step in order to get the right result. Although this feature is meant for images, I found at least a few documents that use it all over the place, making this step very important.	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	4bd79338e8	LibPDF: Fix off-by-one error in Reader::remaining()	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	4b1a72ff7a	LibPDF: Fix loop condition in parse_xref_stream() We previously compared two unrelated values to determine if we parsed the xref table to completion. We now check if we added every subsection instead, and double check to make sure we never read past the end.	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	a17a23a3f0	LibPDF: Make some variable names in parse_xref_stream() more clear I found these to be a bit misleading.	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	f926dfe36b	LibPDF: Implement the DCT filter This filter basically tells us that we are dealing with a JPEG. Note that by serializing the resulting image we assume that this filter is the last one in the chain, everything else would be highly unlikely.	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	baaf42360e	LibPDF: Derive alternate ICC color space from the number of components We currently don't support ICC color spaces and fall back to a "simple" one instead. If no alternative is specified however, we are allowed to pick the closest match based on the number of color components.	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	16ed407c01	LibPDF: Support cascading stream filters You can specify multiple filters as an array, where each one is fed the output of the one before it.	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	becd648a78	LibPDF: Parse hexadecimal values in name objects correctly	2022-11-19 15:42:08 +01:00
Julian Offenhäuser	7c4f5b58be	LibPDF: Use Gfx::PathRasterizer for Adobe Type 1 font rendering This gives much better visual results than painting the path directly. It also has the nice side effect that Type 1 fonts will now look much more similar to TrueType fonts, which use the same class :^) In addition, we can now cache glyph bitmaps for repeated use.	2022-11-19 11:04:34 +01:00
Tim Schumacher	ce2f1b845f	Everywhere: Mark dependencies of most targets as PRIVATE Otherwise, we end up propagating those dependencies into targets that link against that library, which creates unnecessary link-time dependencies. Also included are changes to readd now missing dependencies to tools that actually need them.	2022-11-01 14:49:09 +00:00
Tim Schumacher	7834e26ddb	Everywhere: Explicitly link all binaries against the LibC target Even though the toolchain implicitly links against -lc, it does not know where it should get LibC from except for the sysroot. In the case of Clang this causes it to pick up the LibC stub instead, which might be slightly outdated and feature missing symbols. This is currently not an issue that manifests because we pass through the dependency on LibC and other libraries by accident, which causes CMake to link against the LibC target (instead of just the library), and thus points the linker at the build output directory. Since we are looking to fix that in the upcoming commits, let's make sure that everything will still be able to find the proper LibC first.	2022-11-01 14:49:09 +00:00
Julian Offenhäuser	b14f0950a5	LibPDF: Add very basic support for Adobe Type 1 font rendering Previously we would draw all text, no matter what font type, as Liberation Serif, which results in things like ugly character spacing. We now have partial support for drawing Type 1 glyphs, which are part of a PostScript font program. We completely ignore hinting for now, which results in ugly looking characters at low resolutions, but gain support for a large number of typefaces, including most of the default fonts used in TeX.	2022-10-16 17:44:54 +02:00
Julian Offenhäuser	e6f29302a7	LibPDF: Add glyph drawing and type info methods to PDFFont A PDFFont can now be asked for its specific type and whether it is part of the standard 14 fonts. It now also contains a method to draw a glyph, which is stubbed-out for now. This will be useful for the renderer to take into consideration when drawing text, since we don't include replacements for the standard set of fonts yet, but still want to make use of embedded fonts when available.	2022-10-16 17:44:54 +02:00
Julian Offenhäuser	36f83cecab	LibPDF: Allow page objects to inherit the MediaBox and Resources entries	2022-10-16 17:44:54 +02:00
Julian Offenhäuser	2f71e0f09a	LibPDF: Allow text operator sequences to start with whitespace	2022-10-16 17:44:54 +02:00
Julian Offenhäuser	7ecd420b03	LibPDF: Parse floating point numbers that omit a leading zero correctly	2022-10-16 17:44:54 +02:00

... 8 9 10 11 12 ...

622 commits