serenity

mirror of https://github.com/RGBCube/serenity synced 2025-05-31 10:18:11 +00:00

Author	SHA1	Message	Date
Lucas CHOLLET	1e8004734f	LibPDF: Don't consider the End of Data code as normal ASCII85 input Data encoded with ASCII85 is terminated with the EOD code 0x7E3E. This should not be considered as normal input but rather discarded.	2023-11-14 10:15:15 +01:00
Lucas CHOLLET	59a6d4b7bc	LibPDF: Factorize duplicated code in `Filter::decode_ascii85()`	2023-11-14 10:15:15 +01:00
Lucas CHOLLET	2fe0647c68	LibPDF: Handle pdf-specific white spaces correctly in ASCII85 We were previously only looking the space character but PDF white spaces is a superset of ascii spaces.	2023-11-14 10:15:15 +01:00
Lucas CHOLLET	db08fe12ec	LibPDF: Implement `Reader::is_[eol, whitespace](char)` These two static members are now used to implement respective `matches_` methods but will also be useful to provide a global implementation of the specified concept of whitespace.	2023-11-14 10:15:15 +01:00
Lucas CHOLLET	dac703a0b8	LibPDF: Avoid an unnecessary copy in `Filter::decode_ascii85()`	2023-11-14 10:15:15 +01:00
Nico Weber	9b022239c3	LibPDF: Apply all offsets of TJ operator TJ acts on a list of either strings or numbers. The strings are drawn, and the numbers are treated as offsets. Previously, we'd only apply the last-seen number as offset when we saw a string. That had the effect of us ignoring all but the last number in front of a string, and ignoring numbers at the end of the list. Now, we apply all numbers as offsets. Our rendering of Tests/LibPDF/text.pdf now matches other PDF viewers.	2023-11-14 10:11:09 +01:00
Nico Weber	1c2b0feb7b	LibPDF: Change how CFF optional width prefix is stored Per 5177.Type2.pdf 3.1 "Type 2 Charstring Organization", a glyph's charstring looks like: w? {hs* vs* cm* hm* mt subpath}? {mt subpath}* endchar The `w?` is the width of the glyph, but it's optional. So all possible commands after it (hstem* vstem* cntrmask hintmask moveto endchar) check if there's an extra number at the start and interpret it as a width, for the very first command we read. This was done by having an `is_first_command` local bool that got set to false after the first command. That didn't work with subrs: If the first command was a call to a subr that just pushed a bunch of numbers, then the second command after it is the actual first command. Instead, move that bool into the state. Set it to false the first time we try to read a width, since that means we just read a command that could've been prefixed by a width.	2023-11-14 10:10:34 +01:00
Lucas CHOLLET	9e4d697d23	LibPDF: Detect DCT images correctly Images can have multiple filters, each one of them is processed sequentially. Only the last one will be relevant for the image format (DCT or JPXDecode), so use the last filter instead of the first one to detect that property.	2023-11-13 10:30:34 -05:00
Nico Weber	f882a3ae37	LibPDF: In ColorSpace creation code, use resolve_to() more For valid PDFs, this makes no difference. For invalid PDFs, we now assert during the cast in resolve_to() instead of returning a PDFError. However, most PDFs are valid, and even for invalid PDFs, we'd previously keep the old color space around when getting the PDF error and then usually assert later when the old color space got passed a color with an unexpected number of components (since the components were for the new color space). Doesn't affect any of the > 2000 PDFs I use for testing locally, is less code, and should make for less surprising asserts when it does happen.	2023-11-13 10:29:26 -05:00
Lucas CHOLLET	9bc25db9a3	LibPDF: Add support for the LZW filter This allows us to decode the first page of ThinkingInPostScript.pdf :^)	2023-11-13 14:23:23 +01:00
Lucas CHOLLET	048ef11136	LibPDF: Factorize flate parameters handling to its own function This part will be shared with the LZW filter, so let's factorize it.	2023-11-13 14:23:23 +01:00
Nico Weber	bbde3cbc90	LibPDF: Tolerate an indirect object as dict for CIE-based color spaces Namely, for CalGrayColorSpace, CalRGBColorSpace, LabColorSpace. Fixes a crash rendering any page of Adobe's 5014.CIDFont_Spec.pdf (which uses CalRGBColorSpace with an indirect dict: The dict is object `92 0`, and many color spaces are inline objects referring to it).	2023-11-13 07:12:05 -05:00
Nico Weber	f4a847894f	LibPDF: Make SampledFunction::evaluate() work for n-dimensional input I didn't find example code for this and the AI assistant did very poorly on this as well. So I had to write it all by myself! It can be much more efficient I think, but I think the overall shape is maybe roughly fine.	2023-11-12 07:55:04 +01:00
Nico Weber	a9ef65e64a	LibPDF: For multi-output SampledFunctions, fix output colors For N outputs, the outputs aren't stored in N independent planes. Instead, N output values are stored right next to each other in the stream data.	2023-11-11 08:55:37 +01:00
Nico Weber	ec739460e0	LibPDF: Add test for SampledFunction and fix bugs found by it * SampledFunction now keeps the StreamObject it gets data from alive (doesn't matter too much in practice, but does matter in the test, where nothing else keeps the stream alive). * If a sample is an integer, we would previously sample that value twice and then divide by zero when interpolating. Make sure to sample 1 unit apart.	2023-11-11 08:55:37 +01:00
Nico Weber	323ba7404c	LibPDF: Implement SampledFunction::evaluate() for some sampled functions Things now work for functions that are all of: * linear * 1-D input * 8 bits per sample	2023-11-10 15:03:30 +00:00
Nico Weber	fd1876441a	LibPDF: Implement SampledFunction::create()	2023-11-10 15:03:30 +00:00
Nico Weber	cd9f4655ec	LibPDF: Tweak implementation of postscript `roll` op Since positive offsets roll to the right, it makes more sense to do the big reverse first. Gets rid of an awkward minus sign. No behavior change.	2023-11-10 14:45:38 +01:00
Nico Weber	b23ed86889	LibPDF: Implement StitchingFunction::evaluate()	2023-11-10 14:45:16 +01:00
Nico Weber	ba34ddeb21	LibPDF: Implement StitchingFunction creation	2023-11-10 14:45:16 +01:00
Nico Weber	5af6e1c042	LibPDF: Implement DeviceNColorSpace	2023-11-09 23:33:49 +01:00
Nico Weber	0f07049935	LibPDF: Add ColorSpaceFamily::operator== No behavior change.	2023-11-09 23:33:49 +01:00
Nico Weber	80eec1e16b	LibPDF: Implement PostScriptCalculatorFunction Includes a tokenizer and interpreter for the subset of PostScript supported in PDF type 4 functions.	2023-11-09 16:06:25 +01:00
Tim Schumacher	a2f60911fe	AK: Rename GenericTraits to DefaultTraits This feels like a more fitting name for something that provides the default values for Traits.	2023-11-09 10:05:51 -05:00
Nico Weber	bbd86ee4f3	LibPDF: Implement ExponentialInterpolationFunction	2023-11-06 10:01:05 +01:00
Nico Weber	1aed465efe	LibPDF: Implement Fuction::create()	2023-11-06 10:01:05 +01:00
Nico Weber	b78ea81de5	LibPDF: Implement SeparationColorSpace Requires PDF::Function, which isn't implemented yet, so this has no visual effect yet.	2023-11-06 10:01:05 +01:00
Nico Weber	9204252d02	LibPDF: Add scaffolding for function objects See PDF 1.7 Spec, "3.9 Functions".	2023-11-06 10:01:05 +01:00
Nico Weber	21894f1cde	LibPDF: Fix typos in DeviceN colorspace scaffolding * Compare array size to 3 and 4, not 4 and 5 * Fix literal typo in error message Fixes crash processing 0000906.pdf from 0000.zip from the pdf/a dataset.	2023-11-06 09:54:01 +01:00
Nico Weber	30ea218e35	LibPDF: Implement IndexedColorSpace	2023-11-05 14:27:22 -07:00
Nico Weber	0b087c02a3	LibPDF: Add spec link to default_decode()	2023-11-05 14:27:22 -07:00
Nico Weber	3dca11c4e2	LibPDF: Move color space creation from name or array into ColorSpace No behavior change.	2023-11-05 14:27:22 -07:00
Nico Weber	1dfd49ef99	LibPDF: Implement LabColorSpace	2023-11-05 14:27:22 -07:00
Nico Weber	4a5136fc8c	LibPDF: Implement CalGrayColorSpace I haven't seen this being used in the wild, but it's used in Tests/LibPDF/colorspaces.pdf.	2023-11-04 17:02:37 -04:00
Nico Weber	a207ab709a	LibPDF: In convert_to_srgb(), also apply sRGB curve (ish) We did convert from the input space to linear space and then to linear sRGB, but we forgot to re-apply gamma. This uses the x^2.2 curve instead of the real sRGB curve for now.	2023-11-04 17:02:37 -04:00
Nico Weber	641365b235	LibPDF: Move colorspace conversion functions up a bit No code change, no behavior change. Pure code move.	2023-11-04 17:02:37 -04:00
Nico Weber	f8799885de	LibPDF: Clamp sRGB channels before converting to u8 in CalRGB code Sometimes the numbers end up just slightly above 1.0f, which previously caused an overflow.	2023-11-01 11:45:13 -04:00
Nico Weber	bdd2404453	LibPDF: Ignore input whitepoint in convert_to_d65() CalRGBColorSpace::color() converts into a flat xyz space, which already takes input whitepoint into account. It shouldn't be taken into account again when converting from the flat color space to D65.	2023-11-01 11:45:13 -04:00
Nico Weber	e35a5da2fb	LibPDF: Update dead link in a comment	2023-11-01 11:45:13 -04:00
Nico Weber	1fcf0142d2	LibPDF: Fix unfortunate typo in CalRGBColorSpace::create() We always ignored the /Matrix key in /CalRGB dicts.	2023-11-01 11:45:13 -04:00
Nico Weber	d24289eef4	LibPDF: Always log unhandled type 1 and type 2 font program opcodes This would've made it easy to see that we were missing flex opcodes for https://developer.apple.com/library/archive/documentation/mac/pdf/Text.pdf	2023-11-01 11:40:16 -04:00
Nico Weber	e1a743f286	LibPDF: Implement type 2 flex, hflex, hflex1, flex1 operators This is the type 2 equivalent to type2 othersubr, from what I can tell. See "4.1 Path Construction Operators" in 5177.Type2.pdf, "The Type 2 Charstring Format". Makes text show up alright on https://developer.apple.com/library/archive/documentation/mac/pdf/Text.pdf	2023-11-01 11:40:16 -04:00
Nico Weber	3e707efdfa	LibPDF: Move type1 subr 0 handling into othersubr handler https://adobe-type-tools.github.io/font-tech-notes/pdfs/T1_SPEC.pdf, 8.4 First Four Subrs Entries: """If Flex or hint replacement is used in a Type 1 font program, the first four entries in the Subrs array in the Private dictionary must be assigned charstrings that correspond to the following code sequences. If neither Flex nor hint replacement is used in the font program, then this requirement is removed, and the first Subrs entry may be a normal charstring subroutine sequence. The first four Subrs entries contain: Subrs entry number 0: 3 0 callothersubr pop pop setcurrentpoint return """ othersubr handler 0 gets three arguments: * The flex height (the distance after which the bezier splines are replaced with just straight lines) * The current position after the flex It pushes that position on the postscript stack, where predefined subr handler number 0 then pops it from. It then passes it to setcurrentpoint. In theory, we now correctly do that setcurrentpoint call, which we previously weren't. In practice, that setcurrentpoint call always receives the last point of the flex -- and our path api apparently gets confused when move_to() is called on it when the current point is already at that same location. So tweak the SetCurrentPoint handler to not set the current point on the path if it's already the path's current point, with a FIXME to figure out what exactly is happening in Gfx::Path. No big behavior change if flex is used, but this is more correct if it isn't. (This only works because our `return` handler is empty, else we would have to make the callothersubr handler start a call frame.)	2023-11-01 11:38:41 -04:00
Nico Weber	0bb8249780	LibPDF: Move type1 subr 1 and 2 handling into othersubr handler https://adobe-type-tools.github.io/font-tech-notes/pdfs/T1_SPEC.pdf, 8.4 First Four Subrs Entries: """If Flex or hint replacement is used in a Type 1 font program, the first four entries in the Subrs array in the Private dictionary must be assigned charstrings that correspond to the following code sequences. If neither Flex nor hint replacement is used in the font program, then this requirement is removed, and the first Subrs entry may be a normal charstring subroutine sequence. The first four Subrs entries contain: [...] Subrs entry number 1: 0 1 callothersubr return Subrs entry number 2: 0 2 callothersubr return """ So subr entry numbers 1 and 2 just call othersubr 1 and and 2, which means we can just move the handling code over. No behavior change if flex is used, but more correct if it isn't. (This only works because our `return` handler is empty, else we would have to make the callothersubr handler start a call frame.)	2023-11-01 11:38:41 -04:00
Ali Mohammad Pur	78c04cb8b2	AK+LibPDF: Make Format print floats in a roundtrip-safe way by default Previously we assumed a default precision of 6, which made the printed values quite odd in some cases. This commit changes that default to print them with just enough precision to produce the exact same float when roundtripped. This commit adds some new tests that assert exact format outputs, which have to be modified if we decide to change the default behaviour.	2023-10-31 09:12:35 +03:30
Nico Weber	4cc24548f6	LibPDF: Call dbgln() for unimplemented flex upcodes	2023-10-28 13:28:05 -04:00
Nico Weber	e484fae8e1	LibPDF: Don't do special subr processing for type 2 CFFs This is a subset of #21484: Type 2 CFFs never use the special subrs, so stop doing them for type 2 at least for now. Fixes an assert in 0000064.pdf in 0000.zip in the pdfa dataset (a stack underflow because a subr is supposed to push a bunch of stuff, but instead it ran one of the built-in routines instead of the subr from the font file). As discussed in #21484, this isn't right for type 1 CFFs either, but just removing the code there regresses Tests/LibPDF/type1.pdf. A slightly more involved thing is needed there; I added a FIXME for that here.	2023-10-28 13:28:05 -04:00
Tim Ledbetter	5c0c55d2c0	LibPDF: Ensure xref stream field widths are within expected range Previously, an xref stream with a field with larger than 8 would result in an undefined shift occurring. We now ensure that each field width is a number and is less than or equal to 8.	2023-10-28 13:17:09 -04:00
Nico Weber	6d47fca3bf	LibPDF: Don't assert on outline destinations that use `null` as page Nothing in PDF 1.7 spec 8.2.1 Destinations mentions the page being `null`, but it happens in 0000372.pdf (for the root outline element) and in 0000776.pdf (for every outline element, which looks like a bug in the generator maybe) of 0000.zip from the pdfa dataset.	2023-10-27 06:38:25 -04:00
Tim Ledbetter	b4296e1c9b	LibPDF: Don't use unsanitized values in error messages Previously, constructing error messages with unsanitized input could fail because error message strings must be UTF-8.	2023-10-26 11:05:32 +02:00

... 3 4 5 6 7 ...

662 commits