serenity

mirror of https://github.com/RGBCube/serenity synced 2025-09-18 21:26:16 +00:00

Author	SHA1	Message	Date
Matthew Olsson	be6e4b6f3c	LibPDF: Store indirect value refs in Value objects IndirectValueRef is so simple that it can be stored directly in the Value class instead of being heap allocated. As the comment in Value says, however, in theory the max bits needed to store is 48 (16 for the generation index and 32(?) for the object index), but 32 should be good enough for now. We can increase it to u64 later if necessary.	2021-05-25 00:24:09 +04:30
Matthew Olsson	477e3946e5	LibPDF: Add support for stream filters This commit also splits up StreamObject into PlainTextStreamObject and EncodedStreamObject, which is essentially just a stream object which does not own its bytes vs one which does.	2021-05-25 00:24:09 +04:30
Matthew Olsson	8c7ebc7a3f	LibPDF: Do not assume value is an object in parse_indirect_value	2021-05-25 00:24:09 +04:30
Matthew Olsson	101639e526	LibPDF: Parse graphics commands	2021-05-18 16:35:23 +02:00
Matthew Olsson	03649f85e2	LibPDF: Don't rely on a stream's /Length key existing Some PDFs omit this key apparently, but Firefox opens them fine. Let's emulate that behavior.	2021-05-18 16:35:23 +02:00
Matthew Olsson	3aeaceb727	LibPDF: Parse nested Page Tree structures We now follow nested page tree nodes to find all of the actual page dicts, whereas previously we just assumed the root level page tree node contained all of the page children directly.	2021-05-10 10:32:39 +02:00
Matthew Olsson	8c745ad0d9	LibPDF: Parse page structures This commit introduces the ability to parse the document catalog dict, as well as the page tree and individual pages. Pages obviously aren't fully parsed, as we won't care about most of the fields until we start actually rendering PDFs. One of the primary benefits of the PDF format is laziness. PDFs are not meant to be parsed all at once, and the same is true for pages. When a Document is constructed, it builds a map of page number to object index, but it does not fetch and parse any of the pages. A page is only parsed when a caller requests that particular page (and is cached going forwards). Additionally, this commit also adds an object_cast function which logs bad casts if DEBUG_PDF is set. Additionally, utility functions were added to ArrayObject and DictObject to get all types of objects from the collections to avoid having to manually cast.	2021-05-10 10:32:39 +02:00
Matthew Olsson	72f693e9ed	LibPDF: Add a basic parser and Document structure This commit adds a parser as well as the Reader class, which serves as a utility to aid in reading the PDF both forwards and in reverse. The parser currently is capable of reading xref tables, as well as all values. We don't really do anything with any of this information, however.	2021-05-10 10:32:39 +02:00

1 2

58 commits