1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-10-25 19:22:33 +00:00
Commit graph

16 commits

Author SHA1 Message Date
Simon Woertz
c857b5d22f LibPDF: Convert PDF::Parser::m_document from RefPtr to WeakPtr
Otherwise both `PDF::Document` and `PDF::Parser` have a `RefPtr`
pointing to each other which leads to a memory leak due to a circular
dependency.
2022-01-08 18:57:55 +01:00
Andreas Kling
216e21a1fa AK: Convert AK::Format formatting helpers to returning ErrorOr<void>
This isn't a complete conversion to ErrorOr<void>, but a good chunk.
The end goal here is to propagate buffer allocation failures to the
caller, and allow the use of TRY() with formatting functions.
2021-11-17 00:21:13 +01:00
Andreas Kling
80d4e830a0 Everywhere: Pass AK::ReadonlyBytes by value 2021-11-11 01:27:46 +01:00
Ben Wiederhake
f84a7e2e22 LibPDF: Replace Value class by AK::Variant
This decreases the memory consumption by LibPDF by 4 bytes per Value,
compensating exactly for the increase in an earlier commit. :^)
2021-09-20 17:39:36 +04:30
Ben Wiederhake
edc0cd29f8 LibPDF: Break weird dependency cycle
Old situation:
Object.h defines Object
Object.h defines ArrayObject
ArrayObject requires the definition of Object
ArrayObject requires the definition of Value
Value.h defines Value
Value requires the definition of Object

Therefore, a file with the single line "#include <Value.h>" used to
raise compilation errors; certainly not something that one might expect
from a library.

This patch splits up the definitions in Object.h to break the cycle.
Now, Object.h only defines Object, Value.h still only defines Value (and
includes Object.h), and the new header ObjectDerivatives.h defines
ArrayObject (and includes both Object.h and Value.h).
2021-09-20 17:39:36 +04:30
Matthew Olsson
612b183703 LibPDF: Convert to east-const to comply with the recent style changes 2021-06-12 22:45:01 +04:30
Matthew Olsson
e23bfd7252 LibPDF: Parse linearized PDF files
This is a big step, as most PDFs which are downloaded online will be
linearized. Pretty much the only difference is that the xref structure
is slightly different.
2021-06-12 22:45:01 +04:30
Matthew Olsson
78bc9d1539 LibPDF: Refine the distinction between the Document and Parser
The Parser should hold information relevant for parsing, whereas the
Document should hold information relevant for displaying pages.
With this in mind, there is no reason for the Document to hold the
xref table and trailer. These objects have been moved to the Parser,
which allows the Parser to expose less public methods (which will be
even more evident once linearized PDFs are supported).
2021-06-12 22:45:01 +04:30
Matthew Olsson
1ef5071d1b LibPDF: Harden the document/parser against errors 2021-06-12 22:45:01 +04:30
Matthew Olsson
a08922d2f6 LibPDF: Parse outline structures 2021-05-25 00:24:09 +04:30
Matthew Olsson
d5f94aaa7b LibPDF/PDFViewer: Support rotated pages 2021-05-18 16:35:23 +02:00
Matthew Olsson
f7ea1eb610 Applications: Add a very simple PDFViewer 2021-05-18 16:35:23 +02:00
Matthew Olsson
d6a9b41bac LibPDF: Parse page crop box and user units 2021-05-18 16:35:23 +02:00
Matthew Olsson
3aeaceb727 LibPDF: Parse nested Page Tree structures
We now follow nested page tree nodes to find all of the actual
page dicts, whereas previously we just assumed the root level
page tree node contained all of the page children directly.
2021-05-10 10:32:39 +02:00
Matthew Olsson
8c745ad0d9 LibPDF: Parse page structures
This commit introduces the ability to parse the document catalog dict,
as well as the page tree and individual pages. Pages obviously aren't
fully parsed, as we won't care about most of the fields until we
start actually rendering PDFs.

One of the primary benefits of the PDF format is laziness. PDFs are
not meant to be parsed all at once, and the same is true for pages.
When a Document is constructed, it builds a map of page number to
object index, but it does not fetch and parse any of the pages. A page
is only parsed when a caller requests that particular page (and is
cached going forwards).

Additionally, this commit also adds an object_cast function which
logs bad casts if DEBUG_PDF is set. Additionally, utility functions
were added to ArrayObject and DictObject to get all types of objects
from the collections to avoid having to manually cast.
2021-05-10 10:32:39 +02:00
Matthew Olsson
72f693e9ed LibPDF: Add a basic parser and Document structure
This commit adds a parser as well as the Reader class, which serves
as a utility to aid in reading the PDF both forwards and in reverse.
The parser currently is capable of reading xref tables, as well as
all values. We don't really do anything with any of this information,
however.
2021-05-10 10:32:39 +02:00