mirror of
https://github.com/RGBCube/serenity
synced 2025-07-26 22:27:44 +00:00
LibPDF: Try to repair XRef tables with broken indices
An XRef table usually starts with an object number of zero. While it could technically start at any other number, this is a tell-tale sign of a broken table. For the "broken" documents I encountered, this always meant that some objects must have been removed from the start of the table, without updating the following indices. When this is the case, the document is not able to be read normally. However, most other PDF parsers seem to know of this quirk and fix the XRef table automatically. Likewise, we now check for this exact case, and if it matches up with what we expect, we update the XRef table such that all object numbers match the actual objects found in the file again.
This commit is contained in:
parent
e06a065594
commit
d1bc89e30b
3 changed files with 54 additions and 1 deletions
|
@ -79,6 +79,7 @@ private:
|
|||
PDFErrorOr<LinearizationResult> initialize_linearization_dict();
|
||||
PDFErrorOr<void> initialize_linearized_xref_table();
|
||||
PDFErrorOr<void> initialize_non_linearized_xref_table();
|
||||
PDFErrorOr<void> validate_xref_table_and_fix_if_necessary();
|
||||
PDFErrorOr<void> initialize_hint_tables();
|
||||
PDFErrorOr<PageOffsetHintTable> parse_page_offset_hint_table(ReadonlyBytes hint_stream_bytes);
|
||||
Vector<PageOffsetHintTableEntry> parse_all_page_offset_hint_table_entries(PageOffsetHintTable const&, ReadonlyBytes hint_stream_bytes);
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue