1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-07-26 01:47:34 +00:00

LibPDF: Make parser skip whitespace after header

0000990.pdf from 0000.zip from
https://pdfa.org/new-large-scale-pdf-corpus-now-publicly-available/
starts like so:

```
%PDF-1.7

4 0 obj
```

parse_heaader() used to put the cursor at the start of the 2nd,
empty, line. initialize_linearization_dict() would then check
if `m_reader.matches_number()` to see if there could possibly
be a linearization dict.

In this case, there isn't one, but we should detect linearization
dicts even if they're separated by whitespace from the first line.
This commit is contained in:
Nico Weber 2023-10-20 22:36:10 -04:00 committed by Andreas Kling
parent 5b36355be8
commit cf26fc2393

View file

@ -92,6 +92,7 @@ PDFErrorOr<Version> DocumentParser::parse_header()
return error(DeprecatedString::formatted("Unknown minor version \"{}\"", minor_ver));
m_reader.consume_eol();
m_reader.consume_whitespace();
// Parse optional high-byte comment, which signifies a binary file
// FIXME: Do something with this?