From 2d12647e291c83b15000b54645a26ce926834c26 Mon Sep 17 00:00:00 2001 From: Nico Weber Date: Wed, 3 Jan 2024 17:55:41 -0500 Subject: [PATCH] LibPDF: Add FIXME for "was linearized PDF incrementally updated" check It's pretty tricky to do, and also tricky with respect to skipping trailing bytes after %%EOF: The check requires knowning the full size of the PDF (which means web servers not sending content lengths are out), but that size has to be after stripping trailing bytes, which normal static file servers won't do. So PDF viewers would have to download the last couple bytes of the PDF unconditionally, then strip trailing bytes and use the count to figure out the final actual PDF size. Luckily, we don't incrementally download PDFs from the net but instead require all data to be available in one chunk, so it's not currently a problem. --- Userland/Libraries/LibPDF/DocumentParser.cpp | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Userland/Libraries/LibPDF/DocumentParser.cpp b/Userland/Libraries/LibPDF/DocumentParser.cpp index a856903c03..0c07e563b1 100644 --- a/Userland/Libraries/LibPDF/DocumentParser.cpp +++ b/Userland/Libraries/LibPDF/DocumentParser.cpp @@ -46,6 +46,10 @@ PDFErrorOr DocumentParser::initialize() // If the length given in the linearization dictionary is not equal to the length // of the document, then this file has most likely been incrementally updated, and // should no longer be treated as linearized. + // FIXME: This check requires knowing the full size of the file, while linearization + // is all about being able to render some of it without having to download all of it. + // PDF 2.0 Annex G.7 "Accessing an updated file" talks about this some, + // but mostly just throws its hand in the air. is_linearized = m_linearization_dictionary.value().length_of_file == m_reader.bytes().size(); }