Incorrect is in quotes because the spec (both 1.7 and 2.0) specify this
multiplication as it was originally! However, flipping the order of
operations here makes the text in all of my test cases render in the
correct position.
The CTM is a transformation matrix between the text coordinate system
and the device coordinate system. However, being on the right-side of
the multiplication means that the CTM scale parameters don't have any
influence on the translation component of the left-side matrix. This
oddity is what originally led to me just trying this change to see if
it worked.
Previously, text spacing on a page would only look correct on very
zoomed-in pages. When the page was zoomed out, the spacing between
characters was very large. The cause for this was incorrect initial
values for the Tc (character spacing) and Tw (word spacing) text
parameters. The initial values were too large, but they were only
about 3-5 pixels, which is why the error was only observable for
smaller pages.
The text placement still isn't perfect, but it is _much_ better!
Apologies for the enormous commit, but I don't see a way to split this
up nicely. In the vast majority of cases it's a simple change. A few
extra places can use TRY instead of manual error checking though. :^)
This isn't a complete conversion to ErrorOr<void>, but a good chunk.
The end goal here is to propagate buffer allocation failures to the
caller, and allow the use of TRY() with formatting functions.
Add a check to `Parser::consume_eol` to ensure that there is more data
to read before actually consuming any data. Not checking if there is
data left leads to failing an assertion in case of e.g., a truncated
pdf file.
Same as Vector, ByteBuffer now also signals allocation failure by
returning an ENOMEM Error instead of a bool, allowing us to use the
TRY() and MUST() patterns.
Old situation:
Object.h defines Object
Object.h defines ArrayObject
ArrayObject requires the definition of Object
ArrayObject requires the definition of Value
Value.h defines Value
Value requires the definition of Object
Therefore, a file with the single line "#include <Value.h>" used to
raise compilation errors; certainly not something that one might expect
from a library.
This patch splits up the definitions in Object.h to break the cycle.
Now, Object.h only defines Object, Value.h still only defines Value (and
includes Object.h), and the new header ObjectDerivatives.h defines
ArrayObject (and includes both Object.h and Value.h).
At least `Value::operator=` didn't properly unref the `PDF::Object` when
it was called. This type of problem is removed by just letting `RefPtr`
do its thing.
This patch increases the memory consumption by LibPDF by 4 bytes (the
other union objects) per value.
Our existing implementation did not check the element type of the other
pointer in the constructors and move assignment operators. This meant
that some operations that would require explicit casting on raw pointers
were done implicitly, such as:
- downcasting a base class to a derived class (e.g. `Kernel::Inode` =>
`Kernel::ProcFSDirectoryInode` in Kernel/ProcFS.cpp),
- casting to an unrelated type (e.g. `Promise<bool>` => `Promise<Empty>`
in LibIMAP/Client.cpp)
This, of course, allows gross violations of the type system, and makes
the need to type-check less obvious before downcasting. Luckily, while
adding the `static_ptr_cast`s, only two truly incorrect usages were
found; in the other instances, our casts just needed to be made
explicit.
AK's version should see better inlining behaviors, than the LibM one.
We avoid mixed usage for now though.
Also clean up some stale math includes and improper floatingpoint usage.
We now try to parse the first indirect value and see
if it's the `Linearization Parameter Dictionary`. if it's not, we
fallback to reading the xref table from the end of the document
This isn't tested all that well, as the PDF I am testing with only uses
it for black (which is trivial). It can be tested further when LibPDF
is able to process more complex PDFs that actually use this color space
non-trivially.
This code isn't _actually_ used as of right now, but I wrote it at the
same time as all of the code in the previous commit. I realized after
I wrote it that these hint tables aren't super useful if the parser
already has access to the full file. However, this will be useful if
we ever want to stream PDFs from the web (and possibly view them in
the browser).
This is a big step, as most PDFs which are downloaded online will be
linearized. Pretty much the only difference is that the xref structure
is slightly different.
- A newline was assumed to follow the "stream" keyword, when it can also
be a windows-style line break
- Fix not consuming the "endobj" at the end of every indirect object