For flate and lzw filters, the data can be transformed by this
predictor function to make it compress better. For us this means that
we have to undo this step in order to get the right result.
Although this feature is meant for images, I found at least a few
documents that use it all over the place, making this step very
important.
Previously we would draw all text, no matter what font type, as
Liberation Serif, which results in things like ugly character spacing.
We now have partial support for drawing Type 1 glyphs, which are part of
a PostScript font program. We completely ignore hinting for now, which
results in ugly looking characters at low resolutions, but gain support
for a large number of typefaces, including most of the default fonts
used in TeX.
Since PDF version 1.5, a document may omit the xref table in favor of
a new kind of xref stream object. This is used to reference so-called
"compressed" objects that are part of an object stream.
With this patch we are able to parse this new kind of xref object, but
we'll have to implement object streams to use them correctly.
Security handlers manage encryption and decription of PDF files. The
standard security handler uses RC4/MD5 to perform its crypto (AES as
well, but that is not yet implemented).
This is a big step, as most PDFs which are downloaded online will be
linearized. Pretty much the only difference is that the xref structure
is slightly different.