Every TIFF containers is composed of a main IFD. Some entries of this
one can be a pointer to a sub-IFD. We are now capable of exploring these
underlying structures. Note that we don't do anything with them yet.
We previously were considering Float and Doubles as non-supported types.
But this was done in a sneaky way, by letting them hit the default case
in the `read_type` method. So, when I ported this function to the
generator we started to make this types flow into the system without a
proper support there. Since 3124c161, we would have crashes on images
containing tags with a floating point value.
Support for JPEGs embedded in TIFF images was introduced with TIFF 6.0.
However, this implementation had major issues. It was so problematic
that they decided to reimplement it from scratch in 1995, three years
later. The two incarnations are obviously incompatible.
For more details see:
https://www.awaresystems.be/imaging/tiff/specification/TIFFTechNote2.txt
This tag is required by the specification, but some encoders (at least
Krita) don't write it for images with a single strip.
The test file was generated by opening deflate.tiff in Krita and saving
it with the DEFLATE compression.
Type 2 <=> One-dimensional Group3, customized for TIFF
Type 3 <=> Two-dimensional Group3, uses the original 1D internally
Type 4 <=> Two-dimensional Group4
So let's clarify that this is not Group3 1D but the TIFF variant, which
is called `CCITTRLE` in libtiff. So let's stick with this name to avoid
confusion.
Some tags have a default value, we should return this value in
Metadata's getters when no value has been read from the input file.
Note that we don't support default values for tags with a count bigger
than one.
We were previously only checking the first value, this is wrong for tags
that accept multiple values (e.g. ExtraSamples) and can lead to crashes
on malformed images containing tags with a count of 0.
The TIFF spec is constructed in a way that many tags are defined in
multiple places but some of these definitions are partial. If we look
into "Section 8: Baseline Field Reference Guide", we can see that these
tags indeed have an enforced length of 1.
This allows us to reject invalid images before trying to decode them.
The spec requires more tag to be present[1] but as we don't use them for
decoding I don't see the point.
[1] - XResolution, YResolution and ResolutionUnit
TIFF images with the PhotometricInterpretation tag set to RGBPalette are
based on indexed colors instead of explicitly describing the color for
each pixel. Let's add support for them.
The test case was generated with GIMP using the Indexed image mode after
adding an alpha layer. Not all decoders are able to open this image, but
GIMP can.
These tags are either part of the baseline specification or included by
default by GIMP when exporting TIFF files. Note that we don't add
support for them in the decoder yet. This commit only allows us to parse
the metadata and display it gracefully.
When the `TIFF_DEBUG` flag is set, the TIFF decoder logs every tag and
their values. This is already useful but require the developer to have
the spec handy in order to decrypt each value to its signification. None
of this information is available at runtime, but this is known by the
Python generator. So by generating these debug logs, we drastically
increase their value.
As a bonus point, most of these functions should be useful when we will
display image's metadata in Serenity.
The `TIFFType` enum is exported with a different name to C++. This
change of behavior was handled by manually setting the parameter of a
function. However, we will soon need the exported name in more places,
so let's make it a property of the Enum itself.
Only some specific number of values should be allowed, but let's accept
everything for now and add these checks when the generator will be more
mature.
This tag type is a bit different as even if it fits in the general
definition given in the TIFF specification. That is the value will be of
one specified type multiplied by a known count. Having a
`Vector<Variant<u8, ...>>` will be very painful to use. So let's deviate
a bit from the normal way and use a `ByteBuffer` directly instead this
complicated type.
This will allow us to generate code that handle and provide easy access
to metadata stored in TIFF's tags. The generator is a Python script, and
it output both TIFFMetadata.h and TIFFTagHandler.cpp files.
The generator will definitely need some update to support all TIFF and
EXIF tags, but that will still be easier than writing everything
ourselves.
Some small modifications are needed in TIFFLoader.cpp to make it
compatible with the new `Metadata` class.