Some apps seem to generate malformed images that are accepted
by most readers. We now only throw if malformed data would lead to
a write outside the chunky buffer.
ExifOrientedBitmap was implemented before the introduction of the TIFF
decoder. So we had to provide a definition of the Orientation enum. Now
that we have a TIFF implementation that comes with some enum
definitions, we should prefer this source.
When using the BMP encoding, ICO images are expected to contain a 1-bit
mask for transparency. Regardless an alpha channel is already included
in the image, the mask is always required. As stated here[1], the
mask is used to provide shadow around the image.
Unfortunately, it seems that some encoder do not include that second
transparency mask. So let's read that mask only if some data is still
remaining after decoding the image.
The test case has been generated by truncating the 64 last bytes
(originally dedicated to the mask) from the `serenity.ico` file and
changing the declared size of the image in the ICO header. The size
value is stored at the offset 0x0E in the file and I changed the value
from 0x0468 to 0x0428.
[1]: https://devblogs.microsoft.com/oldnewthing/20101021-00/?p=12483
This fixes an issue where GIF images without a global color table would
have the first segment incorrectly interpreted as color table data.
Makes many more screenshots appear on https://virtuallyfun.com/ :^)
TIFF files are made in a way that make them easily extendable and over
the years people have made sure to exploit that. In other words, it's
easy to find images with non-standard tags. Instead of returning an
error for that, let's skip them.
Note that we need to make sure to realign the reading head in the file.
The test case was originally a 10x10 checkerboard image with required
tags, and also the `DocumentName` tag. Then, I modified this tag in a
hexadecimal editor and replaced its id with 30 000 (0x3075 as a LE u16)
and the type with the same value as well. This is AFAIK, never used as
a custom TIFF tag, so this should remain an invalid tag id and type.
The decoder assumes that k's sampling factor matches y's at the moment.
Better to error out than to silently render something broken.
For ycck, covered by ycck-2111.jpg in the tests.
We currently assume that the K (black) channel uses the same sampling
as the Y channel already, so this already works as long as we don't
error out on it.
This allows us to reject invalid images before trying to decode them.
The spec requires more tag to be present[1] but as we don't use them for
decoding I don't see the point.
[1] - XResolution, YResolution and ResolutionUnit
TIFF images with the PhotometricInterpretation tag set to RGBPalette are
based on indexed colors instead of explicitly describing the color for
each pixel. Let's add support for them.
The test case was generated with GIMP using the Indexed image mode after
adding an alpha layer. Not all decoders are able to open this image, but
GIMP can.
UnassociatedAlpha is the one used by GIMP when generating TIFF images
with transparency. Support is added for Grayscale and RGB images as it's
the two that we support right now but managing transparency should be
really straightforward for other types as well.
As per the specification, TIFF readers should gracefully skip samples
that they are not able to interpret.
This patch allow us to read `strike.tif` from the libtiff test suite as
an RGB image.
The number of samples is not a good measure to deduce the type of image
we are decoding. As per the TIFF spec, the PhotometricInterpretation tag
is required and we should use that instead.
This compression (tag Compression=2) is not very popular on its own, but
a base to implement CCITT3 2D and CCITT4 compressions.
As the format has no real benefits, it is quite hard to find an app that
accepts tho encode that for you. So I used the following program that
calls `libtiff` directly:
```cpp
#include <vector>
#include <cstdlib>
#include <iostream>
#include <tiffio.h>
// An array containing 0 and 1 of length width * height.
extern std::vector<uint8_t> array;
int main() {
// From: https://stackoverflow.com/a/34257789
TIFF *image = TIFFOpen("input.tif", "w");
int const width = 400;
int const height = 300;
TIFFSetField(image, TIFFTAG_IMAGEWIDTH, width);
TIFFSetField(image, TIFFTAG_IMAGELENGTH, height);
TIFFSetField(image, TIFFTAG_PHOTOMETRIC, 0);
TIFFSetField(image, TIFFTAG_COMPRESSION, COMPRESSION_CCITTRLE);
TIFFSetField(image, TIFFTAG_BITSPERSAMPLE, 1);
TIFFSetField(image, TIFFTAG_SAMPLESPERPIXEL, 1);
TIFFSetField(image, TIFFTAG_ROWSPERSTRIP, 1);
std::vector<uint8_t> scan_line(width / 8 + 8, 0);
int count = 0;
for (int i = 0; i < height; i++) {
std::fill(scan_line.begin(), scan_line.end(), 0);
for (int x = 0; x < width; ++x) {
uint8_t eight_pixels = scan_line.at(x / 8);
eight_pixels = eight_pixels << 1;
eight_pixels |= !array.at(i * width + x);
scan_line.at(x / 8) = eight_pixels;
}
int bytes = int(width / 8.0 + 0.5);
if (TIFFWriteScanline(image, scan_line.data(), i, bytes) != 1)
std::cerr << "Something went wrong\n";
}
TIFFClose(image);
}
```
As pointed out by @nico, while doing a right-shift to downscale is fine,
a left-shift to upscale gives wrong results. As an example, imagine a 2-
bits value containing 3, left-shifting it would give 192 instead of 255.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
When the `TIFF_DEBUG` flag is set, the TIFF decoder logs every tag and
their values. This is already useful but require the developer to have
the spec handy in order to decrypt each value to its signification. None
of this information is available at runtime, but this is known by the
Python generator. So by generating these debug logs, we drastically
increase their value.
As a bonus point, most of these functions should be useful when we will
display image's metadata in Serenity.
Let's make the "read a sample" part independent of the decoder. That
will soon allow us to read samples based on the image's parameter
without duplicating the code for every decoder.
This change limits the amount of memory that is initially allocated for
the color table. This prevents an OOM condition if the file contains an
incorrect color table size.
This change doesn't change much on its own. The idea behind this
refactoring is to separate the sample reading from the decoding step.
The decoder returning data byte per byte was fine as we only support
8 bits images, but this assumption won't hold for a long time. So let's
decode everything beforehand and strictly partition the sample reading
code somewhere else.
This tag type is a bit different as even if it fits in the general
definition given in the TIFF specification. That is the value will be of
one specified type multiplied by a known count. Having a
`Vector<Variant<u8, ...>>` will be very painful to use. So let's deviate
a bit from the normal way and use a `ByteBuffer` directly instead this
complicated type.
This will allow us to generate code that handle and provide easy access
to metadata stored in TIFF's tags. The generator is a Python script, and
it output both TIFFMetadata.h and TIFFTagHandler.cpp files.
The generator will definitely need some update to support all TIFF and
EXIF tags, but that will still be easier than writing everything
ourselves.
Some small modifications are needed in TIFFLoader.cpp to make it
compatible with the new `Metadata` class.