TIFF images with the PhotometricInterpretation tag set to RGBPalette are
based on indexed colors instead of explicitly describing the color for
each pixel. Let's add support for them.
The test case was generated with GIMP using the Indexed image mode after
adding an alpha layer. Not all decoders are able to open this image, but
GIMP can.
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:
```
Optional<I> opt;
if constexpr (IsSigned<I>)
opt = view.to_int<I>();
else
opt = view.to_uint<I>();
```
For us.
The main goal here however is to have a single generic number conversion
API between all of the String classes.
UnassociatedAlpha is the one used by GIMP when generating TIFF images
with transparency. Support is added for Grayscale and RGB images as it's
the two that we support right now but managing transparency should be
really straightforward for other types as well.
As per the specification, TIFF readers should gracefully skip samples
that they are not able to interpret.
This patch allow us to read `strike.tif` from the libtiff test suite as
an RGB image.
The number of samples is not a good measure to deduce the type of image
we are decoding. As per the TIFF spec, the PhotometricInterpretation tag
is required and we should use that instead.
This changes the splitting to use a stack, which ensures the resulting
line segments follow the path in order. This will be important for SVG
`<textPath>`s which place text along a path.
This compression (tag Compression=2) is not very popular on its own, but
a base to implement CCITT3 2D and CCITT4 compressions.
As the format has no real benefits, it is quite hard to find an app that
accepts tho encode that for you. So I used the following program that
calls `libtiff` directly:
```cpp
#include <vector>
#include <cstdlib>
#include <iostream>
#include <tiffio.h>
// An array containing 0 and 1 of length width * height.
extern std::vector<uint8_t> array;
int main() {
// From: https://stackoverflow.com/a/34257789
TIFF *image = TIFFOpen("input.tif", "w");
int const width = 400;
int const height = 300;
TIFFSetField(image, TIFFTAG_IMAGEWIDTH, width);
TIFFSetField(image, TIFFTAG_IMAGELENGTH, height);
TIFFSetField(image, TIFFTAG_PHOTOMETRIC, 0);
TIFFSetField(image, TIFFTAG_COMPRESSION, COMPRESSION_CCITTRLE);
TIFFSetField(image, TIFFTAG_BITSPERSAMPLE, 1);
TIFFSetField(image, TIFFTAG_SAMPLESPERPIXEL, 1);
TIFFSetField(image, TIFFTAG_ROWSPERSTRIP, 1);
std::vector<uint8_t> scan_line(width / 8 + 8, 0);
int count = 0;
for (int i = 0; i < height; i++) {
std::fill(scan_line.begin(), scan_line.end(), 0);
for (int x = 0; x < width; ++x) {
uint8_t eight_pixels = scan_line.at(x / 8);
eight_pixels = eight_pixels << 1;
eight_pixels |= !array.at(i * width + x);
scan_line.at(x / 8) = eight_pixels;
}
int bytes = int(width / 8.0 + 0.5);
if (TIFFWriteScanline(image, scan_line.data(), i, bytes) != 1)
std::cerr << "Something went wrong\n";
}
TIFFClose(image);
}
```
As pointed out by @nico, while doing a right-shift to downscale is fine,
a left-shift to upscale gives wrong results. As an example, imagine a 2-
bits value containing 3, left-shifting it would give 192 instead of 255.
These tags are either part of the baseline specification or included by
default by GIMP when exporting TIFF files. Note that we don't add
support for them in the decoder yet. This commit only allows us to parse
the metadata and display it gracefully.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
According to the CSS font matching algorithm specification, it is
supposed to be executed for each glyph instead of each text run, as is
currently done. This change partially implements this by having the
font matching algorithm produce a list of fonts against which each
glyph will be tested to find its suitable font.
Now, it becomes possible to have per-glyph fallback fonts: if the
needed glyph is not present in a font, we can check the subsequent
fonts in the list.
RepeatingBitmapPaintStyle tiles a bitmap passed on passed in parameters
and paints according to that.
OffsetPaintStyle offsets an existing paint style and paints with that.
These two will be useful in the PDF renderer.
These properties could be cached in the font object once they are
decoded from the table for the first time to make subsequent access
faster.
This change makes `Node::scaled_font()` in LibWeb go down from 6% to
1.5% in my profiles because building a font cache key is now a lot
cheaper.
When the `TIFF_DEBUG` flag is set, the TIFF decoder logs every tag and
their values. This is already useful but require the developer to have
the spec handy in order to decrypt each value to its signification. None
of this information is available at runtime, but this is known by the
Python generator. So by generating these debug logs, we drastically
increase their value.
As a bonus point, most of these functions should be useful when we will
display image's metadata in Serenity.
The `TIFFType` enum is exported with a different name to C++. This
change of behavior was handled by manually setting the parameter of a
function. However, we will soon need the exported name in more places,
so let's make it a property of the Enum itself.
Only some specific number of values should be allowed, but let's accept
everything for now and add these checks when the generator will be more
mature.
Let's make the "read a sample" part independent of the decoder. That
will soon allow us to read samples based on the image's parameter
without duplicating the code for every decoder.
mft1 and mft2 tags are very similar. The only difference is that
mft1 uses an u8 lookup table, while mft2 uses a u16 lookup table.
This means their PCS lookup encodings are different, and mft2 uses a
PCSLAB encoding that's different from other places in the v4 spec.
If one profile uses PCSXYZ and the other PCSLAB as connection space,
we now do the necessary XYZ/LAB conversion.
With this and the previous commits, we can now convert from profiles
that use PCSLAB with mAB, such as stress.jpeg from
https://littlecms.com/blog/2020/09/09/browser-check/ :
% Build/lagom/icc --name sRGB --reencode-to serenity-sRGB.icc
% Build/lagom/bin/image -o out.png \
--convert-to-color-profile serenity-sRGB.icc \
~/src/jpegfiles/stress.jpeg
ICC profiles work by transforming from the input color space
(one of many: RGB, CMYK, YUV, etc) to a "profile connection space" (PCS)
and then from there to the output color space.
However, there's not one but two possible profile connection spaces,
PCSXYZ and PCSLAB. The matrix/curve tags can only be used with PCSXYZ,
but the mAB, mBA, mft1, mft2 tags can be used with PCSLAB as well.
The PCSLAB encoding has L going from 0 to 100 and ab from -128 to 127,
instead of from 0 to 1. So they need to be scaled up at the end.
That's also the reason for the "mystery conversion factor": PCSXYZ
doesn't go from 0 to 1 either, but from 0 to 65535/32768, per ICC v4
6.3.4.2 General PCS encoding, Table 11 - PCSXYZ X, Y or Z encoding.
Between input and output are various curves (and the CLUT) that
have domain and range of 0..1. For these, the color has to be linearly
scaled to 0..1 before the curve and back to the actual range after
the curve. Doing that back-to-back is a no-op, so scaling back at
the very end is sufficient.
We will need to use ColorSpace in TagTypes.h, and it can't include
Profile.h.
Also makes Profile.cpp a bit smaller.
No behavior change, pure code move.
`lerp_nd()` is very similar to PDF::SampleFunction::evaluate(). But we
know that the result is a FloatVector3 in the ICC code (at least for
now), so we can save a bunch of redundant computation by returning
all three channels of the LUT at once.
This is enough for images using mAB with A curve / CLUT if the
profile connecting space is PCSXYZ, such as for Upper_Right.jpg
from https://www.color.org/version4html.xalter like so:
% Build/lagom/icc --name sRGB --reencode-to serenity-sRGB.icc
% Build/lagom/bin/image -o out.png \
--convert-to-color-profile serenity-sRGB.icc \
~/Downloads/Upper_Right.jpg
Previously, we determined the positions of glyphs for each text run at
the time of painting, which constituted a significant portion of the
painting process according to profiles. However, since we already go
through each glyph to figure out the width of each fragment during
layout, we can simultaneously gather data about the position of each
glyph in the layout phase and utilize this information in the painting
phase.
I had to update expectations for a couple of reference tests. These
updates are due to the fact that we now measure glyph positions during
layout using a 1x font, and then linearly scale each glyph's position
to device pixels during painting. This approach should be acceptable,
considering we measure a fragment's width and height with an unscaled
font during layout.