Only some specific number of values should be allowed, but let's accept
everything for now and add these checks when the generator will be more
mature.
Let's make the "read a sample" part independent of the decoder. That
will soon allow us to read samples based on the image's parameter
without duplicating the code for every decoder.
mft1 and mft2 tags are very similar. The only difference is that
mft1 uses an u8 lookup table, while mft2 uses a u16 lookup table.
This means their PCS lookup encodings are different, and mft2 uses a
PCSLAB encoding that's different from other places in the v4 spec.
If one profile uses PCSXYZ and the other PCSLAB as connection space,
we now do the necessary XYZ/LAB conversion.
With this and the previous commits, we can now convert from profiles
that use PCSLAB with mAB, such as stress.jpeg from
https://littlecms.com/blog/2020/09/09/browser-check/ :
% Build/lagom/icc --name sRGB --reencode-to serenity-sRGB.icc
% Build/lagom/bin/image -o out.png \
--convert-to-color-profile serenity-sRGB.icc \
~/src/jpegfiles/stress.jpeg
ICC profiles work by transforming from the input color space
(one of many: RGB, CMYK, YUV, etc) to a "profile connection space" (PCS)
and then from there to the output color space.
However, there's not one but two possible profile connection spaces,
PCSXYZ and PCSLAB. The matrix/curve tags can only be used with PCSXYZ,
but the mAB, mBA, mft1, mft2 tags can be used with PCSLAB as well.
The PCSLAB encoding has L going from 0 to 100 and ab from -128 to 127,
instead of from 0 to 1. So they need to be scaled up at the end.
That's also the reason for the "mystery conversion factor": PCSXYZ
doesn't go from 0 to 1 either, but from 0 to 65535/32768, per ICC v4
6.3.4.2 General PCS encoding, Table 11 - PCSXYZ X, Y or Z encoding.
Between input and output are various curves (and the CLUT) that
have domain and range of 0..1. For these, the color has to be linearly
scaled to 0..1 before the curve and back to the actual range after
the curve. Doing that back-to-back is a no-op, so scaling back at
the very end is sufficient.
We will need to use ColorSpace in TagTypes.h, and it can't include
Profile.h.
Also makes Profile.cpp a bit smaller.
No behavior change, pure code move.
`lerp_nd()` is very similar to PDF::SampleFunction::evaluate(). But we
know that the result is a FloatVector3 in the ICC code (at least for
now), so we can save a bunch of redundant computation by returning
all three channels of the LUT at once.
This is enough for images using mAB with A curve / CLUT if the
profile connecting space is PCSXYZ, such as for Upper_Right.jpg
from https://www.color.org/version4html.xalter like so:
% Build/lagom/icc --name sRGB --reencode-to serenity-sRGB.icc
% Build/lagom/bin/image -o out.png \
--convert-to-color-profile serenity-sRGB.icc \
~/Downloads/Upper_Right.jpg
Previously, we determined the positions of glyphs for each text run at
the time of painting, which constituted a significant portion of the
painting process according to profiles. However, since we already go
through each glyph to figure out the width of each fragment during
layout, we can simultaneously gather data about the position of each
glyph in the layout phase and utilize this information in the painting
phase.
I had to update expectations for a couple of reference tests. These
updates are due to the fact that we now measure glyph positions during
layout using a 1x font, and then linearly scale each glyph's position
to device pixels during painting. This approach should be acceptable,
considering we measure a fragment's width and height with an unscaled
font during layout.
Our previous check was not sufficient, since it merely checked the
first byte of the EncodingRecord offset is within range, while the
actual read is 4-byte wide.
Fixes ossfuzz-64165.
This change limits the amount of memory that is initially allocated for
the color table. This prevents an OOM condition if the file contains an
incorrect color table size.
This change doesn't change much on its own. The idea behind this
refactoring is to separate the sample reading from the decoding step.
The decoder returning data byte per byte was fine as we only support
8 bits images, but this assumption won't hold for a long time. So let's
decode everything beforehand and strictly partition the sample reading
code somewhere else.
This tag type is a bit different as even if it fits in the general
definition given in the TIFF specification. That is the value will be of
one specified type multiplied by a known count. Having a
`Vector<Variant<u8, ...>>` will be very painful to use. So let's deviate
a bit from the normal way and use a `ByteBuffer` directly instead this
complicated type.
This will allow us to generate code that handle and provide easy access
to metadata stored in TIFF's tags. The generator is a Python script, and
it output both TIFFMetadata.h and TIFFTagHandler.cpp files.
The generator will definitely need some update to support all TIFF and
EXIF tags, but that will still be easier than writing everything
ourselves.
Some small modifications are needed in TIFFLoader.cpp to make it
compatible with the new `Metadata` class.
Before this change, we used Gfx::Bitmap to represent both decoded
images that are not going to be mutated and bitmaps corresponding
to canvases that could be mutated.
This change introduces a wrapper for bitmaps that are not going to be
mutated, so the painter could do caching: texture caching in the case
of GPU painter and potentially scaled bitmap caching in the case of CPU
painter.
The one deviation from the spec here is to use this in the WOFF
TableDirectoryEntry's tag field. However, *not* making that a Tag made
a lot of things more complicated than they need to be.
A few small changes that didn't seem to deserve separate commits:
- Mark it as packed to remove compiler complaints when it's a member of
a packed struct.
- Add a default constructor for places where we fill in a struct
gradually.
- Restrict the constructor to exactly 4-character string literals.
- Add a to_u32() method for the one place that needs that.
SMasks are greyscale images that get used as alpha channel for a
different image.
JPEGs in PDFs are stored as streams with /DCTDecode filters, and
we have a separate code path for loading those in the PDF renderer.
That code path just calls our JPEG decoder, which creates bitmaps
with format BGRx8888.
So when we process an SMask for such a bitmap, we have to change
the bitmap's format to BGRA8888 in addition to setting alpha values
on all pixels.
This is a hack: Ideally we'd have a CMYK Bitmap pixel format,
and we'd convert to rgb at blit time. Then we could also apply color
profiles (which for CMYK images are CMYK-based).
Also, the colors for our CMYK->RGB conversion are off for PDFs,
and we have distinct codepaths for this in Gfx::Color (for paths)
and JPEGs. So when we fix that, we'll have to fix it in two places.
But this doesn't require a lot of code and it's a huge visual
progression, so let's go with it for now.