JPEGs can store a `restart_interval`, which controls how many
minimum coded units (MCUs) apart the stream state resets.
This can be used for error correction, decoding parts of a jpeg
in parallel, etc.
We tried to use
u32 i = vcursor * context.mblock_meta.hpadded_count + hcursor;
i % (context.dc_restart_interval *
context.sampling_factors.vertical *
context.sampling_factors.horizontal) == 0
to check if we hit a multiple of an MCU.
`hcursor` is the horizontal offset into 8x8 blocks, vcursor the
vertical offset, and hpadded_count stores how many 8x8 blocks
we have per row, padded to a multiple of the sampling factor.
This isn't quite right if hcursor isn't divisible by both
the vertical and horizontal sampling factor. Tweak things so
that they work.
Also rename `i` to `number_of_mcus_decoded_so_far` since that
what it is, at least now.
For the test case, I converted an existing image to a ppm:
Build/lagom/bin/image -o out.ppm \
Tests/LibGfx/test-inputs/jpg/12-bit.jpg
Then I resized it to 102x77px in Photoshop and saved it again.
Then I turned it into a jpeg like so:
path/to/cjpeg \
-outfile Tests/LibGfx/test-inputs/jpg/odd-restart.jpg \
-sample 2x2,1x1,1x1 -quality 5 -restart 3B out.ppm
The trick here is to:
a) Pick a size that's not divisible by the data size width (8),
and that when rounded to a block size (13) still isn't divisible
by the subsample factor -- done by picking a width of 102.
b) Pick a huffman table that doesn't happen to contain the bit
pattern for a restart marker, so that reading a restart marker
from the bitstream as data causes a failure (-quality 5 happens
to do this)
c) Pick a restart interval where we fail to skip it if our calculation
is off (-restart 3B)
Together with #22987, fixes#22780.
Non-interleaved files always have an MCU of one data unit.
(A "data unit" is an 8x8 tile of pixels, and an "MCU" is a
"minium coded unit", e.g. 2x2 data units for luminance and
1 data unit each for Cr and Cb for a YCrCb image with
4:2:0 subsampling.)
For the test case, I converted an existing image to a ppm:
Build/lagom/bin/image -o out.ppm \
Tests/LibGfx/test-inputs/jpg/12-bit.jpg
Then I converted it to grayscale and saved it as a pgm in Photoshop.
Then I turned it into a weird jpeg like so:
path/to/cjpeg \
-outfile Tests/LibGfx/test-inputs/jpg/grayscale_mcu.jpg \
-sample 2x2 -restart 3 out.pgm
Makes 3 of the 5 jpegs failing to decode at #22780 go.
That's all this function reads from Component.
Also rename from validate_luma_and_modify_context() to
validate_sampling_factors_and_modify_context().
No behavior change.
These are written by `mutool extract` for CMYK images.
They don't contain color profiles so they're not super convenient.
But `image` can convert them (*) to sRGB as long as you use it with
`--assign-color-profile` pointing to some CMYK icc profile of your
choice.
*: Once #22922 is merged.
.pam is a "portrable arbitrarymap" as documented at
https://netpbm.sourceforge.net/doc/pam.html
It's very similar to .pbm, .pgm, and .ppm, so this uses the
PortableImageMapLoader framework. The header is slightly different,
so this has a custom header parsing function.
Also, .pam only exixts in binary form, so the ascii form support
becomes optional.
When decoding a CMYK image and asked for a normal `frame()`, the decoder
would convert the CMYK bitmap into an RGB bitmap. Calling `cmyk_frame()`
after that point will provoke a null-dereference.
We previously were considering Float and Doubles as non-supported types.
But this was done in a sneaky way, by letting them hit the default case
in the `read_type` method. So, when I ported this function to the
generator we started to make this types flow into the system without a
proper support there. Since 3124c161, we would have crashes on images
containing tags with a floating point value.
A lot of images format use Exif to store there metadata. As Exif is
based on the TIFF structure, the TIFF decoder can, without modification
be able to decode the metadata. We only need a new API to explicitly
mention that we only need the metadata.
Support for JPEGs embedded in TIFF images was introduced with TIFF 6.0.
However, this implementation had major issues. It was so problematic
that they decided to reimplement it from scratch in 1995, three years
later. The two incarnations are obviously incompatible.
For more details see:
https://www.awaresystems.be/imaging/tiff/specification/TIFFTechNote2.txt
When present, the alpha channel is also affected by the horizontal
differencing predictor.
The test case was generated with GIMP with the following steps:
- Open an RGB image
- Add a transparency layer
- Export as TIFF with the LZW compression scheme
This tag is required by the specification, but some encoders (at least
Krita) don't write it for images with a single strip.
The test file was generated by opening deflate.tiff in Krita and saving
it with the DEFLATE compression.
Type 2 <=> One-dimensional Group3, customized for TIFF
Type 3 <=> Two-dimensional Group3, uses the original 1D internally
Type 4 <=> Two-dimensional Group4
So let's clarify that this is not Group3 1D but the TIFF variant, which
is called `CCITTRLE` in libtiff. So let's stick with this name to avoid
confusion.
Images with a display mask ("stencil" as it's called in DPaint) add
an extra bitplane which acts as a mask. For now, at least skip it
properly. Later we should render masked pixels as transparent, but
this requires some refactoring.
SamplingFactors already has default initializers for its field,
so no need to have an explicit one for the first of the two fields.
No behavior change.
We now allow all subsampling factors where the subsampling factors
of follow-on components evenly decode the ones of the first component.
In practice, this allows YCCK 2111, CMYK 2112, and CMYK 2111.
Previously, we handled sampling factors as part of ycbcr_to_rgb().
That meant it worked ok for code paths that used YCbCr ("normal"
jpegs, and the YCC part of YCCK jpegs), but it didn't work for
example for the K channel in YCCK jpegs, nor for CMYK.
By making this a separate pass, it should now work for all cases.
It also makes it easier to support more subsampling arrangements
in the future, and to use something better than nearest neighbor
for upsampling subsampled blocks.