Else, WebP files with a broken header just return "Decoding failed"
without any more details. This way, there's some debug logging with
more details.
Maybe we'll want to remove this again since it might lead to duplicate
error messages for files that have their error not in the header.
We'll see how this feels. (But most files don't have any errors, so
it's probably fine.)
The spec doesn't talk about this happening in the text, but
`dequant_init()` in 20.4 processes segment adjustment and quantization
index adjustment in the same variable `q` before clamping.
Since we had to adjust the latter step in the previous commit, do
it for the former step too.
I haven't seen this happen in the wild yet (and now, I hopefully
never will notice it if it happens).
It's up to callers of the ImageDecoderPlugin to honor loop_count().
The ImageDecoderPlugin doesn't have to look at it when decoding frames.
No behavior change.
The spec says that the AC dequantization factor for Y2 data should
be at least 8, so do that.
This only has a very small effect (only the first two AC table
entries are < 8 after multiplying with 155 / 100, so this would
have only a small effect on brightness), and this case is hit
exactly 0 times in all my test images. But it's still good to match
the spec.
For a 1024x1024 image, saves about a quarter MB of memory use while
decoding (compared to the decompressed image data itself needing
4 MiB). Not a huge win, but also very easy to do, so might as well.
No behavior change, no measurable performance impact.
This is safe because:
* prediction only computes averages, or explicitly clamps for
TM_PRED / B_TM_PRED. Since the inputs are in [0, 255], so will the
outputs.
* Addition of IDCT and prediction buffer is immediately clamped back
to [0, 255]
No behavior change, and matches what both libwebp and the reference
implementation in rfc6386 do.
https://datatracker.ietf.org/doc/html/rfc6386#section-14.5 says:
"""
The summing procedure is fairly straightforward, having only a couple
of details. The prediction and residue buffers are both arrays of
16-bit signed integers. Each individual (Y, U, and V pixel) result
is calculated first as a 32-bit sum of the prediction and residue,
and is then saturated to 8-bit unsigned range (using, say, the
clamp255 function defined above) before being stored as an 8-bit
unsigned pixel value.
"""
It's IMHO not 100% clear if the clamping is supposed to happen
immediately (so that it affects prediction inputs for the next
macroblock) or later.
But vp8_dixie_idct_add() on page 173 in
https://datatracker.ietf.org/doc/html/rfc6386#section-20.8 does:
recon[0] = CLAMP_255(predict[0] + ((a1 + d1 + 4) >> 3));
So it does look like it should happen immediately.
(I'm a bit confused why the spec then says "The prediction and residue
buffers are both arrays of 16-bit signed integers", since the
prediction buffer can just be an u8 buffer now, without changing
behavior.
The spec has that clamp at the end of
https://datatracker.ietf.org/doc/html/rfc6386#section-12.2:
The exact algorithm is as follows:
[...]
b[r][c] = clamp255(L[r]+ A[c] - P);
For the test images I'm looking at, it doesn't seem to make a
dramatic difference, but omitting it in `B_TM_PRED` did make
a dramatic difference, so add it. (Also, the spec demands it.)
Each secondary partition has an independent BooleanDecoder.
Their bitstreams interleave per macroblock row, that is the first
macroblock row is read from the first decoder, the second from the
second, ..., until it wraps around again.
All partitions share a single prediction state though: The second
macroblock row (which reads coefficients off the second decoder) is
predicted using the result of decoding the frist macroblock row (which
reads coefficients off the first decoder).
So if I understand things right, in theory the coefficient reading could
be parallelized, but prediction can't be. (IDCT can also be
parallelized, but that's true with just a single partition too.)
I created the test image by running
examples/cwebp -low_memory -partitions 3 -o foo.webp \
~/src/serenity/Tests/LibGfx/test-inputs/4.webp
using a cwebp hacked up as described in #19149. Since creating
multi-partition lossy webps requires hacking up `cwebp`, they're likely
very rare in practice. (But maybe other programs using the libwebp API
create them.)
Fixes#19149.
With this, webp lossy support is complete (*) :^)
And with that, webp support is complete: Lossless, lossy, lossy with
alpha, animated lossless, animated lossy, animated lossy with alpha all
work.
(*: Loop filtering isn't implemented yet, which has a minor visual
effect on the output. But it's only visible when carefully comparing
a webp decoded without loop filtering to the same decoded with it.
But it's technically a part of the spec that's still missing.
The upsampling of UV in the YUV->RGB code is also low-quality. This
produces somewhat visible banding in practice in some images (e.g.
in the fire breather's face in 5.webp), so we should probably improve
that at some point. Our JPG decoder has the same issue.)
This basically adds the line
u8 token = TRY(
tree_decode(decoder, COEFFICIENT_TREE,
header.coefficient_probabilities[plane][band][tricky],
last_decoded_value == DCT_0 ? 2 : 0));
and calls it once for the 16 coefficients of a discrete cosine transform
that covers a 4x4 pixel subblock.
And then it does this 24 or 25 times per macroblock, for the 4x4 Y
subblocks and the 2x2 each U and V subblocks (plus an optional Y2 block
for some macroblocks).
It then adds a whole bunch of machinery to be able to compute `plane`,
`band`, and in particular `tricky` (which depends on if the
corresponding left or above subblock has non-zero coefficients).
(It also does dequantization, and does an inverse Walsh-Hadamard
transform when needed, to end up with complete DCT coefficients
in all cases.)
Also add `vp8_short_inv_walsh4x4_c()` from the spec for the inverse
Walsh-Hadamard transform. The YUV output data must bitwise match the
behavior of the spec, so there isn't a ton of flexibility of how to
do this particular function.
This will contain several of the fixed data tables from the VP8 spec.
For starters, it contains the tables needed to read the frame header
in the first partition. These tables are needed to read the
probabilities of the metablock predicition modes, which in turn will
be needed to read the metablock predicition modes themselves.
...and keep a forwarding header around in VP9, so we don't have to
update all references to the class there.
In time, we probably want to merge LibGfx/ImageDecoders and LibVideo
into LibMedia, but for now I need just this class for the lossy
webp decoder. So move just it over.
No behvior change.
Pixels will leave the lossy decoder with alpha set to 255.
The old code would be a no-op in that case.
No observable behavior change yet, since there still is no
decoder for lossy webp.
In practice, it looks like e.g. the animaged webp file on
https://mathiasbynens.be/demo/animated-webp has the header flag set,
because 2 of the frames have alpha, but they're composited on top of
the final bitmap, but the final bitmap isn't transparent there. So
that image still gets a useless alpha channel. Oh well.
This is done by adding an intermediate buffer and flush it at the end of
every row. This makes the `add_pixels` method to drop from 50% to 7% in
profiles.
As we directly write to the stream, we don't need to store a copy of the
entire image in memory. However, writing to a stream is heavier on the
CPU than to a ByteBuffer. This commit unfortunately makes `add_pixels`
two times slower.
This is done by two distinct things:
- Allowing 12 bits AC and DC coefficients
- Adjusting coefficients in the IDCT
While this patch allows to display them we still don't correctly do
the color transformation and ultimately only truncating coefficients to
8 bits.
More precisely, it allows the decoder to try `SOF1` images. There are
still some sub-kind of this kind of JPEG that we don't support. In a
nutshell `SOF1` images allow more Huffman and quantization tables, 12
bits precision and arithmetic encoding. This patch only brings support
for the "more tables" part.
Please note that `SOF2` images are also allowed to have more tables, so
we gave the decoder the ability to handle these in the same time.
Pure code move (except of removing `static` on the two public functions
in the new header), not behavior change.
There isn't a lot of lossy decoder yet, but it'll make implementing it
more convenient.
No behavior change.
Namely:
* Store compressed data in VP8Header
* Make the functions just take ReadonlyBytes instead of a Chunk
Having a function that takes a header and does decoding of the data
after the header isn't really necessary for VP8. For VP8L, it's needed
because the ALPH chunk stores VP8L data without the VP8L header.
But it's nice to make the functions consistent, and it's kind of a
nice structure.
No behavior change.
decode_webp_chunk_VP8() itself will only ever decode RGB data from a
lossy webp stream, but a separate ALPH chunk could add alpha data
later on. Let the function know if that will happen, so that it can
return a bitmap with an alpha channel if appropriate.
Since lossy decoding isn't implemented yet, no behavior change. But it
makes it a bit easier to implement lossy decoding in the future.
The one caller checked the chunk type, so the VERIFY() for that did
nothing.
The VERIFY() for vp8l data only being in files that start with
VP8L or VP8X chunks wasn't completely useless, but also not very
useful.
Remove the now-unused context parameter of decode_webp_image_data().
Most places used to call `context.error()` to report an error,
which would set the context's state to `Error` and then return an
`Error::from_string_literal()`.
This is somewhat elegant, but it doesn't work: Some functions this
code calls returns ErrorOr<>s that aren't created by `context.error()`,
and for these we wouldn't enter the error state.
Instead, manually check error-ness at the leaf entry functions of the
class:
1. Add a set_error() helper for functions returning bool
2. In the two functions returning ErrorOr<>, awkwardly check the error
manually. If this becomes a very common pattern, maybe we can add
a `TRY_WITH_HANDLER(expr, error_lambda)` which would invoke a
lambda on error. We could use that here to set the error code.
No real behavior change (except we enter the error state more often
when something goes wrong).
Instead of ImageData having an Optional<Chunk> for the image data,
have it have a Chunk, and give the context an Optional<ImageData>.
This allows sharing a single code path for checking that either the
main image or an animation frame has a main image data chunk, and
that an alpha chunk is only present with a lossy main image data
chunk.
No behavior change.