Turns out that, if we don't use functions that ensure reading until the
very end of the buffer, we only end up getting the very beginning of
samples and fill the rest with uninitialized data.
While at it, make sure that we read the data that is little endian as a
LittleEndian.
We are currently setting the physical mouse position to the visual
cursor content location. In a line containing only a multi-code point
emoji, this would set the cursor to column 1 rather than however many
code points there are.
Scan with only one component are by definition not interleaved, meaning
that each value is linearly ordered in the stream. Grayscale images
were supported thanks to a hack, by forcing the subsampling to 1.
Now we properly support grayscale image with other subsampling (even if
it doesn't make sense) and more generally scans with only one component
and any sampling factors.
While this solution is more general than the last one it also feels a
bit hackish. We should probably refactor the way we iterate over
components and macroblocks. But that's work for latter, especially when
we will add support for other subsampling than 4-2-2.
Huffman streams are encountered in the scan segment. They have nothing
to do outside this segment, hence they shouldn't outlive the scan.
Please note that this patch changes behavior. The stream is now reset
after each scan.
A scan can contain fewer components that the full image. However, if
there is multiple components, they have to follow the ordering of the
frame header. It means that we can loop over components of the image
and skip those that doesn't correspond.
For now, we exit after the first scan without needing to parse `EOI`.
However, to read scans in a loop we will need to properly detect and
parse `EOI`.
This patch brings us closer to the spec point of view. And while it
makes no functional changes, it reduces the number of places where you
can misuse scan-specific data and improve support for multiple scans.
Nobody made use of the ErrorOr return value and it just added more
chance of confusion, since it was not clear if failing to sniff an
image should return an error or false. The answer was false, if you
returned Error you'd crash the ImageDecoder.
This object is responsible for handling async functions in bytecode,
and this commit fully rewrites it, now it does:
* creates and keeps alive a top level promise, which callers can attach
their `then` clauses
* creates and clear a handle to itself, to assure that it does not get
garbage collected
* properly handles all possible ways a async function could halt and
when possible continues the execution immediately
We use generators in bytecode to approximate async functions, but the
code generated by AwaitExpressions did not have the value processing
paths that Yield requires, eg the `generator.throw()` path, which is
used by AsyncFunctionDriverWrapper to signal Promise rejections.
This uses a newly added instruction `ScheduleJump`
This instruction tells the finally proceeding it, that instead of
jumping to it's next block it should jump to the designated block.
To achieve this it now uses recursive descend, to make the state
managements easier.
With this we now correctly handle try-catch-finally blocks correctly,
modeling all possible controll flows between these blocks, which allows
analysis and optimization passes to make more accurate descisions about
accessibility of the enclosed blocks
Turns out extended-lossless-animated.webp did have a loop count of 0.
So I opened it in Hex Fiend and changed the byte at position 42
(which is the first byte of the little-endian u16 storing the loop
count) to 0x2A, so that the test can compare the loop count to something
not 0.
Imported functions in Wasm may throw JS exceptions, and we need to
preserve these exceptions so we can pass them to the calling JS code.
This also adds a `assert_wasm_result()` API to Result for cases where
only Wasm traps or values are expected (e.g. internal uses) to avoid
making LibWasm (pointlessly) handle JS exceptions that will never show
up in reality.
This reorganizes things so that:
* When initially decoding chunks, we only store pointers to
their data and don't look at the contents
* We allow pausing decoding after decoding the first chunk, since
that's where the dimensions are stored, and we don't need to read
more than that if we only care about dimensions. (Currently
inconsequential, but maybe we want to get dimensions after
receiving the first few bytes off the network in the future.)
* We then have separate methods to interpret chunk data
(only for the first few bytes which store the size, so far.)
Emoji sequences in the grapheme segmentation spec are a bit tricky:
\p{Extended_Pictographic} Extend* ZWJ × \p{Extended_Pictographic}
Our current strategy of tracking a boolean to indicate if we are in an
emoji sequence was causing us to break up emoji made of multiple sub-
sequences. For example, in the "family: man, woman, girl, boy" sequence:
U+1F468 U+200D U+1F469 U+200D U+1F467 U+200D U+1F466
We would break at indices 0 (correctly) and 6 (incorrectly).
Instead of tracking a boolean, it's quite a bit simpler to reason about
emoji sequences by just skipping past them entirely. Note that in cases
like the above emoji, we skip one sub-sequence at a time.
This is for lossy compression, in which case a WebP file is
a single VP8 key frame.
This only parses the 10-byte frame header, which contains image
dimensions (and some other things).
For now, just dbgln_if() all data. Eventually we'll want to use at
least width and height.