At the end of HTML::EventLoop::process(), the loop reschedules itself if
there are more runnable tasks available.
However, the condition was flawed: we would reschedule if there were any
microtasks queued, but those tasks will not be processed if we're
currently within the scope of a microtask checkpoint.
To fix this, we now only reschedule the HTML event loop for microtask
processing *if* we're not already in a microtask checkpoint.
This fixes the 100% CPU churn seen when looking at PRs on GitHub. :^)
This is copy-pasted from the gzip utility, along with its existing TODO.
This is currently only needed by that utility, but this gives us API
symmetry with GzipDecompressor, and helps ensure we won't end up in a
situation where only one utility receives optimizations that should be
received by all interested parties.
HackStudio's Editor has displayed indicators in its gutter for a long
time, but each required manual code to paint them in the right place
and respond to click events. All indicators on a line would be painted
in the same location. If any other applications wanted to have gutter
indicators, they would also need to manually implement the same code.
This commit adds an API to GUI::TextEditor so it deals with these
indicators. It makes sure that multiple indicators on the same line
each have their own area to paint in, and provides a callback for when
one is clicked.
- `register_gutter_indicator()` should be called early on. It returns a
`GutterIndicatorID` that is then used by the other methods.
Indicators on a line are painted from right to left, in the order
they were registered.
- `add_gutter_indicator()` and `remove_gutter_indicator()` add the
indicator to the given line.
- `clear_gutter_indicators()` removes a given indicator from every line.
- The `on_gutter_click` callback is called whenever the user clicks on
the gutter, but *not* on an indicator.
Instead of reading bytes from the output stream into a buffer, just to
immediately write them back out, we can skip the middle-man and copy the
bytes directly into the output buffer.
This implements Intel's slicing-by-8 algorithm for CRC checksums (only
little endian CPUs for now, as I don't have a way to test big endian).
The original paper for this algorithm seems to have disappeared, but
Intel's source code is still available as a reference:
https://sourceforge.net/projects/slicing-by-8/
As well as other implementations for reference:
https://docs.rs/slice-by-8/latest/src/slice_by_8/algorithm.rs.html
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from:
4.89s to 3.52s on Serenity (cold)
1.72s to 1.32s on Serenity (warm)
1.06s to 0.92s on Linux
This was fine before as the last entry was a null string (which could be
printed), but we no longer use C-style sentinel-terminated arrays for
arguments.
This solves an awkward dependency cycle, where CalculatedStyleValue
needs the definition of Percentage, but including that would also pull
in PercentageOr, which in turn needs CalculatedStyleValue.
Many places that previously included StyleValue.h no longer need to. :^)
There were a mix of users between those who want to know if the Length
changed, and those that just want an absolute Length. So, we now have
two methods: Length::absolutize() returns an empty Optional if nothing
changed, and Length::absolutized() always returns a value.
...and replace it with AngleOrCalculated.
This has the nice bonus effect of actually handling `calc()` for angles
in a transform function. :^) (Previously we just would have asserted.)
This is intended as a replacement for Length and friends each holding a
`RefPtr<CalculatedStyleValue>`. Instead, let's make the types explicit
about whether they are calculated or not. This then means a Length is
always a Length, and won't require including `StyleValue.h`.
As noted, it's probably nicer for LengthOrCalculated to live in
`Length.h`, but we can't do that until Length stops including
`StyleValue.h`.
Due to CSSImportRule::has_import_result() being backwards, we never
actually entered imported style sheets when traversing style rules or
media queries.
With this fixed, we no longer need the "collect style sheets" step in
StyleComputer, as normal for_each_effective_style_rule() will now
actually find all the rules. :^)
The constructor is now only concerned with creating the required
streams, which means that it no longer fails for XZ streams with
invalid headers. Instead, everything is parsed and validated during the
first read, preparing us for files with multiple streams.
This was causing a huge slowdown when loading some pages with weirdly
huge number of style sheets. For example, amazon.com has over 200 style
elements, which meant we had to resort the StyleSheetList 200 times.
(And sorting itself was slow because it has to compare DOM positions.)
Instead of sorting, we now look for the correct insertion point when
adding new style sheets, and we start the search from the end, which is
where style sheets are typically added in the vast majority of cases.
This removes a 600ms time sink when loading Amazon on my machine! :^)
Setting the `data` of a text node already triggers `children changed`
per spec, so there's no need for an explicit call.
This avoids parsing every HTMLStyleElement sheet twice. :^)
We now select between nearest neighbor and bilinear filtering when
scaling images in CRC2D.drawImage().
This patch also adds CRC2D.imageSmoothingQuality but it's ignored for
now as we don't have a bunch of different quality levels to map it to.
Work towards #17993 (Ruffle Flash Player)
We currently decode back-references one byte at a time, while writing
that byte back out to the output buffer. This is only necessary when the
back-reference refers to itself, i.e. when the back-reference distance
is less than its length. In other cases, we can read the entire back-
reference block in one shot.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from:
5.8s to 4.89s on Serenity (cold)
2.3s to 1.72s on Serenity (warm)
1.6s to 1.06s on Linux
Huffman codes have a useful property in that they are prefix codes. That
is, a set of bits representing a Huffman-coded symbol is never a prefix
of another symbol. This allows us to create a table, where each index in
the table are integers whose prefix is the entry's corresponding Huffman
code.
With Deflate, we can have codes up to 16 bits in length, thus creating a
prefix table with 2^16 entries. So instead of creating a table fit all
possible codes, we use a cutoff of 8-bit codes. Codes larger than 8 bits
fall back to the binary search method.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from 3.527s to 2.585s on Linux.
We currently mix normal and bit streams during GZIP decompression, where
the latter is a wrapper around the former. This isn't causing issues now
as the underlying bit stream buffer is a byte, so the normal stream can
pick up where the bit stream left off.
In order to increase the size of that buffer though, the normal stream
will not be able to assume it can resume reading after the bit stream.
The buffer can easily contain more bits than it was meant to read, so
when the normal stream resumes, there may be N bits leftover in the bit
stream that the normal stream was meant to read.
To avoid weird behavior when mixing streams, this changes the GZIP
decompressor to always read from a bit stream.
Previously we used a parsing context with no access to the document, so
any URLs in url() functions would become invalid.
Fixes the images on Steam's store carousel, which sets
Element.style.backgroundImage to url() functions.