This is very similar to the LittleEndianInputBitStream bit buffer change
from 8e834d4bb2.
We currently buffer one byte of data for the underlying stream. And when
we put bits onto that buffer, we do so 1 bit at a time.
This replaces the u8 buffer with a u64. And instead of looping at all,
we perform bitwise operations to write the desired number of bits.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), compression time
decreases from:
13.62s to 10.9s on Serenity (cold)
13.62s to 9.22s on Serenity (warm)
2.93s to 2.32s on Linux
One caveat is that this requires explicitly flushing any leftover bits
when the caller is done with the stream. The byte buffer implementation
implicitly flushed its data every time the buffer was byte-aligned, as
doing so would always fill the byte. This is no longer the case. But for
now, this should be fine as the one user of this class, DEFLATE, already
has a "flush everything now that we're done" finalizer.
This is copy-pasted from the gzip utility, along with its existing TODO.
This is currently only needed by that utility, but this gives us API
symmetry with GzipDecompressor, and helps ensure we won't end up in a
situation where only one utility receives optimizations that should be
received by all interested parties.
Instead of reading bytes from the output stream into a buffer, just to
immediately write them back out, we can skip the middle-man and copy the
bytes directly into the output buffer.
The constructor is now only concerned with creating the required
streams, which means that it no longer fails for XZ streams with
invalid headers. Instead, everything is parsed and validated during the
first read, preparing us for files with multiple streams.
We currently decode back-references one byte at a time, while writing
that byte back out to the output buffer. This is only necessary when the
back-reference refers to itself, i.e. when the back-reference distance
is less than its length. In other cases, we can read the entire back-
reference block in one shot.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from:
5.8s to 4.89s on Serenity (cold)
2.3s to 1.72s on Serenity (warm)
1.6s to 1.06s on Linux
Huffman codes have a useful property in that they are prefix codes. That
is, a set of bits representing a Huffman-coded symbol is never a prefix
of another symbol. This allows us to create a table, where each index in
the table are integers whose prefix is the entry's corresponding Huffman
code.
With Deflate, we can have codes up to 16 bits in length, thus creating a
prefix table with 2^16 entries. So instead of creating a table fit all
possible codes, we use a cutoff of 8-bit codes. Codes larger than 8 bits
fall back to the binary search method.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from 3.527s to 2.585s on Linux.
We currently mix normal and bit streams during GZIP decompression, where
the latter is a wrapper around the former. This isn't causing issues now
as the underlying bit stream buffer is a byte, so the normal stream can
pick up where the bit stream left off.
In order to increase the size of that buffer though, the normal stream
will not be able to assume it can resume reading after the bit stream.
The buffer can easily contain more bits than it was meant to read, so
when the normal stream resumes, there may be N bits leftover in the bit
stream that the normal stream was meant to read.
To avoid weird behavior when mixing streams, this changes the GZIP
decompressor to always read from a bit stream.
...simply by using LittleEndianInputBitStream::read_bit() instead of
read_bits(1). This puts us on the fast path for single-bit reads.
There's still lots of money on the table for bigger optimizations to
claim here, just picking an embarrassingly low-hanging fruit. :^)
Similar to POSIX read, the basic read and write functions of AK::Stream
do not have a lower limit of how much data they read or write (apart
from "none at all").
Rename the functions to "read some [data]" and "write some [data]" (with
"data" being omitted, since everything here is reading and writing data)
to make them sufficiently distinct from the functions that ensure to
use the entire buffer (which should be the go-to function for most
usages).
No functional changes, just a lot of new FIXMEs.
`Stream` will be qualified as `AK::Stream` until we remove the
`Core::Stream` namespace. `IODevice` now reuses the `SeekMode` that is
defined by `SeekableStream`, since defining its own would require us to
qualify it with `AK::SeekMode` everywhere.
These are currently being implicitly including by FixedPoint.h by way of
Format.h. The former will soon be removed from the latter, which would
otherwise cause a compile error in these files.
We don't have anything fallible in there yet, but we will soon switch
the seekback buffer to the new `CircularBuffer`, which has a fallible
constructor.
We have to do the same for the internal `GzipDecompressor::Member`
class, as it needs to construct a `DeflateCompressor` from its received
stream.