This is very similar to the LittleEndianInputBitStream bit buffer change
from 8e834d4bb2.
We currently buffer one byte of data for the underlying stream. And when
we put bits onto that buffer, we do so 1 bit at a time.
This replaces the u8 buffer with a u64. And instead of looping at all,
we perform bitwise operations to write the desired number of bits.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), compression time
decreases from:
13.62s to 10.9s on Serenity (cold)
13.62s to 9.22s on Serenity (warm)
2.93s to 2.32s on Linux
One caveat is that this requires explicitly flushing any leftover bits
when the caller is done with the stream. The byte buffer implementation
implicitly flushed its data every time the buffer was byte-aligned, as
doing so would always fill the byte. This is no longer the case. But for
now, this should be fine as the one user of this class, DEFLATE, already
has a "flush everything now that we're done" finalizer.
Instead of reading bytes from the output stream into a buffer, just to
immediately write them back out, we can skip the middle-man and copy the
bytes directly into the output buffer.
We currently decode back-references one byte at a time, while writing
that byte back out to the output buffer. This is only necessary when the
back-reference refers to itself, i.e. when the back-reference distance
is less than its length. In other cases, we can read the entire back-
reference block in one shot.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from:
5.8s to 4.89s on Serenity (cold)
2.3s to 1.72s on Serenity (warm)
1.6s to 1.06s on Linux
Huffman codes have a useful property in that they are prefix codes. That
is, a set of bits representing a Huffman-coded symbol is never a prefix
of another symbol. This allows us to create a table, where each index in
the table are integers whose prefix is the entry's corresponding Huffman
code.
With Deflate, we can have codes up to 16 bits in length, thus creating a
prefix table with 2^16 entries. So instead of creating a table fit all
possible codes, we use a cutoff of 8-bit codes. Codes larger than 8 bits
fall back to the binary search method.
Using the "enwik8" file as a test (100MB uncompressed, commonly used in
benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression
time decreases from 3.527s to 2.585s on Linux.
We currently mix normal and bit streams during GZIP decompression, where
the latter is a wrapper around the former. This isn't causing issues now
as the underlying bit stream buffer is a byte, so the normal stream can
pick up where the bit stream left off.
In order to increase the size of that buffer though, the normal stream
will not be able to assume it can resume reading after the bit stream.
The buffer can easily contain more bits than it was meant to read, so
when the normal stream resumes, there may be N bits leftover in the bit
stream that the normal stream was meant to read.
To avoid weird behavior when mixing streams, this changes the GZIP
decompressor to always read from a bit stream.
...simply by using LittleEndianInputBitStream::read_bit() instead of
read_bits(1). This puts us on the fast path for single-bit reads.
There's still lots of money on the table for bigger optimizations to
claim here, just picking an embarrassingly low-hanging fruit. :^)
Similar to POSIX read, the basic read and write functions of AK::Stream
do not have a lower limit of how much data they read or write (apart
from "none at all").
Rename the functions to "read some [data]" and "write some [data]" (with
"data" being omitted, since everything here is reading and writing data)
to make them sufficiently distinct from the functions that ensure to
use the entire buffer (which should be the go-to function for most
usages).
No functional changes, just a lot of new FIXMEs.
`Stream` will be qualified as `AK::Stream` until we remove the
`Core::Stream` namespace. `IODevice` now reuses the `SeekMode` that is
defined by `SeekableStream`, since defining its own would require us to
qualify it with `AK::SeekMode` everywhere.
We don't have anything fallible in there yet, but we will soon switch
the seekback buffer to the new `CircularBuffer`, which has a fallible
constructor.
We have to do the same for the internal `GzipDecompressor::Member`
class, as it needs to construct a `DeflateCompressor` from its received
stream.
This is to differentiate between the upcoming `AllocatingMemoryStream`,
which automatically allocates memory as needed instead of operating on a
static memory area.
We had some inconsistencies before:
- Sometimes "The", sometimes "the"
- Sometimes trailing ".", sometimes no trailing "."
I picked the most common one (lowecase "the", trailing ".") and applied
it to all copyright headers.
By using the exact same string everywhere we can ensure nothing gets
missed during a global search (and replace), and that these
inconsistencies are not spread any further (as copyright headers are
commonly copied to new files).
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
This commit makes sure that we fail if an encoded lz77 back reference
references bytes that are outside our sliding window, instead of just
silently failing, which triggers an assertion down the line.
Since we were not checking for error flags set by read_bits we would
just always read 0 as the bits' value, which in some edge cases could
lead to an infinite loop.
In the case that both the stream and the wrapped substream had errors
to be handled only one of the two would be resolved due to boolean
short circuiting. this commit ensures both are handled irregardless
of one another.
This ensures that when a DeflateCompressor stream is cleared of any
errors its underlying wrapped streams (InputBitStream/InputMemoryStream)
will be cleared as well and wont fail a VERIFY on destruction.
Very incompressible data could sometimes produce no backreferences
which would result in no distance huffman code being created (as it
was not needed), so VERIFY the code exists only if it is actually
needed for writing the stream.
This commit adds a fully functional DEFLATE compression
implementation that can be used to implement compression
for higher level formats like gzip, zlib or zip.
A large part of this commit is based on Hans Wennborg's
great article about the DEFLATE and zip specifications:
https://www.hanshq.net/zip.html