serenity

mirror of https://github.com/RGBCube/serenity synced 2025-10-23 22:22:31 +00:00

Author	SHA1	Message	Date
Timothy Flynn	c911781c21	Everywhere: Remove needless trailing semi-colons after functions This is a new option in clang-format-16.	2023-07-08 10:32:56 +01:00
Tim Schumacher	60ac254df6	AK: Use hashing to accelerate searching a `CircularBuffer`	2023-07-06 15:06:20 +01:00
Tim Schumacher	42d01b21d8	AK: Rewrite the hint-based `CircularBuffer::find_copy_in_seekback` This now searches the memory in blocks, which should be slightly more efficient. However, it doesn't make much difference (e.g. ~1% in LZMA compression) in most real-world applications, as the non-hint function is more expensive by orders of magnitude.	2023-07-06 15:06:20 +01:00
Tim Schumacher	046a9faeb3	AK: Split up `CircularBuffer::find_copy_in_seekback` The "operation modes" of this function have very different focuses, and trying to combine both in a way where we share the most amount of code probably results in the worst performance. Instead, split up the function into "existing distances" and "no existing distances" so that we can optimize either case separately.	2023-07-06 15:06:20 +01:00
Tim Schumacher	9e82ad758e	AK: Move parts for searching CircularBuffer into a new class We will be adding extra logic to the CircularBuffer to optimize searching, but this would negatively impact the performance of CircularBuffer users that don't need that functionality.	2023-07-06 15:06:20 +01:00
tgsm	c30775522e	LibCompress/Gzip: Replace usage of DeprecatedString	2023-06-17 06:44:16 +02:00
Tim Schumacher	d4b0e64825	LibCompress: Move two shared LZMA magic numbers into a common place	2023-05-19 23:40:33 +02:00
Tim Schumacher	a01968ee6d	LibCompress: Handle arbitrarily long FF-chains in the LZMA encoder	2023-05-19 23:40:33 +02:00
Tim Schumacher	cb93186350	LibCompress: Add debug logging for handling LZMA direct bits	2023-05-19 23:40:33 +02:00
Tim Schumacher	df071d8a76	LibCompress: Add a lot of debug logging to LZMA	2023-05-17 09:08:53 +02:00
Tim Schumacher	85a54cc796	LibCompress: Add an LZMA encoder	2023-05-17 09:08:53 +02:00
Tim Schumacher	9ab3646bc7	LibCompress: Use the variable for LZMA "normalized to real distance" The variable already existed, but I forgot to use it earlier.	2023-05-17 09:08:53 +02:00
Tim Schumacher	42514c6961	LibCompress: Decode the LZMA match type in a separate function This should keep the `read_some` function a bit flatter and shorter, and make it easier to match the match type decoding process with the specification.	2023-05-17 09:08:53 +02:00
Tim Schumacher	4a37bac374	LibCompress: Make LzmaHeader a POD-like type This allows us to initialize the struct using an aggregate initializer.	2023-05-17 09:08:53 +02:00
Tim Schumacher	440d8f908f	LibCompress: Extract the LZMA state to a separate class We will also need this in the compressor, as it needs to do the exact same calculations in reverse.	2023-05-17 09:08:53 +02:00
Lucas CHOLLET	8c34959b53	AK: Add the `Input` word to input-only buffered streams This concerns both `BufferedSeekable` and `BufferedFile`.	2023-05-09 11:18:46 +02:00
Tim Schumacher	dffef6bb71	LibCompress: Remove special casing for looping DEFLATE seekbacks The `copy_from_seekback` method already handles this exactly as DEFLATE expects, but it is slightly more optimized.	2023-05-04 20:01:16 +02:00
Tim Schumacher	4098335600	LibCompress: Error on truncated uncompressed DEFLATE blocks	2023-04-12 14:02:13 -04:00
Tim Schumacher	e11e7309dd	LibCompress: Replace usages of the Endian bytes accessor	2023-04-12 07:33:15 -04:00
Tim Schumacher	381da77ffb	LibCompress: Mark some XZ-related variables and functions as const	2023-04-08 15:18:59 -07:00
Tim Schumacher	e9789e9f36	LibCompress: Move loading XZ blocks into its own function	2023-04-08 15:18:59 -07:00
Tim Schumacher	e6b1e1bb33	LibCompress: Move finishing the current XZ stream into its own function	2023-04-08 15:18:59 -07:00
Tim Schumacher	68984abc43	LibCompress: Move finishing the current XZ block into its own function	2023-04-08 15:18:59 -07:00
Tim Schumacher	0e11e7012d	LibCompress: Move loading XZ stream headers into its own function	2023-04-08 15:18:59 -07:00
Nico Weber	6d38824985	LibCompress: Tolerate more than 288 entries in CanonicalCode Webp lossless can have up to 2328 symbols. This code assumed the deflate max of 288, leading to crashes for webp lossless files using more than 288 symbols (such as Tests/LibGfx/test-inputs/simple-vp8l.webp). Nothing writes webp files at this point, so the m_bit_codes and m_bit_code_lengths arrays aren't ever used in practice with more than 288 entries.	2023-04-07 20:49:39 +02:00
Tim Schumacher	7000ccf89f	LibCompress: Copy LZMA repetitions from the buffer in sequence This improves the decompression time of `clang-15.0.7.src.tar.xz` from 5.2 seconds down to about 2.7 seconds.	2023-04-05 07:30:38 -04:00
Tim Schumacher	b88c58b94c	AK+LibCompress: Break when seekback copying to a full CircularBuffer Otherwise, we just end up infinitely looping while waiting for more space in the destination.	2023-04-05 07:30:38 -04:00
Nico Weber	c84968dafd	LibGfx: Add some support for decoding lossless webp files Missing: * Transform support (used by virtually all lossless webp files) * Meta prefix / entropy image support Working: * Decoding of regular image streams * Color cache This happens to be enough to be able to decode Tests/LibGfx/test-inputs/extended-lossless.webp The canonical prefix code is very similar to deflate's, enough so that this can use Compress::CanonicalCode (and take advantage of all the recent performance improvements there).	2023-04-05 13:24:00 +02:00
Nico Weber	26230f2ffd	LibCompress: Order branches in Deflate's decode_codes() numerically deflate_special_code_length_copy has value 16, so it should be before the two zero-filling branches for codes 17 and 18. Also, the initial if also refers to deflate_special_code_length_copy as well, so if it's repeated right in the next else, one has to keep it on the mental stack for shorter when reading this code. No behavior change.	2023-04-04 19:16:06 +02:00
Nico Weber	72d6a30e08	LibCompress: Remove a few no-op continue statements in Deflate Alternatively, we could remove the else after the continue, but all branches here should be equally prominent, so this seems a bit nicer. No behavior change.	2023-04-04 19:16:06 +02:00
Timothy Flynn	eed956b473	AK: Increase LittleEndianOutputBitStream's buffer size and remove loops This is very similar to the LittleEndianInputBitStream bit buffer change from `8e834d4bb2`. We currently buffer one byte of data for the underlying stream. And when we put bits onto that buffer, we do so 1 bit at a time. This replaces the u8 buffer with a u64. And instead of looping at all, we perform bitwise operations to write the desired number of bits. Using the "enwik8" file as a test (100MB uncompressed, commonly used in benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), compression time decreases from: 13.62s to 10.9s on Serenity (cold) 13.62s to 9.22s on Serenity (warm) 2.93s to 2.32s on Linux One caveat is that this requires explicitly flushing any leftover bits when the caller is done with the stream. The byte buffer implementation implicitly flushed its data every time the buffer was byte-aligned, as doing so would always fill the byte. This is no longer the case. But for now, this should be fine as the one user of this class, DEFLATE, already has a "flush everything now that we're done" finalizer.	2023-04-02 10:54:37 +02:00
Nico Weber	85d0637058	LibCompress: Make CanonicalCode::from_bytes() return ErrorOr<> No intended behavior change.	2023-04-02 06:19:46 +02:00
Tim Schumacher	ad31265e60	LibCompress: Implement block size validation for XZ streams	2023-04-01 13:57:54 +02:00
Tim Schumacher	20f1a29202	LibCompress: Factor out the list of XZ check sizes	2023-04-01 13:57:54 +02:00
Nico Weber	bc70d7bb77	LibCompress: Reduce indentation in CompressedBlock::try_read_more() ...by removing `else` after `return`. No behavior change.	2023-04-01 13:57:39 +02:00
Timothy Flynn	7ec91dfde7	LibCompress: Add a utility to GZIP compress an entire file This is copy-pasted from the gzip utility, along with its existing TODO. This is currently only needed by that utility, but this gives us API symmetry with GzipDecompressor, and helps ensure we won't end up in a situation where only one utility receives optimizations that should be received by all interested parties.	2023-04-01 08:15:49 +02:00
Timothy Flynn	857f559a06	gunzip+LibCompress: Move utility to decompress files to GzipDecompressor This is to allow re-using this method (and any optimization it receives) by other utilities, like gzip.	2023-04-01 08:15:49 +02:00
Nico Weber	c3b8b3124c	LibCompress: Remove two needless heap allocations	2023-03-31 08:44:30 -06:00
Timothy Flynn	8b56d82865	AK+LibCompress: Remove the Deflate back-reference intermediate buffer Instead of reading bytes from the output stream into a buffer, just to immediately write them back out, we can skip the middle-man and copy the bytes directly into the output buffer.	2023-03-31 06:56:11 +02:00
Timothy Flynn	9f238793e0	gunzip+LibCompress: Increase buffer sizes used by Deflate and gunzip Co-authored-by: Andreas Kling <kling@serenityos.org>	2023-03-31 06:56:11 +02:00
Tim Schumacher	fe761a4e9b	LibCompress: Use LZMA context from preexisting dictionary	2023-03-30 14:39:31 +02:00
Tim Schumacher	c020ee8bfa	LibCompress: Avoid overflowing the size of uncompressed LZMA2 chunks	2023-03-30 14:39:31 +02:00
Tim Schumacher	023c64011c	LibCompress: Use the correct LZMA repetition offset in all cases	2023-03-30 14:39:31 +02:00
Tim Schumacher	9ccb0fc1d8	LibCompress: Only require new LZMA2 properties after dictionary reset	2023-03-30 14:39:31 +02:00
Tim Schumacher	d9627503a9	LibCompress: Reduce repeated code in the LZMA decompressor	2023-03-30 14:39:31 +02:00
Tim Schumacher	726963edc7	LibCompress: Implement support for multiple concatenated XZ streams	2023-03-30 14:38:47 +02:00
Tim Schumacher	00332c9b7d	LibCompress: Move XZ header validation into the read function The constructor is now only concerned with creating the required streams, which means that it no longer fails for XZ streams with invalid headers. Instead, everything is parsed and validated during the first read, preparing us for files with multiple streams.	2023-03-30 14:38:47 +02:00
Tim Schumacher	8ff36e5910	LibCompress: Implement proper handling of LZMA end-of-stream markers	2023-03-30 08:45:35 +02:00
Tim Schumacher	b6f3b2f116	LibCompress: Move common LZMA end-of-file checks into helper functions	2023-03-30 08:45:35 +02:00
Timothy Flynn	7447a91d7e	LibCompress: Decode non-self-referencing back-references in one shot We currently decode back-references one byte at a time, while writing that byte back out to the output buffer. This is only necessary when the back-reference refers to itself, i.e. when the back-reference distance is less than its length. In other cases, we can read the entire back- reference block in one shot. Using the "enwik8" file as a test (100MB uncompressed, commonly used in benchmarks: https://www.mattmahoney.net/dc/enwik8.zip), decompression time decreases from: 5.8s to 4.89s on Serenity (cold) 2.3s to 1.72s on Serenity (warm) 1.6s to 1.06s on Linux	2023-03-29 13:22:11 +01:00

1 2 3

148 commits