1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-08-06 19:27:35 +00:00
Commit graph

32230 commits

Author SHA1 Message Date
Nico Weber
9d69c5d434 LibPDF: Tolerate trailing whitespace after %%EOF marker
At first I tried implmenting the quirk from PDF 1.7 Appendix H,
3.4.4, "File Trailer": """Acrobat viewers require only that the %%EOF
marker appear somewhere within the last 1024 bytes of the file.""
This would've been like #22548 but at end-of-file instead of at
start-of-file.

This helped a bunch of files, but also broke a bunch of files that
made more than 1024 bytes of stuff at the end, and it wouldn't have
helped 0000059.pdf, which has over 40k of \0 bytes after the %%EOF.
So just tolerate whitespace after the %%EOF line, and keep ignoring
and arbitrary amount of other stuff after that like before.

This helps:
* 0000599.pdf
  One trailing \0 byte after %%EOF. Due to that byte, the
  is_linearized() check fails and we go down the non-linearized
  codepath. But with this fix, that code path succeeds.
* 0000937.pdf
  Same.
* 0000055.pdf
  Has one space followed by a \n after %%EOF
* 0000059.pdf
  Has over 40kB of trailing \0 bytes

The following files keep working with it:
* 0000242.pdf
  5586 bytes of trailing HTML
* 0000336.pdf
  5586 bytes of trailing HTML fragment
* 0000136.pdf
  2054 bytes of trailing space characters
  This one kind of only worked by accident before since it found
  the %%EOF block before the final %%EOF block. Maybe this is
  even an intentional XRefStm compat hack? Anyways, now it
  find the final block instead.
* 0000327.pdf
  11044 bytes of trailing HTML
2024-01-04 11:19:15 +01:00
Nico Weber
2d12647e29 LibPDF: Add FIXME for "was linearized PDF incrementally updated" check
It's pretty tricky to do, and also tricky with respect to skipping
trailing bytes after %%EOF: The check requires knowning the full size of
the PDF (which means web servers not sending content lengths are out),
but that size has to be after stripping trailing bytes, which normal
static file servers won't do. So PDF viewers would have to download the
last couple bytes of the PDF unconditionally, then strip trailing bytes
and use the count to figure out the final actual PDF size.

Luckily, we don't incrementally download PDFs from the net but
instead require all data to be available in one chunk, so it's
not currently a problem.
2024-01-04 11:19:15 +01:00
Nico Weber
1b45c3e127 LibPDF: Tolerate whitespace after xref and startxref
The spec isn't super clear on if this is allowed:

"""Each cross-reference section shall begin with a line containing the
keyword xref. Following this line..."""

"""The two preceding lines shall contain, one per line and in order, the
keyword startxref and..."""

It kind of sounds like anything goes on both lines as long as they
contain `xref` and `startxref`.

In practice, both seem to always occur at the start of their line,
but in 0000780.pdf (and nowhere else), there's one space after each
keyword before the following linebreak, and this makes that file load.
2024-01-04 10:14:30 +01:00
Nico Weber
efb37f7252 LibPDF: Add Reader::consume_non_eol_whitespace() 2024-01-04 10:14:30 +01:00
Nico Weber
c59e08123b LibPDF: Add a FIXME and a spec comment to Encoding::from_object() 2024-01-04 10:12:11 +01:00
Nico Weber
ad5fc0eda1 LibPDF: An Encoding's /Differences entry is optional
Per "TABLE 5.11 Entries in an encoding dictionary", /Differences is
optional.

(Per "Encodings for TrueType Fonts" in 5.5.5 Character Encoding,
nonsymbolic truetype fonts are even recommended to have "no Differences
array." But in practice, most seem to have it.)

Fixes crashes on:
* 0000001.pdf
* 0000574.pdf
* 0000337.pdf

All three don't render super great, but at least they no longer crash.
2024-01-04 10:12:11 +01:00
Shannon Booth
e9dfa61588 LibWeb: Use UTF-16 code unit offsets in Range::to_string
Similar to another problem we had in CharacterData, we were assuming
that the offsets were raw utf8 byte offsets into the data, instead of
utf16 code units. Fix this by using the substring helpers in
CharacterData to get the text data from the Range.

There are more instances of this issue around the place that we will
need to track down and add tests for, but this fixes one of them :^)

For the test included in this commit, we were previously returning:

llo💨😮

Instead of the expected:

llo💨😮 Wo
2024-01-04 10:10:44 +01:00
Shannon Booth
ee431e6911 LibWeb: Use WebIDL typedefs in Range/AbstractRange
In the public APIs which have their types exposed through IDL.
2024-01-04 10:10:44 +01:00
Aliaksandr Kalenik
b6123df492 LibWeb: Add support for start, center and end justify-content in GFC
Fixes https://github.com/SerenityOS/serenity/issues/22555
2024-01-04 09:47:20 +01:00
Aliaksandr Kalenik
56ff9bffae LibWeb: Support "normal" and "stretch" justify-content in CSS parser 2024-01-04 09:47:20 +01:00
Aliaksandr Kalenik
b395cfccb0 LibWeb: Add support for "align-content: normal" in CSS parser 2024-01-04 09:47:20 +01:00
Nico Weber
fa24fbf120 LibGfx/OpenType: Survive simple glyphs with 0 contours
These are valid per spec, and do sometimes occur in practice, e.g.
in embedded fonts in 0000550.pdf and 0000246.pdf in 0000.zip in the
PDFA test set.
2024-01-04 03:32:46 +01:00
Tim Schumacher
707a36dd79 LibCompress/Brotli: Update the lookback buffer with uncompressed data
We previously skipped updating the lookback buffer when copying
uncompressed data, which resulted in a wrong total byte count.
With a wrong total byte count, our decompressor implementation
ended up choosing a wrong offset into the dictionary.
2024-01-03 17:54:36 +01:00
Ali Mohammad Pur
c3167afa3a LibTLS: Notify the client for app data as soon as some data is available
Previously we were waiting until the socket was no longer immediately
readable to notify the client, resulting in large buffers and longer
latency.
2024-01-03 14:59:59 +01:00
Ali Mohammad Pur
b1297a267c LibCrypto: Avoid branching in galois_multiply()
This makes GHash a little more than twice as fast.
2024-01-03 14:59:59 +01:00
Andreas Kling
27a294547d LibTLS: Add segmentation to the application buffer to avoid memcpy churn
We were previously doing a *lot* of unnecessary memcpy work when
transferring large files.

This patch addresses the issue by introducing a simple segmented buffer
with no additional work when appending new data, or when transfering out
of the buffer.
2024-01-03 14:59:59 +01:00
Andreas Kling
40f87f0954 LibWeb: Stop timers when finalizing a Window or WorkerGlobalScope
This avoids an assertion that timers are not active when destroyed.
2024-01-03 12:56:18 +01:00
MacDue
b4eb66d9fe LibGfx: Simplify condition
This is just an XOR. No behaviour change.
2024-01-03 12:56:01 +01:00
MacDue
db51e80d50 LibGfx: Fix typo 2024-01-03 12:56:01 +01:00
MacDue
a9502396ee LibGfx: Remove somewhat outdated comment
Most of these optimizations have been tried now, so this comment is a
bit misleading.
2024-01-03 12:56:01 +01:00
MacDue
096bdb142b LibGfx: Speed up filling solid colors in path rasterizer
For solid color fills (with alpha = 255), the rasterizer now tracks
spans of solid colors within a scanline and fills the entire span with
a single call to fast_u32_fill().

This gave up to a 1.5x speedup drawing the Ghostscript Tiger within
SerenityOS.
2024-01-03 12:56:01 +01:00
MacDue
2fa488cfa9 LibGfx: Skip horizontal edges in path rasterizer
Only the vertical parts of edges are plotted (then accumulated
horizontally). Fully horizontal edges won't be plotted (and just result
in NaNs).
2024-01-03 12:56:01 +01:00
Timothy Flynn
34160743dc LibIPC: Avoid redundant copy of every tranferred IPC message
For every IPC message sent, we currently prepend the message size to the
IPC message buffer. This incurs the cost of copying the entire message
to its newly allocated position. Instead, reserve the bytes for the size
at the front of the buffer upon creation. Prevent dangerous access to
the buffer with specific public methods.
2024-01-03 10:17:00 +01:00
Timothy Flynn
f2db700ae7 LibIPC: Ensure message sizes do not exceed the limits of u32
We encode the size as a u32, so let's be sure the size does not exceed
that storage. This is unlikely to happen, but no reason not to check.
2024-01-03 10:17:00 +01:00
Timothy Flynn
91558fa381 LibIPC+LibWeb: Add an IPC helper to transfer an IPC message buffer
This large block of code is repeated nearly verbatim in LibWeb. Move it
to a helper function that both LibIPC and LibWeb can defer to. This will
let us make changes to this method in a singular location going forward.

Note this is a bit of a regression for the MessagePort. It now suffers
from the same performance issue that IPC messages face - we prepend the
meessage size to the message buffer. This degredation is very temporary
though, as a fix is imminent, and this change makes that fix easier.
2024-01-03 10:17:00 +01:00
Timothy Flynn
bf15b66117 LibIPC: Use a simpler encoding for arithmetic values
This is less code, but mostly serves to reduce the amount of methods to
be added to IPC::MessageBuffer in an upcoming patch.
2024-01-03 10:17:00 +01:00
Timothy Flynn
3adf01b816 LibIPC: Move MessageBuffer forward declaration from Stub.h to Forward.h
The type of MessageBuffer will be changing, and it was a bit awkward to
look around to find where the forward declaration was. This patch just
moves it to the obvious forwarding header.
2024-01-03 10:17:00 +01:00
Shannon Booth
fa1ef30985 LibWeb: Port Element::set_attribute_value from ByteString
Also making set_attribute_ns take a String instead of a FlyString as
this is only used as an Attr value and no FlyString properties are used.
2024-01-03 10:13:47 +01:00
Shannon Booth
285bca1633 LibWeb: Use Optional<FlyString> const& in Element and NamedNodeMap
This is enabled with the newly added IDL generator support for
FlyStrings.
2024-01-03 10:13:47 +01:00
Shannon Booth
f32185420d LibWeb: Use FlyString where possible in NamedNodeMap
We cannot port over Optional<FlyString> until the IDL generator supports
passing that through as an argument (as opposed to an Optional<String>).

Change to FlyString where possible, and resolve any fallout as a result.
2024-01-03 10:13:47 +01:00
Nico Weber
0bb0c7dac2 LibPDF: Scan for PDF file start in first 1024 bytes
Other readers do this too, and files depend on this.

Fixes opening these four files from the PDFA 0000.zip dataset:

* 0000015.pdf
  Starts with `C:\web\webeuncet\_cat\_docs\_publics\` before header
* 0000408.pdf
  Starts with UTF-8 BOM
* 0000524.pdf
  Starts with 867 bytes of HTML containing a PHP backtrace
* 0000680.pdf
  Starts with `C:\web\webeuncet\_cat\_docs\_publics\` too
2024-01-03 10:12:35 +01:00
Nico Weber
9495f64f91 LibPDF: Improve hex string parsing
A local (non-public) PDF I have lying around contains this in
a page's operator stream:

```
[<00b4003e> 3 <002600480051> 3 <005700550044004f0003> -29
<00330044> 3 <0055> -3 <004e0040> 4 <0003> -29 <004c00560003> -31
<0057004b> 4 <00480003> -37 <0050
>] TJ
```

That is, there's a newline in a hexstring after a character.

This led to `Parser error at offset 5184: Unexpected character`.

The spec says in 3.2.3 String Objects, Hexadecimal Strings:
"""Each pair of hexadecimal digits defines one byte of the string.
White-space characters (such as space, tab, carriage return, line feed,
and form feed) are ignored."""

But we didn't ignore whitespace before or after a character, only
in between the bytes.

The spec also says:
"""If the final digit of a hexadecimal string is missing—that is, if
there is an odd number of digits—the final digit is assumed to be 0."""

In that case, we were skipping the closing `>` twice -- or, more
accurately, we ignored the character after it too. This has been
wrong all the way back in #6974.

Add a test that fails if either of the two changes isn't present.
2024-01-02 22:13:21 +01:00
Andreas Kling
0a05be69cf LibWeb: Update create_new_child_navigable() after spec fix
Now that https://github.com/whatwg/html/issues/9686 is fixed, let's
fix it the exact same way in our implementation. :^)
2024-01-02 21:47:36 +01:00
Aliaksandr Kalenik
49fcc5dcd8 LibWeb: Do not require box to be positioned to create stacking context
Instead of implementing stacking context painting order exactly as it
is defined in CSS2.2 "Appendix E. Elaborate description of Stacking
Contexts" we need to account for changes in the latest standards where
a box can establish a stacking context without being positioned, for
example, by having an opacity different from 1.

Fixes https://github.com/SerenityOS/serenity/issues/21137
2024-01-02 21:45:05 +01:00
Torstennator
82e85172e5 PixelPaint: Fix crash when started with path
This change fixes the initial tool selection when pixelpaint is started
with a path. Previously an already existing editor was expected when
the default tool was initially propagated - which was not the case if
pixelpaint was launched to directly load an existing image.
2024-01-02 17:14:38 +01:00
Lucas CHOLLET
4e09ee1f2f LibGfx/TIFF: Reject images that declare a sample with abnormal bit depth
Anything with a bit depth of zero or greater than 32 is outside our
working range, so let's reject them.
2024-01-02 06:52:50 -07:00
Lucas CHOLLET
ba84af7c22 LibGfx/TIFF: Move check on tag values in its own function
There is only one check for now, but the fuzzer has already found more
checks to add :^)
2024-01-02 06:52:50 -07:00
Shannon Booth
7067c5c972 LibWeb: Port TypeError in UnderlyingSource from ByteString 2024-01-02 10:01:26 +01:00
Shannon Booth
6b88fc2e05 LibWeb: Properly convert UnderlyingSource's autoAllocateChunkSize to u64
The JS::Value being passed through is not a bigint, and needs to be
converted using ConvertToInt, as per:

https://webidl.spec.whatwg.org/#es-unsigned-long-long

Furthermore, the IDL definition also specifies that this is associated
with the [EnforceRange] extended attribute.

This makes it actually possible to pass through an autoAllocateChunkSize
to the ReadableStream constructor without it throwing a TypeError.
2024-01-02 10:01:26 +01:00
Shannon Booth
99bf986889 LibWeb: Use unsigned long long for ReadableStreamBYOBRequest.respond
Now that the IDL generator supports this :^)
2024-01-02 10:01:26 +01:00
Shannon Booth
11371acfaf LibWeb/WebIDL: Implement ConvertToInt and IntegerPart AOs
These are used when converting JS::Values to integers in IDL, as opposed
to our current AD-HOC solution.
2024-01-02 10:01:26 +01:00
Shannon Booth
f1f369b6c6 LibWeb: Add IDL integer typedefs
To make it easier to work out what the correctly sized type should be,
instead of needing to consult the spec or IDL generator.
2024-01-02 10:01:26 +01:00
Shannon Booth
f589bedb0d LibJS: Improve JS::modulo precision for large floating values
JS::modulo was yielding a result of '0' for the input:
```
modulo(1., 18446744073709551616.)
```

Instead of the expected '1'.

As far as I can tell the reason for this is that the repeated calls to
fmod is losing precision in the calculation, leading to the wrong
result. Fix this by only calling fmod once, and preserving the negative
value behaviour by an 'if' check.

Without this, the LibWeb text test:
`/Streams/ReadableByteStream-enqueue-respond.html`

Would hang forever after using this function in the IDL conversion of a
u64 in ConvertToInt.

This should also be more efficient :^)
2024-01-02 10:01:26 +01:00
Shannon Booth
986abe7047 LibJS: Rename IntlNumberIsNaNOrInfinity to NumberIsNaNOrInfinity
While only currently used in Intl in LibJS, this is a pretty generic
error and is useful elsewhere. Rename it to something more generic.
2024-01-02 10:01:26 +01:00
Shannon Booth
56ec36a9dc LibJS: Export MAX_ARRAY_LIKE_INDEX & NEGATIVE_ZERO_BITS in JS namespace 2024-01-02 10:01:26 +01:00
Kevin Meyer
f86ec46a6e Ladybird+LibWebView: Cleanup missing callbacks in InspectorClient
This was causing reproducible crashes, when closing the inspector
window of ladybird running on macos.
2024-01-01 16:04:29 -05:00
Luke Wilde
6231aee761 LibWeb: Add missing DOMRectList::visit_edges 2024-01-01 18:41:14 +01:00
Luke Wilde
5af058d2b6 LibWeb: Only reload iframe on src/srcdoc attribute changes, not all
Fixes Cloudflare Turnstile suddenly going blank and stopping when it
changes the style attribute after doing some setup on the iframe.
2024-01-01 18:41:14 +01:00
Andreas Kling
6eeda29642 LibWeb: Paint 1x1 backgrounds as color fill instead of tiling bitmap
This yields a huge speedup on pages that use this weird but
not-entirely-uncommon technique.
2024-01-01 15:16:58 +01:00
Aliaksandr Kalenik
e8f04be3ae LibWeb/CSS: Fix crashing when calc() is used for border-radius
`BorderRadiusStyleValue::absolutized` should not try to extract length
from LengthPercentage that represents calculated.
2024-01-01 10:12:20 +01:00