1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-10-26 20:52:34 +00:00
Commit graph

67 commits

Author SHA1 Message Date
Timothy Flynn
f5f1a5228e LibWeb: Escape HTML text fragments with multi-byte code point awareness
The UTF-8 encoding of U+00A0 (NBSP) is the bytes 0xc2 0xa0. By looping
over the string to escape byte-by-byte, we replace the second byte with
" ", but leave the first byte in the resulting text. This creates
an invalid UTF-8 string, with a lone leading byte.
2023-03-13 07:29:40 +00:00
Andreas Kling
a504ac3e2a Everywhere: Rename equals_ignoring_case => equals_ignoring_ascii_case
Let's make it clear that these functions deal with ASCII case only.
2023-03-10 13:15:44 +01:00
Matthew Olsson
c0b2fa74ac LibWeb: Fix a few const-ness issues 2023-03-06 13:05:43 +00:00
Kenneth Myhra
ff92324fa5 LibWeb: Make factory method of DOM::ElementFactory fallible 2023-02-22 09:55:33 +01:00
Kenneth Myhra
c120c46acc LibWeb: Make factory methods of DOM::Event fallible
Because of interdependencies between DOM::Event and UIEvents::MouseEvent
to template function fire_an_event() in WebDriverConnection.cpp, the
commit: 'LibWeb: Make factory methods of UIEvents::MouseEvent fallible'
have been squashed into this commit.
2023-02-18 00:52:47 +01:00
Kenneth Myhra
0d9076c9f5 LibWeb: Make factory methods of DOM::Document fallible 2023-02-18 00:52:47 +01:00
Timothy Flynn
b75b7f0c0d LibJS+Everywhere: Propagate Cell::initialize errors from Heap::allocate
Callers that are already in a fallible context will now TRY to allocate
cells. Callers in infallible contexts get a FIXME.
2023-01-29 00:02:45 +00:00
Timothy Flynn
f3db548a3d AK+Everywhere: Rename FlyString to DeprecatedFlyString
DeprecatedFlyString relies heavily on DeprecatedString's StringImpl, so
let's rename it to A) match the name of DeprecatedString, B) write a new
FlyString class that is tied to String.
2023-01-09 23:00:24 +00:00
Andreas Kling
a915fee5f3 LibWeb: Only log HTML parser errors when HTML_PARSER_DEBUG is enabled
At this point, the parser is reliable enough that we don't need to spam
the debug log about minor parsing issues on every websites.
2023-01-09 14:00:26 +01:00
Linus Groh
22089436ed LibJS: Convert Heap::allocate{,_without_realm}() to NonnullGCPtr 2022-12-15 06:56:37 -05:00
Linus Groh
2a66fc6cae LibJS: Add make_handle({Nonnull,}GCPtr<T>) overloads 2022-12-15 06:56:37 -05:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
Timothy Flynn
99d8c115a0 LibWeb: Remove outdated FIXME regarding application cache selection
This algorithm, and window.applicationCache, was removed from the spec:
e4330d5

This also adds a spec link and comments to the affected parser method.
2022-11-29 19:04:31 +01:00
Baitinq
2f16198bd6 LibWeb: Remove unused should_invalidate_styles_on_attribute_changes()
This getter and setter were previously labelled as a "hack" and used to
disable style invalidation on attribute changes during the HTML parsing
phase (as it caused big sites's loading to be slow). These functions
are currently not used, so they can be removed:^)
2022-11-21 10:12:07 +01:00
Andreas Kling
b21b27fda3 LibWeb: Update the HTML parser part that deals with text in <script>
This commit adds inline spec comments to the part of the parser that
ends up calling HTMLScriptElement::prepare().

The code is tweaked to match the spec more closely.
2022-11-21 10:08:50 +01:00
Andreas Kling
7d45927d41 LibWeb: Rename HTMLScriptElement "non-blocking" to "force async"
This has been renamed in the spec, so let's do it here too.
2022-11-21 10:08:50 +01:00
MacDue
8a5d2be617 Everywhere: Remove unnecessary mutable attributes from lambdas
These lambdas were marked mutable as they captured a Ptr wrapper
class by value, which then only returned const-qualified references
to the value they point from the previous const pointer operators.

Nothing is actually mutating in the lambdas state here, and now
that the Ptr operators don't add extra const qualifiers these
can be removed.
2022-11-19 14:37:31 +00:00
Andreas Kling
e9eba66361 LibWeb: Adjust foreignobject to foreignObject in HTML parser
This conversion was missing from the table we copied from the spec
for whatever reason.
2022-11-16 13:01:21 +01:00
Linus Groh
acfb546048 LibWeb: Handle currently ignored WebIDL::ExceptionOr<T>s 2022-10-31 14:12:44 +00:00
Andreas Kling
5530040b3c LibWeb: Annotate and simplify the HTML fragment parsing algorithm
This patch adds inline spec comments, and then adjusts the code a bit
so it reads more like the spec.
2022-10-29 15:16:57 +02:00
Andreas Kling
6e0f80fbe0 LibWeb: Make the HTMLParser GC-allocated
This prevents a reference cycle between a HTMLParser opened via
document.open() and the document. It was one of many things keeping
some documents alive indefinitely.
2022-10-20 15:16:23 +02:00
Andreas Kling
9869405802 LibWeb: Put HTML parser encoding sniffing debug logging behind a flag 2022-10-10 20:22:50 +02:00
Linus Groh
32ad939e44 LibWeb: Rename HighResolutionTime/{CoarsenTime => TimeOrigin}.cpp/h
This is being used for more than just time coarsening now, so let's use
the spec's section title for the name.
2022-10-05 09:12:59 +01:00
Linus Groh
4ea6cc56be LibWeb: Move unsafe_shared_current_time() to HighResolutionTime
This doesn't belong on the EventLoop at all, as far as I can tell.
2022-10-05 09:12:59 +01:00
Linus Groh
fb21271334 LibWeb: Replace incorrect uses of AK::is_ascii_space() 2022-10-02 21:32:49 +02:00
Andrew Kaster
f0c5f77f99 LibWeb: Remove unecessary dependence on Window from HTML classes
These classes only needed Window to get at its realm. Pass a realm
directly to construct HTML classes.
2022-10-01 21:05:32 +01:00
Luke Wilde
7b8a6b8e7a LibWeb: Set HTMLParser::m_scripting_enabled as according to the spec
This allows <noscript> elements to display their content as proper HTML
instead of raw text when scripting is disabled.
2022-09-23 22:25:09 +01:00
Andreas Kling
797d28adca LibWeb: Save begin/end timestamps for load and DOMContentLoaded events 2022-09-21 11:51:18 +02:00
Andreas Kling
ab8432783e LibWeb: Implement aborting the HTML parser
This is roughly on-spec, although I had to invent a simple "aborted"
state for the tokenizer.
2022-09-20 23:44:59 +02:00
Andreas Kling
88f2f50c55 LibWeb: Don't use the internal window object when parsing HTML fragments
Instead, use the window object from the context element. This fixes an
issue where activating event handlers during fragment parsing would try
to set up callbacks using the internal window object's ESO.

This caused a verify_cast crash on Google Maps, since the internal realm
doesn't have an associated ESO. Perhaps it should, but in this specific
case, it makes more sense for fragment parsing to fully adopt the
context provided.
2022-09-06 01:12:44 +02:00
Andreas Kling
6f433c8656 LibWeb+LibJS: Make the EventTarget hierarchy (incl. DOM) GC-allocated
This is a monster patch that turns all EventTargets into GC-allocated
PlatformObjects. Their C++ wrapper classes are removed, and the LibJS
garbage collector is now responsible for their lifetimes.

There's a fair amount of hacks and band-aids in this patch, and we'll
have a lot of cleanup to do after this.
2022-09-06 00:27:09 +02:00
Andreas Kling
7c3db526b0 LibWeb: Make DOM::Event and all its subclasses GC-allocated 2022-09-06 00:27:09 +02:00
sin-ack
3f3f45580a Everywhere: Add sv suffix to strings relying on StringView(char const*)
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).

No functional changes.
2022-07-12 23:11:35 +02:00
Andreas Kling
1956c52c68 LibWeb: Remove unused HTML::parse_html_document() 2022-04-06 19:35:07 +02:00
Idan Horowitz
086969277e Everywhere: Run clang-format 2022-04-01 21:24:45 +01:00
Ali Mohammad Pur
5a0123fd2f LibWeb: Load X(HT)ML documents and transform them into HTML DOM 2022-03-28 23:11:48 +02:00
Andreas Kling
fda25f9505 LibWeb: Move HTML dimension value parsing from CSS to HTML namespace
These are part of HTML, not CSS, so let's not confuse things.
2022-03-26 17:31:01 +01:00
Idan Horowitz
5626e1b324 LibWeb: Rename PARSER_DEBUG => HTML_PARSER_DEBUG
Since this macro was created we gained a couple more parsers in the
system :^)
2022-03-24 21:37:49 +01:00
Timothy Flynn
5608bc4eaf LibWeb: Remove inheritance of FormAssociatedElement from HTMLElement
HTMLObjectElement will need to be both a FormAssociatedElement and a
BrowsingContextContainer. Currently, both of these classes inherit from
HTMLElement. This can work in C++, but is generally frowned upon, and
doesn't play particularly well with the rest of LibWeb.

Instead, we can essentially revert commit 3bb5c62 to remove HTMLElement
from FormAssociatedElement's hierarchy. This means that objects such as
HTMLObjectElement individually inherit from FormAssociatedElement and
HTMLElement now.

Some caveats are:

* FormAssociatedElement still needs to know when the HTMLElement is
  inserted into and removed from the DOM. This hook is automatically
  injected via a macro now, while still allowing classes like
  HTMLInputElement to also know when the element is inserted.

* Casting from a DOM::Element to a FormAssociatedElement is now a
  sideways cast, rather than directly following an inheritance chain.
  This means static_cast cannot be used here; but we can safely use
  dynamic_cast since the only 2 instances of this already use RTTI to
  verify the cast.
2022-03-24 03:35:11 +01:00
Simon Wanner
1d95745901 LibWeb: Implement the rest of the Adoption Agency Algorithm
This gets us 2 points on html5test.com :^)
- Before: https://html5te.st/4cf57659bc08272e (208)
- After: https://html5te.st/fb8a9259bda1c115 (210)
2022-03-20 02:52:37 +01:00
Andreas Kling
cbd343dced LibWeb: Only delay "load" event for script elements that load something
We shouldn't delay the load event for scripts that we're completely
refusing to run anyway. Also, for scripts that have inline text content,
we don't need to delay them either, as they will become ready before
returning from "prepare script".

This makes the "load" event finally fire on lots of websites, including
Wikipedia. :^)
2022-03-19 16:11:36 +01:00
Andreas Kling
2c9dfadb21 LibWeb: Don't delay document "load" event for unclosed script tags
We previously had a bug where markup with unclosed script tags caused
the document load event to be delayed indefinitely. Fix this by only
marking script elements as delaying the load event once we encounter
the script end tag.
2022-03-19 15:04:48 +01:00
Idan Horowitz
c575710e5e LibWeb: Use inline script tag source line as javascript line offset
This makes JS exception line numbers meaningful for inline script tags.
2022-03-14 00:25:33 +01:00
Linus Groh
1422bd45eb LibWeb: Move Window from DOM directory & namespace to HTML
The Window object is part of the HTML spec. :^)
https://html.spec.whatwg.org/multipage/window-object.html
2022-03-08 00:30:30 +01:00
Luke Wilde
46c0d0f7ae LibWeb: Associate form elements with a form in parsing and dynamically
This makes it available for all form associated elements and not just
select and input elements. It also makes it more spec compliant,
especially around the form attribute.

The main thing missing is re-associating form elements with a form
attribute when the form attribute changes or an element with an ID
is inserted/removed or has its ID changed.
2022-03-01 23:19:41 +01:00
Andreas Kling
8b2499b112 LibWeb: Make document.write() work while document is parsing
This necessitated making HTMLParser ref-counted, and having it register
itself with Document when created. That makes it possible for scripts to
add new input at the current parser insertion point.

There is now a reference cycle between Document and HTMLParser. This
cycle is explicitly broken by calling Document::detach_parser() at the
end of HTMLParser::run().

This is a huge progression on ACID3, from 31% to 49%! :^)
2022-02-21 22:00:28 +01:00
Lorenz Steinert
db789813c9 LibWeb: Add basic support for dynamic markup insertion
This implements basic support for dynamic markup insertion, adding
 * Document::open()
 * Document::write(Vector<String> const&)
 * Document::writeln(Vector<String> const&)
 * Document::close()

The HTMLParser is modified to make it possible to create a
script-created parser which initially only contains a HTMLTokenizer
without any data. Aditionally the HTMLParser::run method gains an
overload which does not modify the Document and does not run
HTMLParser::the_end() so that we can reenter the parser at a later time.
Furthermore all FIXMEs that consern the insertion point are implemented
wich is defined in the HTMLTokenizer. Additionally the following
member-variables of the HTMLParser are now exposed by getter funcions:
 * m_tokenizer
 * m_aborted
 * m_script_nesting_level

The HTMLTokenizer is modified so that it contains an insertion
point which keeps track of where the next input from the Document::write
functions will be inserted. The insertion point is implemented as the
charakter offset into m_decoded_input and a boolean describing if the
insertion point is defined. Functions to update, check and {re}store the
insertion point are also added.
The function HTMLTokenizer::insert_eof is added to tell a script-created
parser that document::close was called and HTMLParser::the_end() should
be called.
Lastly an explicit default constructor is added to HTMLTokenizer to
create a empty HTMLTokenizer into which data can be inserted.
2022-02-21 18:26:43 +01:00
Luke Wilde
9845164f6a LibWeb: Handle markers when reconstructing active formatting elements
The entry we get from the active formatting elements list during the
Rewind step of "reconstruct the active formatting elements" can be a
marker. Previously we assumed it was not a marker, which can trigger
an assertion failure with certain malformed HTML.

If the entry in this step is a marker, the spec simply ignores it.
This is step 6 of the algorithm.

This also makes the index unsigned, as this algorithm is a no-op if
the list is empty.

Additionally, this also adds spec comments to this algorithm.

Fixes #12668.
2022-02-20 10:59:42 +01:00
Linus Groh
06948df393 LibWeb: Fail gracefully when reaching the unimplemented part of the AAA
Pages such as https://html5test.com are testing all sorts of weird,
incomplete, and wrong HTML but can be useful or at least interesting for
development - let's try to avoid crashing the process.
2022-02-15 23:24:34 +01:00