1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-10-24 18:22:07 +00:00
Commit graph

31 commits

Author SHA1 Message Date
Andreas Kling
c34da16089 LibWeb: Make <script src> loads partially async (by following the spec)
Instead of firing up a network request and synchronously blocking for it
to finish via a nested event loop, we now start an asynchronous request
when encountering <script src>.

Once the script load finishes (or fails), it gets executed at one of the
synchronization points in the HTML parser.

This solves some long-standing issues with random unexpected events
getting dispatched in the middle of parsing.
2021-09-20 17:22:25 +02:00
Andreas Kling
e11ae33c66 LibWeb: Pop entire stack of open elements at the end of parsing 2021-09-20 17:22:25 +02:00
Andreas Kling
70398645f3 LibWeb: Improvements to error handling in HTML foreign content parsing
Follow the spec more closely when encountering an invalid start or end
tag during foreign content parsing.
2021-09-14 23:49:45 +02:00
Luke Wilde
f62477c093 LibWeb: Implement HTML fragment serialisation and use it in innerHTML
The previous implementation was about a half implementation and was
tied to Element::innerHTML. This separates it and puts it into
HTMLDocumentParser, as this is in the parsing section of the spec.

This provides a near finished HTML fragment serialisation algorithm,
bar namespaces in attributes and the `is` value.
2021-09-14 02:09:18 +02:00
Idan Horowitz
4629f2e4ad LibWeb: Add the Web::URL namespace and move URLEncoder to it
This namespace will be used for all interfaces defined in the URL
specification, like URL and URLSearchParams.

This has the unfortunate side-effect of requiring us to use the fully
qualified AK::URL name whenever we want to refer to the AK class, so
this commit also fixes all such references.
2021-09-13 01:43:10 +02:00
Andreas Kling
882c7b1295 LibWeb: Spin the event loop in HTML parser until scripts can run
Call HTML::EventLoop::spin_until() from the HTML parser when deciding
whether we can run a script yet.

Note that spin_until() actually doesn't do any work yet.
2021-09-09 02:30:54 +02:00
Max Wipfli
7eb294df0d LibWeb: Move HTMLToken in HTMLDocumentParser
This replaces a copy construction of an HTMLToken with a move(). This
allows HTMLToken to be made non-copyable in a further commit.
2021-07-17 16:24:57 +04:30
Max Wipfli
8b31e41692 LibWeb: Change HTMLToken::m_doctype into named DoctypeData struct
This is in preparation for an upcoming storage change of HTMLToken. In
contrast to the other token types, the accessor can hand out a mutable
reference to allow users to change parts of the DoctypeData easily.
2021-07-17 16:24:57 +04:30
Max Wipfli
918bde98b1 LibWeb: Hide implementation details of HTMLToken attribute list
Previously, HTMLToken would expose the Vector<Attribute> directly to
its users. In preparation for a future change, all users now use
implementation-agnostic APIs which do not expose the Vector directly.
2021-07-17 16:24:57 +04:30
Max Wipfli
15d8635afc LibWeb: User getter+setter for HTMLToken tag name and self-closing flag 2021-07-17 16:24:57 +04:30
Max Wipfli
e8e9426b4f LibWeb: User getter and setter for Comment type HTMLTokens 2021-07-17 16:24:57 +04:30
Gunnar Beutner
c3ad8e9a52 LibWeb: Remove StringBuilder from HTMLToken::m_comment_or_character 2021-07-14 23:03:36 +02:00
Gunnar Beutner
3aa202c432 LibWeb: Remove StringBuilder from HTMLToken::m_tag 2021-07-14 23:03:36 +02:00
Gunnar Beutner
901d71148b LibWeb: Remove StringBuilders from HTMLToken::AttributeBuilder 2021-07-14 23:03:36 +02:00
Gunnar Beutner
992964aa7d LibWeb: Remove StringBuilders from HTMLToken::m_doctype 2021-07-14 23:03:36 +02:00
Andreas Kling
ee3a73ddbb AK: Rename downcast<T> => verify_cast<T>
This makes it much clearer what this cast actually does: it will
VERIFY that the thing we're casting is a T (using is<T>()).
2021-06-24 19:57:01 +02:00
Max Wipfli
f808279769 LibWeb: Implement encoding sniffing algorithm
This patch implements the HTML specification's "encoding sniffing
algorithm", which is used when no encoding can be obtained from the
Content-Type header (either because it doesn't contain a charset=...)
value or the file has not been opened via HTTP (as with local files).

It also modifies the creator of the HTMLDocumentParser to use the new
HTMLDocumentParser::create_with_uncertain_encoding static method, which
runs the encoding sniffing algorithm before instantiating the parser.

This now allows us to load local HTML pages (or remote pages without a
charset specified in the 'Content-Type' header) with a non-UTF-8
encoding such as 'windows-1252'. This would previously crash the
browser. :^)
2021-05-18 21:02:07 +02:00
Max Wipfli
d325403cb5 LibTextCodec: Use Optional<String> for get_standardized_encoding
This patch changes get_standardized_encoding to use an Optional<String>
return type instead of just returning the null string when unable to
match the provided encoding to one of the canonical encoding names.

This is part of an effort to move away from using null strings towards
explicitly using Optional<String> to indicate that the String may not
have a value.
2021-05-18 21:02:07 +02:00
Brian Gianforcaro
f8fffe4613 LibWeb: Utilize SourceLocation for HTMLDocumentParser logging 2021-04-25 09:32:03 +02:00
Andreas Kling
b91c49364d AK: Rename adopt() to adopt_ref()
This makes it more symmetrical with adopt_own() (which is used to
create a NonnullOwnPtr from the result of a naked new.)
2021-04-23 16:46:57 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Linus Groh
e4b3591ac4 LibWeb: Fix a TODO in the adoption agency algorithm
There's still a much bigger one at the end of the function though. :^)
2021-04-13 21:59:55 +02:00
Luke
d215578f55 LibWeb: Implement "select" portion of reset_the_insertion_mode_appropriately
Required by Dromaeo. With this, it no longer crashes.
2021-04-06 23:47:05 +02:00
Luke
b82a00d657 LibWeb: Use the new "ensure_pre_insertion_validity" in the HTML document parser
Previously we didn't check if we could insert the element in the
adjusted insertion location's parent.

Also makes the return type NonnullRefPtr, as that's what element is.
2021-04-06 21:42:00 +02:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
AnotherTest
09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
Luke
3f5532d43e LibWeb: Flesh out prepare_script and execute_script
This fills in a bunch of the FIXMEs that was in prepare_script.

execute_script is almost finished, it's just missing the module side.

As an aside, let's not assert when inserting a script element with
innerHTML.
2021-01-29 08:49:50 +01:00
asynts
8465683dcf Everywhere: Debug macros instead of constexpr.
This was done with the following script:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/dbgln<debug_([a-z_]+)>/dbgln<\U\1_DEBUG>/' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec sed -i -E 's/if constexpr \(debug_([a-z0-9_]+)/if constexpr \(\U\1_DEBUG/' {} \;
2021-01-25 09:47:36 +01:00
asynts
acdcf59a33 Everywhere: Remove unnecessary debug comments.
It would be tempting to uncomment these statements, but that won't work
with the new changes.

This was done with the following commands:

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/#define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/#define/ { toggle = 1 }' {} \;

    find . \( -name '*.cpp' -o -name '*.h' -o -name '*.in' \) -not -path './Toolchain/*' -not -path './Build/*' -exec awk -i inplace '$0 !~ /\/\/ #define/ { if (!toggle) { print; } else { toggle = !toggle } } ; $0 ~/\/\/ #define/ { toggle = 1 }' {} \;
2021-01-25 09:47:36 +01:00
asynts
7d783d8b84 Everywhere: Replace a bundle of dbg with dbgln.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.
2021-01-22 22:14:30 +01:00
Andreas Kling
13d7c09125 Libraries: Move to Userland/Libraries/ 2021-01-12 12:17:46 +01:00
Renamed from Libraries/LibWeb/HTML/Parser/HTMLDocumentParser.cpp (Browse further)