The specification says that parts labelled as a "fragment case" will
only occur when parsing a fragment. It says that if it occurs when
not parsing a fragment, then it is a specification error.
We should probably assume at this point that it's an implementation
error. This fixes a few little mistakes that were caught out by this.
Also moves the context element outside insertion mode reset,
as other (unimplemented) parts refer to it, such as
"adjusted current node".
Also cleans up insertion mode reset.
This allows us to determine which mode to render the page in.
Exposes "doctype" and "compatMode" on Document.
Exposes "name", "publicId" and "systemId" on DocumentType.
"width: 500" is not a valid CSS property in standards mode and should
be ignored.
To plumb the quirks-mode flag into CSS parsing, this patch adds a new
CSS::ParsingContext object that must be passed to the CSS parser.
Currently it only allows you to check the quirks-mode flag. In the
future it will be a good place to put additional information needed
for things like relative URL resolution, etc.
This narrows <div class=parser> on ACID2 to the correct width. :^)
This patch implements most of the HTML fragment parsing algorithm and
ports Element::set_inner_html() to it. This was the last remaining user
of the old HTML parser. :^)
Now that we're using the new HTML parser, we don't have to do the weird
"run the script when inserted into the document, uhh, or when the text
content of the script element changes" dance.
Instead, we just follow the spec, and scripts run the way they should.
<a href="/foo&=bar"> was being tokenized into <a href="/foo&=bar">.
The spec mentions this but I had overlooked it. The bug happens because
we interpreted the "&" as a named character reference.
The only remaining client of the old parser is the fragment parser used
by the Element.innerHTML setter. We'll need to implement a bit more
stuff in the new parser before we can switch that over.
The more generic virtual variant is renamed to node_name() and now only
Element has tag_name(). This removes a huge amount of String ctor/dtor
churn in selector matching.
Get rid of the weird old signature:
- int StringType::to_int(bool& ok) const
And replace it with sensible new signature:
- Optional<int> StringType::to_int() const
This makes us at least parse selectors like [foo=bar\ baz] correctly.
The current solution here is quite hackish but the real fix will come
when we implement a spec-compliant CSS parser.
There was a logic mistake in the entity parser that chose the shorter
matching entity instead of the longer. Fix this and make the entity
lists constexpr while we're here.
This patch introduces support for more than just "absolute px" units in
our Length class. It now also supports "em" and "rem", which are units
relative to the font-size of the current layout node and the <html>
element's layout node respectively.
When handling a "textarea" start tag, we have to ignore the next token
if it's an LF ('\n'). However, we were not switching the tokenizer
state before fetching the lookahead token, and this caused us to force
the tokenizer into the RCDATA state too late, effectively getting it
stuck in that state for way longer than it should be.
Fixes#2508.