serenity

mirror of https://github.com/RGBCube/serenity synced 2025-10-31 00:12:44 +00:00

Author	SHA1	Message	Date
Luke	201cc1bfcc	LibWeb: Assert we're parsing a fragment on fragment cases The specification says that parts labelled as a "fragment case" will only occur when parsing a fragment. It says that if it occurs when not parsing a fragment, then it is a specification error. We should probably assume at this point that it's an implementation error. This fixes a few little mistakes that were caught out by this. Also moves the context element outside insertion mode reset, as other (unimplemented) parts refer to it, such as "adjusted current node". Also cleans up insertion mode reset.	2020-07-22 00:02:40 +02:00
Andreas Kling	685e006e27	LibWeb: Use "namespace Web::Foo {" since C++20 allows it :^) Thanks @nico for teaching me about this!	2020-07-21 16:23:08 +02:00
Luke	19d6884529	LibWeb: Implement quirks mode detection This allows us to determine which mode to render the page in. Exposes "doctype" and "compatMode" on Document. Exposes "name", "publicId" and "systemId" on DocumentType.	2020-07-21 01:08:32 +02:00
Nico Weber	e9d18e35d6	LibWeb: Move "Stop parsing!" behind PARSER_DEBUG This makes SerenityOS's IRC client a lot less chatty.	2020-07-06 17:03:26 +02:00
Luke	2df69317f1	LibWeb: Implement almost all missing tokenizer cases	2020-06-28 16:56:26 +02:00
Andreas Kling	7d3c8d066f	LibWeb: Support "pt" length units :^)	2020-06-28 15:25:32 +02:00
Andreas Kling	38d6cc8598	LibWeb: Convert uppercase selector tag names to lowercase internally This is necessary for some older content to work correctly. There's probably a nicer (and correct-er) way to do this. Deferring to the new CSS parser.	2020-06-28 12:58:04 +02:00
Andreas Kling	9e642827fc	LibWeb: Don't tolerate unit-less lengths (except 0) in standards mode "width: 500" is not a valid CSS property in standards mode and should be ignored. To plumb the quirks-mode flag into CSS parsing, this patch adds a new CSS::ParsingContext object that must be passed to the CSS parser. Currently it only allows you to check the quirks-mode flag. In the future it will be a good place to put additional information needed for things like relative URL resolution, etc. This narrows <div class=parser> on ACID2 to the correct width. :^)	2020-06-28 12:46:40 +02:00
Kevin Meyer	22b20c381f	LibWeb: Implement remaining missing tokenizer EOF cases	2020-06-27 13:27:10 +02:00
Andreas Kling	8e6522d034	LibWeb: Implement some missing tokenizer cases for EOF handling	2020-06-26 22:47:07 +02:00
theazgra	6a401a9bde	LibWeb: Remove duplicate if branch in fragment parsing. I noticed in the video the duplicate `if` check. This commit removes the duplicated branch.	2020-06-26 11:58:53 +02:00
Andreas Kling	6293d1a13c	LibWeb+Browser: Remove old HTML parser :^) The new parser is now used everywhere and it's working pretty well!	2020-06-26 00:53:25 +02:00
Andreas Kling	92d831c25b	LibWeb: Implement fragment parsing and use it for Element.innerHTML This patch implements most of the HTML fragment parsing algorithm and ports Element::set_inner_html() to it. This was the last remaining user of the old HTML parser. :^)	2020-06-26 00:53:25 +02:00
Andreas Kling	3fefc7f3e9	LibWeb: Tweak CSS parser to swallow backslash-escaped characters This isn't the correct way of doing this, but at least it allows the parsing to progress a bit further in some cases.	2020-06-25 16:52:38 +02:00
Andreas Kling	4b2ac34725	LibWeb: Move the offset, margin and padding boxes into LayoutStyle	2020-06-24 18:06:21 +02:00
Andreas Kling	5744dd43c5	LibWeb: Remove default Length constructor and add make_auto()/make_px() To prepare for adding an undefined/empty state for Length, let's first move away from Length() creating an auto value.	2020-06-24 11:08:46 +02:00
Andreas Kling	d0312f6208	LibWeb: Handle empty inputs to the CSS parser Empty inputs -> empty outputs.	2020-06-23 20:06:45 +02:00
Andreas Kling	3a5af6ef61	LibWeb: Remove hacky old ways of running <script> element contents Now that we're using the new HTML parser, we don't have to do the weird "run the script when inserted into the document, uhh, or when the text content of the script element changes" dance. Instead, we just follow the spec, and scripts run the way they should.	2020-06-23 16:45:01 +02:00
Andreas Kling	c33d17d363	LibWeb: Fix tokenization of attributes with URL query strings in them <a href="/foo&amp=bar"> was being tokenized into <a href="/foo&=bar">. The spec mentions this but I had overlooked it. The bug happens because we interpreted the "&amp" as a named character reference.	2020-06-23 16:45:01 +02:00
Andreas Kling	07d976716f	LibWeb: Remove most uses of the old HTML parser The only remaining client of the old parser is the fragment parser used by the Element.innerHTML setter. We'll need to implement a bit more stuff in the new parser before we can switch that over.	2020-06-21 22:29:05 +02:00
Andreas Kling	dd7cd92de4	LibWeb: Fix two typo bugs in table parsing These were flushed out by the earlier fix to "table scope". Without the bad implementation of table scopes, ACID2 stopped parsing correctly.	2020-06-21 17:49:02 +02:00
Andreas Kling	15b5dfc794	LibWeb: A </table> inside <tbody> is not a parse error This condition was backwards. Fixes parsing of google.com.	2020-06-21 17:42:00 +02:00
Andreas Kling	1c2b6b074e	LibWeb: Fix misunderstood implementation of "table" and "select" scopes These "stack of open elements" scopes are not supposed to include the base list of element types.	2020-06-21 17:42:00 +02:00
Andreas Kling	966bc05fef	LibWeb: Implement more of the foster parenting algorithm in the parser	2020-06-21 17:42:00 +02:00
stelar7	5eb39a5f61	LibWeb: Update parser with more insertion modes :^) Implements handling of InHeadNoScript, InSelectInTable, InTemplate, InFrameset, AfterFrameset, and AfterAfterFrameset.	2020-06-21 10:13:31 +02:00
Andreas Kling	6242e029ed	LibWeb: Make Element::tag_name() return a const FlyString& The more generic virtual variant is renamed to node_name() and now only Element has tag_name(). This removes a huge amount of String ctor/dtor churn in selector matching.	2020-06-16 19:09:14 +02:00
Andreas Kling	49cd03be95	LibWeb: Fix broken parsing of </form> during "in body" insertion	2020-06-15 20:31:19 +02:00
Andreas Kling	2f26d4c6a1	LibWeb: Fix broken parsing of </select> during "in select" insertion	2020-06-15 19:57:20 +02:00
Andreas Kling	17d26b92f8	LibWeb: Just ignore <script> elements that failed to load the script We're never gonna be able to run them if we can't load them so just let it go.	2020-06-15 18:37:48 +02:00
Luke	a01478c858	LibWeb: Fully implement HTML parser "in table" insertion mode Also fixes some little mistakes in the "in body" insertion mode that I found whilst cross-referencing.	2020-06-14 14:07:07 +02:00
Luke	6532c1e2fa	LibWeb: Implement HTML parser "in column group" insertion mode	2020-06-14 14:07:07 +02:00
Luke	2241b09cd0	LibWeb: Implement HTML parser "in caption" insertion mode	2020-06-14 14:07:07 +02:00
Luke	a1838f676e	LibWeb: Implement all CDATA tokenizer states Even though we haven't implemented any switches to these states yet, we may as well have them ready for when we do implement the switches.	2020-06-14 13:47:19 +02:00
Luke	821312729a	LibWeb: Fully implement all DOCTYPE tokenizer states Also fixes TagOpen having a seperate emit and reconsume in ANYTHING_ELSE.	2020-06-14 13:47:19 +02:00
Luke	ab1df177d8	LibWeb: Fully implement all comment tokenizer states	2020-06-14 13:47:19 +02:00
Andreas Kling	47df0cbbc8	LibWeb: Fix broken tokenization of hexadecimal character references We were interpreting 'A'-'F' as decimal digits which didn't work right.	2020-06-13 13:46:12 +02:00
Andreas Kling	483b371a7b	LibWeb: Parse and match the :visited pseudo-class (always fails) If we don't do this, something like "a:visited" is parsed as "a" which may then take precedence over a previous "a:link" etc.	2020-06-13 00:23:30 +02:00
Andreas Kling	fdfda6dec2	AK: Make string-to-number conversion helpers return Optional Get rid of the weird old signature: - int StringType::to_int(bool& ok) const And replace it with sensible new signature: - Optional<int> StringType::to_int() const	2020-06-12 21:28:55 +02:00
Andreas Kling	bd33bfd120	LibWeb: Whine about unrecognized CSS properties in debug log	2020-06-12 14:15:55 +02:00
Andreas Kling	03da686aa2	LibWeb: Ignore backslashes (\) in attribute selectors This makes us at least parse selectors like [foo=bar\ baz] correctly. The current solution here is quite hackish but the real fix will come when we implement a spec-compliant CSS parser.	2020-06-10 15:50:07 +02:00
Andreas Kling	65c4e5cacf	LibWeb: Parse and match basic "contains" attribute selectors (~=)	2020-06-10 15:43:41 +02:00
Andreas Kling	e836f09094	LibWeb: Fix parser interpreting """ as "&quot" There was a logic mistake in the entity parser that chose the shorter matching entity instead of the longer. Fix this and make the entity lists constexpr while we're here.	2020-06-10 10:34:28 +02:00
Andreas Kling	9b17bf3dcd	LibWeb: Use HTML::TagNames globals in the new HTML parser	2020-06-07 23:53:16 +02:00
Andreas Kling	1d94ca7cfc	LibWeb: Fix codepoint_from_entity() never returning an error If we don't find a matching entity, return an empty Optional.	2020-06-07 19:13:56 +02:00
Andreas Kling	ab4c03ce2d	LibWeb: Fix tokenizer swallowing an extra token after a named entity	2020-06-07 19:09:03 +02:00
Andreas Kling	731685468a	LibWeb: Start fleshing out support for relative CSS units This patch introduces support for more than just "absolute px" units in our Length class. It now also supports "em" and "rem", which are units relative to the font-size of the current layout node and the <html> element's layout node respectively.	2020-06-07 17:55:46 +02:00
Andreas Kling	be6abce44f	LibWeb: Handle EOF tokens during "text" insertion	2020-06-06 16:36:18 +02:00
Luke	61d5bec739	LibWeb: Fully implement all script tokenizer states Also fixes RAWTEXTLessThanSign having a separate emit and reconsume.	2020-06-06 09:55:15 +02:00
Andreas Kling	3337365000	LibWeb: Parse param/source/track start tags during "in body" insertion	2020-06-05 21:59:46 +02:00
Andreas Kling	b4591f0037	LibWeb: Fix parsing of "<textarea></textarea>" When handling a "textarea" start tag, we have to ignore the next token if it's an LF ('\n'). However, we were not switching the tokenizer state before fetching the lookahead token, and this caused us to force the tokenizer into the RCDATA state too late, effectively getting it stuck in that state for way longer than it should be. Fixes #2508.	2020-06-05 12:05:42 +02:00

1 2 3 4

185 commits