1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-31 14:48:14 +00:00
serenity/Libraries/LibWeb/Parser
Andreas Kling 272b35d2e1 LibWeb: Begin work on a spec-compliant HTML parser
In order to actually view the web as it is, we're gonna need a proper
HTML parser. So let's build one!

This patch introduces the Web::HTMLTokenizer class, which currently
operates on a StringView input stream where it fetches (ASCII only atm)
codepoints and tokenizes acccording to the HTML spec tokenization algo.

The tokenizer state machine looks a bit weird but is written in a way
that tries to mimic the spec as closely as possible, in order to make
development easier and bugs less likely.

This initial version is far from finished, but it can parse a trivial
document with a DOCTYPE and open/close tags. :^)
2020-05-22 21:46:13 +02:00
..
CSSParser.cpp LibWeb: Update the CSS prefix to -libweb 2020-05-21 14:15:49 +02:00
CSSParser.h AK: Stop allowing implicit downcast with RefPtr and NonnullRefPtr 2020-04-05 11:19:00 +02:00
HTMLParser.cpp LibWeb: Parse " into '"' 2020-05-21 12:27:08 +02:00
HTMLParser.h LibWeb: Handle iso-8859-1 web content a little bit better 2020-05-03 23:01:58 +02:00
HTMLToken.h LibWeb: Begin work on a spec-compliant HTML parser 2020-05-22 21:46:13 +02:00
HTMLTokenizer.cpp LibWeb: Begin work on a spec-compliant HTML parser 2020-05-22 21:46:13 +02:00
HTMLTokenizer.h LibWeb: Begin work on a spec-compliant HTML parser 2020-05-22 21:46:13 +02:00