1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-10-26 07:32:07 +00:00
Commit graph

48 commits

Author SHA1 Message Date
Linus Groh
69845ae460 LibJS: "-->" preceded by token on same line isn't start of HTML-like comment
B.1.3 HTML-like Comments

The syntax and semantics of 11.4 is extended as follows except that this
extension is not allowed when parsing source code using the goal symbol
Module:

Syntax (only relevant part included)

    SingleLineHTMLCloseComment ::
        LineTerminatorSequence HTMLCloseComment

    HTMLCloseComment ::
        WhiteSpaceSequence[opt] SingleLineDelimitedCommentSequence[opt] --> SingleLineCommentChars[opt]

Fixes #3810.
2020-10-29 22:28:15 +01:00
Linus Groh
6a3389cec6 LibJS: Emit token message for invalid numeric literals 2020-10-26 21:38:34 +01:00
Linus Groh
19edcbd79c LibJS: Emit TokenType::Invalid for unterminated multi-line comments 2020-10-26 21:38:34 +01:00
Linus Groh
03c1d43f6e LibJS: Add message string to Token
This allows us to communicate details about invalid tokens to the parser
without having to invent a bunch of specific invalid tokens like
TokenType::InvalidNumericLiteral.
2020-10-26 21:38:34 +01:00
Linus Groh
15642874f3 LibJS: Support all line terminators (LF, CR, LS, PS)
https://tc39.es/ecma262/#sec-line-terminators
2020-10-22 10:06:30 +02:00
Linus Groh
46cc1f718e LibJS: Unprefixed octal numbers are a syntax error in strict mode 2020-10-19 20:08:22 +02:00
Stephan Unverwerth
2c888b3c6e LibJS: Fix parsing of invalid numeric literals
i.e. "1e" "0x" "0b" "0o" used to be parsed as valid literals.
They now produce invalid tokens. Fixes #3716
2020-10-18 15:38:57 +02:00
Andreas Kling
d8269c343c LibJS: Avoid creating temporary Strings to look up tokens while lexing
It would be cool to solve this in a general way so that looking up
a string literal or StringView in a HashMap with String keys avoids
creating a temp string.

For now, this patch simply addresses the issue in JS::Lexer.
This is a 2-3% speed-up on test-js.
2020-10-17 23:44:41 +02:00
Linus Groh
aa71dae03c LibJS: Implement logical assignment operators (&&=, ||=, ??=)
TC39 proposal, stage 4 as of 2020-07.
https://tc39.es/proposal-logical-assignment/
2020-10-05 17:57:26 +02:00
Ben Wiederhake
5d3c437cce LibJS: Fix start position of multi-line tokens
This broke in case of unterminated regular expressions, causing goofy location
numbers, and 'source_location_hint' to eat up all memory:

Unexpected token UnterminatedRegexLiteral. Expected statement (line: 2, column: 4294967292)
2020-09-12 00:13:29 +02:00
Matthew Olsson
bc1c556755 LibJS: Move regex logic to main Lexer if statement
This prevents a regex such as /=/ from lexing into TokenType::SlashEquals,
preventing the regex logic from working.
2020-06-08 09:18:27 +02:00
Matthew Olsson
cc7462daeb LibJS: Properly consume escaped backslash in regex literal 2020-06-08 09:18:27 +02:00
Matthew Olsson
1fadde2483 LibJS: Fix big int division lexing as UnterminatedRegexLiteral 2020-06-07 20:05:16 +02:00
Linus Groh
0ff9d7e189 LibJS: Add BigInt 2020-06-07 19:29:40 +02:00
Matthew Olsson
61ac1d3ffa LibJS: Lex and parse regex literals, add RegExp objects
This adds regex parsing/lexing, as well as a relatively empty
RegExpObject. The purpose of this patch is to allow the engine to not
get hung up on parsing regexes. This will aid in finding new syntax
errors (say, from google or twitter) without having to replace all of
their regexes first!
2020-06-07 19:06:55 +02:00
Paul Redmond
11405c5139
LibJS: Fix incorrect token column values (#2401)
- initializing m_line_column to 1 in the lexer results in incorrect
  column values in tokens on the first line of input.
- not incrementing m_line_column when EOF is reached results in
  an incorrect column value on the last token.
2020-05-26 19:00:30 +02:00
Linus Groh
00b61a212f LibJS: Remove syntax errors from lexer
Giving the lexer the ability to generate errors adds unnecessary
complexity - also it only calls its syntax_error() function in one place
anyway ("unterminated string literal"). But since the lexer *also* emits
tokens like Eof or UnterminatedStringLiteral, it should be up to the
consumer of these tokens to decide what to do.

Also remove the option to not print errors to stderr as that's not
relevant anymore.
2020-05-15 09:53:52 +02:00
Linus Groh
1383cd23bc LibJS: Add missing keywords/tokens
Some of these are required for syntax we have not implemented yet, some
are future reserved words in strict mode.
2020-05-12 18:47:38 +02:00
Linus Groh
a2e1f1a872 LibJS: Implement exponentiation assignment operator (**=) 2020-05-05 11:12:27 +02:00
Linus Groh
3e754a15d4 LibJS: Implement bitwise assignment operators (&=, |=, ^=) 2020-05-05 11:12:27 +02:00
mattco98
adb4accab3 LibJS: Add template literals
Adds fully functioning template literals. Because template literals
contain expressions, most of the work has to be done in the Lexer rather
than the Parser. And because of the complexity of template literals
(expressions, nesting, escapes, etc), the Lexer needs to have some
template-related state.

When entering a new template literal, a TemplateLiteralStart token is
emitted. When inside a literal, all text will be parsed up until a '${'
or '`' (or EOF, but that's a syntax error) is seen, and then a
TemplateLiteralExprStart token is emitted. At this point, the Lexer
proceeds as normal, however it keeps track of the number of opening
and closing curly braces it has seen in order to determine the close
of the expression. Once it finds a matching curly brace for the '${',
a TemplateLiteralExprEnd token is emitted and the state is updated
accordingly.

When the Lexer is inside of a template literal, but not an expression,
and sees a '`', this must be the closing grave: a TemplateLiteralEnd
token is emitted.

The state required to correctly parse template strings consists of a
vector (for nesting) of two pieces of information: whether or not we
are in a template expression (as opposed to a template string); and
the count of the number of unmatched open curly braces we have seen
(only applicable if the Lexer is currently in a template expression).

TODO: Add support for template literal newlines in the JS REPL (this will
cause a syntax error currently):

    > `foo
    > bar`
    'foo
    bar'
2020-05-04 16:46:31 +02:00
Linus Groh
43c1fa9965 LibJS: Implement (no-op) debugger statement 2020-05-01 22:07:13 +02:00
mattco98
80fecc615a LibJS: Add spreading in array literals
Implement the syntax and behavor necessary to support array literals
such as [...[1, 2, 3]]. A type error is thrown if the target of the
spread operator does not evaluate to an array (though it should
eventually just check for an iterable).

Note that the spread token's name is TripleDot, since the '...' token is
used for two features: spread and rest. Calling it anything involving
'spread' or 'rest' would be a bit confusing.
2020-04-27 11:32:18 +02:00
Linus Groh
95b51e857d LibJS: Add TokenType::TemplateLiteral
This is required for template literals - we're not quite there yet, but at
least the parser can now tell us when this token is encountered -
currently this yields "Unexpected token Invalid". Not really helpful.

The character is a "backtick", but as we already have
TokenType::{StringLiteral,RegexLiteral} this seemed like a fitting name.

This also enables syntax highlighting for template literals in the js
REPL and LibGUI's JSSyntaxHighlighter.
2020-04-24 11:18:57 +02:00
Stephan Unverwerth
9477efe970 LibJS: Handle HTML-style comments 2020-04-14 12:54:09 +02:00
Stephan Unverwerth
f8f65053bd LibJS: Parse "this" as ThisExpression 2020-04-13 00:45:25 +02:00
AnotherTest
7b54274ac5 LibJS: Report the start position of a token as its line column 2020-04-05 16:11:13 +02:00
AnotherTest
cdb627a516 LibJS: Allow lexer to run without logging errors 2020-04-05 16:11:13 +02:00
Stephan Unverwerth
500f6d9e3a LibJS: Add numeric literal parsing for different bases and exponents 2020-04-05 16:01:22 +02:00
Brian Gianforcaro
dd112421b4 LibJS: Plumb line and column information through Lexer / Parser
While debugging test failures, it's pretty frustrating to have to go do
printf debugging to figure out what test is failing right now. While
watching your JS Raytracer stream it seemed like this was pretty
furstrating as well. So I wanted to start working on improving the
diagnostics here.

In the future I hope we can eventually be able to plumb the info down
to the Error classes so any thrown exceptions will contain enough
metadata to know where they came from.
2020-04-05 12:43:39 +02:00
Andreas Kling
9ebd066ac8 LibJS: Add support for "continue" inside "for" statements :^) 2020-04-05 00:22:42 +02:00
Andreas Kling
a860a3f793 LibJS: Hack the lexer to allow numbers with decimals
This is very hackish and should definitely be improved. :^)
2020-04-04 23:13:48 +02:00
Linus Groh
2636cac6e4 LibJS: Remove UndefinedLiteral, add undefined to global object
There is no such thing as a "undefined literal" in JS - undefined is
just a property on the global object with a value of undefined.
This is pretty similar to NaN.

var undefined = "foo"; is a perfectly fine AssignmentExpression :^)
2020-04-03 00:10:52 +02:00
Jack Karamanian
098f1cd0ca LibJS: Add support for arrow functions 2020-03-30 15:41:36 +02:00
Andreas Kling
1923051c5b LibJS: Lexer and parser support for "switch" statements 2020-03-29 15:03:58 +02:00
Andreas Kling
faddf3a1db LibJS: Implement "throw"
You can now throw an expression to the nearest catcher! :^)

To support throwing arbitrary values, I added an Exception class that
sits as a wrapper around whatever is thrown. In the future it will be
a logical place to store a call stack.
2020-03-24 22:21:58 +01:00
Andreas Kling
6dc4b23e2f LibJS: Teach the lexer to recognize ">=" and "<=" :^) 2020-03-23 14:10:23 +01:00
0xtechnobabble
bc002f807a LibJS: Parse object expressions 2020-03-21 10:08:58 +01:00
0xtechnobabble
cfd710eb31 LibJS: Implement null and undefined literals 2020-03-16 13:42:13 +01:00
Stephan Unverwerth
c0e6234219 LibJS: Lex single quote strings, escaped chars and unterminated strings 2020-03-14 12:13:53 +01:00
Stephan Unverwerth
15d5b2d29e LibJS: Add operator precedence parsing
Obey precedence and associativity rules when parsing expressions
with chained operators.
2020-03-14 00:11:24 +01:00
Oriko
2d7f4bea90 LibJS: Fix endless loop in string lexing 2020-03-13 22:53:13 +01:00
Stephan Unverwerth
ac524b632f LibJS: Fix lexing of the last character in a file
Before this commit the last character in a file would be swallowed.
This also fixes parsing of empty files which would previously ASSERT.
2020-03-13 21:47:57 +01:00
Andreas Kling
4781e3fb72 LibJS: Fix some coding style mistakes in Lexer 2020-03-12 13:52:54 +01:00
Conrad Pankoff
097e1af4e8 LibJS: Implement for statement 2020-03-12 13:42:23 +01:00
Conrad Pankoff
e88f2f15ee LibJS: Parse === and !== binary operators 2020-03-12 13:42:23 +01:00
Andreas Kling
ed100bc6f4 LibJS: Implement basic lexing + parsing of StringLiteral
This still includes the double-quote characters (") but at least the
AST comes out right.
2020-03-12 13:05:06 +01:00
Stephan Unverwerth
f3a9eba987 LibJS: Add Javascript lexer and parser
This adds a basic Javascript lexer and parser. It can parse the
currently existing demo programs. More work needs to be done to
turn it into a complete parser than can parse arbitrary JS Code.

The lexer outputs tokens with preceeding whitespace and comments
in the trivia member. This should allow us to generate the exact
source code by concatenating the generated tokens.

The parser is written in a way that it always returns a complete
syntax tree. Error conditions are represented as nodes in the
tree. This simplifies the code and allows it to be used as an
early stage parser, e.g for parsing JS documents in an IDE while
editing the source code.:
2020-03-12 09:25:49 +01:00