1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-31 04:38:11 +00:00
Commit graph

931 commits

Author SHA1 Message Date
Linus Groh
3dbf4c62b0 LibJS: Use GenericLexer for Token::string_value()
This is, and I can't stress this enough, a lot better than all the
manual bounds checking and indexing that was going on before.

Also fixes a small bug where "\u{}" wouldn't get rejected as invalid
unicode escape sequence.
2020-10-29 11:52:31 +01:00
Linus Groh
b5bd05b717 LibJS: Don't parse numeric literal containing 8 or 9 as octal
If the value has a leading zero (allowed in non-strict mode) but
contains the digits 8 or 9 it can't be an octal number.
2020-10-28 21:11:32 +01:00
Linus Groh
b4e51249e9 LibJS: Always insert semicolon after do-while statement if missing
https://tc39.es/ecma262/#sec-additions-and-changes-that-introduce-incompatibilities-with-prior-editions

11.9.1: In ECMAScript 2015, Automatic Semicolon Insertion adds a
semicolon at the end of a do-while statement if the semicolon is
missing. This change aligns the specification with the actual behaviour
of most existing implementations.
2020-10-28 21:11:32 +01:00
Linus Groh
d278f61f4c LibJS: Restrict toEval() failures to SyntaxError
We only use expect(...).toEval() / not.toEval() for checking syntax
errors, where we obviously can't put the code in a regular function. For
runtime errors we do exactly that, so toEval() should not fail - this
allows us to use undefined identifiers in syntax tests.
2020-10-28 21:11:32 +01:00
Linus Groh
7112031bfb LibJS: Use message from invalid token in syntax error 2020-10-26 21:38:34 +01:00
Linus Groh
6a3389cec6 LibJS: Emit token message for invalid numeric literals 2020-10-26 21:38:34 +01:00
Linus Groh
19edcbd79c LibJS: Emit TokenType::Invalid for unterminated multi-line comments 2020-10-26 21:38:34 +01:00
Linus Groh
03c1d43f6e LibJS: Add message string to Token
This allows us to communicate details about invalid tokens to the parser
without having to invent a bunch of specific invalid tokens like
TokenType::InvalidNumericLiteral.
2020-10-26 21:38:34 +01:00
Linus Groh
66e315959d LibJS: Allow all line terminators to be used for line continuations 2020-10-25 19:45:47 +01:00
Marcin Gasperowicz
e5ddcadd3c LibJS: Parse line continuations in string literals properly
Newlines after line continuation were inserted into the string 
literals. This patch makes the parser ignore the newlines after \ and
also makes it so that "use strict" containing a line continuation is 
not a valid "use strict".
2020-10-25 15:16:47 +01:00
Linus Groh
dca9e4ec10 LibJS: Implement rules for duplicate function parameters
- A regular function can have duplicate parameters except in strict mode
  or if its parameter list is not "simple" (has a default or rest
  parameter)
- An arrow function can never have duplicate parameters

Compared to other engines I opted for more useful syntax error messages
than a generic "duplicate parameter name not allowed in this context":

    "use strict"; function test(foo, foo) {}
                                     ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in strict mode (line: 1, column: 34)

    function test(foo, foo = 1) {}
                       ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with default parameter (line: 1, column: 20)

    function test(foo, ...foo) {}
                          ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with rest parameter (line: 1, column: 23)

    (foo, foo) => {}
          ^
    Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in arrow function (line: 1, column: 7)
2020-10-25 12:56:02 +01:00
Linus Groh
2adcabb6b3 LibJS: Disallow escape sequence/line continuation in use strict directive
https://tc39.es/ecma262/#sec-directive-prologues-and-the-use-strict-directive

A Use Strict Directive is an ExpressionStatement in a Directive Prologue
whose StringLiteral is either of the exact code point sequences
"use strict" or 'use strict'. A Use Strict Directive may not contain an
EscapeSequence or LineContinuation.
2020-10-24 16:34:01 +02:00
Linus Groh
4fb96afafc LibJS: Support LegacyOctalEscapeSequence in string literals
https://tc39.es/ecma262/#sec-additional-syntax-string-literals

The syntax and semantics of 11.8.4 is extended as follows except that
this extension is not allowed for strict mode code:

Syntax

    EscapeSequence::
        CharacterEscapeSequence
        LegacyOctalEscapeSequence
        NonOctalDecimalEscapeSequence
        HexEscapeSequence
        UnicodeEscapeSequence

    LegacyOctalEscapeSequence::
        OctalDigit [lookahead ∉ OctalDigit]
        ZeroToThree OctalDigit [lookahead ∉ OctalDigit]
        FourToSeven OctalDigit
        ZeroToThree OctalDigit OctalDigit

    ZeroToThree :: one of
        0 1 2 3

    FourToSeven :: one of
        4 5 6 7

    NonOctalDecimalEscapeSequence :: one of
        8 9

This definition of EscapeSequence is not used in strict mode or when
parsing TemplateCharacter.

Note

It is possible for string literals to precede a Use Strict Directive
that places the enclosing code in strict mode, and implementations must
take care to not use this extended definition of EscapeSequence with
such literals. For example, attempting to parse the following source
text must fail:

function invalid() { "\7"; "use strict"; }
2020-10-24 16:34:01 +02:00
Linus Groh
9f036959e8 LibJS: Report correct line/column for string literal syntax errors
We're passing a token to this function, so m_current_token is actually
the next token - which leads to incorrect line/column numbers for string
literal syntax errors:

    "\u"
        ^
    Uncaught exception: [SyntaxError]: Malformed unicode escape sequence (line: 1, column: 5)

Rather than:

    "\u"
    ^
    Uncaught exception: [SyntaxError]: Malformed unicode escape sequence (line: 1, column: 1)
2020-10-24 16:34:01 +02:00
Linus Groh
d6f8c52245 LibJS: Allow try statement with only finally clause
This was a regression introduced by 9ffe45b - a TryStatement without
'catch' clause *is* allowed, if it has a 'finally' clause. It is now
checked properly that at least one of both is present.
2020-10-24 16:34:01 +02:00
Linus Groh
80bb62b9cc LibJS: Distinguish between statement and declaration
This separates matching/parsing of statements and declarations and
fixes a few edge cases where the parser would incorrectly accept a
declaration where only a statement is allowed - for example:

    if (foo) const a = 1;
    for (var bar;;) function b() {}
    while (baz) class c {}
2020-10-23 19:13:06 +02:00
Linus Groh
f8ae6fa713 LibJS: Disallow NumericLiteral immediately followed by Identifier
From the spec: https://tc39.es/ecma262/#sec-literals-numeric-literals

The SourceCharacter immediately following a NumericLiteral must not be
an IdentifierStart or DecimalDigit.

For example: 3in is an error and not the two input elements 3 and in.
2020-10-23 19:13:06 +02:00
Linus Groh
80bb22788f LibJS: Don't allow TryStatement without catch clause 2020-10-23 19:13:06 +02:00
Linus Groh
82ac936a9d LibJS: Check for exception after executing (do)while test expression
Otherwise we crash the interpreter when an exception is thrown during
evaluation of the while or do/while test expression - which is easily
caused by a ReferenceError - e.g.:

    while (someUndefinedVariable) {
        // ...
    }
2020-10-23 19:06:57 +02:00
Andreas Kling
619cd613d0 LibJS: Give VM a cache of single-ASCII-character PrimitiveString
A large number of JS strings are a single ASCII character. This patch
adds a 128-entry cache for those strings to the VM. The cost of the
cache is 1536 byte of GC heap (all in same block) + 2304 bytes malloc.

This avoids a lot of GC heap allocations, and packing all of these
in the same heap block is nice for fragmentation as well.
2020-10-22 17:48:12 +02:00
Andreas Kling
5c2520e6b2 LibJS: Simplify environment access a little bit in VM::construct() 2020-10-22 17:23:40 +02:00
Andreas Kling
07f76cd980 LibJS: Shrink sizeof(LexicalEnvironment) by reorganizing members 2020-10-22 17:03:40 +02:00
Linus Groh
15642874f3 LibJS: Support all line terminators (LF, CR, LS, PS)
https://tc39.es/ecma262/#sec-line-terminators
2020-10-22 10:06:30 +02:00
Linus Groh
1e86379327 LibJS: Rest parameter in setter functions is a syntax error 2020-10-20 20:27:58 +02:00
Linus Groh
6331d45a6f LibJS: Move checks for invalid getter/setter params to parse_function_node
This allows us to provide better error messages as we can point the
syntax error location to the exact first invalid parameter instead of
always the end of the function within a object literal or class
definition.

Before this change:

    const Foo = { set bar() {} }
                               ^
    Uncaught exception: [SyntaxError]: Object setter property must have one argument (line: 1, column: 28)

    class Foo { set bar() {} }
                             ^
    Uncaught exception: [SyntaxError]: Class setter method must have one argument (line: 1, column: 26)

After this change:

    const Foo = { set bar() {} }
                          ^
    Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 23)

    class Foo { set bar() {} }
                        ^
    Uncaught exception: [SyntaxError]: Setter function must have one argument (line: 1, column: 21)

The only possible downside of this change is that class getters/setters
and functions in objects are not distinguished in the message anymore -
I don't think that's important though, and classes are (mostly) just
syntactic sugar anyway.
2020-10-20 20:27:58 +02:00
Linus Groh
db75be1119 LibJS: Refactor parse_function_node() bool parameters into bit flags
I'm about to add even more options and a bunch of unnamed true/false
arguments is really not helpful. Let's make this a single parse options
parameter using bit flags.
2020-10-20 20:27:58 +02:00
Linus Groh
a82c56f9f7 LibJS: Speed up IndexedPropertyIterator by computing non-empty indices
This provides a huge speed-up for objects with large numbers as property
keys in some situation. Previously we would simply iterate from 0-<max>
and check if there's a non-empty value at each index - now we're being
smarter and compute a list of non-empty indices upfront, by checking
each value in the packed elements vector and appending the sparse
elements hashmap keys (for GenericIndexedPropertyStorage).

Consider this example, an object with a single own property, which is a
number increasing by a factor of 10 each iteration:

    for (let i = 0; i < 10; ++i) {
        const o = {[10 ** i]: "foo"};
        const start = Date.now();
        Object.getOwnPropertyNames(o);  // <-- IndexedPropertyIterator
        const end = Date.now();
        console.log(`${10 ** i} -> ${(end - start) / 1000}s`);
    }

Before this change:

    1 -> 0.0000s
    10 -> 0.0000s
    100 -> 0.0000s
    1000 -> 0.0000s
    10000 -> 0.0005s
    100000 -> 0.0039s
    1000000 -> 0.0295s
    10000000 -> 0.2489s
    100000000 -> 2.4758s
    1000000000 -> 25.5669s

After this change:

    1 -> 0.0000s
    10 -> 0.0000s
    100 -> 0.0000s
    1000 -> 0.0000s
    10000 -> 0.0000s
    100000 -> 0.0000s
    1000000 -> 0.0000s
    10000000 -> 0.0000s
    100000000 -> 0.0000s
    1000000000 -> 0.0000s

Fixes #3805.
2020-10-20 08:51:41 +02:00
Linus Groh
46cc1f718e LibJS: Unprefixed octal numbers are a syntax error in strict mode 2020-10-19 20:08:22 +02:00
Linus Groh
e898c98873 LibJS: Don't parse arrow function with newline between ) and =>
If there's a newline between the closing paren and arrow it's not a
valid arrow function, ASI should kick in instead (it'll then fail with
"Unexpected token Arrow")
2020-10-19 11:31:55 +02:00
Linus Groh
965d952ff3 LibJS: Share parameter parsing between regular and arrow functions
This simplifies try_parse_arrow_function_expression() and fixes a few
cases that should not produce an arrow function AST but did:

    (a,,) => {}
    (a b) => {}
    (a ...b) => {}
    (...b a) => {}

The new parsing logic checks whether parens are expected and uses
parse_function_parameters() if so, rolling back if a new syntax error
occurs during that. Otherwise it's just an identifier in which case we
parse the single parameter ourselves.
2020-10-19 11:31:55 +02:00
Linus Groh
aa68de3530 LibJS: Fix dump() indentation of UpdateExpression with suffix operator 2020-10-19 11:31:55 +02:00
Linus Groh
2dbea60fe2 LibJS: Multiple 'default' clauses in switch statement are a syntax error 2020-10-19 11:30:14 +02:00
Linus Groh
f8886ef5ba LibJS: Handle continue in switch statement unwinding 2020-10-18 19:08:52 +02:00
Linus Groh
8f54edb7a0 LibJS: Handle return value in switch statement unwinding
Fixes #3790.
2020-10-18 19:08:52 +02:00
Stephan Unverwerth
2c888b3c6e LibJS: Fix parsing of invalid numeric literals
i.e. "1e" "0x" "0b" "0o" used to be parsed as valid literals.
They now produce invalid tokens. Fixes #3716
2020-10-18 15:38:57 +02:00
Andreas Kling
77c1957961 LibJS: Use allocate_without_global_object for allocating Shapes 2020-10-17 23:47:07 +02:00
Andreas Kling
d8269c343c LibJS: Avoid creating temporary Strings to look up tokens while lexing
It would be cool to solve this in a general way so that looking up
a string literal or StringView in a HashMap with String keys avoids
creating a temp string.

For now, this patch simply addresses the issue in JS::Lexer.
This is a 2-3% speed-up on test-js.
2020-10-17 23:44:41 +02:00
Andreas Kling
d3dfd55472 LibJS: Prebake the empty object ({}) with a prototype
Instead of performing a prototype transition for every new object we
create via {}, prebake the object returned by Object::create_empty()
with a shape with ObjectPrototype as the prototype.

We also prebake the shape for the object assigned to the "prototype"
property of new ScriptFunction objects, since those are extremely
common and that code broke from this change anyway.

This avoid a large number of transitions and is a small speed-up on
test-js.
2020-10-17 23:23:53 +02:00
Linus Groh
b98b83712f LibJS: constexpr some Number object constant values 2020-10-16 17:06:57 +02:00
Andreas Kling
2c956ac132 LibJS: Reorganize Shape members to reduce sizeof(Shape) a bit 2020-10-16 16:46:27 +02:00
Andreas Kling
2c0e153396 LibJS: Don't bother deferring GC during ensure_property_table()
This is not actually necessary, since no GC allocations are made during
this process. If we ever make property tables into heap cells, we'd
have to rethink this.
2020-10-16 08:59:51 +02:00
Andreas Kling
4387590e65 LibJS: Support move semantics for StringOrSymbol
This allows us to rehash property tables without a bunch of ref count
churn happening.
2020-10-15 23:49:53 +02:00
Andreas Kling
1d96ecf148 Everywhere: Add missing <AK/TemporaryChange.h> includes
Don't rely on HashTable.h pulling this in.
2020-10-15 23:49:53 +02:00
Linus Groh
e07490ce13 LibJS: Don't assume value for index < size in IndexedPropertyIterator
This assumption only works for the m_packed_elements Vector where a
missing value at a certain index still returns an empty value, but not
for the m_sparse_elements HashMap, which is being used for indices
>= 200 - in that case the Optional<ValueAndAttributes> result will not
have a value.

This fixes a crash in the js REPL where printing an array with a hole at
any index >= 200 would crash.
2020-10-14 00:52:47 +02:00
Andreas Kling
a1029738fd LibJS: Add some more items to CommonPropertyNames that I missed 2020-10-14 00:10:49 +02:00
Andreas Kling
8f535435dc LibJS: Avoid property lookups during object initialization
When we're initializing objects, we're just adding a bunch of new
properties, without transition, and without overlap (we never add
the same property twice.)

Take advantage of this by skipping lookups entirely (no need to see
if we're overwriting an existing property) during initialization.

Another nice test-js speedup :^)
2020-10-13 23:57:45 +02:00
Andreas Kling
7b863330dc LibJS: Cache commonly used FlyStrings in the VM
Roughly 7% of test-js runtime was spent creating FlyStrings from string
literals. This patch frontloads that work and caches all the commonly
used names in LibJS on a CommonPropertyNames struct that hangs off VM.
2020-10-13 23:57:45 +02:00
Andreas Kling
9f6c5f68b6 LibJS: Tidy up CallExpression::execute() a little bit 2020-10-13 19:13:37 +02:00
Linus Groh
a5bf6cfff9 LibJS: Don't change offset when reconfiguring property in unique shape
When changing the attributes of an existing property of an object with
unique shape we must not change the PropertyMetadata offset.
Doing so without resizing the underlying storage vector caused an OOB
write crash.

Fixes #3735.
2020-10-10 23:25:00 +02:00
Matthew Olsson
e8da5f99b1 LibJS: break or continue with nonexistent label is a syntax error 2020-10-08 23:27:16 +02:00