This is, and I can't stress this enough, a lot better than all the
manual bounds checking and indexing that was going on before.
Also fixes a small bug where "\u{}" wouldn't get rejected as invalid
unicode escape sequence.
We only use expect(...).toEval() / not.toEval() for checking syntax
errors, where we obviously can't put the code in a regular function. For
runtime errors we do exactly that, so toEval() should not fail - this
allows us to use undefined identifiers in syntax tests.
Newlines after line continuation were inserted into the string
literals. This patch makes the parser ignore the newlines after \ and
also makes it so that "use strict" containing a line continuation is
not a valid "use strict".
- A regular function can have duplicate parameters except in strict mode
or if its parameter list is not "simple" (has a default or rest
parameter)
- An arrow function can never have duplicate parameters
Compared to other engines I opted for more useful syntax error messages
than a generic "duplicate parameter name not allowed in this context":
"use strict"; function test(foo, foo) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in strict mode (line: 1, column: 34)
function test(foo, foo = 1) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with default parameter (line: 1, column: 20)
function test(foo, ...foo) {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in function with rest parameter (line: 1, column: 23)
(foo, foo) => {}
^
Uncaught exception: [SyntaxError]: Duplicate parameter 'foo' not allowed in arrow function (line: 1, column: 7)
https://tc39.es/ecma262/#sec-directive-prologues-and-the-use-strict-directive
A Use Strict Directive is an ExpressionStatement in a Directive Prologue
whose StringLiteral is either of the exact code point sequences
"use strict" or 'use strict'. A Use Strict Directive may not contain an
EscapeSequence or LineContinuation.
https://tc39.es/ecma262/#sec-additional-syntax-string-literals
The syntax and semantics of 11.8.4 is extended as follows except that
this extension is not allowed for strict mode code:
Syntax
EscapeSequence::
CharacterEscapeSequence
LegacyOctalEscapeSequence
NonOctalDecimalEscapeSequence
HexEscapeSequence
UnicodeEscapeSequence
LegacyOctalEscapeSequence::
OctalDigit [lookahead ∉ OctalDigit]
ZeroToThree OctalDigit [lookahead ∉ OctalDigit]
FourToSeven OctalDigit
ZeroToThree OctalDigit OctalDigit
ZeroToThree :: one of
0 1 2 3
FourToSeven :: one of
4 5 6 7
NonOctalDecimalEscapeSequence :: one of
8 9
This definition of EscapeSequence is not used in strict mode or when
parsing TemplateCharacter.
Note
It is possible for string literals to precede a Use Strict Directive
that places the enclosing code in strict mode, and implementations must
take care to not use this extended definition of EscapeSequence with
such literals. For example, attempting to parse the following source
text must fail:
function invalid() { "\7"; "use strict"; }
This was a regression introduced by 9ffe45b - a TryStatement without
'catch' clause *is* allowed, if it has a 'finally' clause. It is now
checked properly that at least one of both is present.
This separates matching/parsing of statements and declarations and
fixes a few edge cases where the parser would incorrectly accept a
declaration where only a statement is allowed - for example:
if (foo) const a = 1;
for (var bar;;) function b() {}
while (baz) class c {}
From the spec: https://tc39.es/ecma262/#sec-literals-numeric-literals
The SourceCharacter immediately following a NumericLiteral must not be
an IdentifierStart or DecimalDigit.
For example: 3in is an error and not the two input elements 3 and in.
Otherwise we crash the interpreter when an exception is thrown during
evaluation of the while or do/while test expression - which is easily
caused by a ReferenceError - e.g.:
while (someUndefinedVariable) {
// ...
}
If there's a newline between the closing paren and arrow it's not a
valid arrow function, ASI should kick in instead (it'll then fail with
"Unexpected token Arrow")
This simplifies try_parse_arrow_function_expression() and fixes a few
cases that should not produce an arrow function AST but did:
(a,,) => {}
(a b) => {}
(a ...b) => {}
(...b a) => {}
The new parsing logic checks whether parens are expected and uses
parse_function_parameters() if so, rolling back if a new syntax error
occurs during that. Otherwise it's just an identifier in which case we
parse the single parameter ourselves.
When changing the attributes of an existing property of an object with
unique shape we must not change the PropertyMetadata offset.
Doing so without resizing the underlying storage vector caused an OOB
write crash.
Fixes#3735.
Previously, when a loop detected an unwind of type ScopeType::Function
(which means a return statement was executed inside of the loop), it
would just return undefined. This set the VM's last_value to undefined,
when it should have been the returned value. This patch makes all loop
statements return the appropriate value in the above case.
'continue' is no longer allowed outside of a loop, and an unlabeled
'break' is not longer allowed outside of a loop or switch statement.
Labeled 'break' statements are still allowed everywhere, even if the
label does not exist.
The check for invalid lhs and assignment to eval/arguments in strict
mode should happen for all kinds of assignment expressions, not just
AssignmentOp::Assignment.
Since blocks can't be strict by themselves, it makes no sense for them
to store whether or not they are strict. Strict-ness is now stored in
the Program and FunctionNode ASTNodes. Fixes issue #3641
In the case of an exception in a property getter function we would not
return early, and a subsequent attempt to call the replacer function
would crash the interpreter due to call_internal() asserting.
Fixes#3548.
This fixes two cases obj[expr] and obj[expr]() (MemberExpression and
CallExpression respectively) when expr throws an exception and results
in an empty value, causing a crash by passing the invalid PropertyName
created by computed_property_name() to Object::get() without checking it
first.
Fixes#3459.
This fixes two issues with running a TryStatement finalizer:
- Temporarily store and clear the exception, if any, so we can run the
finalizer block statement without it getting in our way, which could
have unexpected side effects otherwise (and will likely return early
somewhere).
- Stop unwinding so more than one child node of the finalizer
BlockStatement is executed if an exception has been thrown previously
(which would have called unwind(ScopeType::Try)). Re-throwing as
described above ensures we still unwind after the finalizer, if
necessary.
Also add some tests specifically for try/catch/finally blocks, we
didn't have any!
Interpreter::run() was so far being used both as the "public API entry
point" for running a JS::Program as well as internally to execute
JS::Statement|s of all kinds - this is now more distinctly separated.
A program as returned by the parser is still going through run(), which
is responsible for creating the initial global call frame, but all other
statements are executed via execute_statement() directly.
Fixes#3437, a regression introduced by adding ASSERT(!exception()) to
run() without considering the effects that would have on internal usage.
Test files created with:
$ for f in Libraries/LibJS/Tests/builtins/Date/Date.prototype.get*js; do
cp $f $(echo $f | sed -e 's/get/getUTC/') ;
done
$ rm Libraries/LibJS/Tests/builtins/Date/Date.prototype.getUTCTime.js
$ git add Libraries/LibJS/Tests/builtins/Date/Date.prototype.getUTC*.js
$ ls Libraries/LibJS/Tests/builtins/Date/Date.prototype.getUTC*.js | \
xargs sed -i -e 's/get/getUTC/g'
Year computation has to be based on seconds, not days, in case
t is < 0 but t / __seconds_per_day is 0.
Year computation also has to consider negative timestamps.
With this, days is always positive and <= the number of days in the
year, so base the tm_wday computation directly on the timestamp,
and do it first, before t is modified in the year computation.
In C, % can return a negative number if the left operand is negative,
compensate for that.
Tested via test-js. (Except for tm_wday, since we don't implement
Date.prototype.getUTCDate() yet.)