This patch adds a new "Peephole" pass for performing small, local
optimizations to bytecode.
We also introduce the first such optimization, fusing a sequence of
some comparison instruction FooCompare followed by a JumpIf into a
new set of JumpFooCompare instructions.
This gives a ~50% speed-up on the following microbenchmark:
for (let i = 0; i < 10_000_000; ++i) {
}
But more traditional benchmarks see a pretty sizable speed-up as well,
for example 15% on Kraken/ai-astar.js and 16% on Kraken/audio-dft.js :^)
Similar to 'Calendar Methods Record', this is part of a refactor to the
temporal spec which will need much work for all of the corresponding AOs
to be updated to use.
Put in a new header file to prevent circular include problems when using
this new record.
This works very similarly to MarkedVector<T>, but instead of expecting
T to be Value or a GC-allocated pointer type, T can be anything.
Every pointer-sized value in the vector's storage will be checked during
conservative root scanning.
In other words, this allows you to put something like this in a
ConservativeVector<Foo> and it will be protected from GC:
struct Foo {
i64 number;
Value some_value;
GCPtr<Object> some_object;
};
The JIT compiler was an interesting experiment, but ultimately the
security & complexity cost of doing arbitrary code generation at runtime
is far too high.
In subsequent commits, the bytecode format will change drastically, and
instead of rewriting the JIT to fit the new bytecode, this patch simply
removes the JIT instead.
Other engines, JavaScriptCore in particular, have already proven that
it's possible to handle the vast majority of contemporary web content
with an interpreter. They are currently ~5x faster than us on benchmarks
when running without a JIT. We need to catch up to them before
considering performance techniques with a heavy security cost.
From test262 documentation, this flag means:
The test file should only be run when the [[CanBlock]] property of
the Agent Record executing the file is `false`.
This patch stubs out the accessor for that internal slot and skips tests
with the CanBlockIsFalse if that internal slot is true.
This patch adds two macros to declare per-type allocators:
- JS_DECLARE_ALLOCATOR(TypeName)
- JS_DEFINE_ALLOCATOR(TypeName)
When used, they add a type-specific CellAllocator that the Heap will
delegate allocation requests to.
The result of this is that GC objects of the same type always end up
within the same HeapBlock, drastically reducing the ability to perform
type confusion attacks.
It also improves HeapBlock utilization, since each block now has cells
sized exactly to the type used within that block. (Previously we only
had a handful of block sizes available, and most GC allocations ended
up with a large amount of slack in their tails.)
There is a small performance hit from this, but I'm sure we can make
up for it elsewhere.
Note that the old size-based allocators still exist, and we fall back
to them for any type that doesn't have its own CellAllocator.
We currently cannot parse the output of `toString` and `toUTCString`.
While the spec does not require such support, test262 expects it, and
all major engines support it.
The previous implementation was calling `backtrace()` for every
function call, which is quite slow.
Instead, this implementation provides VM::stack_trace() which unwinds
the native stack, maps it through NativeExecutable::get_source_range
and combines it with source ranges from interpreted call frames.
There are a lot of native C++ functions that will be used by both the
bytecode interpreter and jitted code. Let's put them in their own file
instead of having them in Interpreter.cpp.
These passes have not been shown to actually optimize any JS, and tests
have become very flaky with optimizations enabled. Until some measurable
benefit is shown, remove the optimization passes to reduce overhead of
maintaining bytecode operations and to reduce CI churn. The framework
for optimizations will live on in git history, and can be restored once
proven useful.
Rather than splitting the Iterator type and its AOs into two files,
let's combine them into one file to match every other JS runtime object
that we have.
The dollar sign is a special character in POSIX shells and in the Ninja
build file format. If the file name contains a `$`, something goes wrong
in the escaping/unescaping of this symbol, and CMake/GCC/Clang generate
invalid dependency files where not all instances of `$` are escaped
properly. Because of this, Ninja fails to rebuild `$262Object.cpp` if
the headers included by it have changed.
Stale `$262Object.cpp.o` files have been the cause of mysterious crashes
multiple times which only go away after doing a clean build. Let's
prevent these from happening again by removing the `$` from the
filename.
The RegExpLiteral AST node already has the parsed regex::Parser::Result
so let's plumb that over to the bytecode executable instead of reparsing
the regex every time NewRegExp is executed.
~12% speed-up on language/literals/regexp/S7.8.5_A2.1_T2.js in test262.
This uses a new Iterator type called IteratorHelper. This does not
implement IteratorHelper.prototype.return as that relies on generator
objects (i.e. the internal slots of JS::GeneratorObject), which are not
hooked up here.
Iterator.from creates an Iterator from either an existing iterator or
an iterator-like object. In the latter case, it sets the prototype of
the returned iterator to WrapForValidIteratorPrototype to wrap around
the iterator-like object's iteration methods.
ThrowableStringBuilder is a thin wrapper around StringBuilder to map
results from the try_* methods to a throw completion. This will let us
try to throw on OOM conditions rather than just blowing up.
This pass tries to eliminate repeated lookups of variables by name, by
remembering where these where last loaded to.
For now the lookup cache needs to be fully cleared with each call or
property access, because we do not have a way to check if these have any
side effects on the currently visible scopes.
Note that property accesses can cause getters/setters to be called, so
these are treated as calls in all cases.
And make it capable of printing to any Core::Stream.
This is useful on its own and can be used in a number of places, so move
it out and make it available as JS::print().
Before this change, each AST node had a 64-byte SourceRange member.
This SourceRange had the following layout:
filename: StringView (16 bytes)
start: Position (24 bytes)
end: Position (24 bytes)
The Position structs have { line, column, offset }, all members size_t.
To reduce memory consumption, AST nodes now only store the following:
source_code: NonnullRefPtr<SourceCode> (8 bytes)
start_offset: u32 (4 bytes)
end_offset: u32 (4 bytes)
SourceCode is a new ref-counted data structure that keeps the filename
and original parsed source code in a single location, and all AST nodes
have a pointer to it.
The start_offset and end_offset can be turned into (line, column) when
necessary by calling SourceCode::range_from_offsets(). This will walk
the source code string and compute line/column numbers on the fly, so
it's not necessarily fast, but it should be rare since this information
is primarily used for diagnostics and exception stack traces.
With this, ASTNode shrinks from 80 bytes to 32 bytes. This gives us a
~23% reduction in memory usage when loading twitter.com/awesomekling
(330 MiB before, 253 MiB after!) :^)
Otherwise, we end up propagating those dependencies into targets that
link against that library, which creates unnecessary link-time
dependencies.
Also included are changes to readd now missing dependencies to tools
that actually need them.
Intrinsics, i.e. mostly constructor and prototype objects, but also
things like empty and new object shape now live on a new heap-allocated
JS::Intrinsics object, thus completing the long journey of taking all
the magic away from the global object.
This represents the Realm's [[Intrinsics]] slot in the spec and matches
its existing [[GlobalObject]] / [[GlobalEnv]] slots in terms of
architecture.
In the majority of cases it should now be possibly to fully allocate a
regular object without the global object existing, and in fact that's
what we do now - the realm is allocated before the global object, and
the intrinsics between both :^)
The Intl mathematical value is much like ECMA-262's mathematical value
in that it is meant to represent an arbitrarily precise number. The Intl
MV further allows positive/negative infinity, negative zero, and NaN.
This implementation is *not* arbitrarily precise. Rather, it is a
replacement for the use of Value within Intl.NumberFormat. The exact
syntax of the Intl MV is still being worked on, but abstracting this
away into its own class will allow hooking in the finalized Intl MV
more easily, and makes implementing Intl.NumberFormat.formatRange
easier.
Note the methods added here are essentially the same as the static
helpers in Intl/NumberFormat.cpp.