Instead of constructing a String and converting that to a PropertyName
on the fly, we can just leverage CommonPropertyNames, add a couple more
and directly pass ready-to-use PropertyNames with pre-allocated Strings.
Combining these into one list helps reduce the size of MatchState, and
as a result, reduces the amount of memory consumed during execution of
very large regex matches.
Doing this also allows us to remove a few regex byte code instructions:
ClearNamedCaptureGroup, SaveLeftNamedCaptureGroup, and NamedReference.
Named groups now behave the same as unnamed groups for these operations.
Note that SaveRightNamedCaptureGroup still exists to cache the matched
group name.
This also removes the recursion level from the MatchState, as it can
exist as a local variable in Matcher::execute instead.
The check for stack space in VM from push_execution_context has been
moved to a method on VM called did_reach_stack_space_limit. This
allows us to check the stack size in other places besides
push_execution_context.
We can now verify that we have enough space on the stack before calling
flatten_into_array to ensure that we don't cause a stack overflow error
when calling the function with a large depth.
When overriding visit_edges() in a JS::Object subclass, we must make
sure to call the base class visit_edges(), or the object's Shape (and
any properties) will not get marked.
When appending two strings together to form a new string, if both of the
strings are already UTF-16, create the new string as UTF-16 as well.
This shaves about 0.5 seconds off the following test262 test:
RegExp/property-escapes/generated/General_Category_-_Decimal_Number.js
The test262 tests under RegExp/property-escapes/generated will invoke
Reflect.apply with up to 10,000 arguments at a time. In LibJS, when the
call stack reached VM::call_internal, we transfer those arguments from
a MarkedValueList to the execution context's arguments Vector.
Because these types differ (MarkedValueList is a Vector<Value, 32>), the
arguments are copied rather than moved. By changing the arguments vector
to a MarkedValueList, we can properly move the passed arguments over.
This shaves about 2 seconds off the following test262 test (from 15sec):
RegExp/property-escapes/generated/General_Category_-_Decimal_Number.js
In addition to invoking js_string() with existing UTF-16 strings when
possible, RegExpExec now takes a Utf16String instead of a Utf16View. The
view was previously fully copied into the returned result object, so
this prevents potentially large copies of string data.
The primary themes here are invoking js_string() with existing instances
of Utf16String when possible, and not creating entire UTF-8 copies when
not needed.
This commit does not go out of its way to reduce copying of the string
data yet, but is a minimum set of changes to compile LibJS after making
PrimitiveString hold a Utf16String.
To help alleviate memory usage when creating and copying large strings,
create a simple wrapper around a Vector<u16> to reference count UTF-16
strings.
This is a tiny difference and only changes anything for primitives in
strict mode. However this is tested in test262 and can be noticed by
overriding toString of primitive values.
This does now require one to wrap an object in a Value to call invoke
but all code using invoke has been migrated.
Add a JS_ENUMERATE_INTL_OBJECTS macro and use it to generate:
- Forward declarations
- CommonPropertyNames class name members
- Constructor and prototype GlobalObject members, getters, visitors,
and initialize_constructor() calls
This is the start of implementing ECMA-402 in LibJS, better known as the
ECMAScript Internationalization API.
Much like Temporal this gets its own subdirectory (Runtime/Intl/) as
well as a new C++ namespace (JS::Intl) so we don't have to prefix all
the files and classes with "Intl".
https://tc39.es/ecma402/