1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-07-27 13:27:35 +00:00

AK+LibRegex: Add Utf16View::code_point_at and use it in RegexStringView

The current method of iterating through the string to access a code
point hurts performance quite badly for very large strings. The test262
test "RegExp/property-escapes/generated/Any.js" previously took 3 hours
to complete; this one change brings it down to under 10 seconds.
This commit is contained in:
Timothy Flynn 2021-08-01 18:56:52 -04:00 committed by Andreas Kling
parent bed51d856a
commit 510bbcd8e0
3 changed files with 21 additions and 0 deletions

View file

@ -240,7 +240,10 @@ public:
return ch;
},
[&](Utf32View& view) -> u32 { return view[index]; },
[&](Utf16View& view) -> u32 { return view.code_point_at(index); },
[&](auto& view) -> u32 {
// FIXME: Iterating to the code point is inefficient, particularly for very large
// strings. Implement something like code_point_at to Utf8View.
size_t i = index;
for (auto it = view.begin(); it != view.end(); ++it, --i) {
if (i == 0)