1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-14 08:34:57 +00:00

LibRegex+Everywhere: Make LibRegex more unicode-aware

This commit makes LibRegex (mostly) capable of operating on any of
the three main string views:
- StringView for raw strings
- Utf8View for utf-8 encoded strings
- Utf32View for raw unicode strings

As a result, regexps with unicode strings should be able to properly
handle utf-8 and not stop in the middle of a code point.
A future commit will update LibJS to use the correct type of string
depending on the flags.
This commit is contained in:
Ali Mohammad Pur 2021-07-18 05:07:01 +04:30 committed by Ali Mohammad Pur
parent e5af15a6e9
commit f364fcec5d
8 changed files with 310 additions and 207 deletions

View file

@ -417,7 +417,7 @@ private:
StringBuilder result;
for (auto& e : match.capture_group_matches[0])
result.append(e.view.u8view());
result.append(e.view.string_view());
return result.build();
}