serenity

mirror of https://github.com/RGBCube/serenity synced 2025-09-16 23:56:18 +00:00

Author	SHA1	Message	Date
Timothy Flynn	9509433e25	LibRegex: Implement and use a REPEAT operation for bytecode repetition Currently, when we need to repeat an instruction N times, we simply add that instruction N times in a for-loop. This doesn't scale well with extremely large values of N, and ECMA-262 allows up to N = 2^53 - 1. Instead, add a new REPEAT bytecode operation to defer this loop from the parser to the runtime executor. This allows the parser to complete sans any loops (for this instruction), and allows the executor to bail early if the repeated bytecode fails. Note: The templated ByteCode methods are to allow the Posix parsers to continue using u32 because they are limited to N = 2^20.	2021-08-15 11:43:45 +01:00
Timothy Flynn	f1ce998d73	LibRegex+LibJS: Combine named and unnamed capture groups in MatchState Combining these into one list helps reduce the size of MatchState, and as a result, reduces the amount of memory consumed during execution of very large regex matches. Doing this also allows us to remove a few regex byte code instructions: ClearNamedCaptureGroup, SaveLeftNamedCaptureGroup, and NamedReference. Named groups now behave the same as unnamed groups for these operations. Note that SaveRightNamedCaptureGroup still exists to cache the matched group name. This also removes the recursion level from the MatchState, as it can exist as a local variable in Matcher::execute instead.	2021-08-15 11:43:45 +01:00
Timothy Flynn	2e4b6fd1ac	LibRegex: Ensure escaped code points are exactly 4 digits in length	2021-08-15 11:43:45 +01:00
Ali Mohammad Pur	15f95220ae	AK+Everywhere: Delete Variant's default constructor This was exposed to the user by mistake, and even accumulated a bunch of users that didn't blow up out of sheer luck.	2021-08-13 17:31:39 +04:30
Timothy Flynn	df14d11a11	LibRegex: Disallow invalid interval qualifiers in Unicode mode Fixes all remaining 'built-ins/RegExp/property-escapes' test262 tests.	2021-08-11 13:11:01 +02:00
Timothy Flynn	484ccfadc3	LibRegex: Support property escapes of Unicode script extensions	2021-08-04 13:50:32 +01:00
Timothy Flynn	06088df729	LibRegex: Support property escapes of the Unicode script property Note that unlike binary properties and general categories, scripts must be specified in the non-binary (Script=Value) form.	2021-08-04 13:50:32 +01:00
Timothy Flynn	1e10d6d7ce	LibRegex: Support property escapes of Unicode General Categories This changes LibRegex to parse the property escape as a Variant of Unicode Property & General Category values. A byte code instruction is added to perform matching based on General Category values.	2021-08-02 21:02:09 +04:30
Timothy Flynn	d485cf29d7	LibRegex+LibUnicode: Begin implementing Unicode property escapes This supports some binary property matching. It does not support any properties not yet parsed by LibUnicode, nor does it support value matching (such as Script_Extensions=Latin).	2021-07-30 21:26:31 +01:00
Ali Mohammad Pur	36bfc912fc	LibRegex: Switch to east-const style	2021-07-23 21:19:21 +04:30
Ali Mohammad Pur	c8b2199251	LibRegex: Clear previous capture group contents in ECMA262 mode ECMA262 requires that the capture groups only contain the values from the last iteration, e.g. `((c)(a)?(b))` should _not_ contain 'a' in the second capture group when matching "cabcb".	2021-07-23 21:19:21 +04:30
Ali Mohammad Pur	11a8476cf4	LibRegex: Use the parser state capture group count in BRE Otherwise the users won't know how many capture groups are in the parsed regular expression.	2021-07-10 23:14:08 +04:30
Ali Mohammad Pur	54d89609de	LibRegex: Add support for the Basic POSIX regular expressions This implements the internal regex stuff for #8506.	2021-07-10 13:33:08 +02:00
Brian Gianforcaro	1682f0b760	Everything: Move to SPDX license identifiers in all files. SPDX License Identifiers are a more compact / standardized way of representing file license information. See: https://spdx.dev/resources/use/#identifiers This was done with the `ambr` search and replace tool. ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *	2021-04-22 11:22:27 +02:00
AnotherTest	c128b3fd91	LibRegex: Remove 'ReadDigitFollowPolicy' as it's no longer needed Thanks to @GMTA: `1b071455b1 (r49343474)`	2021-04-10 12:10:45 +02:00
Jelle Raaijmakers	db321db5f4	LibRegex: Parse `\0` as a zero-byte instead of 0x30 ("0") This was causing some regexes to trip up. Fixes #6202.	2021-04-09 21:53:14 +02:00
AnotherTest	6bbb26fdaf	LibRegex: Allow references to capture groups that aren't parsed yet This only applies to the ECMA262 parser. This behaviour is an ECMA262-specific quirk, such references always generate zero-length matches (even on subsequent passes). Also adds a test in LibJS's test suite. Fixes #6039.	2021-04-01 21:55:47 +02:00
AnotherTest	f05e518cbc	LibRegex: Implement section B.1.4. of the ECMA262 spec This allows the parser to deal with crazy patterns like the one in #5517.	2021-02-27 07:31:01 +01:00
Andreas Kling	13d7c09125	Libraries: Move to Userland/Libraries/	2021-01-12 12:17:46 +01:00

19 commits