mirror of
https://github.com/RGBCube/serenity
synced 2025-07-25 17:27:35 +00:00
LibRegex: Generate a search tree when patterns would benefit from it
This takes the previous alternation optimisation and applies it to all the alternation blocks instead of just the few instructions at the start. By generating a trie of instructions, all logically equivalent instructions will be consolidated into a single node, allowing the engine to avoid checking the same thing multiple times. For instance, given the pattern /abc|ac|ab/, this optimisation would generate the following tree: - a | - b | | - c | | | - <accept> | | - <accept> | - c | | - <accept> which will attempt to match 'a' or 'b' only once, and would also limit the number of backtrackings performed in case alternatives fails to match. This optimisation is currently gated behind a simple cost model that estimates the number of instructions generated, which is pessimistic for small patterns, though the change in performance in such patterns is not particularly large.
This commit is contained in:
parent
18f4b6c670
commit
4e69eb89e8
4 changed files with 347 additions and 152 deletions
|
@ -1047,6 +1047,8 @@ TEST_CASE(optimizer_alternation)
|
|||
Array tests {
|
||||
// Pattern, Subject, Expected length
|
||||
Tuple { "a|"sv, "a"sv, 1u },
|
||||
Tuple { "a|a|a|a|a|a|a|a|a|b"sv, "a"sv, 1u },
|
||||
Tuple { "ab|ac|ad|bc"sv, "bc"sv, 2u },
|
||||
};
|
||||
|
||||
for (auto& test : tests) {
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue