From f5c1bbc00bceed95f67d612735991fd90586a26e Mon Sep 17 00:00:00 2001 From: Timothy Flynn Date: Tue, 3 Aug 2021 17:11:19 -0400 Subject: [PATCH] LibUnicode: Parse UCD Scripts.txt and generate as a Unicode property There are a couple of minor nuances with parsing script values, compared to other properties. In Scripts.txt, the UCD file lists the full name of each script; other properties, like General Category, list the shorter name in their primary files. This means that the aliases listed in PropertyValueAliases.txt are reversed for script values. --- .../Libraries/LibUnicode/CharacterTypes.cpp | 22 ++++ .../Libraries/LibUnicode/CharacterTypes.h | 3 + .../CodeGenerators/GenerateUnicodeData.cpp | 104 +++++++++++++----- Userland/Libraries/LibUnicode/Forward.h | 1 + .../Libraries/LibUnicode/unicode_data.cmake | 11 +- 5 files changed, 112 insertions(+), 29 deletions(-) diff --git a/Userland/Libraries/LibUnicode/CharacterTypes.cpp b/Userland/Libraries/LibUnicode/CharacterTypes.cpp index b4d3fa09d5..f780b31973 100644 --- a/Userland/Libraries/LibUnicode/CharacterTypes.cpp +++ b/Userland/Libraries/LibUnicode/CharacterTypes.cpp @@ -318,4 +318,26 @@ bool is_ecma262_property([[maybe_unused]] Property property) #endif } +Optional