The amount of aliases in the likely-subtags dataset is quite large, so
this also needed to change the way the data is generated. Otherwise, the
compiler would complain about the size of the generated code.
Previously, a static method was generated that would effectively parse
the dataset into a HashMap of Unicode::LanguageID at runtime. We now
perform that parsing at generation-time, and instead generate an Array
of a structure similar to Unicode::LanguageID (we cannot use the same
structure because it contains String and Optional, which cannot be used
at compile-time).
The DOM specification says that the primary use case for these is to
give Promises abort semantics. It is also a prerequisite for Fetch,
as it is used to make Fetch abortable.
a
CLDR contains a set of likely subtag data where, given a locale, you can
resolve what is the most likely language, script, or territory of that
locale. This data is needed for resolving territory aliases. These
aliases might contain multiple territories, and we need to resolve which
of those territories is most likely correct for a locale.
Note that the likely subtag data is quite huge (a few thousand entries).
As an optimization encouraged by the spec, we only generate the smallest
subset of this data that we actually need (about 150 entries).
Most alias substitutions are "simple", meaning that alias matching is
done by examining a single locale subtag. However, there are a handful
of "complex" aliases where matching is done by examining multiple
subtags. For example, the variant subtag "lojban" causes the locale
"art-lojban" to be canonicalized to "jbo", but only when the language
subtag is "art" (i.e. this should not occur for the locale "en-lojban").
This generates a method to perform complex alias matching.
CLDR contains a set of aliases for languages, territories, etc. that no
longer are meant to be used (e.g. due to deprecation). For example, the
language "aam" is deprecated and should be canonicalized as "aas".
This is needed so all headers and files exist on disk, so that
the sonar cloud analyzer can find them when executing the compilation
commands contained in compile_commands.json, without actually building.
Co-authored-by: Andrew Kaster <akaster@serenityos.org>
This allows us to remove all the add_subdirectory calls from the top
level CMakeLists.txt that referred to targets linking LagomCore.
Segregating the host tools and Serenity targets helps us get to a place
where the main Serenity build can simply use a CMake toolchain file
rather than swapping all the compiler/sysroot variables after building
host libraries and tools.
Moving this helper CMake file to the centralized Meta/CMake folder helps
to get a better grasp on what extra files are required for the build,
and what files are generated.
While we're at it, don't use add_compile_definitions for
ENABLE_UNICODE_DATA, which only needs to be seen by LibUnicode sources.
The top-level CMakeLists.txt already automatically detects ccache, but
CI will invoke CMake with Lagom's CMakeLists.txt. Add an option to Lagom
to do the same detection.
This standard CMake option controls whether add_library() calls will
use STATIC or SHARED by default. The flag is set to on by default
since that's what we want for normal CI jobs and local builds and the
test262 runner, but disabled for oss-fuzz builds.
This should finally fix the oss-fuzz build after it was broken in #9017
oss-fuzz un-breakage was verified by running the following commands in
the oss-fuzz repo:
python infra/helper.py build_image serenity
python infra/helper.py build_fuzzers --sanitizer address --engine afl \
--architecture x86_64 serenity /path/to/local/checkout/Meta/Lagom
python infra/helper.py check_build --sanitizer address --engine afl \
--architecture x86_64 serenity
This supports some binary property matching. It does not support any
properties not yet parsed by LibUnicode, nor does it support value
matching (such as Script_Extensions=Latin).
Split the Lagom build into shared libraries to match the Serenity build.
This reduces the cognitive load when trying to edit the Lagom CMakeLists
significantly. It also reduces the amount of source files that must be
compiled to run each test or host program significantly.
Also re-organize all the build rules into sections. And reorganize the
CMakeLists file in general.
LibTTF has a concrete dependency on LibGfx for things like Gfx::Bitmap,
and LibGfx has a concrete dependency in the TTF::Font class in
Gfx::FontDatabase. This circular dependency works fine for Serenity and
Lagom Linux builds of the two libraries. It also works fine for static
library builds on Lagom macOS builds.
However, future changes will make Lagom use shared libraries, and
circular library dependencies are not tolerated in macOS.
This is primarily to allow using LibUnicode within LibJS and its REPL.
Note: this seems to be the first time that a Lagom dependency requires
generated source files. For this to work, some of Lagom's CMakeLists.txt
commands needed to be re-organized to include the CMake files that fetch
and parse UnicodeData.txt. The paths required to invoke the generator
also differ depending on what is currently building (SerenityOS vs.
Lagom as part of the Serenity build vs. a standalone Lagom build).
These are usually incorrect, and people sometimes forget to add the
correct values as a result of them being optional, so they should just
be specified explicitly.
This removes all usages of the non-standard define_property helper
method and replaces all it's usages with the specification required
alternative or with define_direct_property where appropriate.
All GUI applications currently load all TTF fonts on startup
(to populate the Gfx::FontDatabase. This could probably be smarter.)
Before this patch, everyone would open the files and read them into
heap-allocated storage. Now we simply mmap() them instead. :^)
This commit adds a bunch of passes, the most interesting of which is a
pass that merges blocks together, and a pass that places blocks that
flow into each other next to each other, and a very simply pass that
removes duplicate basic blocks.
Note that this does not remove the jump at the end of each block in that
pass to avoid scope creep in the passes.
Previously, AK::Function would accept _any_ callable type, and try to
call it when called, first with the given set of arguments, then with
zero arguments, and if all of those failed, it would simply not call the
function and **return a value-constructed Out type**.
This lead to many, many, many hard to debug situations when someone
forgot a `const` in their lambda argument types, and many cases of
people taking zero arguments in their lambdas to ignore them.
This commit reworks the Function interface to not include any such
surprising behaviour, if your function instance is not callable with
the declared argument set of the Function, it can simply not be
assigned to that Function instance, end of story.
Previously ByteBuffer::grow() behaved like Vector<T>::resize().
However the function name was somewhat ambiguous - and so this patch
updates ByteBuffer to behave more like Vector<T> by replacing grow()
with resize() and adding an ensure_capacity() method.
This also lets the user change the buffer's capacity without affecting
the size which was not previously possible.
Additionally this patch makes the capacity() method public (again).
This only tests "can it be parsed", but the goal of this commit is to
provide a test framework that can be built upon :)
The conformance tests are downloaded, compiled* and installed only if
the INCLUDE_WASM_SPEC_TESTS cmake option is enabled.
(*) Since we do not yet have a wast parser, the compilation is delegated
to an external tool from binaryen, `wasm-as`, which is required for the
test suite download/install to succeed.
This *does* run the tests in CI, but it currently does not include the
spec conformance tests.
Previously the CMake options for -fsanitize=address, thread and
undefined were gated behind clang, which was unecessary. Only
-fsanitize=fuzzer is clang-only.
Since the operations are already complicated and will become even more
so soon, let's split them into their own files. We can also integrate
the NumberTheory operations that would better fit there into this class
as well.
This commit doesn't change behaviors, but moves the allocation of some
variables into caller classes.