serenity

mirror of https://github.com/RGBCube/serenity synced 2025-10-26 12:02:34 +00:00

Author	SHA1	Message	Date
Timothy Flynn	2d2f713426	LibUnicode: Generate per-locale minimum grouping digit values Previously, we were breaking up digits into groups without regard for the locale's minimumGroupingDigits value in the CLDR. This value is 1 in most locales, but is 2 in locales such as pl-PL. What this means is that in those locales, the group separator should only be inserted if the thousands group has at least 2 digits. So 1000 is formatted as "1,000" in en-US, but "1000" in pl-PL. And 10000 is "10,000" in en-US and "10 000" in pl-PL.	2022-01-27 20:30:52 +00:00
Timothy Flynn	bced4e9324	LibJS+LibUnicode: Convert Intl.ListFormat to use Unicode::Style Remove ListFormat's own definition of the Style enum, which was further duplicated by a generated ListPatternStyle enum with the same values.	2022-01-25 19:02:59 +00:00
Timothy Flynn	4400150cd2	LibJS+LibUnicode: Return the appropriate time zone name depending on DST	2022-01-19 21:20:41 +00:00
Timothy Flynn	bf677eb485	LibUnicode: Generate both standard and daylight time zone names While LibTimeZone didn't support DST, we only generated one of them, preferring the standard name. Now that DST can be tested, generate both names.	2022-01-19 21:20:41 +00:00
Timothy Flynn	701b7810ba	LibUnicode: Generate code point abbreviations	2022-01-18 15:13:25 +00:00
Idan Horowitz	877ae85017	LibJS+LibUnicode: Make static const Utf8View variables constexpr	2022-01-17 14:46:07 +00:00
Timothy Flynn	c86f7a675d	LibUnicode: Do not limit language display names to known locales Currently, the UnicodeLocale generator collects a list of known locales from the CLDR before processing language display names. For each locale, the identifier is broken into language, script, and region subtags, and we create a list of seen languages. When processing display names, we skip languages we hadn't seen in that first step. This is insufficient for language display names like "en-GB", which do not have an locale entry in the CLDR, and thus are skipped. So instead, create the list of known languages by actually reading through the list of languages which have a display name.	2022-01-13 23:05:31 +01:00
Timothy Flynn	91acc2e9c5	LibUnicode: Parse and generate locale display patterns These patterns indicate how to display locale strings when that locale contains multiple subtags. For example, "en-US" would be displayed as "English (United States)".	2022-01-13 23:05:31 +01:00
Timothy Flynn	0d75949827	LibUnicode: Parse and generate locale display names for date fields	2022-01-13 13:43:57 +01:00
Timothy Flynn	7f162c471d	LibUnicode: Parse and generate locale display names for calendars Note there's a bit of an unfortunate duplication in the calendar enum generated by UnicodeLocale and the existing enum generated by UnicodeDateTimeFormat. The former contains every calendar known to the CLDR, whereas the latter contains the calendars we've actually parsed for DateTimeFormat (currently only Gregorian). The new enum generated here can be removed once DateTimeFormat knows about all calendars.	2022-01-13 13:43:57 +01:00
Timothy Flynn	bdf02c21e1	LibUnicode: Swap the preferred order of standard time zone display names Our generator is currently preferring the DST variant of the time zone display names over the non-DST variant. LibTimeZone currently does not have DST support, and operates in a mode that basically assumes DST does not exist. Swap the display names for now just to be consistent until we have DST support. Note we will need to generate both of these variants and select the appropriate one at runtime once we have DST support.	2022-01-12 15:43:12 +01:00
Timothy Flynn	0d8120eeb2	LibUnicode: Perform number system lookups by enumeration value Now that number systems are generated as an enum, we can generated the number system data in the order of that enum. This lets us perform lookups of that data by index instead of a loop of string comparisons.	2022-01-12 10:49:07 +01:00
Timothy Flynn	c5138f0f2b	LibUnicode: Parse number system digits from the CLDR We had a hard-coded table of number system digits copied from ECMA-402. Turns out these digits are in the CLDR, so let's parse the digits from there instead of hard-coding them.	2022-01-12 10:49:07 +01:00
Timothy Flynn	e2dfbe8f67	LibUnicode: Parse and generate long and short generic time zone names This implements the CalendarPatternStyle::{Long,Short}Generic styles of time zone name formatting.	2022-01-11 23:56:35 +01:00
Timothy Flynn	8d35563f28	LibUnicode: Implement TR-35's localized GMT offset formatting This adds an API to use LibTimeZone to convert a time zone such as "America/New_York" to a GMT offset string like "GMT-5" (short form) or "GMT-05:00" (long form).	2022-01-11 23:56:35 +01:00
Timothy Flynn	1c2c98ac5d	LibTimeZone: Add method to convert a time zone to a string	2022-01-11 00:36:45 +01:00
Timothy Flynn	b543c3e490	Meta: Don't assume how each generator wants to generate keyed map names The generate_mapping helper generates a series of structs like: Array<SomeType, 1> s_mapping_key_0 {}; Array<SomeType, 2> s_mapping_key_1 {}; Array<SomeType, 3> s_mapping_key_2 {}; Array<Span<SomeType const>> s_mapping { { s_mapping_key_0.span(), s_mapping_key_1.span(), s_mapping_key_2.span(), } }; Where the names of the struct were generated by the format_mapping_name lambda inside the helper. Rather than this lambda making assumptions on how each generator wants to name its structs, add a parameter for the caller to provide a naming formatter. This is because the TimeZoneData generator will want pretty specific identifier formatting rules.	2022-01-11 00:36:45 +01:00
Timothy Flynn	6da1bfeeea	Meta: Support generating case-insensitive value-from-string methods This also extracts the default parameters for generate_value_from_string to a structure. This is just to make it cleaner to add new options.	2022-01-11 00:36:45 +01:00
Timothy Flynn	498b741434	LibUnicode: Use LibTimeZone's list of time zone names LibUnicode no longer needs to generate a list of time zone names that it parsed from metaZones.json. We can defer to the TZDB for a golden list of time zones.	2022-01-08 12:45:34 +01:00
Timothy Flynn	ca9123f66f	LibUnicode: Rename DateTimeFormat's generator's TimeZone struct Before using LibTimeZone within LibUnicode, rename this structure to avoid naming conflicts with the TimeZone namespace.	2022-01-08 12:45:34 +01:00
mjz19910	10ec98dd38	Everywhere: Fix spelling mistakes	2022-01-07 15:44:42 +01:00
Timothy Flynn	6d7d9dd324	LibUnicode: Do not assume time zones & meta zones have a 1-to-1 mapping The generator parses metaZones.json to form a mapping of meta zones to time zones (AKA "golden zone" in TR-35). This parser errantly assumed this was a 1-to-1 mapping.	2022-01-06 22:28:01 +01:00
Timothy Flynn	62d8d1fdfd	LibUnicode: Move UTC verification to the scope that requires it In Unicode::get_time_zone_name(), we don't need to require that the time zone is UTC for long- and short-style name lookups. This is required for other styles, because they will depend on TZDB data - so move the VERIFY to that scope.	2022-01-06 22:28:01 +01:00
Timothy Flynn	ec7d5351ed	LibJS+LibUnicode: Handle flexible day periods that roll over midnight When searching for the locale-specific flexible day period for a given hour, we were neglecting to handle cases where the period crosses 00:00. For example, the en locale defines a day period range of [21:00, 06:00). When given the hour of 05:00, we were checking if (21 <= 5 && 5 < 6), thus not recognizing that the hour falls in that period.	2022-01-05 16:22:55 +01:00
Timothy Flynn	dd88ff70ac	LibUnicode: Remove now unused value-from-string generator overload	2022-01-04 22:49:43 +00:00
Timothy Flynn	437b9fe204	LibUnicode: Convert UnicodeData to link with weak symbols	2022-01-04 22:49:43 +00:00
Timothy Flynn	f576142fe8	LibJS+LibUnicode: Convert UnicodeLocale to link with weak symbols	2022-01-04 22:49:43 +00:00
Timothy Flynn	cf8e11a562	LibUnicode: Add temporary overload of value-from-string generator This is a temporary mechanism while LibUnicode is in an in-between state where some symbols are weakly linked and others are dynamically loaded. The latter require an asm() label to be loaded.	2022-01-04 22:49:43 +00:00
Timothy Flynn	ba4cdf34f8	LibUnicode: Convert UnicodeDateTimeFormat to link with weak symbols	2022-01-04 22:49:43 +00:00
Timothy Flynn	98709d9be1	LibUnicode: Convert UnicodeNumberFormat to link with weak symbols Currently, we load the generated Unicode symbols with dlopen at runtime. This is unnecessary as of `565a880ce5`. Applications that want Unicode data now link directly against the shared library holding that data. So the same functionality can be achieved with weak symbols.	2022-01-04 22:49:43 +00:00
Timothy Flynn	126a3fe180	LibUnicode: Add minimal support for generic & offset-based time zones ECMA-402 now supports short-offset, long-offset, short-generic, and long-generic time zone name formatting. For example, in the en-US locale the America/Eastern time zone would be formatted as: short-offset: GMT-5 long-offset: GMT-05:00 short-generic: ET long-generic: Eastern Time We currently only support the UTC time zone, however. Therefore, this very minimal implementation does not consider GMT offset or generic display names. Instead, the CLDR defines specific strings for UTC.	2022-01-03 15:11:59 +01:00
Timothy Flynn	52394deece	LibUnicode: Remove now unused value-from-string generator overload The generate_value_from_string_for_dynamic_loading() overload was just temporary until all generates were switched over to dynamic loading.	2021-12-21 13:09:49 -08:00
Timothy Flynn	15e1498419	LibUnicode: Dynamically load the generated UnicodeDateTimeFormat symbols	2021-12-21 13:09:49 -08:00
Timothy Flynn	a1f0ca59ae	LibUnicode: Dynamically load the generated UnicodeNumberFormat symbols	2021-12-21 13:09:49 -08:00
Timothy Flynn	09be26b5d2	LibUnicode: Dynamically load the generated UnicodeLocale symbols	2021-12-21 13:09:49 -08:00
Timothy Flynn	3fd53baa25	LibUnicode: Dynamically load the generated UnicodeData symbols The generated data for libunicodedata.so is quite large, and loading it is a price paid by nearly every application by way of depending on LibRegex. In order to defer this cost until an application actually uses one of the surrounding APIs, dynamically load the generated symbols. To be able to load the symbols dynamically, the generated methods must have demangled names. Typically, this is accomplished with `extern "C"` blocks. The clang toolchain complains about this here because the types returned from the generators are strictly C++ types. So to demangle the names, we use the asm() compiler directive to manually define a symbol name; the caveat is that we must be sure the symbols are unique. As an extra precaution, we prefix each symbol name with "unicode_". For more details, see: https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html This symbol loader used in this implementation provides the additional benefit of removing many [[maybe_unused]] attributes from the LibUnicode methods. Internally, if ENABLE_UNICODE_DATABASE_DOWNLOAD is OFF, the loader is able to stub out the function pointers it returns. Note that as of this commit, LibUnicode is still directly linked against LibUnicodeData. This commit is just a first step towards removing that.	2021-12-21 13:09:49 -08:00
Michel Hermier	060e5ccbbc	Lagom: Bind `time_zone_list_index_type` in the generator The variable `s_time_zone_list_index_type` seems to be unused (detected when compiling with clang), and it seems logical to bind it even it if it is not used for now.	2021-12-18 21:01:10 -08:00
Timothy Flynn	ce6c515873	LibUnicode: Generate unique list patterns and lists of list patterns	2021-12-13 21:28:56 -08:00
Timothy Flynn	0ad2decd04	LibUnicode: Generate unique list of keyword values	2021-12-13 21:28:56 -08:00
Timothy Flynn	0c6cc4ad96	LibUnicode: Generate unique lists of localized currencies	2021-12-13 21:28:56 -08:00
Timothy Flynn	a45f2ccc25	LibUnicode: Generate unique lists of languages, territories, and scripts	2021-12-13 21:28:56 -08:00
Timothy Flynn	6e5f0b139b	LibUnicode: Remove unused fields from generated structures A couple of structures held a string index that is unused. Removing them also removes the string values from the unique string list.	2021-12-13 21:28:56 -08:00
Timothy Flynn	77fc877c04	LibUnicode: Generate unique lists of hour cycles	2021-12-13 21:28:56 -08:00
Timothy Flynn	6f17696176	LibUnicode: Generate unique lists of time zone structures	2021-12-13 21:28:56 -08:00
Timothy Flynn	df33156462	LibUnicode: Generate unique lists of day period structures	2021-12-13 21:28:56 -08:00
Timothy Flynn	265785e847	LibUnicode: Generate unique day period structures	2021-12-13 21:28:56 -08:00
Timothy Flynn	7af1818e76	LibUnicode: Generate unique time zone structures Each of the 374 locales contain 156 time zone structures. Of these 58,344 structures, 13,578 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	b14b37f386	LibUnicode: Generate unique calendar structures Of the 374 generated calendars, 173 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	4b721597d7	LibUnicode: Generate unique lists of calendar range patterns Of the 374 range pattern lists and 374 range12 pattern lists, 230 are unique.	2021-12-13 21:28:56 -08:00
Timothy Flynn	9fc2442e7d	LibUnicode: Generate unique lists of calendar patterns Of the 374 generated lists, 152 are unique. These lists have upwards of 1000 entries as well, so the de-duplication is particularly nice.	2021-12-13 21:28:56 -08:00

1 2 3 4

160 commits