The first step of to_temporal_calendar_with_iso_default() is checking
whether the given value is undefined, so we should actually pass that
instead of unconditionally dereferencing the Optional<String>.
It's not the `to_uint<u8>()` call that would fail, if we have a value
for these productions they will always be valid numbers. We do need to
provide a fallback for when that's not the case and any of them is
undefined, i.e. an empty Optional.
This is the start of a parser for the ISO 8601 grammar used in the
Temporal spec:
https://tc39.es/proposal-temporal/#sec-temporal-iso8601grammar
We will, on purpose, not use a generic ISO 8601 parser from AK or
similar for two reasons:
- Many AOs make specific assumptions about which productions exist and
access them directly, even when they're part of a larger production.
- The spec says "The grammar deviates from the standard given in ISO
8601 in the following ways:" and then lists 17 of such deviations.
Making that work with a general purpose parser is not worth it.
The public API is not being used anywhere yet, but will be in the next
couple of commits. Likewise, the Production enum will be populated with
all the productions accessed directly (e.g. TemporalDateString).
Many thanks to Ali for showing me how to improve my initial approach
full of macros with a nice RAII helper - it's much nicer :^)
Co-Authored-By: Ali Mohammad Pur <mpfard@serenityos.org>
This would return incorrect results for negative inputs. It still does
to some extent, remainder() in step 2 might need to be replaced with
modulo (I opened an issue in tc39/proposal-temporal about that).
When the bytecode interpreter was converted to ThrowCompletionOr<Value>
it then also cleared the vm.exception() making it seem like no exception
was thrown.
Also removed the TRY_OR_DISCARD as that would skip the error handling
parts.
Turns out the only difference between our existing implementation and
the ECMA-402 implementation is we weren't passing the locales and
options list to each element.toLocaleString invocation.
This also adds spec comments to the definition.
This isn't a complete conversion to ErrorOr<void>, but a good chunk.
The end goal here is to propagate buffer allocation failures to the
caller, and allow the use of TRY() with formatting functions.
Also add slightly richer parse errors now that we can include a string
literal with returned errors.
This will allow us to use TRY() when working with JSON data.
This wasn't the case for compact patterns, but unit patterns can contain
multiple (up to 2, really) identifiers that must each be recognized by
LibJS.
Each generated NumberFormat object now stores an array of identifiers
parsed. The format pattern itself is encoded with the index into this
array for that identifier, e.g. the compact format string "0K" will
become "{number}{compactIdentifier:0}".
This field is currently used to store the StringView into the compact
name/symbol in the format string. Units will need to store a similar
field, so rename the field to be more generic, and extract the parser
for it.
Instead of currency pattern lookups within select_currency_unit_pattern,
rename the method to select_pattern_with_plurality and accept any list
of patterns. This method will be needed for units.
Currently, we get the following results
-1 - -2 = -1
-2 - -1 = 1
Correct would be:
-1 - -2 = 1
-2 - -1 = -1
This was already attempted to be fixed in 7ed8970, but that change was
incorrect. This directly translates to LibJS BigInts having the same
incorrect behavior - it even was tested.
Finding the best number format to use for compact notation involves
creating a Vector of all compact formats for the locale and looking for
the one that best matches the number's magnitude. ECMA-402 wants this
number format to be found multiple times, so cache the result for future
use.
The compact scale of each formatting rule was precomputed in commit:
be69eae651
Using the formula: compact scale = magnitude - pattern scale
This computation was off-by-one.
For example, consider the format key "10000-count-one", which maps to
"00 thousand" in en-US. What we are really after is the exponent that
best represents the string "thousand" for values greater than 10000
and less than 100000 (the next format key). We were previously doing:
log10(10000) - "00 thousand".count("0") = 2
Which clearly isn't what we want. Instead, if we do:
log10(10000) + 1 - "00 thousand".count("0") = 3
We get the correct exponent for each format key for each locale.
This commit also renames the generated variable from "compact_scale" to
"exponent" to match the terminology used in ECMA-402.
481f7d6afa tried to use `modulo()` here, but missed that the
code used `<=` instead of `<`.
Keep using `modulo()` and add an explicit conditional, which is
arguably clearer.
When the resulting week is in the previous year, we need to check if the
previous year is a leap year and can potentially have 53 weeks, instead
of the given year.
Also added a comment to briefly explain what's going on, as it took me a
while to figure out.
...and change the parameter types from i64 to double, as predicted by
a FIXME. The issue here is that integer division with a negative
dividend doesn't yield the same result as floating point division
wrapped in floor().
Additionally, the `days` variable needs to be signed.