serenity

mirror of https://github.com/RGBCube/serenity synced 2025-09-18 13:56:18 +00:00

Author	SHA1	Message	Date
Nico Weber	24a469f521	Everywhere: Prefer {:#x} over 0x{:x} in format strings The former automatically adapts the prefix to binary and octal output, and is what we already use in the majority of cases. Patch generated by: rg -l '0x\{' \| xargs sed -i '' -e 's/0x{:/{:#/' I ran it 4 times (until it stopped changing things) since each invocation only converted one instance per line. No behavior change.	2024-02-21 17:54:38 +01:00
Ali Mohammad Pur	5e1499d104	Everywhere: Rename {Deprecated => Byte}String This commit un-deprecates DeprecatedString, and repurposes it as a byte string. As the null state has already been removed, there are no other particularly hairy blockers in repurposing this type as a byte string (what it _really_ is). This commit is auto-generated: $ xs=$(ack -l \bDeprecatedString\b\\|deprecated_string AK Userland \ Meta Ports Ladybird Tests Kernel) $ perl -pie 's/\bDeprecatedString\b/ByteString/g; s/deprecated_string/byte_string/g' $xs $ clang-format --style=file -i \ $(git diff --name-only \| grep \.cpp\\|\.h) $ gn format $(git ls-files '.gn' '.gni')	2023-12-17 18:25:10 +03:30
Timothy Flynn	43e9dc0500	LibUnicode: Use weak symbols to provide default IDNA defintions Rather than using #ifdef blocks, update the fallback IDNA definitions to use weak symbols to match the rest of LibUnicode / LibLocale.	2023-12-10 10:19:14 -05:00
Simon Wanner	7d9fe44039	LibUnicode: Download and parse IDNA data	2023-12-10 08:04:58 -05:00
Tim Schumacher	a2f60911fe	AK: Rename GenericTraits to DefaultTraits This feels like a more fitting name for something that provides the default values for Traits.	2023-11-09 10:05:51 -05:00
Timothy Flynn	139c575cc9	LibUnicode: Update to Unicode version 15.1.0 https://unicode.org/versions/Unicode15.1.0/ This update includes a new set of code point properties, Indic Conjunct Break. These may have the values Consonant, Linker, or Extend. These are used in text segmentation to prevent breaking on some extended grapheme cluster sequences.	2023-09-15 18:30:26 +02:00
Andreas Kling	8b936b5912	AK: Make SourceGenerator::set() infallible	2023-08-22 13:08:24 +02:00
Sam Atkins	0d021a63c7	LibUnicode: Generate data for bidirectional character types This will let us examine code points to determine the rtl/ltr direction of a piece of text.	2023-08-20 16:21:35 -04:00
Lucas CHOLLET	3f35ffb648	Userland: Prefer `_string` over `_short_string` As `_string` can't fail anymore (since `3434412`), there are no real benefits to use the short variant in most cases.	2023-08-08 07:37:21 +02:00
Timothy Flynn	b91af3c6a0	LibUnicode: Remove a few generator tracking fields that are now unused These were used to generate specialized tables. Now that those tables have been migrated to general 2-stage lookup tables, these fields are all unused.	2023-07-28 05:28:50 +02:00
Timothy Flynn	456211932f	LibUnicode: Perform code point case conversion lookups in constant time Similar to commit `0652cc4`, we now generate 2-stage lookup tables for case conversion information. Only about 1500 code points are actually cased. This means that case information is rather highly compressible, as the blocks we break the code points into will generally all have no casing information at all. In total, this change: * Does not change the size of libunicode.so (which is nice because, generally, the 2-stage lookup tables are expected to trade a bit of size for performance). * Reduces the runtime of the new benchmark test case added here from 1.383s to 1.127s (about an 18.5% improvement).	2023-07-28 05:28:50 +02:00
Timothy Flynn	0ee133af90	LibUnicode: Separate code point case information into its own structure There is no functional change here. This information will compose the upcoming multistage casing tables in an upcoming patch. Extract it to its own struct to prepare for that.	2023-07-28 05:28:50 +02:00
Timothy Flynn	a332a8ad19	LibUnicode: Prepare Unicode data generator for multistage casing tables There is no functional change here. This just adjusts the changes made in commit `0652cc4` to be a bit more generic for code point casing tables. We currently only generate property tables, which boil down to a vector of booleans. Casing tables will be a struct of varying types, so this generalizes some of the generator to prepare for that ahead of time, to make the upcoming casing patch smaller / easier to grok.	2023-07-28 05:28:50 +02:00
Timothy Flynn	3fae92eea2	LibUnicode: Search code point properties sequentially at compile time When generating code point property tables, we currently binary search the code point range lists for each property to decide if a code point has that property. However, we are both iterating over the code points and through the sorted properties in order. This means we do not need to search code point ranges that are below the current code point at all. We can even remove the code point ranges that fall below the current code point, as we will not see a code point in those ranges again. On my machine, this reduces the run time of GenerateUnicodeData from 3.4 seconds to 1.2 seconds.	2023-07-28 05:28:50 +02:00
Timothy Flynn	0652cc48c0	LibUnicode: Perform code point property lookups in constant time We currently produce a single table for all categories of code point properties (GeneralCategory, Script, etc.). Each row contains a field indicating the range of code points to which that property applies. At runtime, we then do a binary search through that table to decide if a code point has a property. This changes our approach to generate a 2-stage lookup table for each of those categories. There is an in-depth explanation of these tables above the new `create_code_point_tables` method. The end effect is that code point property lookup is reduced from a binary search to constant-time array lookups. In total, this change: * Increases the size of libunicode.so from 2.7 MB to 2.9 MB. * Reduces the runtime of the new benchmark test case added here from 3.576s to 1.020s (a 3.5x speedup). * In a profile of resizing a TextEditor window with a 3MB file open, the runtime of checking if a code point has a word break property reduces from ~81% to ~56%.	2023-07-26 08:36:20 +02:00
Timothy Flynn	8f1d73abde	LibUnicode: Use the public CodePointRange in the code generator The next commit will need a type from LibUnicode/CharacterTypes.h. To avoid conflicts between that header's CodePointRange and the one that is defined in the code generator, just use the public definition.	2023-07-26 08:36:20 +02:00
Timothy Flynn	cb128dcf75	LibUnicode: Move the CodePointRangeComparator struct to a public header Move it out of the generated code so that it may be used by the code generator itself.	2023-07-26 08:36:20 +02:00
Timothy Flynn	c950f88611	LibUnicode: Stop generating Block property data We started generating this data in commit `0505e03`, but it was unused. It's still not used, so let's remove it, rather than bloating the size of libunicode.so with unused data. If we need it in the future, it's trivial to add back. Note we have always used the block name data from that commit, and that is still present here.	2023-07-26 08:36:20 +02:00
Ben Wiederhake	5cfa883b9f	LibUnicode: Explicitly mark HashMap copy	2023-05-19 22:33:57 +02:00
Lucas CHOLLET	8c34959b53	AK: Add the `Input` word to input-only buffered streams This concerns both `BufferedSeekable` and `BufferedFile`.	2023-05-09 11:18:46 +02:00
Cameron Youell	1d24f394c6	Everywhere: Use `LibFileSystem` where trivial	2023-03-21 19:03:21 +00:00
Sam Atkins	b18c1c7291	LibUnicode: Remove now-unused dir-iterator helper functions	2023-03-15 12:49:33 -04:00
Sam Atkins	8a8ad81aa1	LibUnicode: Migrate GenerateEmojiData to Directory::for_each_entry()	2023-03-15 12:49:33 -04:00
Sam Atkins	8672b380f6	LibUnicode: Read emoji file title from LexicalPath directly ... rather than taking the whole file name, and then manually trimming the extension off.	2023-03-15 12:49:33 -04:00
gustrb	5141c86587	AK: Rename CaseInsensitiveStringViewTraits to reflect intent Now it is called `CaseInsensitiveASCIIStringViewTraits`, so we can be more specific about what data structure does it operate onto. ;)	2023-03-14 21:34:32 +00:00
Tim Schumacher	8032724574	CodeGenerators: Ensure that we always print the entire generated output	2023-03-13 15:16:20 +00:00
Tim Schumacher	d5871f5717	AK: Rename Stream::{read,write} to Stream::{read_some,write_some} Similar to POSIX read, the basic read and write functions of AK::Stream do not have a lower limit of how much data they read or write (apart from "none at all"). Rename the functions to "read some [data]" and "write some [data]" (with "data" being omitted, since everything here is reading and writing data) to make them sufficiently distinct from the functions that ensure to use the entire buffer (which should be the go-to function for most usages). No functional changes, just a lot of new FIXMEs.	2023-03-13 15:16:20 +00:00
Sam Atkins	774f328783	LibCore+Everywhere: Return an Error from DirIterator::error() This also removes DirIterator::error_string(), since the same strerror() string will be included when you print the Error itself. Except in `ls` which is still using fprintf() for now.	2023-03-05 20:23:42 +01:00
Timothy Flynn	ca2b030336	LibUnicode: Use binary search for lookups into the generated emoji data This sorts the array of generated emoji data by code point (first by code point length, then by code point value). This lets us use a binary search to find emoji data, rather than the current linear search. In a profile of scrolling around /home/anon/Documents/emoji.txt, this reduces the runtime of Gfx::Emoji::emoji_for_code_points from 69.03% to 28.42%. Within that, Unicode::find_emoji_for_code_points reduces from 28.42% to just 1.95%.	2023-03-05 16:44:20 +01:00
Timothy Flynn	03f32bdf86	LibUnicode: Validate that all emoji images in /res/emoji actually exist This will raise a compile error if an emoji image was neglected to be added to e.g. emoji-serenity.txt, or if the code points are not correct.	2023-03-03 17:09:58 +00:00
Timothy Flynn	fd1fbad1d2	LibGfx+LibUnicode: Support specifying the path to search for emoji Similar to the FontDatabase, this will be needed for Ladybird to find emoji images. We now generate just the file name of emoji image in LibUnicode, and look for that file in the specified path (defaulting to /res/emoji) at runtime.	2023-03-01 14:54:16 +00:00
MacDue	01fa3bb788	LibUnicode: Propagate try_append() errors when building emoji data	2023-02-24 22:18:25 +01:00
Timothy Flynn	8c38d46c1a	LibUnicode: Generate the path to emoji images alongside emoji data This will provide for quicker emoji lookups, rather than having to discover and allocate these paths at runtime before we find out if they even exist.	2023-02-24 19:48:47 +01:00
Tim Schumacher	874c7bba28	LibCore: Remove `Stream.h`	2023-02-13 00:50:07 +00:00
Tim Schumacher	606a3982f3	LibCore: Move Stream-based file into the `Core` namespace	2023-02-13 00:50:07 +00:00
Tim Schumacher	d43a7eae54	LibCore: Rename `File` to `DeprecatedFile` As usual, this removes many unused includes and moves used includes further down the chain.	2023-02-13 00:50:07 +00:00
MacDue	63b11030f0	Everywhere: Use ReadonlySpan<T> instead of Span<T const>	2023-02-08 19:15:45 +00:00
Tim Schumacher	8464da1439	AK: Move `Stream` and `SeekableStream` from `LibCore` `Stream` will be qualified as `AK::Stream` until we remove the `Core::Stream` namespace. `IODevice` now reuses the `SeekMode` that is defined by `SeekableStream`, since defining its own would require us to qualify it with `AK::SeekMode` everywhere.	2023-01-29 19:16:44 -07:00
Linus Groh	6e7459322d	AK: Remove StringBuilder::build() in favor of to_deprecated_string() Having an alias function that only wraps another one is silly, and keeping the more obvious name should flush out more uses of deprecated strings. No behavior change.	2023-01-27 20:38:49 +00:00
Timothy Flynn	8f2589b3b0	LibUnicode: Parse and generate case folding code point data Case folding rules have a similar mapping style as special casing rules, where one code point may map to zero or more case folding rules. These will be used for case-insensitive string comparisons. To see how case folding can differ from other casing rules, consider "ß" (U+00DF): >>> "ß".lower() 'ß' >>> "ß".upper() 'SS' >>> "ß".title() 'Ss' >>> "ß".casefold() 'ss'	2023-01-18 14:43:40 +00:00
Timothy Flynn	9226cf7272	LibUnicode: Rename a special casing variable name in the UCD generator This name will soon be a bit ambiguous with a similar case folding variable name.	2023-01-18 14:43:40 +00:00
Timothy Flynn	8d9fb898d7	LibUnicode: Update out-of-date spec links And remove links that aren't adding much value but will often get out of date (i.e. links to UCD files, which are already all listed in unicode_data.cmake).	2023-01-18 14:43:40 +00:00
Timothy Flynn	b562348d31	LibUnicode: Generate simple case folding mappings for titlecase Note we already generate the special case foldings for titlecase.	2023-01-16 18:33:44 -05:00
Timothy Flynn	12f6793223	LibUnicode: Move Unicode-aware case transformations to a helper file These will be needed by AK::String as well, so move them to a helper file where they can be re-used.	2023-01-09 19:23:46 -07:00
Ben Wiederhake	6fd478b6ce	Everywhere: Remove unused includes of AK/Format.h These instances were detected by searching for files that include AK/Format.h, but don't match the regex: \\b(CheckedFormatString\|critical_dmesgln\|dbgln\|dbgln_if\|dmesgln\|FormatBu ilder\|__FormatIfSupported\|FormatIfSupported\|FormatParser\|FormatString\|Fo rmattable\|Formatter\|__format_value\|HasFormatter\|max_format_arguments\|out \|outln\|set_debug_enabled\|StandardFormatter\|TypeErasedFormatParams\|TypeEr asedParameter\|VariadicFormatParams\|v_critical_dmesgln\|vdbgln\|vdmesgln\|vf ormat\|vout\|warn\|warnln\|warnln_if)\\b (Without the linebreaks.) This regex is pessimistic, so there might be more files that don't actually use any formatting functions. Observe that this revealed that Userland/Libraries/LibC/signal.cpp is missing an include. In theory, one might use LibCPP to detect things like this automatically, but let's do this one step after another.	2023-01-02 20:27:20 -05:00
Tim Schumacher	ed4c2f2f8e	LibCore: Rename `Stream::read_all` to `read_until_eof` This generally seems like a better name, especially if we somehow also need a better name for "read the entire buffer, but not the entire file" somewhere down the line.	2022-12-12 14:16:42 +01:00
Thomas Queiroz	6debd967ba	Lagom/CodeGenerators: Use HashMap::try_ensure_capacity	2022-12-10 14:29:46 +01:00
Tim Schumacher	2fc2025f49	LibCore: Move `Core::Stream::File::exists()` to `Core::File` `Core::Stream::File` shouldn't hold any utility methods that are unrelated to constructing a `Core::Stream`, so let's just replace the existing `Core::File::exists` with the nicer looking implementation.	2022-12-08 12:52:14 +00:00
Linus Groh	57dc179b1f	Everywhere: Rename to_{string => deprecated_string}() where applicable This will make it easier to support both string types at the same time while we convert code, and tracking down remaining uses. One big exception is Value::to_string() in LibJS, where the name is dictated by the ToString AO.	2022-12-06 08:54:33 +01:00
Linus Groh	6e19ab2bbc	AK+Everywhere: Rename String to DeprecatedString We have a new, improved string type coming up in AK (OOM aware, no null state), and while it's going to use UTF-8, the name UTF8String is a mouthful - so let's free up the String name by renaming the existing class. Making the old one have an annoying name will hopefully also help with quick adoption :^)	2022-12-06 08:54:33 +01:00

1 2 3 4 5 ...

286 commits