Arguments larger than 32bit need to be passed as a pointer on a 32bit
architectures. sys$profiling_enable has u64 event_mask argument,
which means that it needs to be passed as an pointer. Previously upper
32bits were filled by garbage.
gethostbyname() and gethostbyaddr() now set h_errno (per spec) and try
to recover and return (with an error) instead of choking in VERIFY()
whenever an I/O or protocol error occurs in the communication with
LookupServer.
The DT_RELR relocation is a relatively new relocation encoding designed
to achieve space-efficient relative relocations in PIE programs.
The description of the format is available here:
https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/Pi9aSwwABgAJ
It works by using a bitmap to store the offsets which need to be
relocated. Even entries are *address* entries: they contain an address
(relative to the base of the executable) which needs to be relocated.
Subsequent even entries are *bitmap* entries: "1" bits encode offsets
(in word size increments) relative to the last address entry which need
to be relocated.
This is in contrast to the REL/RELA format, where each entry takes up
2/3 machine words. Certain kinds of relocations store useful data in
that space (like the name of the referenced symbol), so not everything
can be encoded in this format. But as position-independent executables
and shared libraries tend to have a lot of relative relocations, a
specialized encoding for them absolutely makes sense.
The authors of the format suggest an overall 5-20% reduction in the file
size of various programs. Due to our extensive use of dynamic linking
and us not stripping debug info, relative relocations don't make up such
a large portion of the binary's size, so the measurements will tend to
skew to the lower side of the spectrum.
The following measurements were made with the x86-64 Clang toolchain:
- The kernel contains 290989 relocations. Enabling RELR decreased its
size from 30 MiB to 23 MiB.
- LibUnicodeData contains 190262 relocations, almost all of them
relative. Its file size changed from 17 MiB to 13 MiB.
- /bin/WebContent contains 1300 relocations, 66% of which are relative
relocations. With RELR, its size changed from 832 KiB to 812 KiB.
This change was inspired by the following blog post:
https://maskray.me/blog/2021-10-31-relative-relocations-and-relr
The global variable use in these functions is super thread-unsafe and
means that any concurrent calls to sprintf or fprintf in a process
could race with each other and end up writing unexpected results.
We can just replace the function + global variable with a lambda that
captures the relevant argument when calling printf_internal instead.
I tried the OpenSSH port but it failed to compile due to a missing
definition of this macro. It's simple enough to add, and it's addition
allowed OpenSSH to compile once again.
I also went ahead and added spec comments for these macros as well.
As ECMA262 regex allows `[^]` and literal newlines to match newlines in
the input string, we shouldn't split the input string into lines, rather
simply make boundaries and catchall patterns capable of checking for
these conditions specifically.
This renames the current implementation of current_time_zone to
system_time_zone to more clearly indicate what it is. Then reimplements
current_time_zone to return whatever was set up by tzset, falling back
to UTC if something went awry, for convenience.
From POSIX:
the ctime(), localtime(), mktime(), strftime(), and strftime_l()
functions are required to set timezone information as if by calling
tzset()
ctime is excluded here because it invokes localtime, so there's no need
to invoke tzset twice.
POSIX defines this as the "Maximum number of bytes supported for the
name of a timezone (not of the TZ variable)." It must have a minimum
value of _POSIX_TZNAME_MAX (6). The longest time zone name in the TZDB
is about 40 chars, so 64 is chosen here for a little wiggle room, and
to round up to a power of 2.
Before this commit all consume_until overloads aside from the Predicate
one would consume (and ignore) the stop char/string, while the
Predicate overload would not, in order to keep behaviour consistent,
the other overloads no longer consume the stop char/string as well.
It's a bad idea to have a global event loop in a client application as
that will cause an initialization-order fiasco in ASAN. Therefore, LibC
now has a flag "s_global_initializers_ran" which is false until _entry
in crt0 runs, which in turn only gets called after all the global
initializers were actually executed. The EventLoop constructor checks
the flag and crashes the program if it is being called as a global
constructor. A note next to the VERIFY_NOT_REACHED() informs the
developer of these things and how we usually instantiate event loops.
The upshot of this is that global event loops will cause a crash before
any undefined behavior is hit.
LibTimeZone will be needed directly within LibC for functions such as
localtime(). This change adds LibTimeZone directly within LibC, so that
LibTimeZone isn't its own .so library anymore.
LibTimeZone itself is compiled as an object library to make it easier to
give it generator-specific compilation flags.
The POSIX standard specifies the following:
> If the main() function returns to its original caller, or if the
> exit() function is called, all open files are closed (hence all output
> streams are flushed) before program termination.
This means that flushing `stdin` and `stdout` only is not enough, as the
program might have pending writes in other file buffers too.
Now that we support `fflush(nullptr)`, we call that in `exit()` to flush
all streams. This fixes one of bash's generated headers not being
written to disk.
We don't mutate the pointed-to memory, so let's be const correct.
Fixes building the `mimalloc` library that's optionally used by the mold
linker (note that it isn't enabled yet as I haven't tested it).
This function is an extended version of `chmod(2)` that lets one control
whether to dereference symlinks, and specify a file descriptor to a
directory that will be used as the base for relative paths.
This helper that originally appeared in 4.4BSD helps to daemonize
a process by forking, setting itself as session leader, chdir to "/" and
closing stdin/stdout.
NoAllocationGuard is an RAII stack guard that prevents allocations
while it exists. This is done through a thread-local global flag which
causes malloc to crash on a VERIFY if it is false. The guard allows for
recursion.
The intended use case for this class is in real-time audio code. In such
code, allocations are really bad, and this is an easy way of dynamically
enforcing the no-allocations rule while giving the user good feedback if
it is violated. Before real-time audio code is executed, e.g. in LibDSP,
a NoAllocationGuard is instantiated. This is not done with this commit,
as currently some code in LibDSP may still incorrectly allocate in real-
time situations.
Other use cases for the Kernel have also been added, so this commit
builds on the previous to add the support both in Userland and in the
Kernel.
Add them in `<Kernel/API/Device.h>` and use these to provides
`{makedev,major,minor}` in `<sys/sysmacros.h>`. It aims to be more in
line with other Unix implementations and avoid code duplication in user
land.
These checks were added because macOS doesn't have `shadow.h`, so we
would end up including our own LibC's `shadow.h` when we built Lagom.
All inclusions of this header in our code base are now guarded by
`#ifndef AK_OS_BSD_GENERIC`, so these checks are now pointless.
In C++, a function declaration with an empty parameter list means that
the function takes no arguments. In C, however, it means that the
function takes an unspecified number of parameters.
What we did previously was therefore non-conforming. This caused a
config check to fail in the curl port, as it was able to redeclare
`rand` as taking an int parameter.
There's only two places where we're using the C99 feature of array
designated initalizers. This feature seemingly wasn't included with
C++20 designated initalizers for classes and structs. The only two
places we were using this feature are suitably old and isolated that
it makes sense to just suppress the warning at the usage sites while
discouraging future array designated intializers in new code.
Certain C Libraries have (unfortunately) included strings.h as a
part of string.h, which violates the POSIX spec for that specific
header. Some applications rely on this being the case, so let's
include it in our string.h
This was currently crashing Half-Life because it was a considered an
"Unknown" specifier. We can use the same case statement as the regular
hex format conversion (lower case 'x'), as the backend
to convert the number already supports upper/lower case input, hence
we get it for free :^)
This modifies sys$chown to allow specifying whether or not to follow
symlinks and in which directory.
This was then used to implement lchown and fchownat in LibC and LibCore.