The obsolete ttyname and ptsname syscalls are removed.
LibC doesn't rely on these anymore, and it helps simplifying the Kernel
in many places, so it's an overall an improvement.
In addition to that, /proc/PID/tty node is removed too as it is not
needed anymore by userspace to get the attached TTY of a process, as
/dev/tty (which is already a character device) represents that as well.
Apologies for the enormous commit, but I don't see a way to split this
up nicely. In the vast majority of cases it's a simple change. A few
extra places can use TRY instead of manual error checking though. :^)
Previously, Emulator::virt$execve would not report ENOENT and EACCES
when the binary to be executed was nonexistent or not executable. This
broke the execp family of functions, which rely on ENOENT being reported
in order to know that they should continue searching $PATH.
Emulator::virt$execve would construct command lines such as
`/bin/UserspaceEmulator echo -- hello` instead of
`/bin/UserspaceEmulator -- echo hello`, which naturally caused problems.
This commit moves the "--" to the correct place.
This function is an extended version of `chmod(2)` that lets one control
whether to dereference symlinks, and specify a file descriptor to a
directory that will be used as the base for relative paths.
This commit fixes an issue where zero-alignment would lead to the
requested range being allocated outside of the total available range,
hitting an assert in RangeAllocator::allocate_anywhere().
If we do not mark these ranges as reserved, RangeAllocator might later
give us addresses that overlap these, which then causes an assertion
failure in the SoftMMU. This behavior led to recurring CI failures, and
sometimes made programs as simple as `/bin/true` fail.
Fixes "Crash 1" reported in #9104
Most other syscalls pass address arguments as `void*` instead of
`uintptr_t`, so let's do that here too. Besides improving consistency,
this commit makes `strace` correctly pretty-print these arguments in
hex.
This feature was introduced in version 4.17 of the Linux kernel, and
while it's not specified by POSIX, I think it will be a nice addition to
our system.
MAP_FIXED_NOREPLACE provides a less error-prone alternative to
MAP_FIXED: while regular fixed mappings would cause any intersecting
ranges to be unmapped, MAP_FIXED_NOREPLACE returns EEXIST instead. This
ensures that we don't corrupt our process's address space if something
is already at the requested address.
Note that the more portable way to do this is to use regular
MAP_ANONYMOUS, and check afterwards whether the returned address matches
what we wanted. This, however, has a large performance impact on
programs like Wine which try to reserve large portions of the address
space at once, as the non-matching addresses have to be unmapped
separately.
We create a base class called GenericFramebufferDevice, which defines
all the virtual functions that must be implemented by a
FramebufferDevice. Then, we make the VirtIO FramebufferDevice and other
FramebufferDevice implementations inherit from it.
The most important consequence of rearranging the classes is that we now
have one IOCTL method, so all drivers should be committed to not
override the IOCTL method or make their own IOCTLs of FramebufferDevice.
All graphical IOCTLs are known to all FramebufferDevices, and it's up to
the specific implementation whether to support them or discard them (so
we require extensive usage of KResult and KResultOr, together with
virtual characteristic functions).
As a result, the interface is much cleaner and understandable to read.
We only froward String setting and FlagPost creation for now, due to the
other performance events being nonsensical to forward.
We also record these signposts in the optionally generated profile.
It was fragile to use the address of the body of the memory management
functions to disable memory auditing within them. Functions called from
these did not get exempted from the audits, so in some cases
UserspaceEmulator reported bogus heap buffer overflows.
Memory auditing did not work at all on Clang because when querying the
addresses, their offset was taken relative to the base of `.text` which
is not the first segment in the `R/RX/RW(RELRO)/RW(non-RELRO)` layout
produced by LLD.
Similarly to when setting metadata about the allocations, we now use the
`emuctl` system call to selectively suppress auditing when we reach
these functions. This ensures that functions called from `malloc` are
affected too, and no issues occur because of the inconsistency between
Clang and GCC memory layouts.
`ue --profile --profile-file ~/some-file.profile id` can now generate a
full profile (instruction-by-instruction, if needed), at the cost of not
being able to see past the syscall boundary (a.la. callgrind).
This makes it significantly easier to profile seemingly fast userspace
things, like Loader.so :^)
Previously each malloc size class would keep around a limited number of
unused blocks which were marked with MADV_SET_VOLATILE which could then
be reinitialized when additional blocks were needed.
This changes malloc() so that it also keeps around a number of blocks
without marking them with MADV_SET_VOLATILE. I termed these "hot"
blocks whereas blocks which were marked as MADV_SET_VOLATILE are called
"cold" blocks because they're more expensive to reinitialize.
In the worst case this could increase memory usage per process by
1MB when a program requests a bunch of memory and frees all of it.
Also, in order to make more efficient use of these unused blocks
they're now shared between size classes.
Unlike accept() the new accept4() system call lets the caller specify
flags for the newly accepted socket file descriptor, such as
SOCK_CLOEXEC and SOCK_NONBLOCK.
Previously struct sockaddr was used which isn't guaranteed to be
large enough to hold the socket address get{sock,peer}name() returns.
Also, the addrlen argument was initialized incorrectly and should
instead use the address length specified by the caller.
With the new InodeWatcher API, the old style of creating a watcher per
inode will no longer work. Therefore the FileWatcher API has been
updated to support multiple watches, and its users have also been
refactored to the new style. At the moment, all operations done on a
(Blocking)FileWatcher return Result objects, however, this may be
changed in the future if it becomes too obnoxious. :^)
Co-authored-by: Gunnar Beutner <gunnar@beutner.name>