1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-31 10:58:12 +00:00
Commit graph

926 commits

Author SHA1 Message Date
Andreas Kling
8ebec2938c Kernel: Convert process file descriptor table to a SpinlockProtected
Instead of manually locking in the various member functions of
Process::OpenFileDescriptions, simply wrap it in a SpinlockProtected.
2022-01-29 02:17:06 +01:00
Andreas Kling
b27b22a68c Kernel: Allocate entire SelectBlocker::FDVector at once
Use try_ensure_capacity() + unchecked_append() instead of repeatedly
doing try_append().
2022-01-28 23:41:18 +01:00
Andreas Kling
31c1094577 Kernel: Don't mess with thread state in Process::do_exec()
We were marking the execing thread as Runnable near the end of
Process::do_exec().

This was necessary for exec in processes that had never been scheduled
yet, which is a specific edge case that only applies to the very first
userspace process (normally SystemServer). At this point, such threads
are in the Invalid state.

In the common case (normal userspace-initiated exec), making the current
thread Runnable meant that we switched away from its current state:
Running. As the thread is indeed running, that's a bogus change!
This created a short time window in which the thread state was bogus,
and any attempt to block the thread would panic the kernel (due to a
bogus thread state in Thread::block() leading to VERIFY_NOT_REACHED().)

Fix this by not touching the thread state in Process::do_exec()
and instead make the first userspace thread Runnable directly after
calling Process::exec() on it in try_create_userspace_process().

It's unfortunate that exec() can be called both on the current thread,
and on a new thread that has never been scheduled. It would be good to
not have the latter edge case, but fixing that will require larger
architectural changes outside the scope of this fix.
2022-01-27 11:18:25 +01:00
Brian Gianforcaro
e954b4bdd4 Kernel: Return error from sys$execve() when called with zero arguments
There are many assumptions in the stack that argc is not zero, and
argv[0] points to a valid string. The recent pwnkit exploit on Linux
was able to exploit this assumption in the `pkexec` utility
(a SUID-root binary) to escalate from any user to root.

By convention `execve(..)` should always be called with at least one
valid argument, so lets enforce that semantic to harden the system
against vulnerabilities like pwnkit.

Reference: https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
2022-01-26 13:05:59 +01:00
Idan Horowitz
d1433c35b0 Kernel: Handle OOM failures in find_shebang_interpreter_for_executable 2022-01-26 02:37:03 +02:00
Idan Horowitz
8cf0e4a5e4 Kernel: Eliminate allocations from generate_auxiliary_vector 2022-01-26 02:37:03 +02:00
Idan Horowitz
a6f0ab358a Kernel: Make AddressSpace::find_regions_intersecting OOM-fallible 2022-01-26 02:37:03 +02:00
Idan Horowitz
e23d320bb9 Kernel: Fail gracefully due to OOM on HashTable set in sys$setgroups 2022-01-26 02:37:03 +02:00
Idan Horowitz
8dfd124718 Kernel: Replace String with NonnullOwnPtr<KString> in sys$getkeymap 2022-01-25 08:06:02 +01:00
Liav A
69f054616d Kernel: Add CommandLine option to disable or enable the PC speaker
By default, we disable the PC speaker as it's quite annoying when using
the text mode console.
2022-01-23 00:40:54 +00:00
Jelle Raaijmakers
df73e8b46b Kernel: Allow program headers to align on multiples of PAGE_SIZE
These checks in `sys$execve` could trip up the system whenever you try
to execute an `.so` file. For example, double-clicking `libwasm.so` in
Terminal crashes the kernel.

This changes the program header alignment checks to reflect the same
checks in LibELF, and passes the requested alignment on to
`::try_allocate_range()`.
2022-01-23 00:11:56 +02:00
Idan Horowitz
0adee378fd Kernel: Stop using LibKeyboard's CharacterMap in HIDManagement
This was easily done, as the Kernel and Userland don't actually share
any of the APIs exposed by it, so instead the Kernel APIs were moved to
the Kernel, and the Userland APIs stayed in LibKeyboard.

This has multiple advantages:
 * The non OOM-fallible String is not longer used for storing the
   character map name in the Kernel
 * The kernel no longer has to link to the userland LibKeyboard code
 * A lot of #ifdef KERNEL cruft can be removed from LibKeyboard
2022-01-21 18:25:44 +01:00
Andreas Kling
0e08763483 Kernel: Wrap much of sys$execve() in a block scope
Since we don't return normally from this function, let's make it a
little extra difficult to accidentally leak something by leaving it on
the stack in this function.
2022-01-13 23:57:33 +01:00
Andreas Kling
0e72b04e7d Kernel: Perform exec-into-new-image directly in sys$execve()
This ensures that everything allocated on the stack in Process::exec()
gets cleaned up. We had a few leaks related to the parsing of shebang
(#!) executables that get fixed by this.
2022-01-13 23:57:33 +01:00
Idan Horowitz
cfb9f889ac LibELF: Accept Span instead of Pointer+Size in validate_program_headers 2022-01-13 22:40:25 +01:00
Idan Horowitz
3e959618c3 LibELF: Use StringBuilders instead of Strings for the interpreter path
This is required for the Kernel's usage of LibELF, since Strings do not
expose allocation failure.
2022-01-13 22:40:25 +01:00
Andreas Kling
8ad46fd8f5 Kernel: Stop leaking executable path in successful sys$execve()
Since we don't return from sys$execve() when it's successful, we have to
take special care to tear down anything we've allocated.

Turns out we were not doing this for the full executable path itself.
2022-01-13 16:15:37 +01:00
Idan Horowitz
40159186c1 Kernel: Remove String use-after-free in generate_auxiliary_vector
Instead we generate the random bytes directly in
make_userspace_context_for_main_thread if requested.
2022-01-13 00:20:08 -08:00
Idan Horowitz
e72bbca9eb Kernel: Fix OOB write in sys$uname
Since this was only out of bounds of the specific field, not of the
whole struct, and because setting the hostname requires root privileges
this was not actually a security vulnerability.
2022-01-13 00:20:08 -08:00
Idan Horowitz
50d6a6a186 Kernel: Convert hostname to KString 2022-01-13 00:20:08 -08:00
Idan Horowitz
bc85b64a38 Kernel: Replace usages of String::formatted with KString in sys$exec 2022-01-12 16:09:09 +02:00
Idan Horowitz
dba0840942 Kernel: Remove outdated FIXME comment in sys$sethostname 2022-01-12 16:09:09 +02:00
Daniel Bertalan
182016d7c0 Kernel+LibC+LibCore+UE: Implement fchmodat(2)
This function is an extended version of `chmod(2)` that lets one control
whether to dereference symlinks, and specify a file descriptor to a
directory that will be used as the base for relative paths.
2022-01-12 14:54:12 +01:00
Andreas Kling
a62bdb0761 Kernel: Delay Process data unprotection in sys$pledge()
Don't unprotect the protected data area until we've validated the pledge
syscall inputs.
2022-01-02 18:08:02 +01:00
circl
63760603f3 Kernel+LibC+LibCore: Add lchown and fchownat functions
This modifies sys$chown to allow specifying whether or not to follow
symlinks and in which directory.

This was then used to implement lchown and fchownat in LibC and LibCore.
2022-01-01 15:08:49 +01:00
Daniel Bertalan
1d2f78682b Kernel+AK: Eliminate a couple of temporary String allocations 2021-12-30 14:16:03 +01:00
Brian Gianforcaro
54b9a4ec1e Kernel: Handle promise violations in the syscall handler
Previously we would crash the process immediately when a promise
violation was found during a syscall. This is error prone, as we
don't unwind the stack. This means that in certain cases we can
leak resources, like an OwnPtr / RefPtr tracked on the stack. Or
even leak a lock acquired in a ScopeLockLocker.

To remedy this situation we move the promise violation handling to
the syscall handler, right before we return to user space. This
allows the code to follow the normal unwind path, and grantees
there is no longer any cleanup that needs to occur.

The Process::require_promise() and Process::require_no_promises()
functions were modified to return ErrorOr<void> so we enforce that
the errors are always propagated by the caller.
2021-12-29 18:08:15 +01:00
Brian Gianforcaro
0f7fe1eb08 Kernel: Use Process::require_no_promises instead of REQUIRE_NO_PROMISES
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.

It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
2021-12-29 18:08:15 +01:00
Brian Gianforcaro
bad6d50b86 Kernel: Use Process::require_promise() instead of REQUIRE_PROMISE()
This change lays the foundation for making the require_promise return
an error hand handling the process abort outside of the syscall
implementations, to avoid cases where we would leak resources.

It also has the advantage that it makes removes a gs pointer read
to look up the current thread, then process for every syscall. We
can instead go through the Process this pointer in most cases.
2021-12-29 18:08:15 +01:00
Brian Gianforcaro
b5367bbf31 Kernel: Clarify why ftruncate() & pread() are passed off_t const*
I fell into this trap and tried to switch the syscalls to pass by
the `off_t` by register. I think it makes sense to add a clarifying
comment for future readers of the code, so they don't fall into the
same trap. :^)
2021-12-29 05:54:04 -08:00
Brian Gianforcaro
737a11389c Kernel: Fix info leak from sockaddr_un in socket syscalls
In `sys$accept4()` and `get_sock_or_peer_name()` we were not
initializing the padding of the `sockaddr_un` struct, leading to
an kernel information leak if the
caller looked back at it's contents.

Before Fix:

    37.766 Clipboard(11:11): accept4 Bytes:
    2f746d702f706f7274616c2f636c6970626f61726440eac130e7fbc1e8abbfc
    19c10ffc18440eac15485bcc130e7fbc1549feaca6c9deaca549feaca1bb0bc
    03efdf62c0e056eac1b402d7acd010ffc14602000001b0bc030100000050bf0
    5c24602000001e7fbc1b402d7ac6bdc

After Fix:

    0.603 Clipboard(11:11): accept4 Bytes:
    2f746d702f706f7274616c2f636c6970626f617264000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000
    0000000000000000000000000000000
2021-12-29 03:41:32 -08:00
Daniel Bertalan
fcdd202741 Kernel: Return the actual number of CPU cores that we have
... instead of returning the maximum number of Processor objects that we
can allocate.

Some ports (e.g. gdb) rely on this information to determine the number
of worker threads to spawn. When gdb spawned 64 threads, the kernel
could not cope with generating backtraces for it, which prevented us
from debugging it properly.

This commit also removes the confusingly named
`Processor::processor_count` function so that this mistake can't happen
again.
2021-12-29 03:17:41 -08:00
Idan Horowitz
d7ec5d042f Kernel: Port Process to ListedRefCounted 2021-12-29 12:04:15 +01:00
Guilherme Goncalves
33b78915d3 Kernel: Propagate overflow errors from Memory::page_round_up
Fixes #11402.
2021-12-28 23:08:50 +01:00
Daniel Bertalan
52beeebe70 Kernel: Remove the KString::try_create(String::formatted(...)) pattern
We can now directly create formatted KStrings with KString::formatted.

:^)
2021-12-28 01:55:22 -08:00
Guilherme Gonçalves
da6aef9fff Kernel: Make msync return EINVAL when regions are too large
As a small cleanup, this also makes `page_round_up` verify its
precondition with `page_round_up_would_wrap` (which callers are expected
to call), rather than having its own logic.

Fixes #11297.
2021-12-23 17:43:12 -08:00
Daniel Bertalan
8e3d1a42e3 Kernel+UE+LibC: Store address as void* in SC_m{re,}map_params
Most other syscalls pass address arguments as `void*` instead of
`uintptr_t`, so let's do that here too. Besides improving consistency,
this commit makes `strace` correctly pretty-print these arguments in
hex.
2021-12-23 23:08:10 +01:00
Daniel Bertalan
77f9272aaf Kernel+UE: Add MAP_FIXED_NOREPLACE mmap() flag
This feature was introduced in version 4.17 of the Linux kernel, and
while it's not specified by POSIX, I think it will be a nice addition to
our system.

MAP_FIXED_NOREPLACE provides a less error-prone alternative to
MAP_FIXED: while regular fixed mappings would cause any intersecting
ranges to be unmapped, MAP_FIXED_NOREPLACE returns EEXIST instead. This
ensures that we don't corrupt our process's address space if something
is already at the requested address.

Note that the more portable way to do this is to use regular
MAP_ANONYMOUS, and check afterwards whether the returned address matches
what we wanted. This, however, has a large performance impact on
programs like Wine which try to reserve large portions of the address
space at once, as the non-matching addresses have to be unmapped
separately.
2021-12-23 23:08:10 +01:00
Andreas Kling
1d08b671ea Kernel: Enter new address space before destroying old in sys$execve()
Previously we were assigning to Process::m_space before actually
entering the new address space (assigning it to CR3.)

If a thread was preempted by the scheduler while destroying the old
address space, we'd then attempt to resume the thread with CR3 pointing
at a partially destroyed address space.

We could then crash immediately in write_cr3(), right after assigning
the new value to CR3. I am hopeful that this may have been the bug
haunting our CI for months. :^)
2021-12-23 01:18:26 +01:00
Daniel Bertalan
ce1bf3724e Kernel: Replace intersecting ranges in mmap when MAP_FIXED is specified
This behavior is mandated by POSIX and is used by software like Wine
after reserving large chunks of the address range.
2021-12-22 00:02:36 -08:00
Martin Bříza
86b249f02f Kernel: Implement sysconf(_SC_SYMLOOP_MAX)
Not much to say here, this is an implementation of this call that
accesses the actual limit constant that's used by the VirtualFileSystem
class.

As a side note, this is required for my eventual Qt port.
2021-12-21 12:54:11 -08:00
Liav A
5a649d0fd5 Kernel: Return EINVAL when specifying -1 for setuid and similar syscalls
For setreuid and setresuid syscalls, -1 means to set the current
uid/euid/gid/egid value, to be more convenient for programming.
However, for other syscalls where we pass only one argument, there's no
justification to specify -1.

This behavior is identical to how Linux handles the value -1, and is
influenced by the fact that the manual pages for the group of one
argument syscalls that handle ID operations is ambiguous about this
topic.
2021-12-20 11:32:16 +01:00
Hendiadyoin1
18013f3c06 Kernel: Remove a redundant check in Process::remap_range_as_stack
We already VERIFY that we have carved something out, so we don't need to
check that again.
2021-12-18 10:31:18 -08:00
Hendiadyoin1
2d28b441bf Kernel: Collapse a redundant boolean conditional return statement in …
validate_mmap_prot
2021-12-18 10:31:18 -08:00
Hendiadyoin1
f38d32535c Kernel: Access OpenFileDescriptions::max_open() statically in Syscalls 2021-12-18 10:31:18 -08:00
Hendiadyoin1
c860e0ab95 Kernel: Add implicit auto qualifiers in Syscalls 2021-12-18 10:31:18 -08:00
Hendiadyoin1
f5b495d92c Kernel: Remove else after return in Process::do_write 2021-12-18 10:31:18 -08:00
Andreas Kling
32aa623eff Kernel: Fix 4-byte uninitialized memory leak in sys$sigaltstack()
It was possible to extract 4 bytes of uninitialized kernel stack memory
on x86_64 by looking in the padding of stack_t.
2021-12-18 11:30:10 +01:00
Andreas Kling
0ae8702692 Kernel: Make File::stat() & friends return Error<struct stat>
Instead of making the caller provide a stat buffer, let's just return
one as a value.
2021-12-18 11:30:10 +01:00
Andreas Kling
abf2204402 Kernel: Use copy_typed_from_user() in more places :^) 2021-12-18 11:30:10 +01:00