1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-07-10 06:37:36 +00:00
Commit graph

953 commits

Author SHA1 Message Date
Ali Mohammad Pur
6608812e4b Kernel: Over-align the FPUState on the stack in sigreturn
The stack is misaligned at this point for some reason, this is a hack
that makes the resulting object "correctly" aligned, thus avoiding a
KUBSAN error.
2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
88d7bf7362 Kernel: Save and restore FPU state on signal dispatch on i386/x86_64 2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
4bd01b7fe9 Kernel: Add support for SA_SIGINFO
We currently don't really populate most of the fields, but that can
wait :^)
2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
585054d68b Kernel: Comment the living daylights out of signal trampoline/sigreturn
Mere mortals like myself cannot understand more than two lines of
assembly without a million comments explaining what's happening, so do
that and make sure no one has to go on a wild stack state chase when
hacking on these.
2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
848eaf2220 Kernel: Reject sigaction() with SA_SIGINFO
We can't handle this, so let sigaction() fail instead of PANIC()'ing
later when we try to dispatch a signal with SA_SIGINFO set.
2022-03-04 20:07:05 +01:00
Ali Mohammad Pur
cf63447044 Kernel: Move signal handlers from being thread state to process state
POSIX requires that sigaction() and friends set a _process-wide_ signal
handler, so move signal handlers and flags inside Process.
This also fixes a "pid/tid confusion" FIXME, as we can now send the
signal to the process and let that decide which thread should get the
signal (which is the thread with tid==pid, but that's now the Process's
problem).
Note that each thread still retains its signal mask, as that is local to
each thread.
2022-03-04 20:07:05 +01:00
Lucas CHOLLET
839d3d9f74 Kernel: Add getrusage() syscall
Only the two timeval fields are maintained, as required by the POSIX
standard.
2022-02-28 20:09:37 +01:00
Idan Horowitz
011bd06053 Kernel: Set CS selector when initializing thread context on x86_64
These are not technically required, since the Thread constructor
already sets these, but they are set on i686, so let's try and keep
consistent behaviour between the different archs.
2022-02-27 00:38:00 +02:00
Brian Gianforcaro
d05fa14e52 Kernel: Use TRY() when validating clock_id in TimeManagement
Gets rid of a bit of code duplication, and makes the API more consistent
with the style we are moving towards.
2022-02-21 15:47:51 -08:00
Brian Gianforcaro
70f3fa2dd2 Kernel: Set new process name in do_exec before waiting for the tracer
While investigating why gdb is failing when it calls `PT_CONTINUE`
against Serenity I noticed that the names of the programs in the
System Monitor didn't make sense. They were seemingly stale.

After inspecting the kernel code, it became apparent that the sequence
occurs as follows:

    1. Debugger calls `fork()`
    2. The forked child calls `PT_TRACE_ME`
    3. The `PT_TRACE_ME` instructs the forked process to block in the
       kernel waiting for a signal from the tracer on the next call
       to `execve(..)`.
    4. Debugger waits for forked child to spawn and stop, and then it
       calls `PT_ATTACH` followed by `PT_CONTINUE` on the child.
    5. Currently the `PT_CONTINUE` fails because of some other yet to
       be found bug.
    6. The process name is set immediately AFTER we are woken up by
       the `PT_CONTINUE` which never happens in the case I'm debugging.

This chain of events leaves the process suspended, with the name  of
the original (forked) process instead of the name we inherit from
the `execve(..)` call.

To avoid such confusion in the future, we set the new name before we
block waiting for the tracer.
2022-02-19 18:04:32 -08:00
Jakub Berkop
895a050e04 Kernel: Fixed argument passing for profiling_enable syscall
Arguments larger than 32bit need to be passed as a pointer on a 32bit
architectures. sys$profiling_enable has u64 event_mask argument,
which means that it needs to be passed as an pointer. Previously upper
32bits were filled by garbage.
2022-02-19 11:37:02 +01:00
Ali Mohammad Pur
a1cb2c371a AK+Kernel: OOM-harden most parts of Trie
The only part of Unveil that can't handle OOM gracefully is the
String::formatted() use in the node metadata.
2022-02-15 18:03:02 +02:00
Jakub Berkop
4916c892b2 Kernel/Profiling: Add profiling to read syscall
Syscalls to read can now be profiled, allowing us to monitor
filesystem usage by different applications.
2022-02-14 11:38:13 +01:00
Idan Horowitz
bd821982e0 Kernel: Use StringView::for_each_split_view() in sys$pledge
This let's us avoid the fallible Vector allocation that split_view()
entails.
2022-02-14 11:35:20 +01:00
Idan Horowitz
e384f62ee2 Kernel: Make master TLS region WeakPtr construction OOM-fallible 2022-02-14 11:35:20 +01:00
Idan Horowitz
c8ab7bde3b Kernel: Use try_make_weak_ptr() instead of make_weak_ptr() 2022-02-13 23:02:57 +01:00
Idan Horowitz
d6ea6c39a7 AK+Kernel: Rename try_make_weak_ptr to make_weak_ptr_if_nonnull
This matches the likes of the adopt_{own, ref}_if_nonnull family and
also frees up the name to allow us to eventually add OOM-fallible
versions of these functions.
2022-02-13 23:02:57 +01:00
Andrew Kaster
b4a7d148b1 Kernel: Expose maximum argument limit in sysconf
Move the definitions for maximum argument and environment size to
Process.h from execve.cpp. This allows sysconf(_SC_ARG_MAX) to return
the actual argument maximum of 128 KiB to userspace.
2022-02-13 22:06:54 +02:00
Idan Horowitz
57bce8ab97 Kernel: Set up Regions before adding them to a Process's AddressSpace
This reduces the amount of time in which not fully-initialized Regions
are present inside an AddressSpace's region tree.
2022-02-11 17:49:46 +02:00
Lenny Maiorani
c6acf64558 Kernel: Change static constexpr variables to constexpr where possible
Function-local `static constexpr` variables can be `constexpr`. This
can reduce memory consumption, binary size, and offer additional
compiler optimizations.

These changes result in a stripped x86_64 kernel binary size reduction
of 592 bytes.
2022-02-09 21:04:51 +00:00
Andreas Kling
cda56f8049 Kernel: Robustify and rename Inode bound socket API
Rename the bound socket accessor from socket() to bound_socket().
Also return RefPtr<LocalSocket> instead of a raw pointer, to make it
harder for callers to mess up.
2022-02-07 13:02:34 +01:00
sin-ack
24fd8fb16f Kernel: Ensure socket is suitable for writing in sys$sendmsg
Previously we would return a bytes written value of 0 if the writing end
of the socket was full. Now we either exit with EAGAIN if the socket
description is non-blocking, or block until the description can be
written to.

This is mostly a copy of the conditions in sys$write but with the "total
nwritten" parts removed as sys$sendmsg does not have that.
2022-02-07 12:21:45 +01:00
Andreas Kling
04539d4930 Kernel: Propagate sys$profiling_enable() buffer allocation failure
Caught a kernel panic when enabling profiling of all threads when there
was very little memory available.
2022-02-06 01:25:32 +01:00
Andreas Kling
3845c90e08 Kernel: Remove unnecessary includes from Thread.h
...and deal with the fallout by adding missing includes everywhere.
2022-01-30 16:21:59 +01:00
Idan Horowitz
e28af4a2fc Kernel: Stop using HashMap in Mutex
This commit removes the usage of HashMap in Mutex, thereby making Mutex
be allocation-free.

In order to achieve this several simplifications were made to Mutex,
removing unused code-paths and extra VERIFYs:
 * We no longer support 'upgrading' a shared lock holder to an
   exclusive holder when it is the only shared holder and it did not
   unlock the lock before relocking it as exclusive. NOTE: Unlike the
   rest of these changes, this scenario is not VERIFY-able in an
   allocation-free way, as a result the new LOCK_SHARED_UPGRADE_DEBUG
   debug flag was added, this flag lets Mutex allocate in order to
   detect such cases when debugging a deadlock.
 * We no longer support checking if a Mutex is locked by the current
   thread when the Mutex was not locked exclusively, the shared version
   of this check was not used anywhere.
 * We no longer support force unlocking/relocking a Mutex if the Mutex
   was not locked exclusively, the shared version of these functions
   was not used anywhere.
2022-01-29 16:45:39 +01:00
Andreas Kling
d748a3c173 Kernel: Only lock process file descriptor table once in sys$poll()
Grab the OpenFileDescriptions mutex once and hold on to it while
populating the SelectBlocker::FDVector.
2022-01-29 02:17:12 +01:00
Andreas Kling
b56646e293 Kernel: Switch process file descriptor table from spinlock to mutex
There's no reason for this to use a spinlock. Instead, let's allow
threads to block if someone else is using the descriptor table.
2022-01-29 02:17:09 +01:00
Andreas Kling
8ebec2938c Kernel: Convert process file descriptor table to a SpinlockProtected
Instead of manually locking in the various member functions of
Process::OpenFileDescriptions, simply wrap it in a SpinlockProtected.
2022-01-29 02:17:06 +01:00
Andreas Kling
b27b22a68c Kernel: Allocate entire SelectBlocker::FDVector at once
Use try_ensure_capacity() + unchecked_append() instead of repeatedly
doing try_append().
2022-01-28 23:41:18 +01:00
Andreas Kling
31c1094577 Kernel: Don't mess with thread state in Process::do_exec()
We were marking the execing thread as Runnable near the end of
Process::do_exec().

This was necessary for exec in processes that had never been scheduled
yet, which is a specific edge case that only applies to the very first
userspace process (normally SystemServer). At this point, such threads
are in the Invalid state.

In the common case (normal userspace-initiated exec), making the current
thread Runnable meant that we switched away from its current state:
Running. As the thread is indeed running, that's a bogus change!
This created a short time window in which the thread state was bogus,
and any attempt to block the thread would panic the kernel (due to a
bogus thread state in Thread::block() leading to VERIFY_NOT_REACHED().)

Fix this by not touching the thread state in Process::do_exec()
and instead make the first userspace thread Runnable directly after
calling Process::exec() on it in try_create_userspace_process().

It's unfortunate that exec() can be called both on the current thread,
and on a new thread that has never been scheduled. It would be good to
not have the latter edge case, but fixing that will require larger
architectural changes outside the scope of this fix.
2022-01-27 11:18:25 +01:00
Brian Gianforcaro
e954b4bdd4 Kernel: Return error from sys$execve() when called with zero arguments
There are many assumptions in the stack that argc is not zero, and
argv[0] points to a valid string. The recent pwnkit exploit on Linux
was able to exploit this assumption in the `pkexec` utility
(a SUID-root binary) to escalate from any user to root.

By convention `execve(..)` should always be called with at least one
valid argument, so lets enforce that semantic to harden the system
against vulnerabilities like pwnkit.

Reference: https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt
2022-01-26 13:05:59 +01:00
Idan Horowitz
d1433c35b0 Kernel: Handle OOM failures in find_shebang_interpreter_for_executable 2022-01-26 02:37:03 +02:00
Idan Horowitz
8cf0e4a5e4 Kernel: Eliminate allocations from generate_auxiliary_vector 2022-01-26 02:37:03 +02:00
Idan Horowitz
a6f0ab358a Kernel: Make AddressSpace::find_regions_intersecting OOM-fallible 2022-01-26 02:37:03 +02:00
Idan Horowitz
e23d320bb9 Kernel: Fail gracefully due to OOM on HashTable set in sys$setgroups 2022-01-26 02:37:03 +02:00
Idan Horowitz
8dfd124718 Kernel: Replace String with NonnullOwnPtr<KString> in sys$getkeymap 2022-01-25 08:06:02 +01:00
Liav A
69f054616d Kernel: Add CommandLine option to disable or enable the PC speaker
By default, we disable the PC speaker as it's quite annoying when using
the text mode console.
2022-01-23 00:40:54 +00:00
Jelle Raaijmakers
df73e8b46b Kernel: Allow program headers to align on multiples of PAGE_SIZE
These checks in `sys$execve` could trip up the system whenever you try
to execute an `.so` file. For example, double-clicking `libwasm.so` in
Terminal crashes the kernel.

This changes the program header alignment checks to reflect the same
checks in LibELF, and passes the requested alignment on to
`::try_allocate_range()`.
2022-01-23 00:11:56 +02:00
Idan Horowitz
0adee378fd Kernel: Stop using LibKeyboard's CharacterMap in HIDManagement
This was easily done, as the Kernel and Userland don't actually share
any of the APIs exposed by it, so instead the Kernel APIs were moved to
the Kernel, and the Userland APIs stayed in LibKeyboard.

This has multiple advantages:
 * The non OOM-fallible String is not longer used for storing the
   character map name in the Kernel
 * The kernel no longer has to link to the userland LibKeyboard code
 * A lot of #ifdef KERNEL cruft can be removed from LibKeyboard
2022-01-21 18:25:44 +01:00
Andreas Kling
0e08763483 Kernel: Wrap much of sys$execve() in a block scope
Since we don't return normally from this function, let's make it a
little extra difficult to accidentally leak something by leaving it on
the stack in this function.
2022-01-13 23:57:33 +01:00
Andreas Kling
0e72b04e7d Kernel: Perform exec-into-new-image directly in sys$execve()
This ensures that everything allocated on the stack in Process::exec()
gets cleaned up. We had a few leaks related to the parsing of shebang
(#!) executables that get fixed by this.
2022-01-13 23:57:33 +01:00
Idan Horowitz
cfb9f889ac LibELF: Accept Span instead of Pointer+Size in validate_program_headers 2022-01-13 22:40:25 +01:00
Idan Horowitz
3e959618c3 LibELF: Use StringBuilders instead of Strings for the interpreter path
This is required for the Kernel's usage of LibELF, since Strings do not
expose allocation failure.
2022-01-13 22:40:25 +01:00
Andreas Kling
8ad46fd8f5 Kernel: Stop leaking executable path in successful sys$execve()
Since we don't return from sys$execve() when it's successful, we have to
take special care to tear down anything we've allocated.

Turns out we were not doing this for the full executable path itself.
2022-01-13 16:15:37 +01:00
Idan Horowitz
40159186c1 Kernel: Remove String use-after-free in generate_auxiliary_vector
Instead we generate the random bytes directly in
make_userspace_context_for_main_thread if requested.
2022-01-13 00:20:08 -08:00
Idan Horowitz
e72bbca9eb Kernel: Fix OOB write in sys$uname
Since this was only out of bounds of the specific field, not of the
whole struct, and because setting the hostname requires root privileges
this was not actually a security vulnerability.
2022-01-13 00:20:08 -08:00
Idan Horowitz
50d6a6a186 Kernel: Convert hostname to KString 2022-01-13 00:20:08 -08:00
Idan Horowitz
bc85b64a38 Kernel: Replace usages of String::formatted with KString in sys$exec 2022-01-12 16:09:09 +02:00
Idan Horowitz
dba0840942 Kernel: Remove outdated FIXME comment in sys$sethostname 2022-01-12 16:09:09 +02:00
Daniel Bertalan
182016d7c0 Kernel+LibC+LibCore+UE: Implement fchmodat(2)
This function is an extended version of `chmod(2)` that lets one control
whether to dereference symlinks, and specify a file descriptor to a
directory that will be used as the base for relative paths.
2022-01-12 14:54:12 +01:00