This has several benefits:
1) We no longer just blindly derefence a null pointer in various places
2) We will get nicer runtime error messages if the current process does
turn out to be null in the call location
3) GCC no longer complains about possible nullptr dereferences when
compiling without KUBSAN
Before we start disabling acquisition of the big process lock for
specific syscalls, make sure to document and assert that all the
lock is held during all syscalls.
The entire process is not needed, just require the user to pass in the
Space. Also provide no_lock variant to use when you already have the
VM/Space lock acquired, to avoid unnecessary recursive spinlock
acquisitions.
The Process::Handler type has KResultOr<FlatPtr> as its return type.
Using a different return type with an equally-sized template parameter
sort of works but breaks once that condition is no longer true, e.g.
for KResultOr<int> on x86_64.
Ideally the syscall handlers would also take FlatPtrs as their args
so we can get rid of the reinterpret_cast for the function pointer
but I didn't quite feel like cleaning that up as well.
This adds just enough stubs to make the kernel compile on x86_64. Obviously
it won't do anything useful - in fact it won't even attempt to boot because
Multiboot doesn't support ELF64 binaries - but it gets those compiler errors
out of the way so more progress can be made getting all the missing
functionality in place.
There were a few cases where we could end up logging profiling events
before or after the associated process or thread exists in the profile:
After enabling profiling we might end up with CPU samples before we
had a chance to synthesize process/thread creation events.
After a thread exits we would still log associated kmalloc/kfree
events. Instead we now just ignore those events.
This adds two new arguments to the thread_exit system call which let
a thread unmap an arbitrary VM range on thread exit. LibPthread
uses this functionality to unmap the thread stack.
Fixes#7267.
The current method of emitting performance events requires a bit of
boiler plate at every invocation, as well as having to ignore the
return code which isn't used outside of the perf event syscall. This
change attempts to clean that up by exposing high level API's that
can be used around the code base.
This turns the perfcore format into more a log than it was before,
which lets us properly log process, thread and region
creation/destruction. This also makes it unnecessary to dump the
process' regions every time it is scheduled like we did before.
Incidentally this also fixes 'profile -c' because we previously ended
up incorrectly dumping the parent's region map into the profile data.
Log-based mmap support enables profiling shared libraries which
are loaded at runtime, e.g. via dlopen().
This enables profiling both the parent and child process for
programs which use execve(). Previously we'd discard the profiling
data for the old process.
The Profiler tool has been updated to not treat thread IDs as
process IDs anymore. This enables support for processes with more
than one thread. Also, there's a new widget to filter which
process should be displayed.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
The expression
(u8*)params.m_stack_location + stack_size
… causes UBSan to spit out the warning
KUBSAN: addition of unsigned offset to 0x00000002 overflowed to 0xb0000003
… even though there is no actual overflow happening here.
This can be reproduced by running:
$ syscall create_thread 0 [ 0 0 0 0 0xb0000001 2 ]
Technically, this is a true-positive: The C++-reference is incredibly strict
about pointer-arithmetic:
> A pointer to non-array object is treated as a pointer to the first element
> of an array with size 1. […] [A]ttempts to generate a pointer that isn't
> pointing at an element of the same array or one past the end invoke
> undefined behavior.
https://en.cppreference.com/w/cpp/language/operator_arithmetic
Frankly, this feels silly. So let's just use FlatPtr instead.
Found by fuzz-syscalls. Undocumented bug.
Note that FlatPtr is an unsigned type, so
user_esp.value() - 4
is defined even if we end up with a user_esp of 0 (this can happen for example
when params.m_stack_size = 0 and params.m_stack_location = 0). The result would
be a Kernelspace-pointer, which would then be immediately flagged by
'MM.validate_user_stack' as invalid, as intended.
This was another vestige from a long time ago, when exiting a thread
would mutate global data structures that were only protected by the
interrupt flag.
Make more of the kernel compile in 64-bit mode, and make some things
pointer-size-agnostic (by using FlatPtr.)
There's a lot of work to do here before the kernel will even compile.
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)
Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.
We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
Found by fuzz-syscalls. Can be reproduced by running this in the Shell:
$ syscall exit_thread
This leaves the process in the 'Dying' state but never actually removes it.
Therefore, avoid this scenario by pretending to exit the entire process.
This patch adds Space, a class representing a process's address space.
- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.
This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.
Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.
Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
Wrap thread creation in a Thread::try_create() helper that first
allocates a kernel stack region. If that allocation fails, we propagate
an ENOMEM error to the caller.
This avoids the situation where a thread is half-constructed, without a
valid kernel stack, and avoids having to do messy cleanup in that case.
It's useful for programs to change their thread names to say something
interesting about what they are working on. Let's not require "thread"
for this since single-threaded programs may want to do it without
pledging "thread".
Problem:
- Many constructors are defined as `{}` rather than using the ` =
default` compiler-provided constructor.
- Some types provide an implicit conversion operator from `nullptr_t`
instead of requiring the caller to default construct. This violates
the C++ Core Guidelines suggestion to declare single-argument
constructors explicit
(https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c46-by-default-declare-single-argument-constructors-explicit).
Solution:
- Change default constructors to use the compiler-provided default
constructor.
- Remove implicit conversion operators from `nullptr_t` and change
usage to enforce type consistency without conversion.
These changes are arbitrarily divided into multiple commits to make it
easier to find potentially introduced bugs with git bisect.Everything:
The modifications in this commit were automatically made using the
following command:
find . -name '*.cpp' -exec sed -i -E 's/dbg\(\) << ("[^"{]*");/dbgln\(\1\);/' {} \;
This prevents zombies created by multi-threaded applications and brings
our model back to closer to what other OSs do.
This also means that SIGSTOP needs to halt all threads, and SIGCONT needs
to resume those threads.
New Thread objects should be adopted into a RefPtr upon creation.
If creating a thread failed (e.g. out of memory), releasing the RefPtr
will destruct the partially created object, but in the successful case
the thread will add an additional reference that it keeps until it
finishes execution. Adopting will drop it to 1 when returning from
create_thread, or 0 if the thread could not be fully constructed.
This adds the ability to pass a pointer to kernel thread/process.
Also add the ability to use a closure as thread function, which
allows passing information to a kernel thread more easily.
g_scheduler_lock cannot safely be acquired after Thread::m_lock
because another processor may already hold g_scheduler_lock and wait
for the same Thread::m_lock.
Similar to Process, we need to make Thread refcounted. This will solve
problems that will appear once we schedule threads on more than one
processor. This allows us to hold onto threads without necessarily
holding the scheduler lock for the entire duration.
The thread joining logic hadn't been updated to account for the subtle
differences introduced by software context switching. This fixes several
race conditions related to thread destruction and joining, as well as
finalization which did not properly account for detached state and the
fact that threads can be joined after termination as long as they're not
detached.
Fixes#3596
There are plenty of places in the kernel that aren't
checking if they actually got their allocation.
This fixes some of them, but definitely not all.
Fixes#3390Fixes#3391
Also, let's make find_one_free_page() return nullptr
if it doesn't get a free index. This stops the kernel
crashing when out of memory and allows memory purging
to take place again.
Fixes#3487
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
Userspace<void*> is a bit strange here, as it would appear to the
user that we intend to de-refrence the pointer in kernel mode.
However I think it does a good join of illustrating that we are
treating the void* as a value type, instead of a pointer type.