When switching to the new address space, we also have to switch the
Process::m_master_tls_* variables as they may refer to a region in
the old address space.
This was causing `su` to not run correctly.
Regression from 65641187ff.
This replaces the previous owning address space pointer. This commit
should not change any of the existing functionality, but it lays down
the groundwork needed to let us properly access the region table under
the address space spinlock during page fault handling.
Instead of setting up the new address space on it's own, and only swap
to the new address space at the end, we now immediately swap to the new
address space (while still keeping the old one alive) and only revert
back to the old one if we fail at any point.
This is done to ensure that the process' active address space (aka the
contents of m_space) always matches actual address space in use by it.
That should allow us to eventually make the page fault handler process-
aware, which will let us properly lock the process address space lock.
The SID was duplicated between the process credentials and protected
data. And to make matters worse, the credentials SID was not updated in
sys$setsid.
This patch fixes this by removing the SID from protected data and
updating the credentials SID everywhere.
This closes two race windows:
- ProcessGroup removed itself from the "all process groups" list in its
destructor. It was possible to walk the list between the last unref()
and the destructor invocation, and grab a pointer to a ProcessGroup
that was about to get deleted.
- sys$setsid() could end up creating a process group that already
existed, as there was a race window between checking if the PGID
is used, and actually creating a ProcessGroup with that PGID.
Now that it's no longer using LockRefPtr, we can actually move it into
protected data. (LockRefPtr couldn't be stored there because protected
data is immutable at times, and LockRefPtr uses some of its own bits
for locking.)
This syscall sends a signal to other threads or itself. This mechanism
is already guarded by locking mechanisms, and widely used within the
kernel without help from the big lock.
...and also make the Process tick counters clock_t instead of u32.
It seems harmless to get interrupted in the middle of reading these
counters and reporting slightly fewer ticks in some category.
This syscall is only concerned with the current thread (except in the
case of a pledge violation, when it will add some details about that
to the process coredump metadata. That stuff is already serialized.)
This syscall operates on the file descriptor table, and on individual
open file descriptions. Both of those are already protected by scoped
locking mechanisms.
This syscall had a TOCTOU where it checked the peer's PPID before
locking the protected data (where the PPID is stored).
After closing the race window, we can mark the syscall as not needing
the big lock.
These were stored in a bunch of places. The main one that's a bit iffy
is the Mutex::m_holder one, which I'm going to simplify in a subsequent
commit.
In Plan9FS and WorkQueue, we can't make the NNRPs const due to
initialization order problems. That's probably doable with further
cleanup, but left as an exercise for our future selves.
Before starting this, I expected the thread blockers to be a problem,
but as it turns out they were super straightforward (for once!) as they
don't mutate the thread after initiating a block, so they can just use
simple const-ified NNRPs.
- Instead of taking the first new thread as an out-parameter, we now
bundle the process and its first thread in a struct and use that
as the return value.
- Make all Process factory functions return ErrorOr. Use this to convert
some places to more TRY().
- Drop the "try_" prefix on Process factory functions.
The only persistent one of these was Thread::m_process and that never
changes after initialization. Make it const to enforce this and switch
everything over to RefPtr & NonnullRefPtr.
There was only one permanent storage location for these: as a member
in the Mount class.
That member is never modified after Mount initialization, so we don't
need to worry about races there.
The details of the specific interrupt bits that must be turned on are
irrelevant to the sys$execve implementation. Abstract it away to the
Processor implementations using the InterruptsState enum.