Note that I'm developing some helper types in the Syscall namespace as
I go here. Once I settle on some nice types, I will convert all the
other syscalls to use them as well.
The join_thread() syscall is not supposed to be interruptible by
signals, but it was. And since the process death mechanism piggybacked
on signal interrupts, it was possible to interrupt a pthread_join() by
killing the process that was doing it, leading to confusing due to some
assumptions being made by Thread::finalize() for threads that have a
pending joiner.
This patch fixes the issue by making "interrupted by death" a distinct
block result separate from "interrupted by signal". Then we handle that
state in join_thread() and tidy things up so that thread finalization
doesn't get confused by the pending joiner being gone.
Test: Tests/Kernel/null-deref-crash-during-pthread_join.cpp
As Sergey pointed out, it's silly to have proper entries for . and ..
in TmpFS when we can just synthesize them on the fly.
Note that we have to tolerate removal of . and .. via remove_child()
to keep VFS::rmdir() happy.
The userspace execve() wrapper now measures all the strings and puts
them in a neat and tidy structure on the stack.
This way we know exactly how much to copy in the kernel, and we don't
have to use the SMAP-violating validate_read_str(). :^)
When loading a new executable, we now map the ELF image in kernel-only
memory and parse it there. Then we use copy_to_user() when initializing
writable regions with data from the executable.
Note that the exec() syscall still disables SMAP protection and will
require additional work. This patch only affects kernel-originated
process spawns.
This fixes a null RefPtr deref (which asserts) in the scheduler if a
file descriptor being select()'ed is closed by a second thread while
blocked in select().
Test: Kernel/null-deref-close-during-select.cpp
Make mmap return -ENOTSUP in this case to make sure users don't get
confused and think they're using a private mapping when it's actually
shared. It's currenlty not possible to open a file and mmap it
MAP_PRIVATE, and change the perms of the private mapping to ones that
don't match the permissions of the underlying file.
It would be nice to do this in the assembly code, but we have to check
if the feature is available before doing a CLAC, so I've put this in
the C++ code for now.
This patch fixes some issues with the mmap() and mprotect() syscalls,
neither of whom were checking the permission bits of the underlying
files when mapping an inode MAP_SHARED.
This made it possible to subvert execution of any running program
by simply memory-mapping its executable and replacing some of the code.
Test: Kernel/mmap-write-into-running-programs-executable-file.cpp
This encourages callers to strongly reference file descriptions while
working with them.
This fixes a use-after-free issue where one thread would close() an
open fd while another thread was blocked on it becoming readable.
Test: Kernel/uaf-close-while-blocked-in-read.cpp
Instead of using the FIFO's memory address as part of its absolute path
identity, just use an incrementing FIFO index instead.
Note that this is not used for anything other than debugging (it helps
you identify which file descriptors refer to the same FIFO by looking
at /proc/PID/fds
Before this, you could make the kernel copy memory from anywhere by
setting up an ELF executable with a program header specifying file
offsets outside the file.
Since ELFImage didn't even know how large it was, we had no clue that
we were copying things from outside the ELF.
Fix this by adding a size field to ELFImage and validating program
header ranges before memcpy()'ing to them.
The ELF code is definitely going to need more validation and checking.
This code had been misinterpreting the Multiboot ELF section headers
since the beginning. Furthermore QEMU wasn't even passing us any
headers at all, so this wasn't checking anything.