I often keep my terminal camped in the project root directory and run
`make && ./Kernel/sync.sh && ./Kernel/run`
This change allows me to not feel like a doofus when I do that :^)
This patch introduces a helpful copy_string_from_user() function
that takes a bounded null-terminated string from userspace memory
and copies it into a String object.
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that
prevents the kernel from accessing userspace memory. With SMAP enabled,
trying to read/write a userspace memory address while in the kernel
will now generate a page fault.
Since it's sometimes necessary to read/write userspace memory, there
are two new instructions that quickly switch the protection on/off:
STAC (disables protection) and CLAC (enables protection.)
These are exposed in kernel code via the stac() and clac() helpers.
There's also a SmapDisabler RAII object that can be used to ensure
that you don't forget to re-enable protection before returning to
userspace code.
THis patch also adds copy_to_user(), copy_from_user() and memset_user()
which are the "correct" way of doing things. These functions allow us
to briefly disable protection for a specific purpose, and then turn it
back on immediately after it's done. Going forward all kernel code
should be moved to using these and all uses of SmapDisabler are to be
considered FIXME's.
Note that we're not realizing the full potential of this feature since
I've used SmapDisabler quite liberally in this initial bring-up patch.
Our syscall calling convention only allows passing up to 3 arguments in
registers. For syscalls that take more arguments, we bake them into a
struct and pass a pointer to that struct instead.
When doing pointer validation, this is what we would do:
1) Validate the "params" struct
2) Validate "params->some_pointer"
3) ... other stuff ...
4) Use "params->some_pointer"
Since the parameter struct is stored in userspace, it can be modified
by userspace after validation has completed.
This was a recurring pattern in many syscalls that was further hidden
by me using structured binding declarations to give convenient local
names to things in the parameter struct:
auto& [some_pointer, ...] = *params;
memcpy(some_pointer, ...);
This devilishly makes "some_pointer" look like a local variable but
it's actually more like an alias for "params->some_pointer" and will
expand to a dereference when accessed!
This patch fixes the issues by explicitly copying out each member from
the parameter structs before validating them, and then never using
the "param" pointers beyond that.
Thanks to braindead for finding this bug! :^)
This has been a FIXME for a long time. We now apply the provided
read/write permissions to the constructed FileDescription when opening
a File object via File::open().
We were running without the sticky bit and mode 777, which meant that
the /tmp directory was world-writable *without* protection.
With this fixed, it's no longer possible for everyone to steal root's
files in /tmp.
In order to ensure a specific owner and mode when the local socket
filesystem endpoint is instantiated, we need to be able to call
fchmod() and fchown() on a socket fd between socket() and bind().
This is because until we call bind(), there is no filesystem inode
for the socket yet.
If we're creating something that should have a different owner than the
current process's UID/GID, we need to plumb that all the way through
VFS down to the FS functions.
Inode::size() may try to take a lock, so we can't be calling it with
interrupts disabled.
This fixes a kernel hang when trying to execute a binary in a TmpFS.
We now have these API's in <Kernel/Random.h>:
- get_fast_random_bytes(u8* buffer, size_t buffer_size)
- get_good_random_bytes(u8* buffer, size_t buffer_size)
- get_fast_random<T>()
- get_good_random<T>()
Internally they both use x86 RDRAND if available, otherwise they fall
back to the same LCG we had in RandomDevice all along.
The main purpose of this patch is to give kernel code a way to better
express its needs for random data.
Randomness is something that will require a lot more work, but this is
hopefully a step in the right direction.
To accomodate file creation, path resolution optionally returns the
last valid parent directory seen while traversing the path.
Clients will then interpret "ENOENT, but I have a parent for you" as
meaning that the file doesn't exist, but its immediate parent directory
does. The client then goes ahead and creates a new file.
In the case of "/foo/bar/baz" where there is no "/foo", it would fail
with ENOENT and "/" as the last seen parent directory, causing e.g the
open() syscall to create "/baz".
Covered by test_io.
It was previously possible to write to read-only file descriptors,
and read from write-only file descriptors.
All FileDescription objects now start out non-readable + non-writable,
and whoever is creating them has to "manually" enable reading/writing
by calling set_readable() and/or set_writable() on them.