1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-24 01:05:08 +00:00
Commit graph

1052 commits

Author SHA1 Message Date
Itamar
8fce807b10 Kernel: Fix usermode verification in ptrace with PT_SETREGS
When doing PT_SETREGS, we want to verify that the debugged thread is
executing in usermode.

b2f7ccf refactored things and flipped the relevant check around, which
broke things that use PT_SETREGS (for example, stepping over
breakpoints with sdb).
2023-02-03 17:39:13 +01:00
Timon Kruiper
b941bd55d9 Kernel: Add Syscalls/execve.cpp to aarch64 build 2023-01-27 20:47:08 +00:00
Timon Kruiper
1fbf562e7e Kernel: Add ThreadRegisters::set_exec_state and use it in execve.cpp
Using this abstraction it is possible to compile this file for aarch64.
2023-01-27 20:47:08 +00:00
Timon Kruiper
12322670cb Kernel: Use InterruptsState abstraction in execve.cpp
This was using the x86_64 specific cpu_flags abstraction, which is not
compatible with aarch64.
2023-01-27 20:47:08 +00:00
Timon Kruiper
5ffd53e2f2 Kernel: Add Syscalls/fork.cpp to aarch64 build 2023-01-27 20:47:08 +00:00
Timon Kruiper
80b722081b Kernel: Add Syscalls/mmap.cpp to aarch64 build 2023-01-27 20:47:08 +00:00
Timon Kruiper
a146a19636 Kernel: Make Syscalls/ptrace.cpp buildable for aarch64 2023-01-27 20:47:08 +00:00
Timon Kruiper
697c5ca5e5 Kernel: Move Memory/PageDirectory.{cpp,h} to arch-specific directory
The handling of page tables is very architecture specific, so belongs
in the Arch directory. Some parts were already architecture-specific,
however this commit moves the rest of the PageDirectory class into the
Arch directory.

While we're here the aarch64/PageDirectory.{h,cpp} files are updated to
be aarch64 specific, by renaming some members and removing x86_64
specific code.
2023-01-27 11:41:43 +01:00
Timon Kruiper
fb10774862 Kernel: Factor our PreviousMode into RegisterState::previous_mode
Various places in the kernel were manually checking the cs register for
x86_64, however to share this with aarch64 a function in RegisterState
is added, and the call-sites are updated. While we're here the
PreviousMode enum is renamed to ExecutionMode.
2023-01-27 11:41:43 +01:00
Andrew Kaster
100fb38c3e Kernel+Userland: Move LibC/sys/ioctl_numbers to Kernel/API/Ioctl.h
This header has always been fundamentally a Kernel API file. Move it
where it belongs. Include it directly in Kernel files, and make
Userland applications include it via sys/ioctl.h rather than directly.
2023-01-21 10:43:59 -07:00
Andrew Kaster
ddea37b521 Kernel+LibC: Move name length constants to Kernel/API from limits.h
Reduce inclusion of limits.h as much as possible at the same time.

This does mean that kmalloc.h is now including Kernel/API/POSIX/limits.h
instead of LibC/limits.h, but the scope could be limited a lot more.
Basically every file in the kernel includes kmalloc.h, and needs the
limits.h include for PAGE_SIZE.
2023-01-21 10:43:59 -07:00
Liav A
7aebbe52b9 Meta: Fix copyright header in Kernel/Syscalls/jail.cpp file
I wrote that file in 2022, not Andreas in 2018.
2023-01-14 09:57:04 +01:00
Liav A
16b6e644d7 Kernel: Require "stdio" pledge promise when calling get_root_session_id 2023-01-13 13:41:30 +01:00
Andreas Kling
5dcc58d54a Kernel+LibCore: Make %sid path parsing not take ages
Before this patch, Core::SessionManagement::parse_path_with_sid() would
figure out the root session ID by sifting through /sys/kernel/processes.

That file can take quite a while to generate (sometimes up to 40ms on my
machine, which is a problem on its own!) and with no caching, many of
our programs were effectively doing this multiple times on startup when
unveiling something in /tmp/session/%sid/

While we should find ways to make generating /sys/kernel/processes fast
again, this patch addresses the specific problem by introducing a new
syscall: sys$get_root_session_id(). This extracts the root session ID
by looking directly at the process table and takes <1ms instead of 40ms.

This cuts WebContent process startup time by ~100ms on my machine. :^)
2023-01-10 19:32:31 +01:00
Liav A
04221a7533 Kernel: Mark Process::jail() method as const
We really don't want callers of this function to accidentally change
the jail, or even worse - remove the Process from an attached jail.
To ensure this never happens, we can just declare this method as const
so nobody can mutate it this way.
2023-01-07 03:44:59 +03:30
yyny
fb2be937ac Kernel: Allow sending SIGCONT to processes in the same group
Allow sending `SIGCONT` to processes that share the same `pgid`.
This is allowed in Linux aswell.

Also fixes a FIXME :^)
2023-01-03 18:13:11 +01:00
yyny
9ca979846c Kernel: Add sid and pgid to Credentials
There are places in the kernel that would like to have access
to `pgid` credentials in certain circumstances.

I haven't found any use cases for `sid` yet, but `sid` and `pgid` are
both changed with `sys$setpgid`, so it seemed sensical to add it.

In Linux, `man 7 credentials` also mentions both the session id and
process group id, so this isn't unprecedented.
2023-01-03 18:13:11 +01:00
kleines Filmröllchen
a6a439243f Kernel: Turn lock ranks into template parameters
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
  obtain any lock rank just via the template parameter. It was not
  previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
  of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
  readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
  wanted in the first place (or made use of). Locks randomly changing
  their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
  taken in the right order (with the right lock rank checking
  implementation) as rank information is fully statically known.

This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
2023-01-02 18:15:27 -05:00
Liav A
e598f22768 Kernel: Disallow executing SUID binaries if process is jailed
Check if the process we are currently running is in a jail, and if that
is the case, fail early with the EPERM error code.

Also, as Brian noted, we should also disallow attaching to a jail in
case of already running within a setid executable, as this leaves the
user with false thinking of being secure (because you can't exec new
setid binaries), but the current program is still marked setid, which
means that at the very least we gained permissions while we didn't
expect it, so let's block it.
2022-12-30 15:49:37 -05:00
Timon Kruiper
a3cbaa3449 Kernel: Move ThreadRegisters into arch-specific directory
These are architecture-specific anyway, so they belong in the Arch
directory. This commit also adds ThreadRegisters::set_initial_state to
factor out the logic in Thread.cpp.
2022-12-29 19:32:20 -07:00
Liav A
91db482ad3 Kernel: Reorganize Arch/x86 directory to Arch/x86_64 after i686 removal
No functional change.
2022-12-28 11:53:41 +01:00
Liav A
5ff318cf3a Kernel: Remove i686 support 2022-12-28 11:53:41 +01:00
Liav A
8585b2dc23 Kernel/Memory: Add option to annotate region mapping as immutable
We add this basic functionality to the Kernel so Userspace can request a
particular virtual memory mapping to be immutable. This will be useful
later on in the DynamicLoader code.

The annotation of a particular Kernel Region as immutable implies that
the following restrictions apply, so these features are prohibited:
- Changing the region's protection bits
- Unmapping the region
- Annotating the region with other virtual memory flags
- Applying further memory advises on the region
- Changing the region name
- Re-mapping the region
2022-12-16 01:02:00 -07:00
Liav A
6c0486277e Kernel: Reintroduce the msyscall syscall as the annotate_mapping syscall
This syscall will be used later on to ensure we can declare virtual
memory mappings as immutable (which means that the underlying Region is
basically immutable for both future annotations or changing the
protection bits of it).
2022-12-16 01:02:00 -07:00
Agustin Gianni
ac40090583 Kernel: Add the auxiliary vector to the stack size validation
This patch validates that the size of the auxiliary vector does not
exceed `Process::max_auxiliary_size`. The auxiliary vector is a range
of memory in userspace stack where the kernel can pass information to
the process that will be created via `Process:do_exec`.

The reason the kernel needs to validate its size is that the about to
be created process needs to have remaining space on the stack.
Previously only `argv` and `envp` were taken into account for the
size validation, with this patch, the size of `auxv` is also
checked. All three elements contain values that a user (or an
attacker) can specify.

This patch adds the constant `Process::max_auxiliary_size` which is
defined to be one eight of the user-space stack size. This is the
approach taken by `Process:max_arguments_size` and
`Process::max_environment_size` which are used to check the sizes
of `argv` and `envp`.
2022-12-14 15:09:28 +00:00
sin-ack
ef6921d7c7 Kernel+LibC+LibELF: Set stack size based on PT_GNU_STACK during execve
Some programs explicitly ask for a different initial stack size than
what the OS provides. This is implemented in ELF by having a
PT_GNU_STACK header which has its p_memsz set to the amount that the
program requires. This commit implements this policy by reading the
p_memsz of the header and setting the main thread stack size to that.
ELF::Image::validate_program_headers ensures that the size attribute is
a reasonable value.
2022-12-11 19:55:37 -07:00
sin-ack
9b425b860c Kernel+LibC+Tests: Implement pwritev(2)
While this isn't really POSIX, it's needed by the Zig port and was
simple enough to implement.
2022-12-11 19:55:37 -07:00
sin-ack
70337f3a4b Kernel+LibC: Implement setregid(2)
This copies and adapts the setresgid syscall, following in the footsteps
of setreuid and setresuid.
2022-12-11 19:55:37 -07:00
sin-ack
2a502fe232 Kernel+LibC+LibCore+UserspaceEmulator: Implement faccessat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
d5fbdf1866 Kernel+LibC+LibCore: Implement renameat(2)
Now with the ability to specify different bases for the old and new
paths.
2022-12-11 19:55:37 -07:00
sin-ack
eb5389e933 Kernel+LibC+LibCore: Implement mkdirat(2) 2022-12-11 19:55:37 -07:00
sin-ack
6445a706cf Kernel+LibC: Implement readlinkat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
sin-ack
9850a69cd1 Kernel+LibC+LibCore: Implement symlinkat(2)
Co-Authored-By: Daniel Bertalan <dani@danielbertalan.dev>
2022-12-11 19:55:37 -07:00
Andreas Kling
4277e2d58f Kernel: Add some spec links and comments to sys$posix_fallocate() 2022-11-29 11:09:19 +01:00
Andreas Kling
961e1e590b Kernel: Make sys$posix_fallocate() fail with ENODEV on non-regular files
Previously we tried to determine if `fd` refers to a non-regular file by
doing a stat() operation on the file.

This didn't work out very well since many File subclasses don't
actually implement stat() but instead fall back to failing with EBADF.

This patch fixes the issue by checking for regular files with
File::is_regular_file() instead.
2022-11-29 11:09:19 +01:00
Andreas Kling
9249bcb5aa Kernel: Remove unnecessary FIXME in sys$posix_fallocate()
This syscall doesn't need to do anything for ENOSPC, as that is already
handled by its callees.
2022-11-29 11:09:19 +01:00
Liav A
718ae68621 Kernel+LibCore+LibC: Implement support for forcing unveil on exec
To accomplish this, we add another VeilState which is called
LockedInherited. The idea is to apply exec unveil data, similar to
execpromises of the pledge syscall, on the current exec'ed program
during the execve sequence. When applying the forced unveil data, the
veil state is set to be locked but the special state of LockedInherited
ensures that if the new program tries to unveil paths, the request will
silently be ignored, so the program will continue running without
receiving an error, but is still can only use the paths that were
unveiled before the exec syscall. This in turn, allows us to use the
unveil syscall with a special utility to sandbox other userland programs
in terms of what is visible to them on the filesystem, and is usable on
both programs that use or don't use the unveil syscall in their code.
2022-11-26 12:42:15 -07:00
Andreas Kling
5556b27e38 Kernel: Update tv_nsec field when using utimensat() with UTIME_NOW
We were only updating the tv_sec field and leaving UTIME_NOW in tv_nsec.
2022-11-24 16:56:27 +01:00
Liav A
9559682f5c Kernel: Disallow jail creation from a process within a jail
We now disallow jail creation from a process within a jail because there
is simply no valid use case to allow it, and we will probably not enable
this behavior (which is considered a bug) again.

Although there was no "real" security issue with this bug, as a process
would still be denied to join that jail, there's an information reveal
about the amount of jails that are or were present in the system.
2022-11-13 16:58:54 -07:00
Liav A
3cc0d60141 Kernel: Split the Ext2FileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
1c91881a1d Kernel: Split the ISO9660FileSystem.{cpp,h} files to smaller components 2022-11-08 02:54:48 -07:00
Liav A
fca3b7f1f9 Kernel: Split the DevPtsFS files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3fc52a6d1c Kernel: Split the Plan9FileSystem.{cpp,h} file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
3906dd3aa3 Kernel: Split the ProcFS core file into smaller components 2022-11-08 02:54:48 -07:00
Liav A
e882b2ed05 Kernel: Split the FATFileSystem.{cpp,h} files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e6101dd3e Kernel: Split the TmpFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
f53149d5f6 Kernel: Split the SysFS core files into smaller components 2022-11-08 02:54:48 -07:00
Liav A
5e062414c1 Kernel: Add support for jails
Our implementation for Jails resembles much of how FreeBSD jails are
working - it's essentially only a matter of using a RefPtr in the
Process class to a Jail object. Then, when we iterate over all processes
in various cases, we could ensure if either the current process is in
jail and therefore should be restricted what is visible in terms of
PID isolation, and also to be able to expose metadata about Jails in
/sys/kernel/jails node (which does not reveal anything to a process
which is in jail).

A lifetime model for the Jail object is currently plain simple - there's
simpy no way to manually delete a Jail object once it was created. Such
feature should be carefully designed to allow safe destruction of a Jail
without the possibility of releasing a process which is in Jail from the
actual jail. Each process which is attached into a Jail cannot leave it
until the end of a Process (i.e. when finalizing a Process). All jails
are kept being referenced in the JailManagement. When a last attached
process is finalized, the Jail is automatically destroyed.
2022-11-05 18:00:58 -06:00
Andreas Kling
9c46fb7337 Kernel: Make sys$msyscall() not take the big lock
This function is already serialized by the address space lock.
2022-11-05 18:54:39 +01:00
kleines Filmröllchen
259bfe05b1 Kernel: Set priority of all threads within a process if requested
This is intended to reflect the POSIX sched_setparam API, which has some
cryptic language
(https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_08_04_01
) that as far as I can tell implies we should prioritize process
scheduling policies over thread scheduling policies. Technically this
means that a process must have its own sets of policies that are
considered first by the scheduler, but it seems unlikely anyone relies
on this behavior in practice. So we just override all thread's policies,
making them (at least before calls to pthread_setschedparam) behave
exactly like specified on the surface.
2022-10-27 11:30:19 +01:00