1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-16 13:55:00 +00:00
Commit graph

829 commits

Author SHA1 Message Date
Ben Wiederhake
00131d244e Kernel: Expose sysctl 'ubsan_is_deadly' to panic the Kernel on UB
This makes it easier to find UB, for example when fuzzing the Kernel.

This can be enabled by default, thanks to @boricj's work in
32e1354b9b.
2021-03-07 17:31:25 +01:00
Andreas Kling
2871df6f0d Kernel: Stop trying to keep InodeVMObject in sync with disk changes
As it turns out, Dr. POSIX doesn't require that post-mmap() changes
to a file are reflected in the memory mappings. So we don't actually
have to care about the file size changing (or the contents.)

IIUC, as long as all the MAP_SHARED mappings that refer to the same
inode are in sync, we're good.

This means that VMObjects don't need resizing capabilities. I'm sure
there are ways we can take advantage of this fact.
2021-03-04 15:42:51 +01:00
Andreas Kling
a1d1a3b50b Kernel: Use BitmapView instead of Bitmap::wrap() 2021-03-04 11:25:45 +01:00
Andreas Kling
5e7abea31e Kernel+Profiler: Capture metadata about all profiled processes
The perfcore file format was previously limited to a single process
since the pid/executable/regions data was top-level in the JSON.

This patch moves the process-specific data into a top-level array
named "processes" and we now add entries for each process that has
been sampled during the profile run.

This makes it possible to see samples from multiple threads when
viewing a perfcore file with Profiler. This is extremely cool! :^)
2021-03-02 22:38:06 +01:00
Andreas Kling
ea500dd3e3 Kernel: Start work on full system profiling :^)
The superuser can now call sys$profiling_enable() with PID -1 to enable
profiling of all running threads in the system. The perf events are
collected in a global PerformanceEventBuffer (currently 32 MiB in size.)

The events can be accessed via /proc/profile
2021-03-02 22:38:06 +01:00
Ben Wiederhake
336303bda4 Kernel: Make kgettimeofday use AK::Time 2021-03-02 08:36:08 +01:00
Ben Wiederhake
05d5e3fad9 Kernel: Remove duplicative kgettimeofday(timeval&) function 2021-03-02 08:36:08 +01:00
Ben Wiederhake
860a3bbce3 Kernel: Use default con/de-structors
This may seem like a no-op change, however it shrinks down the Kernel by a bit:
.text -432
.unmap_after_init -60
.data -480
.debug_info -673
.debug_aranges 8
.debug_ranges -232
.debug_line -558
.debug_str -308
.debug_frame -40

With '= default', the compiler can do more inlining, hence the savings.
I intentionally omitted some opportunities for '= default', because they
would increase the Kernel size.
2021-02-28 18:09:12 +01:00
Andreas Kling
69a30f95cc Ext2FS: Make block list flushing a bit less aggressive
We don't need to flush the on-disk inode struct multiple times while
writing out its block list. Just mark the in-memory Inode as having
dirty metadata and the SyncTask will flush it eventually.
2021-02-26 18:24:40 +01:00
Andreas Kling
c3a0fd4b7a Ext2FS: Move block list computation from Ext2FS to Ext2FSInode
Since the inode is the logical owner of its block list, let's move the
code that computes the block list there, and also stop hogging the FS
lock while we compute the block list, as there is no need for it.
2021-02-26 18:14:02 +01:00
Andreas Kling
c09921b9be Ext2FS: Don't hog FS lock while reading/writing inodes
There are two locks in the Ext2FS implementation:

* The FS lock (Ext2FS::m_lock)
  This governs access to the superblock, block group descriptors,
  and the block & inode bitmap blocks. It's held while allocating
  or freeing blocks/inodes.

* The inode lock (Ext2FSInode::m_lock)
  This governs access to the inode metadata, including the block
  list, and to the content data as well. It's held while doing
  basically anything with the inode.

Once an on-disk block/inode is allocated, it logically belongs
to the in-memory Inode object, so there's no need for the FS lock
to be taken while manipulating them, the inode lock is all you need.

This dramatically reduces the impact of disk I/O on path resolution
and various other things that look at individual inodes.
2021-02-26 17:57:38 +01:00
Andreas Kling
c7c63727bf Ext2FS: Remove unnecessary locking in find_block_containing_inode()
This is just a bunch of index math based on immutable values in the
super block and block group descriptor. No need to lock here!
2021-02-26 17:24:39 +01:00
Andreas Kling
81e3ea29c3 Ext2FS: Remove unnecessary lock in Ext2FS::write_ext2_node()
Now that writing to the underlying storage is serialized, we don't
need to take the FS lock when writing out an inode struct.
2021-02-26 17:23:46 +01:00
Andreas Kling
dcc5b7397f Kernel: Take FS lock in BlockBasedFS during seek/read/write operations
Since these filesystems operate on an underlying file descriptor
and rely on its offset for correctness, let's use the FS lock to
serialize these operations.

This also means that FS subclasses can rely on block-level read/write
operations being atomic.
2021-02-26 17:15:32 +01:00
Andreas Kling
65e083ed36 Revert "Ext2FS: Don't reload already-cached block list when freeing inode"
This reverts commit 1e737a5c50.

The cached block list does not include meta-blocks, so we'd end up
leaking those. There's definitely a nice way to avoid work here, but it
turns out it wasn't quite this trivial. Reverting for now.
2021-02-26 14:57:00 +01:00
Andreas Kling
1e737a5c50 Ext2FS: Don't reload already-cached block list when freeing inode
If we already have a cached copy of the inode's block list, we can use
that to free the blocks. No need to reload the list.
2021-02-26 14:05:18 +01:00
Andreas Kling
1f9409a658 Ext2FS: Inode allocation improvements
This patch combines inode the scan for an available inode with the
updating of the bit in the inode bitmap into a single operation.

We also exit the scan immediately when we find an inode, instead of
continuing until we've scanned all the eligible groups(!)

Finally, we stop holding the filesystem lock throughout the entire
operation, and instead only take it while actually necessary
(during inode allocation, flush, and inode cache update.)
2021-02-26 14:05:18 +01:00
Andreas Kling
19083fd760 Ext2FS: Propagate errors from more places
Improve a bunch of situations where we'd previously panic the kernel
on failure. We now propagate whatever error we had instead. Usually
that'll be EIO.
2021-02-26 14:05:18 +01:00
Andreas Kling
6352b4fd74 Ext2FS: Share some bitmap code between inode and block allocation
Both inode and block allocation operate on bitmap blocks and update
counters in the superblock and group descriptor.

Since we're here, also add some error propagation around this code.
2021-02-26 14:05:18 +01:00
Andreas Kling
5d180d1f99 Everywhere: Rename ASSERT => VERIFY
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED)

Since all of these checks are done in release builds as well,
let's rename them to VERIFY to prevent confusion, as everyone is
used to assertions being compiled out in release.

We can introduce a new ASSERT macro that is specifically for debug
checks, but I'm doing this wholesale conversion first since we've
accumulated thousands of these already, and it's not immediately
obvious which ones are suitable for ASSERT.
2021-02-23 20:56:54 +01:00
Brian Gianforcaro
2139e0a201 Kernel: Handle overflow in FileDescription::seek(, SEEK_CUR) 2021-02-21 17:12:01 +01:00
Brian Gianforcaro
cbd8f78cce Kernel: Use uniform initialization instead of memset for a few stack buffer.
Raw memset is relatively easy to mess up, avoid it when there are
better alternatives provided by the compiler in modern C++.
2021-02-21 11:52:47 +01:00
Brian Gianforcaro
7c950c2d01 Kernel: Use ByteBuffer::zero_fill() instead of raw memset in Ext2
There was a typo in one of the memsets, use the type safe wrapper instead.

Fix EXt
2021-02-21 11:52:47 +01:00
Andreas Kling
fdf03852c9 Kernel: Slap UNMAP_AFTER_INIT on a whole bunch of functions
There's no real system here, I just added it to various functions
that I don't believe we ever want to call after initialization
has finished.

With these changes, we're able to unmap 60 KiB of kernel text
after init. :^)
2021-02-19 20:23:05 +01:00
Andreas Kling
37d8faf1b4 ProcFS: Fix /proc/PID/* hardening bypass
This enabled trivial ASLR bypass for non-dumpable programs by simply
opening /proc/PID/vm before exec'ing.

We now hold the target process's ptrace lock across the refresh/write
operations, and deny access if the process is non-dumpable. The lock
is necessary to prevent a TOCTOU race on Process::is_dumpable() while
the target is exec'ing.

Fixes #5270.
2021-02-19 09:46:36 +01:00
Andreas Kling
6c2f0316d9 Kernel: Convert snprintf() => String::formatted()/number() 2021-02-17 16:37:11 +01:00
Brian Gianforcaro
ddd79fe2cf Kernel: Add WaitQueue::wait_forever and it use it for all infinite waits.
In preparation for marking BlockingResult [[nodiscard]], there are a few
places that perform infinite waits, which we never observe the result of
the wait. Instead of suppressing them, add an alternate function which
returns void when performing and infinite wait.
2021-02-15 08:28:57 +01:00
Andreas Kling
8415866c03 Kernel: Remove user/kernel flags from Region
Now that we no longer need to support the signal trampolines being
user-accessible inside the kernel memory range, we can get rid of the
"kernel" and "user-accessible" flags on Region and simply use the
address of the region to determine whether it's kernel or user.

This also tightens the page table mapping code, since it can now set
user-accessibility based solely on the virtual address of a page.
2021-02-14 01:34:23 +01:00
Jean-Baptiste Boric
9ce0639383 Kernel: Use divide_rounded_up inside write_block_list_for_inode 2021-02-13 19:56:49 +01:00
Jean-Baptiste Boric
869b33d6dd Kernel: Support triply indirect blocks for BlockListShape computation 2021-02-13 19:56:49 +01:00
Ben Wiederhake
46e5890152 Kernel: Add forgotten 'const' flag 2021-02-13 00:40:31 +01:00
Andreas Kling
0a45cfee01 DevFS: Use strongly typed InodeIndex
Also add an assertion for the DevFS inode index allocator overflowing.
2021-02-12 16:24:40 +01:00
Andreas Kling
ffa39f98e8 Kernel: Fix build with BBFS_DEBUG 2021-02-12 13:51:34 +01:00
Andreas Kling
c62c00e7db Ext2FS: Make Ext2FS::GroupIndex a distinct integer type 2021-02-12 13:33:58 +01:00
Andreas Kling
489317e573 Kernel: Make BlockBasedFS::BlockIndex a distinct integer type 2021-02-12 11:59:27 +01:00
Andreas Kling
e44c1792a7 Kernel: Add distinct InodeIndex type
Use the DistinctNumeric mechanism to make InodeIndex a strongly typed
integer type.
2021-02-12 10:26:29 +01:00
Andreas Kling
c8a90a31b6 Kernel: Remove default arguments from Inode::resolve_as_link()
Nobody was calling it without specifying all arguments anyway.
2021-02-12 09:06:03 +01:00
Andreas Kling
95064f8b58 Ext2FS: Convert #if EXT2_DEBUG => dbgln_if() and constexpr-if 2021-02-11 23:05:16 +01:00
Andreas Kling
a280cdf9ba Ext2FS: Shrink Ext2FSDirectoryEntry from 16 to 12 bytes
The way we read/write directories is very inefficient, and this doesn't
solve any of that. It does however reduce memory usage of directory
entry vectors by 25% which has nice immediate benefits.
2021-02-11 22:45:50 +01:00
Andreas Kling
1f277f0bd9 Kernel: Convert all *Builder::appendf() => appendff() 2021-02-09 19:18:13 +01:00
Andreas Kling
8bda30edd2 Kernel: Move memory statistics helpers from Process to Space 2021-02-08 22:23:29 +01:00
Andreas Kling
f1b5def8fd Kernel: Factor address space management out of the Process class
This patch adds Space, a class representing a process's address space.

- Each Process has a Space.
- The Space owns the PageDirectory and all Regions in the Process.

This allows us to reorganize sys$execve() so that it constructs and
populates a new Space fully before committing to it.

Previously, we would construct the new address space while still
running in the old one, and encountering an error meant we had to do
tedious and error-prone rollback.

Those problems are now gone, replaced by what's hopefully a set of much
smaller problems and missing cleanups. :^)
2021-02-08 18:27:28 +01:00
AnotherTest
09a43969ba Everywhere: Replace dbgln<flag>(...) with dbgln_if(flag, ...)
Replacement made by `find Kernel Userland -name '*.h' -o -name '*.cpp' | sed -i -Ee 's/dbgln\b<(\w+)>\(/dbgln_if(\1, /g'`
2021-02-08 18:08:55 +01:00
William Bowling
b97d23a71f
Kernel: Use the resolved parent path when testing create veil (#5231) 2021-02-06 19:11:44 +01:00
Andreas Kling
4c0707e56c Kernel: Don't create a zero-length VLA in Ext2FS block list walk
Found by KUBSAN :^)
2021-02-05 21:23:11 +01:00
Andreas Kling
54d28df97d Kernel: Make /proc/PID/stacks/TID a JSON array
The contents of these files are now raw JSON arrays. We no longer
symbolicate the addresses. That's up to userspace from now on.
2021-02-04 22:55:39 +01:00
Andreas Kling
e1236dac3e Kernel: Check for off_t overflow in FileDescription::read/write
We were checking for size_t (unsigned) overflow but the current offset
is actually stored as off_t (signed). Fix this, and also fail with
EOVERFLOW correctly.
2021-02-03 10:54:35 +01:00
Andreas Kling
9f05044c50 Kernel: Check for off_t overflow before reading/writing InodeFile
Let's double-check before calling the Inode. This way we don't have to
trust every Inode subclass to validate user-supplied inputs.
2021-02-03 10:51:37 +01:00
Andreas Kling
823186031d Kernel: Add a way to specify which memory regions can make syscalls
This patch adds sys$msyscall() which is loosely based on an OpenBSD
mechanism for preventing syscalls from non-blessed memory regions.

It works similarly to pledge and unveil, you can call it as many
times as you like, and when you're finished, you call it with a null
pointer and it will stop accepting new regions from then on.

If a syscall later happens and doesn't originate from one of the
previously blessed regions, the kernel will simply crash the process.
2021-02-02 20:13:44 +01:00
Andreas Kling
d4f40241f1 Ext2FS: Avoid unnecessary parent inode lookup during inode creation
Creation of new inodes is always driven by the parent inode, so we can
just refer directly to it instead of looking up the parent by ID.
2021-02-02 18:58:26 +01:00