Either we mount from a loop device or other source, the user might want
to obfuscate the given source for security reasons, so this option will
ensure this will happen.
If passed during a mount, the source will be hidden when reading from
the /sys/kernel/df node.
Similarly to DevPtsFS, this filesystem is about exposing loop device
nodes easily in /dev/loop, so userspace doesn't need to do anything in
order to use new devices immediately.
Instead of returning a raw pointer, which could be technically invalid
when using it in the caller function, we return a valid RefPtr of such
device.
This ensures that the code in DevPtsFS is now safe from a rare race
condition in which the SlavePTY device is gone but we still have a
pointer to it.
This device is a block device that allows a user to effectively treat an
Inode as a block device.
The static construction method is given an OpenFileDescription reference
but validates that:
- The description has a valid custody (so it's not some arbitrary file).
Failing this requirement will yield EINVAL.
- The description custody points to an Inode which is a regular file, as
we only support (seekable) regular files. Failing this requirement
will yield ENOTSUP.
LoopDevice can be used to mount a regular file on the filesystem like
other supported types of (physical) block devices.
Such operation is almost equivalent to writing on an Inode, so lock the
Inode m_inode_lock exclusively.
All FileSystem Inode implementations then override a new method called
truncate_locked which should implement the actual truncating.
SysFS, ProcFS and DevPtsFS were all sending filetype 0 when traversing
their directories, but it is actually very easy to send proper filetypes
in these filesystems.
This patch binds all RAM backed filesystems to use only one enum for
their internal filetype, to simplify the implementation and allow
sharing of code.
Please note that the Plan9FS case is currently not solved as I am not
familiar with this filesystem and its constructs.
The ProcFS mostly keeps track of the filetype, and a fix was needed for
the /proc root directory - all processes exhibit a directory inside it
which makes it very easy to hardcode the directory filetype for them.
There's also the `self` symlink inode which is now exposed as DT_LNK.
As for SysFS, we could leverage the fact everything inherits from the
SysFSComponent class, so we could have a virtual const method to return
the proper filetype.
Most of the files in SysFS are "regular" files though, so the base class
has a non-pure virtual method.
Lastly, the DevPtsFS simply hardcodes '.' and '..' as directory file
type, and everything else is hardcoded to send the character device file
type, as this filesystem is only exposing character pts device files.
This commit adds minimal support for compiler-instrumentation based
memory access sanitization.
Currently we only support detection of kmalloc redzone accesses, and
kmalloc use-after-free accesses.
Support for inline checks (for improved performance), and for stack
use-after-return and use-after-return detection is left for future PRs.
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:
```
Optional<I> opt;
if constexpr (IsSigned<I>)
opt = view.to_int<I>();
else
opt = view.to_uint<I>();
```
For us.
The main goal here however is to have a single generic number conversion
API between all of the String classes.
RAMFS was passing 0, which lead to the userspace seeing all entries as
DT_UNKNOWN when iterating over the directory contents.
To repro prior to this commit, simply check `echo /tmp/*/`.
When writing to /sys/kernel/request_panic it will do a kernel panic.
Trying to truncate the node will result in kernel panic with a slightly
different message.
When the FileSystem does a sync, it gathers up all the inodes with
dirty metadata into a vector. The inode mutex is not held while
checking the inode dirty bit, which can lead to a kernel panic
due to concurrent inode modifications.
Fixes: #21796
Previously, attempting to update an ext2 inode with a UID or GID
larger than 65535 would overflow. We now write the high bits of UIDs
and GIDs to the same place that Linux does within the `osd2` struct.
We should consider whether the selected Thread is within the same jail
or not.
Therefore let's make it clear to callers with jail semantics if a called
method checks if the desired Thread object is within the same jail.
As for Thread::for_each_* methods, currently nothing in the kernel
codebase needs iteration with consideration for jails, so the old
Thread::for_each* were simply renamed to include "ignoring_jails" suffix
in their names.
The name for this directory is a bit awkward. Also, the distinction of
constant information is not really valuable as I thought it would be, so
let's bring that information back into the /sys/kernel directory.
The name "variables" is a bit awkward and what the directory entries are
really about is kernel configuration so let's make it clear with the new
name.
We first must flush the superblock through the BlockBasedFileSystem
methods properly and only then clear the DiskCache pointer, to prevent a
possible kernel panic due to nullptr dereference.
Using the kernel stack is preferable, especially when the examined
strings should be limited to a reasonable length.
This is a small improvement, because if we don't actually move these
strings then we don't need to own heap allocations for them during the
syscall handler function scope.
In addition to that, some kernel strings are known to be limited, like
the hostname string, for these strings we also can use FixedStringBuffer
to store and copy to and from these buffers, without using any heap
allocations at all.
Instead, use the FixedCharBuffer class to ensure we always use a static
buffer storage for these names. This ensures that if a Process or a
Thread were created, there's a guarantee that setting a new name will
never fail, as only copying of strings should be done to that static
storage.
The limits which are set are 32 characters for processes' names and 64
characters for thread names - this is because threads' names could be
more verbose than processes' names.
Previously we could get a raw pointer to a Mount object which might be
invalid when actually dereferencing it.
To ensure this could not happen, we should just use a callback that will
be used immediately after finding the appropriate Mount entry, while
holding the mount table lock.
We don't really need this method anymore, because we could just try to
find the mount entry based on the given mount point host custody.
This also allows us to remove the is_vfs_root and root_inode_id methods
from the VirtualFileSystem class.
We could easily encounter a case where we do the following:
```
mkdir -p /tmp2
mount /dev/hda /tmp2
```
would produce a bug that doing `ls /tmp2/tmp2` will give the contents
on `/dev/hda` ext2 root directory and also on `/tmp2/tmp2/tmp2` and so
on.
To prevent this, we must compare the current custody against each mount
entry's custody to ensure their paths match.
This is not useful, as we have literally zero knowledge about where this
inode is actually located at with respect to the entire global path tree
so we could easily encounter a case where we do the following:
```
mkdir -p /tmp2
mount /dev/hda /tmp2
```
and when traversing the /tmp2 directory entries, we will see the root
inode of /dev/hda on "/tmp2/tmp2", even if it was not mounted.
Therefore, we should just plainly give the raw directory entries as they
are written "on the disk". Anything else that needs to exactly know if
there's an underlying mounted filesystem, can just use the stat syscall
instead.
This ensures that the host mount point custody path is not the same like
the new to-be-mounted custody.
A scenario that could happen before adding this check is:
```
mkdir -p /tmp2
mount /dev/hda /tmp2/
mount /dev/hda /tmp2/
mount /dev/hda /tmp2/ # this will fail here
```
and after adding this check, the following scenario is now this:
```
mkdir -p /tmp2
mount /dev/hda /tmp2/
mount /dev/hda /tmp2/ # this will fail here
mount /dev/hda /tmp2/ # this will fail here too
```
Since this is the block size that file system drivers *should* set,
let's name it the logical block size, just like most file systems such
as ext2 already do anyways.
This never was a logical block size, it always was a device specific
block size. Ideally the block size would change in accordance to
whatever the driver wants to use, but that is a change for the future.
For now, let's get rid of this confusing naming.