This updates the profiling subsystem to use a separate timer to
trigger CPU sampling. This timer has a higher resolution (1000Hz)
and is independent from the scheduler. At a later time the
resolution could even be made configurable with an argument for
sys$profiling_enable() - but not today.
Modify the API so it's possible to propagate error on OOM failure.
NonnullOwnPtr<T> is not appropriate for the ThreadTracer::create() API,
so switch to OwnPtr<T>, use adopt_own_if_nonnull() to handle creation.
This patch modifies InodeWatcher to switch to a one watcher, multiple
watches architecture. The following changes have been made:
- The watch_file syscall is removed, and in its place the
create_iwatcher, iwatcher_add_watch and iwatcher_remove_watch calls
have been added.
- InodeWatcher now holds multiple WatchDescriptions for each file that
is being watched.
- The InodeWatcher file descriptor can be read from to receive events on
all watched files.
Co-authored-by: Gunnar Beutner <gunnar@beutner.name>
Previously we'd try to load ELF images which did not have
an interpreter set with an incorrect load offset of 0, i.e. way
outside of the part of the address space where we'd expect either
the dynamic loader or the user's executable to reside.
This fixes the problem by using get_load_offset for both executables
which have an interpreter set and those which don't. Notably this
allows us to actually successfully execute the Loader.so binary:
courage:~ $ /usr/lib/Loader.so
You have invoked `Loader.so'. This is the helper program for programs
that use shared libraries. Special directives embedded in executables
tell the kernel to load this program.
This helper program loads the shared libraries needed by the program,
prepares the program to run, and runs it. You do not need to invoke
this helper program directly.
courage:~ $
The current method of emitting performance events requires a bit of
boiler plate at every invocation, as well as having to ignore the
return code which isn't used outside of the perf event syscall. This
change attempts to clean that up by exposing high level API's that
can be used around the code base.
The fact that current_time can "fail" makes its use a bit awkward.
All callers in the Kernel are trusted besides syscalls, so assert
that they never get there, and make sure all current callers perform
validation of the clock_id with TimeManagement::is_valid_clock_id().
I have fuzzed this change locally for a bit to make sure I didn't
miss any obvious regression.
The error handling in all these cases was still using the old style
negative values to indicate errors. We have a nicer solution for this
now with KResultOr<T>. This change switches the interface and then all
implementers to use the new style.
Previously, TLS data was always zero-initialized.
To support initializing the values of TLS data, sys$allocate_tls now
receives a buffer with the desired initial data, and copies it to the
master TLS region of the process.
The DynamicLinker gathers the initial TLS image and passes it to
sys$allocate_tls.
We also now require the size passed to sys$allocate_tls to be
page-aligned, to make things easier. Note that this doesn't waste memory
as the TLS data has to be allocated in separate pages anyway.
For example Linux accepts an additional argument for flags in accept4()
that let the user specify what flags they want. However, by default
accept() should not inherit those flags from the listener socket.
Theoretically the append should never fail as we have in-line storage
of FD_SETSIZE, which should always be enough. However I'm planning on
removing the non-try variants of AK::Vector when compiling in kernel
mode in the future, so this will need to go eventually. I suppose it
also protects against some unforeseen bug where we we can append more
than FD_SETSIZE items.
Theoretically the append should never fail as we have in-line storage
of 2, which should be enough. However I'm planning on removing the
non-try variants of AK::Vector when compiling in kernel mode in the
future, so this will need to go eventually. I suppose it also protects
against some unforeseen bug where we we can append more than 2 items.
sys$purge() is a bit unique, in that it is probably in the systems
advantage to attempt to limp along if we hit OOM while processing
the vmobjects to purge. This change modifies the algorithm to observe
OOM and continue trying to purge any previously visited VMObjects.
This turns the perfcore format into more a log than it was before,
which lets us properly log process, thread and region
creation/destruction. This also makes it unnecessary to dump the
process' regions every time it is scheduled like we did before.
Incidentally this also fixes 'profile -c' because we previously ended
up incorrectly dumping the parent's region map into the profile data.
Log-based mmap support enables profiling shared libraries which
are loaded at runtime, e.g. via dlopen().
This enables profiling both the parent and child process for
programs which use execve(). Previously we'd discard the profiling
data for the old process.
The Profiler tool has been updated to not treat thread IDs as
process IDs anymore. This enables support for processes with more
than one thread. Also, there's a new widget to filter which
process should be displayed.
Userspace can provide a null argument for the `act` argument to the
`sigaction` syscall to not set any new behavior. This is described
here:
https://pubs.opengroup.org/onlinepubs/007904875/functions/sigaction.html
Without this fix, the `copy_from_user(...)` invocation on `user_act`
fails and makes the syscall return early.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
While profiling all processes the profile buffer lives forever.
Once you have copied the profile to disk, there's no need to keep it
in memory. This syscall surfaces the ability to clear that buffer.
This adds PT_PEEKDEBUG and PT_POKEDEBUG to allow for reading/writing
the debug registers, and updates the Kernel's debug handler to read the
new information from the debug status register.
Previously the dynamic loader would become unused after it had invoked
the program's entry function. However, in order to support exceptions
and - at a later point - dlfcn functionality we need to call back
into the dynamic loader at runtime.
Because the dynamic loader has a static copy of LibC it'll attempt to
invoke syscalls directly from its text segment. For this to work the
executable region for the dynamic loader needs to have syscalls enabled.
Reading from the mapping doesn't work when the text segment has a non-zero
offset because in that case the first mapped page doesn't contain the ELF
header.
This should provide some speed up, as currently searches for regions
containing a given address were performed in O(n) complexity, while
this container allows us to do those in O(logn).
This does not affect functionality right now, but it means that the
regions vector will now never have any overlapping regions, which will
allow the use of balance binary search trees instead of a vector in the
future. (since they require keys to be exclusive)
The end goal of this commit is to allow to boot on bare metal with no
PS/2 device connected to the system. It turned out that the original
code relied on the existence of the PS/2 keyboard, so VirtualConsole
called it even though ACPI indicated the there's no i8042 controller on
my real machine because I didn't plug any PS/2 device.
The code is much more flexible, so adding HID support for other type of
hardware (e.g. USB HID) could be much simpler.
Briefly describing the change, we have a new singleton called
HIDManagement, which is responsible to initialize the i8042 controller
if exists, and to enumerate its devices. I also abstracted a bit
things, so now every Human interface device is represented with the
HIDDevice class. Then, there are 2 types of it - the MouseDevice and
KeyboardDevice classes; both are responsible to handle the interface in
the DevFS.
PS2KeyboardDevice, PS2MouseDevice and VMWareMouseDevice classes are
responsible for handling the hardware-specific interface they are
assigned to. Therefore, they are inheriting from the IRQHandler class.
Previously, Process::do_write would error if the O_APPEND flag was set
on a non-seekable file.
Other systems (such as Linux) seem to be OK with doing this, so we now
do not attempt to seek to the end the file if it's not seekable.