Fix gracefully failing these calls if used within IRQ handlers. If we're
handling IRQs, we need to handle these failures first, because we can't
really resolve page faults in a meaningful way. But if we know that it
was one of these functions that failed, then we can gracefully handle
the situation.
This solves a crash where the Scheduler attempts to produce backtraces
in the timer irq, some of which cause faults.
Fixes#3492
Since the CPU already does almost all necessary validation steps
for us, we don't really need to attempt to do this. Doing it
ourselves doesn't really work very reliably, because we'd have to
account for other processors modifying virtual memory, and we'd
have to account for e.g. pages not being able to be allocated
due to insufficient resources.
So change the copy_to/from_user (and associated helper functions)
to use the new safe_memcpy, which will return whether it succeeded
or not. The only manual validation step needed (which the CPU
can't perform for us) is making sure the pointers provided by user
mode aren't pointing to kernel mappings.
To make it easier to read/write from/to either kernel or user mode
data add the UserOrKernelBuffer helper class, which will internally
either use copy_from/to_user or directly memcpy, or pass the data
through directly using a temporary buffer on the stack.
Last but not least we need to keep syscall params trivial as we
need to copy them from/to user mode using copy_from/to_user.
These special functions can be used to safely copy/set memory or
determine the length of a string, e.g. provided by user mode.
In the event of a page fault, safe_memcpy/safe_memset will return
false and safe_strnlen will return -1.
I decided to modify MappedROM.h because all other entried in Forward.h
are also classes, and this is visually more pleasing.
Other than that, it just doesn't make any difference which way we resolve
the conflicts.
Since "rings" typically refer to code execution and user processes
can also execute in ring 0, rename these functions to more accurately
describe what they mean: kernel processes and user processes.
The ring is determined based on the CS register. This fixes crashes
being handled as ring 3 crashes even though EIP/CS clearly showed
that the crash happened in the kernel.
In addition to being the proper POSIX etiquette, it seems like a bad idea
for issues like the one seen in #3428 to result in a kernel crash. This patch
replaces the current behavior of failing on insufficient buffer size to truncating
SOCK_RAW messages to the buffer size. This will have to change if/when MSG_PEEK
is implemented, but for now this behavior is more compliant and logical than
just bailing.
By being a bit too greedy and only allocating how much we need for
the failing allocation, we can end up in an infinite loop trying
to expand the heap further. That's because there are other allocations
(e.g. logging, vmobjects, regions, ...) that happen before we finally
retry the failed allocation request.
Also fix allocating in page size increments, which lead to an assertion
when the heap had to grow more than the 1 MiB backup.
Rather than trying to find a contiguous set of bits of size 1, just
find one single available bit using a hint.
Also, try to randomize returned physical pages a bit by placing them
into a 256 entry queue rather than making them available immediately.
Then, once the queue is filled, pick a random one, make it available
again and use that slot for the latest page to be returned.
In c3d231616c we added the atomic variable
m_have_any_unmasked_pending_signals tracking the state of pending signals.
Add helper functions that automatically update this variable as needed.
We need to wait until a thread is fully set up and ready for running
before attempting to deliver a signal. Otherwise we may not have a
user stack yet.
Also, remove the Skip0SchedulerPasses and Skip1SchedulerPass thread
states that we don't really need anymore with software context switching.
Fixes the kernel crash reported in #3419
Previously, it was kept as just a time_t and the sub-second
offset was inferred from the monotonic clock. This means that
sub-second time adjustments were ignored.
Now that `ntpquery -s` can pass in a time with sub-second
precision, it makes sense to keep time at that granularity
in the kernel.
After this, `ntpquery -s` immediately followed by `ntpquery` shows
an offset of 0.02s (that is, on the order of network roundtrip time)
instead of up to 0.75s previously.
Instead of FileDescriptor branching on the type of File it's wrapping,
add a File::stat() function that can be overridden to provide custom
behavior for the stat syscalls.
Sometimes a physical underlying page may be there, but we may be
unable to allocate a page table that may be needed to map it. Bubble
up such mapping errors so that they can be handled more appropriately.
It may be impossible to allocate more backup memory after expanding
the heap if memory is running low. In that case we wouldn't allocate
backup memory until trying to expand the heap again. But we also
wouldn't take advantage of using removed memory as backup, which means
that no backup memory would be available when the heap needs to grow
again, causing subsequent expansion to fail because there is no
backup memory.
The process of expanding memory requires allocations and deallocations
on the heap itself. So, while we're trying to expand the heap, don't
remove memory just because we might briefly not need it. Also prevent
recursive expansion attempts.
If allocating a page table triggers purging memory, we need to call
quickmap_pd again to make sure the underlying physical page is
remapped to the correct one. This is needed because purging itself
may trigger calls to ensure_pte as well.
Fixes#3370
When cloning a purgeable memory region (which happens on fork),
we need to preserve the "was purged" and "volatile" state of the
original region, or they will always appear as non-volatile and
unpurged regions in the child process.
Fixes#3374.
The exit condition for the loop was sizeof(m_features) * 8,
which was 32. Presumably this was supposed to mean 32 bits, but it
actually made it stop as soon as it reached the 6th bit.
Also add detection for more SIMD CPU features.
Add an ExpandableHeap and switch kmalloc to use it, which allows
for the kmalloc heap to grow as needed.
In order to make heap expansion to work, we keep around a 1 MiB backup
memory region, because creating a region would require space in the
same heap. This means, the heap will grow as soon as the reported
utilization is less than 1 MiB. It will also return memory if an entire
subheap is no longer needed, although that is rarely possible.