Enter a critical section in Processor::exit_trap so that processing
SMP messages doesn't enable interrupts upon leaving. We need to delay
this until the end where we call into the Scheduler if exiting the
trap results in being outside of a critical section and irq handler.
Co-authored-by: Tom <tomut@yahoo.com>
First off: unregister the region from MemoryManager before unmapping it.
The order of operations here was a bit strange, presumably to avoid a
situation where a fault would happen while unmapping, and the fault
handler would find the MemoryManager region list in an invalid state.
Unregistering it before unmapping sidesteps the whole problem, and
allows us to easily fix another problem: a deadlock could occur due
to inconsistent acquisition order (PageDirectory must come before MM.)
We don't want to be holding the MM lock if it's a user region and we
have to consult the page directory, since that can lead to a deadlock
if we don't already have the page directory lock.
- Use the receiver's per-CPU entry in the message, instead of the
sender's. (Using the sender's entry wasn't safe for broadcast
messages since the same entry ended up on multiple message queues.)
- Retry the CAS until it *succeeds* instead of *fails*. This closes a
race window, and also ensures a correct return value. The return value
is used by the caller to decide whether to broadcast an IPI.
This was the main reason smp=on was so slow. We had CPUs busy-waiting
until someone else triggered an IPI and moved things along.
- Add a CPU pause hint to the spin loop. :^)
It may happen that CPU A manages to page in from the same inode
while we're just entering the same page fault handler on CPU B.
Handle it gracefully by checking if the data has already been paged in
(instead of VERIFY'ing that it hasn't) and then remap the page if that's
the case.
Due to a boolean mistake in smp_return_to_pool(), we didn't retry
pushing the message onto the freelist after a failed attempt.
This caused the message pool to eventually become completely empty
after enough contentious access attempts.
This patch also adds a pause hint to the CPU in the failed attempt
code path.
The kernel panic handler now parses the kernels boot_mode to decide how
to handle the panic. So the previous logic could end up in an panic loop
until we blew out the kernel stack.
Instead only validate the kernel's boot mode once per boot, after
initializing the kernel command line.
Taking a reference or a pointer to a value that's not aligned properly
is undefined behavior. While `[[gnu::packed]]` ensures that reads from
and writes to fields of packed structs is a safe operation, the
information about the reduced alignment is lost when creating pointers
to these values.
Weirdly enough, GCC's undefined behavior sanitizer doesn't flag these,
even though the doc of `-Waddress-of-packed-member` says that it usually
leads to UB. In contrast, x86_64 Clang does flag these, which renders
the 64-bit kernel unable to boot.
For now, the `address-of-packed-member` warning will only be enabled in
the kernel, as it is absolutely crucial there because of KUBSAN, but
might get excessively noisy for the userland in the future.
Also note that we can't append to `CMAKE_CXX_FLAGS` like we do for other
flags in the kernel, because flags added via `add_compile_options` come
after these, so the `-Wno-address-of-packed-member` in the root would
cancel it out.
Kernels built with Clang seem to be quite allocation-heavy compared to
their GCC counterparts. We would sometimes end up crashing during boot
because the eternal ranges had no free capacity.
The Clang error message reads like this (`-Wdeprecated-array-compare`):
> error: comparison between two arrays is deprecated; to compare
> array addresses, use unary '+' to decay operands to pointers.
The maximum number of virtual consoles is determined at compile time,
so we can pre-allocate that many slots, dodging some heap allocations.
Furthermore, virtual consoles are never destroyed, so it's fine to
simply store a raw pointer to the currently active one.
This commit implements the ISO 9660 filesystem as specified in ECMA 119.
Currently, it only supports the base specification and Joliet or Rock
Ridge support is not present. The filesystem will normalize all
filenames to be lowercase (same as Linux).
The filesystem can be mounted directly from a file. Loop devices are
currently not supported by SerenityOS.
Special thanks to Lubrsi for testing on real hardware and providing
profiling help.
Co-Authored-By: Luke <luke.wilde@live.co.uk>
I had to move the thread list out of the protected base area of Process
so that it could live with its lock (which needs to be mutable).
Ideally it would live in the protected area, so maybe we can figure out
a way to do that later.
When I laid down the foundation for the start of the big process lock
separation, I added asserts to all system call implementations to
validate we hold the big process lock in the locations we think we
should be.
Adding that assert to sys$profiling_enable broke boot time profiling as
we were never holding the lock on boot. Even though it's not technically
required, lets make sure to hold the lock while enabling to appease the
assert.
VirtualConsole::m_lock was only used in a single place: on_tty_write()
That function is already protected by a global lock, so this second
lock served no purpose whatsoever.
Note: TCPSocket::create_client() has a dubious locking process where
the sockets by tuple table is first shared lock to check if the socket
exists and bail out if it does, then unlocks, then exclusively locks to
add the tuple. There could be a race condition where two client
creation requests for the same tuple happen at the same time and both
cleared the shared lock check. When in doubt, lock exclusively the
whole time.