When we get a COW fault and discover that whoever we were COW'ing
together with has either COW'ed that page on their end (or they have
unmapped/exited) we simplify life for ourselves by clearing the COW
bit and keeping the page we already have. (No need to COW if the page
is not shared!)
The act of doing this does not return a committed page to the pool.
In fact, that committed page we had reserved for this purpose was used
up (allocated) by our COW buddy when they COW'ed the page.
This fixes a kernel panic when running TestLibCMkTemp. :^)
This makes a kernel panic immediately fail the on-target CI job.
Otherwise the failed job looks like a test timeout unless one digs into
the details of the job.
We don't need an entirely separate VMObject subclass to influence the
location of the physical pages.
Instead, we simply allocate enough physically contiguous memory first,
and then pass it to the AnonymousVMObject constructor that takes a span
of physical pages.
VMObject already has an IntrusiveList of all the Regions that map it.
We were keeping a counter in addition to this, and only using it in
a single place to avoid iterating over the list in case it only had
1 entry.
Simplify VMObject by removing this counter and always iterating the
list even if there's only 1 entry. :^)
This was previously used for a single debug logging statement during
memory purging. There are no remaining users of this weak pointer,
so let's get rid of it.
If a purgeable VM object is in the "volatile" state when we're asked
to make a COW clone of it, make life simpler by simply "purging"
the cloned object right away.
This effectively means that a fork()'ed child process will discover
its purgeable+volatile regions to be empty if/when it tries making
them non-volatile.
This patch changes the semantics of purgeable memory.
- AnonymousVMObject now has a "purgeable" flag. It can only be set when
constructing the object. (Previously, all anonymous memory was
effectively purgeable.)
- AnonymousVMObject now has a "volatile" flag. It covers the entire
range of physical pages. (Previously, we tracked ranges of volatile
pages, effectively making it a page-level concept.)
- Non-volatile objects maintain a physical page reservation via the
committed pages mechanism, to ensure full coverage for page faults.
- When an object is made volatile, it relinquishes any unused committed
pages immediately. If later made non-volatile again, we then attempt
to make a new committed pages reservation. If this fails, we return
ENOMEM to userspace.
mmap() now creates purgeable objects if passed the MAP_PURGEABLE option
together with MAP_ANONYMOUS. anon_create() memory is always purgeable.
Right now, NE2000 NICs don't work because the link is down by default
and this will never change. Of all the NE2000 documentation I looked
at I could not find a link status indicator, so just assume the link
is up.
next_packet_page points to a page, but was being compared to a byte
offset rather than a page offset when adjusting the BOUNDARY register
when the ring buffer wraps around.
Fixes#8327.
This bug manifests it self when the caller to sys$pledge() passes valid
promises, but invalid execpromises. The code would apply the promises
and then return an error for the execpromises. This leaves the user in
a confusing state, as the promises were silently applied, but we return
an error suggesting the operation has failed.
Avoid this situation by tweaking the implementation to only apply the
promises / execpromises after all validation has occurred.
GCC-11 added a new option `-fzero-call-used-regs` which causes the
compiler to zero function arguments before return of a function. The
goal being to reduce the possible attack surface by disarming ROP
gadgets that might be potentially useful to attackers, and reducing
the risk of information leaks via stale register data. You can find
the GCC commit below[0].
This is a mitigation I noticed on the Linux KSPP issue tracker[1] and
thought it would be useful mitigation for the SerenityOS Kernel.
The reduction in ROP gadgets is observable using the ropgadget utility:
$ ROPgadget --nosys --nojop --binary Kernel | tail -n1
Unique gadgets found: 42754
$ ROPgadget --nosys --nojop --binary Kernel.RegZeroing | tail -n1
Unique gadgets found: 41238
The size difference for the i686 Kernel binary is negligible:
$ size Kernel Kernel.RegZerogin
text data bss dec hex filename
13253648 7729637 6302360 27285645 1a0588d Kernel
13277504 7729637 6302360 27309501 1a0b5bd Kernel.RegZeroing
We don't have any great workloads to measure regressions in Kernel
performance, but Kees Cook mentioned he measured only around %1
performance regression with this enabled on his Linux kernel build.[2]
References:
[0] d10f3e900b
[1] https://github.com/KSPP/linux/issues/84
[2] https://lore.kernel.org/lkml/20210714220129.844345-1-keescook@chromium.org/
This patch greatly simplifies VMObject locking by doing two things:
1. Giving VMObject an IntrusiveList of all its mapping Region objects.
2. Removing VMObject::m_paging_lock in favor of VMObject::m_lock
Before (1), VMObject::for_each_region() was forced to acquire the
global MM lock (since it worked by walking MemoryManager's list of
all regions and checking for regions that pointed to itself.)
With each VMObject having its own list of Regions, VMObject's own
m_lock is all we need.
Before (2), page fault handlers used a separate mutex for preventing
overlapping work. This design required multiple temporary unlocks
and was generally extremely hard to reason about.
Instead, page fault handlers now use VMObject's own m_lock as well.
Despite what the declaration would have us believe these are not "u8*".
If they were we wouldn't have to use the & operator to get the address
of them and then cast them to "u8*"/FlatPtr afterwards.
Since we're taking from the committed set of pages, there should never
be a reason for this call to fail.
Also add a Badge to disallow taking committed pages from anywhere but
the Region class.
We don't need to have a dedicated API for creating a VMObject with a
single page, the multi-page API option works in all cases.
Also make the API take a Span<NonnullRefPtr<PhysicalPage>> instead of
a NonnullRefPtrVector<PhysicalPage>.