We do that to increase clarity of the major and secondary components in
the subsystem. To ensure it's even more understandable, we rename the
files to better represent the class within them and to remove redundancy
in the name.
Also, some includes are removed from the general components of the ATA
components' classes.
It is starting to get a little messy with how each device can try to add
or remove itself to either /sys/dev/block or /sys/dev/char directories.
To better do this, we introduce 4 virtual methods to take care of that,
so until we ensure all nodes in /sys/dev/block and /sys/dev/char are
actual symlinks, we allow the Device base class to call virtual methods
upon insertion or before being destroying, so it add itself elegantly to
either of these directories or remove itself when needed.
For special cases where we need to create symlinks, we have two virtual
methods to be called otherwise to do almost the same thing mentioned
before, but to use symlinks instead.
This change in fact does the following:
1. Use support for symlinks between /sys/dev/block/ storage device
identifier nodes and devices in /sys/devices/storage/{LUN}.
2. Add basic nodes in a /sys/devices/storage/{LUN} directory, to let
userspace to know about the device and its details.
LUN address is essentially how people used to address SCSI devices back
in the day we had these devices more in use. However, SCSI was taken as
an abstraction layer for many Unix and Unix-like systems, so it still
common to see LUN addresses in use. In Serenity, we don't really provide
such abstraction layer, and therefore until now, we didn't use LUNs too.
However (again), this changes, as we want to let users to address their
devices under SysFS easily. LUNs make sense in that regard, because they
can be easily adapted to different interfaces besides SCSI.
For example, for legacy ATA hard drive being connected to the first IDE
controller which was enumerated on the PCI bus, and then to the primary
channel as slave device, the LUN address would be 0:0:1.
To make this happen, we add unique ID number to each StorageController,
which increments by 1 for each new instance of StorageController. Then,
we adapt the ATA and NVMe devices to use these numbers and generate LUN
in the construction time.
This bug was probably around for a very long time, but it is noticeable
only under VirtualBox as it generated an non fatal error which caused a
kernel panic because we VERIFYed the wrong lock to be locked.
There's no real value in separating physical pages to supervisor and
user types, so let's remove the concept and just let everyone to use
"user" physical pages which can be allocated from any PhysicalRegion
we want to use. Later on, we will remove the "user" prefix as this
prefix is not needed anymore.
Each of these strings would previously rely on StringView's char const*
constructor overload, which would call __builtin_strlen on the string.
Since we now have operator ""sv, we can replace these with much simpler
versions. This opens the door to being able to remove
StringView(char const*).
No functional changes.
The initialize_hba method now calls the reset method to reset the HBA
and initialize each AHCIPort. Also, after full HBA reset we need to turn
on the AHCI functionality of the HBA and global interrupts since they
are cleared to 0 according to the specification in the GHC register.
Instead of doing this in a parent class like the AHCIController, let's
do that directly in the AHCIPort class as that class is the only user of
these sort of physical pages. While it seems like we waste an entire 4KB
of physical RAM for each allocation, this could serve us later on if we
want to fetch other types of logs from the ATA device.
The way AHCIPortHandler held AHCIPorts and even provided them with
physical pages for the ATA identify buffer just felt wrong.
To fix this, AHCIPortHandler is not a ref-counted object anymore. This
solves the big part of the problem, because AHCIPorts can't hold a
reference to this object anymore, only the AHCIController can do that.
Then, most of the responsibilities are shifted to the AHCIController,
making the AHCIPortHandler a handler of port interrupts only.
The AHCI code is not very good at OOM conditions, so this is a first
step towards OOM correctness. We should not allocate things inside C++
constructors because we can't catch OOM failures, so most allocation
code inside constructors is exported to a different function.
Also, don't use a HashMap for holding RefPtr of AHCIPort objects in
AHCIPortHandler because this structure is not very OOM-friendly. Instead
use a fixed Array of 32 RefPtrs, as at most we can have 32 AHCI ports
per AHCI controller.
That code used the old AK::Result container, which leads to overly
complicated initialization flow when trying to figure out the correct
partition table type. Instead, when using the ErrorOr container the code
is much simpler and more understandable.
In most cases it's safe to abort the requested operation and go forward,
however, in some places it's not clear yet how to handle these failures,
therefore, we use the MUST() wrapper to force a kernel panic for now.
The current implementation of read/write will fail in StorageDevice
when the request length is less than the block size of the underlying
device. Fix it by calculating the offset within a block for such cases
and using it for copying data from the bounce buffer.
The underlying driver does not need to recalculate the buffer size as
it is passed in the AsyncBlockDevice struct anyway. This also helps in
removing any assumptions of the underlying block size of the device.
This class already has variables named m_lock, and it's also strange
that locals are named with the `m_` prefix. So lets fix that to make
the code more readable.
Found by PVS-Studio.
This is mainly useful when adding an HostController but due to OOM
condition, we abort temporary Vector insertion of a DeviceIdentifier
and then exit the iteration loop to report back the error if occured.
Instead, hold the lock while we copy the contents to a stack-based
Vector then iterate on it without any locking.
Because we rely on heap allocations, we need to propagate errors back
in case of OOM condition, therefore, both PCI::enumerate API function
and PCI::Access::add_host_controller_and_enumerate_attached_devices use
now a ErrorOr<void> return value to propagate errors. OOM Error can only
occur when enumerating the m_device_identifiers vector under a spinlock
and trying to expand the temporary Vector which will be used locklessly
to actually iterate over the PCI::DeviceIdentifiers objects.
If there's no PCI bus, then it's safe to assume that we run on a x86
machine that has an ISA IDE controller in the system. In such case, we
just instantiate a ISAIDEController object that assumes fixed locations
of IDE IO ports.
Function-local `static constexpr` variables can be `constexpr`. This
can reduce memory consumption, binary size, and offer additional
compiler optimizations.
These changes result in a stripped x86_64 kernel binary size reduction
of 592 bytes.
As make<T> is infallible, it really should not be used anywhere in the
Kernel. Instead replace with fallible `new (nothrow)` calls, that will
eventually be error-propagated.
Add polling support to NVMe so that it does not use interrupt to
complete a IO but instead actively polls for completion. This probably
is not very efficient in terms of CPU usage but it does not use
interrupts to complete a IO which is beneficial at the moment as there
is no MSI(X) support and it can reduce the latency of an IO in a very
fast NVMe device.
The NVMeQueue class has been made the base class for NVMeInterruptQueue
and NVMePollQueue. The factory function `NVMeQueue::try_create` will
return the appropriate queue to the controller based on the polling
boot parameter.
The polling mode can be enabled by adding an extra boot parameter:
`nvme_poll`.
This is being used by GUID partitions so the first three dash-delimited
fields of the GUID are stored in little endian order but the last two
fields are stored in big endian order, hence it's a representation which
is mixed.
If we panic the kernel for a storage-related reason, we might as well be
helpful and print out a list of detected storage devices and their
partitions to help with debugging.
Reasons for such a panic include:
- No boot device with the given name found
- No boot device with the given UUID found
- Failing to open the root filesystem after determining a boot device
Before attempting to remove the device while handling an AHCI port
interrupt, check if m_connected_device is even non-null.
This happened during my bare metal run and caused a kernel panic.