This is in preparation for adding MSI(x) support to the NVMe device.
NVMeInterruptQueue needs access to the PCI device to deal with MSI(x)
interrupts. It is ok to pass the NVMeController as a reference to the
NVMeQueue as NVMeController is the one that owns the NVMeQueue.
This is very similar to how AHCIController passes its reference to its
interrupt handler.
Add an explicit QueueType enum which could be used to create a poll or
an interrupt queue. This is better than passing an Optional<irq>.
This refactoring is in preparation for adding MSIx support to NVMe.
These were easy to pick-up as these pointers are assigned during the
construction point and are never changed afterwards.
This small change to these pointers will ensure that our code will not
accidentally assign these pointers with a new object which is always a
kind of bug we will want to prevent.
The current way we handle sync commands is very ugly and depends on lot
of preconditions. Now that we have an end_io handler for a request, we
can use WaitQueue to do sync commands more elegantly.
This does depend on block layer sending one request at a time but this
change is a step forward towards better IO handling.
There was a private variable named m_current_request which was used to
track a single request at a time. This guarantee is given by the block
layer where we wait on each IO. This design will break down in the
driver once the block layer removes that constraint.
Redesign the IO handling in a completely asynchronous way by maintaining
requests up to queue depth. NVMeIO struct is introduced to track an IO
submitted along with other information such whether the IO is still
being processed and an endio callback which will be called during the
end of a request.
A hashmap private variable is created which will key based on the
command id of a request with a value of NVMeIO. endio handler will come
in handy if we are doing a sync request and we want to wake up the wait
queue during the end.
This change also simplified the code by removing some special condition
in submit_sqe function, etc that were marked as FIXME for a long time.
Using sq_tail as cid makes an inherent assumption that we send only
one IO at a time. Use an atomic variable instead for command id of a
submission queue entry.
As sq_tail is not used as cid anymore, remove m_prev_sq_tail which used
to hold the last used sq_tail value.
This class had slightly confusing semantics and the added weirdness
doesn't seem worth it just so we can say "." instead of "->" when
iterating over a vector of NNRPs.
This patch replaces NonnullRefPtrVector<T> with Vector<NNRP<T>>.
This step would ideally not have been necessary (increases amount of
refactoring and templates necessary, which in turn increases build
times), but it gives us a couple of nice properties:
- SpinlockProtected inside Singleton (a very common combination) can now
obtain any lock rank just via the template parameter. It was not
previously possible to do this with SingletonInstanceCreator magic.
- SpinlockProtected's lock rank is now mandatory; this is the majority
of cases and allows us to see where we're still missing proper ranks.
- The type already informs us what lock rank a lock has, which aids code
readability and (possibly, if gdb cooperates) lock mismatch debugging.
- The rank of a lock can no longer be dynamic, which is not something we
wanted in the first place (or made use of). Locks randomly changing
their rank sounds like a disaster waiting to happen.
- In some places, we might be able to statically check that locks are
taken in the right order (with the right lock rank checking
implementation) as rank information is fully statically known.
This refactoring even more exposes the fact that Mutex has no lock rank
capabilites, which is not fixed here.
I believe this to be safe, as the main thing that LockRefPtr provides
over RefPtr is safe copying from a shared LockRefPtr instance. I've
inspected the uses of RefPtr<PhysicalPage> and it seems they're all
guarded by external locking. Some of it is less obvious, but this is
an area where we're making continuous headway.
Until now, our kernel has reimplemented a number of AK classes to
provide automatic internal locking:
- RefPtr
- NonnullRefPtr
- WeakPtr
- Weakable
This patch renames the Kernel classes so that they can coexist with
the original AK classes:
- RefPtr => LockRefPtr
- NonnullRefPtr => NonnullLockRefPtr
- WeakPtr => LockWeakPtr
- Weakable => LockWeakable
The goal here is to eventually get rid of the Lock* classes in favor of
using external locking.
Instead of having two separate implementations of AK::RefCounted, one
for userspace and one for kernelspace, there is now RefCounted and
AtomicRefCounted.
All users which relied on the default constructor use a None lock rank
for now. This will make it easier to in the future remove LockRank and
actually annotate the ranks by searching for None.
Add polling support to NVMe so that it does not use interrupt to
complete a IO but instead actively polls for completion. This probably
is not very efficient in terms of CPU usage but it does not use
interrupts to complete a IO which is beneficial at the moment as there
is no MSI(X) support and it can reduce the latency of an IO in a very
fast NVMe device.
The NVMeQueue class has been made the base class for NVMeInterruptQueue
and NVMePollQueue. The factory function `NVMeQueue::try_create` will
return the appropriate queue to the controller based on the polling
boot parameter.
The polling mode can be enabled by adding an extra boot parameter:
`nvme_poll`.
Instead, try to allocate the DMA buffer before trying to construct the
NVMeQueue. This allows us to fail early if we can't allocate the DMA
buffer before allocating and creating the heavier NVMeQueue object.
Only a generic struct definition was present for NVMeSubmission. To
improve type safety and clarity, added an union of NVMeSubmission
structs that are applicable to the command being submitted.
We need to use the volatile keyword when mapping the device registers,
or the compiler may optimize access, which lead to this QEMU error:
pci_nvme_ub_mmiord_toosmall in nvme_mmio_read: MMIO read smaller than
32-bits, offset=0x0
Add a basic NVMe driver support to serenity
based on NVMe spec 1.4.
The driver can support multiple NVMe drives (subsystems).
But in a NVMe drive, the driver can support one controller
with multiple namespaces.
Each core will get a separate NVMe Queue.
As the system lacks MSI support, PIN based interrupts are
used for IO.
Tested the NVMe support by replacing IDE driver
with the NVMe driver :^)