About half of the Processor code is common across architectures, so
let's share it with a templated base class. Also, other code that can be
shared in some ways, like FPUState and TrapFrame functions, is adjusted
here. Functions which cannot be shared trivially (without internal
refactoring) are left alone for now.
Otherwise, the message's contents might be in the cache only, so
VideoCore will read stale/garbage data from main memory.
This fixes framebuffer setup on bare metal with the data cache enabled.
This commit adds Processor::set_thread_specific_data, and this function
is used to factor out architecture specific implementation of setting
the thread specific data. This function is implemented for
aarch64 and x86_64, and the callsites are changed to use this function
instead.
Until now the kernel was always executing with SP_EL0, as this made the
initial dropping to EL1 a bit easier. This commit changes this behaviour
to use the corresponding SP_ELx for each exception level.
To make sure that the execution of the C++ code can continue, the
current stack pointer is copied into the corresponding SP_ELx just
before dropping an exception level.
There’s similar RDRAND register (encoded as ‘s3_3_c2_c4_0ʼ) to be
added if needed. RNG CPU feature on Aarch64 guarantees existence of both
RDSEED and RDRAND registers simultaneously—in contrast to x86-64, where
respective instructions are independent of each other.
The hand-written assembly does not compile under Clang due to register
size mismatches. Using a loop is slower (~6 instructions on O2 as
opposed to 2 with hand-written assembly), but using the pause
instruction makes this more efficient even under TCG.