Instead of caching a raw pointer to the next instruction, cache the
region we're fetching instructions from, and a pointer to its base.
This way we don't need to keep invalidating and reloading the cache
whenever the CPU jumps.
Not motivated by anything in particular, they just looked easy to fill
in. With this, all arithmetic FI* FPU instructions are implemented.
Switch to the mXXint style in a few more functions, this part is no-op.
This is used by memset() so we get a lot of mileage out of optimizing
this instruction.
Note that we currently audit every individual byte accessed separately.
This could be greatly improved by adding a range auditing mechanism to
MallocTracer.
m32int is a 32-bit integer stored in memory, and should not be mistaken
for a floating point number. :^)
Also add missing handling of 64-bit FPU register operands to some of
the RM64 instructions.
These instructions now operate on the specified FPU stack entry instead
of always using ST(0) and ST(1).
FUCOMI and FUCOMIP also handle NaN values slightly better.
Start fleshing out basic support for floating-point instructions in the
UserspaceEmulator CPU.
This is all work done by @nico for #3576. I'm just merging it all in
this patch since it's a decent foundation to continue working on. :^)
When a mallocation is shrunk/grown without moving, UE needs to update
its precise metadata about the mallocation, since it tracks *exactly*
how many bytes were allocated, not just the malloc chunk size.
Problem:
- `constexpr` functions are decorated with the `inline` specifier
keyword. This is redundant because `constexpr` functions are
implicitly `inline`.
- [dcl.constexpr], §7.1.5/2 in the C++11 standard): "constexpr
functions and constexpr constructors are implicitly inline (7.1.2)".
Solution:
- Remove the redundant `inline` keyword.
* AK: Add formatter for JsonValue.
* Inspector: Use new format functions.
* Profiler: Use new format functions.
* UserspaceEmulator: Use new format functions.
In the future all (normal) output should be written by any of the
following functions:
out (currently called new_out)
outln
dbg (currently called new_dbg)
dbgln
warn (currently called new_warn)
warnln
However, there are still a ton of uses of the old out/warn/dbg in the
code base so the new functions are called new_out/new_warn/new_dbg. I am
going to rename them as soon as all the other usages are gone (this
might take a while.)
I also added raw_out/raw_dbg/raw_warn which don't do any escaping,
this should be useful if no formatting is required and if the input
contains tons of curly braces. (I am not entirely sure if this function
will stay, but I am adding it for now.)
Don't require clients to templatize modrm().read{8,16,32,64}() with
the ValueWithShadow type when we can figure it out automatically.
The main complication here is that ValueWithShadow is a UE concept
while the MemoryOrRegisterReference inlines exist at the lower LibX86
layer and so doesn't have direct access to those types. But that's
nothing we can't solve with some simple template trickery. :^)
This is useful for reading and writing doubles for #3329.
It is also useful for emulating 64-bit binaries.
MemoryOrRegisterReference assumes that 64-bit values are always
memory references since that's enough for fpu support. If we
ever want to emulate 64-bit binaries, that part will need minor
updating.
When compiling with "-Os", GCC produces the following pattern for
atomic decrement (which is used by our RefCounted template):
or eax, -1
lock xadd [destination], eax
Since or-ing with -1 will always produce the same output (-1), we can
mark the result of these operations as initialized. This stops us from
complaining about false positives when running the shell in UE. :^)
Some of the remaining instructions have different behavior for
register and non-register ops. Since we already have the
two-level flags tables, model this by setting all handlers in
the two-level table to the register op handler, while the
first-level flags table stores the action for the non-reg handler.
Some of these don't just use the REG bits of the mod/rm byte
as slashes, but also the R/M bits to have up to 9 different
instructions per opcode/slash combination (1 opcode requires
that MOD is != 11, the other 8 have MODE == 11).
This is done by making the slashes table two levels deep for
these cases.
Some of this is cosmetic (e.g "FST st0" has no effect already,
but its bit pattern gets disassembled as "FNOP"), but for
most uses it isn't.
FSTENV and FSTCW have an extraordinary 0x9b prefix. This is
not yet handled in this patch.
"xor reg,reg" or "sub reg,reg" both zero out the register, which means
we know for sure the result is 0. So mark the value as initialized,
and make sure we don't taint the CPU flags.
This removes some false positives from the uninitialized memory use
detection mechanism.
Fixes#2850.
We now track whether the flags register is tainted by the use of one or
more uninitialized values in a computation.
For now, the state is binary; the flags are either tainted or not.
We could be more precise about this and only taint the specific flags
that get updated by each instruction, but I think this will already get
us 99% of the results we want. :^)
This patch introduces the concept of shadow bits. For every byte of
memory there is a corresponding shadow byte that contains metadata
about that memory.
Initially, the only metadata is whether the byte has been initialized
or not. That's represented by the least significant shadow bit.
Shadow bits travel together with regular values throughout the entire
CPU and MMU emulation. There are two main helper classes to facilitate
this: ValueWithShadow and ValueAndShadowReference.
ValueWithShadow<T> is basically a struct { T value; T shadow; } whereas
ValueAndShadowReference<T> is struct { T& value; T& shadow; }.
The latter is used as a wrapper around general-purpose registers, since
they can't use the plain ValueWithShadow memory as we need to be able
to address individual 8-bit and 16-bit subregisters (EAX, AX, AL, AH.)
Whenever a computation is made using uninitialized inputs, the result
is tainted and becomes uninitialized as well. This allows us to track
this state as it propagates throughout memory and registers.
This patch doesn't yet keep track of tainted flags, that will be an
important upcoming improvement to this.
I'm sure I've messed up some things here and there, but it seems to
basically work, so we have a place to start! :^)
These were not recording the higher part of the result correctly.
Since the flags are much less complicated than the inline assembly
here, just implement IMUL in C++ instead.