/boot/Kernel.debug only contains the symbol table and DWARF debug
information, and has its `.text` and other PT_LOAD segments stripped
out. When we try to parse its data as instructions, we get a crash from
within LibX86.
We now load the actual /boot/Kernel binary when we want to disassemble
kernel functions.
There is no point in keeping around a separate MappedFile object for
/boot/Kernel.debug for each DisassemblyModel we create and re-parsing
the kernel image multiple times. This will significantly speed up
browsing through profile entries from the kernel in disassembly view.
Previously, we assumed that the `.text` segment was loaded at vaddr 0 in
shared object, which is not the case with `-z separate-code` enabled.
Because we didn't do the right calculations to translate an address from
a performance event into its value within the ELF file, Profiler would
try to disassemble out-of-bounds memory locations, leading to a crash.
This commit also changes `LibraryMetadata` to apply to a loaded library
as a whole, not just to one of its segments (like .text or .data). This
lets us simplify the interface, as we no longer have to worry about
`text_base`.
Fixes#10628
Since our executables are position-independent, the address values
extraced from processes don't correspond to their values within the ELF
file. We have to offset the absolute addresses by the load base address
to get the relative symbol that we need for disassembly.
Now that the kernel is compiled as a PIE, all addresses are relative to
the loaded base address, so Symbolication::kernel_base has to be
subtracted off from the absolute addresses if we want to symbolicate
them.
Previously we assumed there were less kernel samples than user samples,
by implicitly using the kernel histogram size for indicies to the user
histogram. Such a profile can be reproduced by profiling a very short
lived program like true: `profile -c true`
Beforehand we were dividing the frame width by the profile length in ms
and then dividing the frame width by the result once more, which is
equivalent to (but slower) just using the length in ms directly, aside
from the case in which the profile is less than 1 ms long, in which
case this would trigger undefined behaviour due to the division by zero
This allows for typing [8] instead of [8, 8, 8, 8] to specify the same
margin on all edges, for example. The constructors follow CSS' style of
specifying margins. The added constructors are:
- Margins(int all): Sets the same margin on all edges.
- Margins(int vertical, int horizontal): Sets the first argument to top
and bottom margins, and the second argument to left and right margins.
- Margins(int top, int vertical, int bottom): Sets the first argument to
the top margin, the second argument to the left and right margins,
and the third argument to the bottom margin.
Previously the argument order for Margins was (left, top, right,
bottom). To make it more familiar and closer to how CSS does it, the
argument order is now (top, right, bottom, left).
Instead of keeping a separate Vector<Event> for signposts, let them live
in the main event stream. For fast iteration, we instead keep a cache of
the signpost event indices.
Also check for the most common event type (sample) first instead of
leaving it as the fallback. This avoids a lot of string comparisons
while parsing profiles.
Making userspace provide a global string ID was silly, and made the API
extremely difficult to use correctly in a global profiling context.
Instead, simply make the kernel do the string ID allocation for us.
This also allows us to convert the string storage to a Vector in the
kernel (and an array in the JSON profile data.)
The first perf_event argument to a PERF_EVENT_SIGNPOST is now
interpreted as a string ID (in the profile strings set.)
This allows us to generate signposts with custom strings. :^)
Signposts generated by perf_event(PERF_EVENT_SIGNPOST) now show up in
profile timelines, and if you hover them you get a tooltip with the two
arguments passed with the event.
This can happen if the symbol is part of a switch-case, and not
a function, which would previously have made the disassembly view
appear empty.
Now we disassemble the containing function, starting at the given label
and continuing up until the last captured instruction.
Most of the models were just calling did_update anyway, which is
pointless since it can be unified to the base Model class. Instead, code
calling update() will now call invalidate(), which functions identically
and is more obvious in what it does.
Additionally, a default implementation is provided, which removes the
need to add empty implementations of update() for each model subclass.
Co-Authored-By: Ali Mohammad Pur <ali.mpfard@gmail.com>
This enables further work on implementing KASLR by adding relocation
support to the pre-kernel and updating the kernel to be less dependent
on specific virtual memory layouts.
This removes all the hard-coded kernel base addresses from userspace
tools.
One downside for this is that e.g. Profiler no longer uses a different
color for kernel symbols when run as a non-root user.
Applications previously had to create a GUI::Menubar object, add menus
to it, and then call GUI::Window::set_menubar().
This patch introduces GUI::Window::add_menu() which creates the menubar
automatically and adds items to it. Application code becomes slightly
simpler as a result. :^)
This reverts commit cfef3040fb.
It looks like although this does improve things, it also degrades the
experience and messes with the usability, especially for large amounts
of processes.
Need to come back to this with a more holistic fix.
As threads come and go, we can't simply account for how many time
slices the threads at any given point may have been using. We need to
also account for threads that have since disappeared. This means we
also need to track how many time slices we have expired globally.
However, because this doesn't account for context switches outside of
the system timer tick values may still be under-reported. To solve this
we will need to track more accurate time information on each context
switch.
This also fixes top's cpu usage calculation which was still based on
the number of context switches.
Fixes#6473
Today the profile viewer timeline view has a static size, which is
computed as half the height of the window given it has two root widgets.
Instead the timeline view should shrink to only consume the size that
each process timeline consumes.
We already do this in most places, so the style should be consistent.
Also, Clang does not like it, as this could cause an unexpected compile
error if some statements are added to the default label or a new label
is added above it.
While structs being forward declared as classes is not strictly an
issue, Clang complains as this is not portable code, since some ABIs
treat classes declared as `class` and `struct` differently.
It's easier to fix these than to reason about explicitly disabling
another warning.
This implements StringUtils::find_any_of() and uses it in
String::find_any_of() and StringView::find_any_of(). All uses of
find_{first,last}_of have been replaced with find_any_of(), find() or
find_last(). find_{first,last}_of have subsequently been removed.
The LexicalPath instance methods dirname(), basename(), title() and
extension() will be changed to return StringView const& in a further
commit. Due to this, users creating temporary LexicalPath objects just
to call one of those getters will recieve a StringView const& pointing
to a possible freed buffer.
To avoid this, static methods for those APIs have been added, which will
return a String by value to avoid those problems. All cases where
temporary LexicalPath objects have been used as described above haven
been changed to use the static APIs.
When constructing values of the InstructionData type we assume that
the event_count field is a size_t while it actually is a u32. On x86_64
this fails because those are different types.