This commit adds "Filesystem Events" View to the Profiler. This tab
will present combined information for recorded Filesystem events.
Currently only accumulated count and duration is presented.
Duration amount currently only shows events that took over 1ms,
which means that in most cases 0 is show.
This adds a new view mode to profiler which displays source lines and
samples that occured at those lines. This view can be opened via the
menu or by pressing CTRL-S.
It does this by mapping file names from DWARF to "/usr/src/serenity/..."
i.e. source code should be copied to /usr/src/serenity/Userland and
/usr/src/serenity/Kernel to be visible in this mode.
Currently *all* files contributing to the selected function are loaded
completely which could be a lot of data when dealing with lots of
inlined code.
No need to copy and move them around, just to pass them as a
`String const&` to the constructor.
We still end up copying it to a normal String in the end though...
Also add slightly richer parse errors now that we can include a string
literal with returned errors.
This will allow us to use TRY() when working with JSON data.
This changes browsing through disassembled functions in Profiler from a
painfully sluggish experience into quite a swift one. It's especially
true for profiling the kernel, as it has more than 10 megabytes of DWARF
data to churn through.
There is no point in keeping around a separate MappedFile object for
/boot/Kernel.debug for each DisassemblyModel we create and re-parsing
the kernel image multiple times. This will significantly speed up
browsing through profile entries from the kernel in disassembly view.
Instead of keeping a separate Vector<Event> for signposts, let them live
in the main event stream. For fast iteration, we instead keep a cache of
the signpost event indices.
Also check for the most common event type (sample) first instead of
leaving it as the fallback. This avoids a lot of string comparisons
while parsing profiles.
Making userspace provide a global string ID was silly, and made the API
extremely difficult to use correctly in a global profiling context.
Instead, simply make the kernel do the string ID allocation for us.
This also allows us to convert the string storage to a Vector in the
kernel (and an array in the JSON profile data.)
The first perf_event argument to a PERF_EVENT_SIGNPOST is now
interpreted as a string ID (in the profile strings set.)
This allows us to generate signposts with custom strings. :^)
Signposts generated by perf_event(PERF_EVENT_SIGNPOST) now show up in
profile timelines, and if you hover them you get a tooltip with the two
arguments passed with the event.
Most of the models were just calling did_update anyway, which is
pointless since it can be unified to the base Model class. Instead, code
calling update() will now call invalidate(), which functions identically
and is more obvious in what it does.
Additionally, a default implementation is provided, which removes the
need to add empty implementations of update() for each model subclass.
Co-Authored-By: Ali Mohammad Pur <ali.mpfard@gmail.com>
This enables further work on implementing KASLR by adding relocation
support to the pre-kernel and updating the kernel to be less dependent
on specific virtual memory layouts.
This removes all the hard-coded kernel base addresses from userspace
tools.
One downside for this is that e.g. Profiler no longer uses a different
color for kernel symbols when run as a non-root user.
The LexicalPath instance methods dirname(), basename(), title() and
extension() will be changed to return StringView const& in a further
commit. Due to this, users creating temporary LexicalPath objects just
to call one of those getters will recieve a StringView const& pointing
to a possible freed buffer.
To avoid this, static methods for those APIs have been added, which will
return a String by value to avoid those problems. All cases where
temporary LexicalPath objects have been used as described above haven
been changed to use the static APIs.
Previously Profiler was using timestamps to distinguish processes.
However it is possible that separate processes with the same PID exist
at the exact same timestamp (e.g. for execve). This changes Profiler
to use unique serial numbers for each event instead.
The profiler tried to be clever when handling process_exit events by
subtracting one from the timestamp. This was supposed to ensure that
events after a process' death would be attributed to the new process
in case the old process used execve(). However, if there was another
event (e.g. a CPU sample) at the exact same time the process_exit
event was recorded the profile would fail to load because we
didn't find the process anymore.
This changes introduces a new problem where samples would be attributed
to the incorrect process if a CPU sample for the old process, a
process_exit as well as a process_create event plus another CPU sample
event for the new process happened at the exact same time. I think
it's a reasonable compromise though.
I've had a couple of instances where a profile was missing process
creation events for a PID. I don't know how to reproduce it yet,
so this patch merely adds a helpful debug message so you know why
Profiler is failing to load the file.
This patch adds an additional level of hierarchy to the call tree:
Every process gets its own top-level node. :^)
Before this, selecting multiple processes would get quite confusing
as all the call stacks from different processes were combined together
into one big tree.
Problem:
- Default destructors (and constructors) are in `.cpp` files. This
prevents the compiler's optimizer from inlining them when it thinks
inlining is appropriate (unless LTO is used).
- Forward declarations can prevent some optimizations, such as
inlining of constructors and destructors.
Solution:
- Remove them or set them to `= default` and let the compiler handle
the generation of them.
- Remove unneeded forward declarations.
Hiding those frames doesn't really make sense. They're a major
contributor to a process' spent CPU time and show up in a lot of
profiles. That however is because those processes really do spend
quite a bit of time in the scheduler by doing lots of context
switches, like WindowServer when responding to IPC calls.
Instead of hiding these for aesthetic reasons we should instead
improve the scheduler.
We can lose profiling timer events for a few reasons, for example
disabled interrupts or system slowness. This accounts for lost
time between CPU samples by adding a field lost_samples to each
profiling event which tracks how many samples were lost immediately
preceding the event.
Now that the profiling timer is independent from the scheduler the
user will get quite a few CPU samples from "within" the scheduler.
These events are less useful when just profiling a user-mode process
rather than the whole system. This patch adds an option to Profiler to
hide these events.
This turns the perfcore format into more a log than it was before,
which lets us properly log process, thread and region
creation/destruction. This also makes it unnecessary to dump the
process' regions every time it is scheduled like we did before.
Incidentally this also fixes 'profile -c' because we previously ended
up incorrectly dumping the parent's region map into the profile data.
Log-based mmap support enables profiling shared libraries which
are loaded at runtime, e.g. via dlopen().
This enables profiling both the parent and child process for
programs which use execve(). Previously we'd discard the profiling
data for the old process.
The Profiler tool has been updated to not treat thread IDs as
process IDs anymore. This enables support for processes with more
than one thread. Also, there's a new widget to filter which
process should be displayed.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
This is a little bit messy, but basically if an ELF object is non-PIE,
we have to account for the executable mapping being at a hard-coded
offset and subtract that when doing symbolication.
There's probably a nicer way to solve this, I just hacked this together
so we can see "cc1plus" and friends in profiles. :^)
In multi-process profiles, the same ELF objects tend to occur many
times (everyone has libc.so for example) so we will quickly run out
of VM if we map each object once per process that uses it.
Fix this by adding a "mapped object cache" that maps the path of
an ELF object to a cached memory mapping and wrapping ELF::Image.