1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-31 15:38:10 +00:00
serenity/Userland/Libraries/LibC/arch/x86_64
Daniel Bertalan bcf124c07d LibC: Implement a faster memset routine for x86-64 in assembly
This commit addresses the following shortcomings of our current, simple
and elegant memset function:
- REP STOSB/STOSQ has considerable startup overhead, it's impractical to
  use for smaller sizes.
- Up until very recently, AMD CPUs didn't have support for "Enhanced REP
  MOVSB/STOSB", so it performed pretty poorly on them.

With this commit applied, I could measure a ~5% decrease in `test-js`'s
runtime when I used qemu's TCG backend. The implementation is based on
the following article from Microsoft:

https://msrc-blog.microsoft.com/2021/01/11/building-faster-amd64-memset-routines

Two versions of the routine are implemented: one that uses the ERMS
extension mentioned above, and one that performs plain SSE stores. The
version appropriate for the CPU is selected at load time using an IFUNC.
2022-05-01 12:42:01 +02:00
..
crti.S LibC: Use our implementation of crti.o and crtn.o 2021-07-14 13:12:25 +02:00
crtn.S LibC: Use our implementation of crti.o and crtn.o 2021-07-14 13:12:25 +02:00
memset.cpp LibC: Implement a faster memset routine for x86-64 in assembly 2022-05-01 12:42:01 +02:00
memset.S LibC: Implement a faster memset routine for x86-64 in assembly 2022-05-01 12:42:01 +02:00
setjmp.S LibC: Fix sigsetjmp on x86_64 2021-08-26 00:54:23 +02:00