1
Fork 0
mirror of https://github.com/RGBCube/uutils-coreutils synced 2026-01-16 02:01:05 +00:00
uutils-coreutils/src
Chirag Jadwani 2c1459cbfc cut: optimizations
* Use buffered stdout to reduce write sys calls.

This simple change yielded the biggest performace gain.

* Use `for_byte_record_with_terminator` from the `bstr` crate.

This is to minimize the per line copying needed by
`BufReader::read_until`. The `cut_fields` and `cut_fields_delimiter`
functions used `read_until` to iterate over lines. That required copying
each input line to the line buffer. With
`for_byte_record_with_terminator` copying is minimized as it calls our
closure with a reference to BufReader's buffer most of the time.  It
needs to copy (internally) only to process any incomplete lines at the
end of the buffer.

* Re-write `Searcher` to use `memchr`.

Switch from the naive implementation to one that uses `memchr`.

* Rewrite `cut_bytes` almost entirely.

This was already well optimized. The performance gain in this case is
not from avoiding copying. In fact, it needed zero copying whereas new
implementation introduces some copying similar to `cut_fields` described
above. But the occassional copying cost is more than offset by the use
of the very fast `memchr` inside `for_byte_record_with_terminator`.
This change also simplifies the code significantly. Removed the `buffer`
module.
2021-04-24 22:29:48 +05:30
..
bin Update the binary usage to match busybox 2021-03-10 23:52:33 +01:00
uu cut: optimizations 2021-04-24 22:29:48 +05:30
uucore Merge branch 'master' into split-wsl-detection 2021-04-24 10:24:13 +02:00
uucore_procs Fix "panic message is not a string literal" warnings (#1915) 2021-03-26 11:09:16 +01:00