mirror of
https://github.com/RGBCube/uutils-coreutils
synced 2025-08-01 05:27:45 +00:00
cut: optimizations
* Use buffered stdout to reduce write sys calls. This simple change yielded the biggest performace gain. * Use `for_byte_record_with_terminator` from the `bstr` crate. This is to minimize the per line copying needed by `BufReader::read_until`. The `cut_fields` and `cut_fields_delimiter` functions used `read_until` to iterate over lines. That required copying each input line to the line buffer. With `for_byte_record_with_terminator` copying is minimized as it calls our closure with a reference to BufReader's buffer most of the time. It needs to copy (internally) only to process any incomplete lines at the end of the buffer. * Re-write `Searcher` to use `memchr`. Switch from the naive implementation to one that uses `memchr`. * Rewrite `cut_bytes` almost entirely. This was already well optimized. The performance gain in this case is not from avoiding copying. In fact, it needed zero copying whereas new implementation introduces some copying similar to `cut_fields` described above. But the occassional copying cost is more than offset by the use of the very fast `memchr` inside `for_byte_record_with_terminator`. This change also simplifies the code significantly. Removed the `buffer` module.
This commit is contained in:
parent
2f17bfc14c
commit
2c1459cbfc
5 changed files with 157 additions and 302 deletions
2
Cargo.lock
generated
2
Cargo.lock
generated
|
@ -1777,7 +1777,9 @@ dependencies = [
|
|||
name = "uu_cut"
|
||||
version = "0.0.6"
|
||||
dependencies = [
|
||||
"bstr",
|
||||
"clap",
|
||||
"memchr 2.3.4",
|
||||
"uucore",
|
||||
"uucore_procs",
|
||||
]
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue