mirror of
https://github.com/RGBCube/uutils-coreutils
synced 2025-09-15 11:36:16 +00:00
sort: automatically fall back to extsort
To make this work we make default sort a special case of external sort. External sorting uses auxiliary files for intermediate chunks. However, when we can keep our intermediate chunks in memory, we don't write them to the file system at all. Only when we notice that we can't keep them in memory they are written to the disk. Additionally, we don't allocate buffers with the capacity of their maximum size anymore. Instead, they start with a capacity of 8kb and are grown only when needed. This makes sorting smaller files about as fast as it was before (I'm seeing a regression of ~3%), and allows us to seamlessly continue with auxiliary files when needed.
This commit is contained in:
parent
df45b20dc1
commit
e7da8058dc
7 changed files with 138 additions and 112 deletions
|
@ -15,29 +15,18 @@ fn test_helper(file_name: &str, args: &str) {
|
|||
.stdout_is_fixture(format!("{}.expected.debug", file_name));
|
||||
}
|
||||
|
||||
// FYI, the initialization size of our Line struct is 96 bytes.
|
||||
//
|
||||
// At very small buffer sizes, with that overhead we are certainly going
|
||||
// to overrun our buffer way, way, way too quickly because of these excess
|
||||
// bytes for the struct.
|
||||
//
|
||||
// For instance, seq 0..20000 > ...text = 108894 bytes
|
||||
// But overhead is 1920000 + 108894 = 2028894 bytes
|
||||
//
|
||||
// Or kjvbible-random.txt = 4332506 bytes, but minimum size of its
|
||||
// 99817 lines in memory * 96 bytes = 9582432 bytes
|
||||
//
|
||||
// Here, we test 108894 bytes with a 50K buffer
|
||||
//
|
||||
#[test]
|
||||
fn test_larger_than_specified_segment() {
|
||||
new_ucmd!()
|
||||
.arg("-n")
|
||||
.arg("-S")
|
||||
.arg("50K")
|
||||
.arg("ext_sort.txt")
|
||||
.succeeds()
|
||||
.stdout_is_fixture("ext_sort.expected");
|
||||
fn test_buffer_sizes() {
|
||||
let buffer_sizes = ["0", "50K", "1M", "1000G"];
|
||||
for buffer_size in &buffer_sizes {
|
||||
new_ucmd!()
|
||||
.arg("-n")
|
||||
.arg("-S")
|
||||
.arg(buffer_size)
|
||||
.arg("ext_sort.txt")
|
||||
.succeeds()
|
||||
.stdout_is_fixture("ext_sort.expected");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue