3 KiB
Performance Profiling Tutorial
Effective Benchmarking with Hyperfine
Hyperfine is a powerful command-line benchmarking tool that allows you to measure and compare execution times of commands with statistical rigor.
Benchmarking Best Practices
When evaluating performance improvements, always set up your benchmarks to compare:
- The GNU implementation as reference
- The implementation without the change
- The implementation with your change
This three-way comparison provides clear insights into:
- How your implementation compares to the standard (GNU)
- The actual performance impact of your specific change
Example Benchmark
First, you will need to build the binary in release mode. Debug builds are significantly slower:
cargo build --features unix --release
# Three-way comparison benchmark
hyperfine \
--warmup 3 \
"/usr/bin/ls -R ." \
"./target/release/coreutils.prev ls -R ." \
"./target/release/coreutils ls -R ."
# can be simplified with:
hyperfine \
--warmup 3 \
-L ls /usr/bin/ls,"./target/release/coreutils.prev ls","./target/release/coreutils ls" \
"{ls} -R ."
# to improve the reproducibility of the results:
taskset -c 0
Interpreting Results
Hyperfine provides summary statistics including:
- Mean execution time
- Standard deviation
- Min/max times
- Relative performance comparison
Look for consistent patterns rather than focusing on individual runs, and be aware of system noise that might affect results.
Using Samply for Profiling
Samply is a sampling profiler that helps you identify performance bottlenecks in your code.
Basic Profiling
# Generate a flame graph for your application
samply record ./target/debug/coreutils ls -R
# Profile with higher sampling frequency
samply record --rate 1000 ./target/debug/coreutils seq 1 1000
The output using the debug profile might be easier to understand, but the performance characteristics may be somewhat different from release profile that we actually care about.
Consider using the profiling profile, that compiles in release mode but with debug symbols. For example:
cargo build --profile profiling -p uu_ls
samply record -r 10000 target/profiling/ls -lR /var .git .git .git > /dev/null
Workflow: Measuring Performance Improvements
-
Establish baselines:
hyperfine --warmup 3 \ "/usr/bin/sort large_file.txt" \ "our-sort-v1 large_file.txt" -
Identify bottlenecks:
samply record ./our-sort-v1 large_file.txt -
Make targeted improvements based on profiling data
-
Verify improvements:
hyperfine --warmup 3 \ "/usr/bin/sort large_file.txt" \ "our-sort-v1 large_file.txt" \ "our-sort-v2 large_file.txt" -
Document performance changes with concrete numbers
hyperfine --export-markdown file.md [...]