- https://blog.anp.lol/rust/2016/07/24/profiling-rust-perf-flamegraph/
- https://gist.github.com/jFransham/369a86eff00e5f280ed25121454acec1#number-one-optimization-tip-dont
- http://likebike.com/posts/How_To_Write_Fast_Rust_Code.html
- https://llogiq.github.io/2017/06/01/perf-pitfalls.html
- https://rust-embedded.github.io/book/unsorted/speed-vs-size.html
You might have to do this:
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
And these changes to your TOML
[profile.release]
debug = true
[profile.bench]
debug = true
perf record -F 999 -a -g -- sleep 60
./stackcollapse-perf.pl out.perf > out.folded
./flamegraph.pl out.kern_folded > kernel.svg
(CARGO_PROFILE_BENCH_DEBUG=true; cargo flamegraph --bench full_benchmarking)
Needs QT if you want to run the visualization on macOS (see).
valgrind --tool=massif --stacks=yes --verbose --peak-inaccuracy=0 --time-unit=ms --detailed-freq=1 --max-snapshots=1000 --massif-out-file=valgrind-massif.txt target/release/deps/full_benchmarking-a6de29ad1dbb7338
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --track-origins=yes --verbose --log-file=valgrind-memcheck.txt target/release/deps/full_benchmarking-a6de29ad1dbb7338
valgrind --tool=dhat --mode=copy --verbose --dhat-out-file=valgrind-dhat-copy.txt target/release/deps/full_benchmarking-a6de29ad1dbb7338
valgrind --tool=dhat --mode=heap --verbose --dhat-out-file=valgrind-dhat-heap.txt target/release/deps/full_benchmarking-a6de29ad1dbb7338
Works on linux.