Skip to content

Instantly share code, notes, and snippets.

@MoSal
Last active April 10, 2025 18:54
Show Gist options
  • Save MoSal/b8e711a114ad16da4cff756046d0880c to your computer and use it in GitHub Desktop.
Save MoSal/b8e711a114ad16da4cff756046d0880c to your computer and use it in GitHub Desktop.
Curious rustc behavior

Curious rustc behavior

Test case

rand-ascii 100k 30000-31000 > random_100k_30000-31000.txt
hyperfine --warmup=1 -r4 './target/release/bat -pf random_100k_30000-31000.txt

Where:

  • rand-ascii comes from here.
  • bat comes from here.
  • Beware that both are using the tmp branch.
  • GNU/Linux x86_64 target.

First Run

Building rustc

# bootstrap.dist.toml
[build]
build-stage = 1
dist-stage = 1
install-stage = 1
test-stage = 1
doc-stage = 1
extended = false
tools = []

[llvm]
download-ci-llvm = true

[rust]
channel = "nightly"
download-rustc = false
lld = true
llvm-bitcode-linker = true
# bootstrap.toml
profile = "dist"
change-id = 138986
gix clone --depth=100 https://github.com/rust-lang/rust
cd rust
cp -f /tmp/bootstrap.dist.toml src/bootstrap/defaults/bootstrap.dist.toml
cp -f /tmp/bootstrap.toml .
git show --oneline HEAD
e5fefc359be (HEAD -> master, origin/master, origin/HEAD) Auto merge of #139474 - jieyouxu:bump-rustc-perf, r=Kobzol

./x.py build

Using built rustc to build bat

# bat tmp branch
cargo +stage1 build --release

Benchmarking test-case

% hyperfine --warmup=1 -r4 './target/release/bat -pf random_100k_30000-31000.txt
Benchmark 1: ./target/release/bat -pf random_100k_30000-31000.txt
  Time (mean ± σ):      7.892 s ±  0.018 s    [User: 7.419 s, System: 0.462 s]
  Range (min … max):    7.873 s …  7.907 s    4 runs

Second Run

Build bat again, but turn off stripping

# bat tmp branch again, 
sed -i 's|^strip = true|strip = false|' Cargo.toml
rm -rf target
cargo +stage1 build --release

Benchmarking test case-again

% hyperfine --warmup=1 -r4 './target/release/bat -pf random_100k_30000-31000.txt
Benchmark 1: ./target/release/bat -pf random_100k_30000-31000.txt
  Time (mean ± σ):      4.689 s ±  0.021 s    [User: 4.259 s, System: 0.422 s]
  Range (min … max):    4.666 s …  4.717 s    4 runs

Third Run (if you think a bisect would be enough, think again!)

Add non-functional change to rustc repo and rebuild

# rustc
echo t > t.txt
git add t.txt
git c -m 't.txt'
./x.py build

Re-enable stripping in bat and build again

# bat tmp branch
sed -i 's|^strip = false|strip = true|' Cargo.toml
rm -rf target
cargo +stage1 build --release
hyperfine --warmup=1 -r4 './target/release/bat -pf random_100k_30000-31000.txt
Benchmark 1: ./target/release/bat -pf random_100k_30000-31000.txt
  Time (mean ± σ):      4.713 s ±  0.053 s    [User: 4.245 s, System: 0.460 s]
  Range (min … max):    4.670 s …  4.790 s    4 runs

If this doesn't work, as in you don't get good performance back again, try other stuff like creating a new branch and rebasing without keeping merge commits, ...etc. You will get there ;)

Fourth Run (with stable rustc 1.86.0)

strip = true

Benchmark 1: ./target/release/bat -pf random_100k_30000-31000.txt
  Time (mean ± σ):      4.724 s ±  0.105 s    [User: 4.281 s, System: 0.435 s]
  Range (min … max):    4.643 s …  4.877 s    4 runs

strip = false

Benchmark 1: ./target/release/bat -pf random_100k_30000-31000.txt
  Time (mean ± σ):      6.493 s ±  0.078 s    [User: 6.008 s, System: 0.470 s]
  Range (min … max):    6.433 s …  6.603 s    4 runs

So performance is/was good. But stripping still has an effect for some reason, even if it's a good one here!

Fifth Run (with beta rustc 1.87.0-beta.3)

Here, we get the same performance with stripping on and off. Unfortunately, it's not good performance.

Benchmark 1: ./target/release/bat -pf random_100k_30000-31000.txt
  Time (mean ± σ):      8.458 s ±  0.010 s    [User: 7.977 s, System: 0.464 s]
  Range (min … max):    8.445 s …  8.467 s    4 runs

The performance regression actually started with nightly 2025-03-21, and you can bisect this a little bit further. But the fun starts when you start testing your theories about what's the culprit, as observed in Run 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment