Turned the research win into a regression guard. A deterministic integration test reproduces the exact regime (N=5000, D=128, noise σ=0.40, seed=42) through the public reranker API and asserts GnnDiffusion beats the no-rerank baseline.
recall@10: noisy=0.280 gnn=0.384 delta=+0.104 (matches #479 exactly)
test result: ok. 1 passed
Why it matters: the headline "+10.4pp" now lives in cargo test (CI), with a
+0.03 floor so it can't silently regress — and it runs without the standalone
benchmark binary (which endpoint security blocks on the dev box). Branch:
feat/productionize-gnn-rerank.