A recent paper — "Safety and Fairness in Agentic AI Depend on Interaction Topology, Not on Model Scale or Alignment" (arxiv:2605.01147) — provides empirical confirmation of something the multi-agent safety field has been circling: aligning individual agents does not produce aligned systems. The paper shows ordering instability (59 percentage point variance from reordering alone), information cascades (99.9% agreement with zero error correction in larger models), and functional collapse (systems satisfying fairness metrics while abandoning their actual function).
Their conclusion: "Current safety frameworks targeting component-level alignment are targeting the wrong object."
They're right. But the paper's own proposals — topological sweeps, architecture disclosure, stress-testing — still target from outside. They ask: which topologies are safe? The more productive question is: why does topology determine safety at all?