An argument stitched from Solve Everything: Achieving Abundance by 2035 (Alexander D. Wissner-Gross & Peter H. Diamandis).
Three moves: bet on the open rail; standardize the target; run the target over the rail.
A2A is a rail, not a train.
"Own the rails. The 'application layer' (the specific app or model) is a trap; it will be commoditized and value will drop to zero. The durable value is in the infrastructure of the solved world..."
"Models are destined to be commoditized; they are the 'trains' that will eventually all look the same... The trains will come and go, but the rails will determine where they can travel."
Walled gardens and tech islands are the risk:
"Mandate 'Open Rails': Just as different email providers can talk to each other, AI assistants must be interoperable. We cannot allow a 'walled garden' where a medical AI cannot speak to an insurance AI because they are owned by rivals."
Every prior platform scaled on a shared, open substrate — never a proprietary one:
"The harness was a stack of abstractions: transistors, protocols (like TCP/IP), and operating systems turned computation into a programmable substrate."
And the window is short:
"...once technical standards are set, they are nearly impossible to change. Consider the QWERTY keyboard... the first credible [standards] that achieve widespread adoption today will define the 'physics' of the new economy."
An interoperable agent is only useful if you can trust it — and trust doesn't come from a vendor's word. It comes from certification, reputation, and results that anyone can check. Benchmarks are the path to that trust: the standardized, public proof that an agent does what it claims.
"Artisans rely on personal taste. Industries rely on Targeting Systems (formerly known as 'benchmarks'). A field allows for industrial-scale progress only when we can state, with mathematical precision, 'This number is what success looks like.'"
The proof has to be public, adversarial, and verifiable by anyone — not self-reported:
"...make truth cheap to verify. In our era, that means we must build public, adversarial benchmark authorities. We need 'scoreboards' for AI that are stress-tested by red teams to ensure that a claim of intelligence is actually true."
"If a company claims to be 'solving education' but cannot show you a verified, public scoreboard of learning gains, they are still in the 'pamphlet phase.' They are marketing, not engineering."
"It is auditable, requiring public Decision Records for AI Systems (DR-AIS) and 'replication packs' so that any claim of success can be verified by a third party."
But the benchmarks of today are insufficient — narrow, static, and quickly gamed:
"The first era of AI 'leaderboards' was a necessary prelude, but that era is over. Those early benchmarks were narrow, academic, and easily 'saturated,' with AIs eventually achieving a perfect score."
Trust has to be portable and live. An agent should carry its certifications and benchmark verifications on its A2A card — credentials any counterparty can read before it transacts. And the certificates can't be a one-time stamp: feedback and new real-world scenarios flow back over the protocol to keep the benchmarks honest and current. Reputation becomes a living credential, not a press release.
A cross-vendor benchmark needs a cross-vendor protocol to reach the agent under test:
"Action Surfaces: Intelligence is useless if it cannot act. We must build the APIs (software connections)... and contract protocols that allow AI agents to safely affect the real world. This is the 'handshake' between the digital brain and the physical hand."
Carry the benchmark over that handshake, and its hardest requirements fall out for free:
"The 'Two-Source' Rule: For critical domains (medicine, energy, justice), any high-stakes decision must be confirmed by at least two independent AI models trained on different datasets."
Replication packs that re-run anywhere, certificates that ride on the agent's card, second opinions from genuinely different vendors — none of it works through a proprietary API. So the rail and the target become one standard.
And whoever sets that default writes the rulebook everyone else builds on:
"Standards Diplomacy: The most powerful empire is the one that writes the rulebook. We must export trusted 'rails,' including the safety protocols, data standards, and API definitions, that the rest of the world builds upon."
"We are in a race: The Rails... vs. The Muddle. If we build [them] fast enough, we win."
Bet on the open rail. Make trust portable on it. Run the target over it. The window is open now, and it closes hard.