An agent harness is the runtime that sits between a language model and the world — it manages the conversation loop, dispatches tool calls, enforces context and trust boundaries, persists session state, and shepherds work across failures. "Agent harness" is not a single pattern; it is a design space with several largely independent axes. Choices on one axis constrain others, and the sharpest failures in production come from the interactions rather than from any single dimension in isolation.
A harness also does not exist in isolation. It depends on — and is co-designed with — a broader ecosystem: identity and delegation systems, storage substrates, proxies and gateways, network policy enforcement, observability pipelines, evaluation infrastructure, model lifecycle tooling, supply-chain controls, policy engines and approval flows, cost and quota systems, long-term memory stores, and compliance mechanisms. A harness that wins on its internal axes but has no eval su