Best Practices for Agent-Friendly API Design

When designing APIs intended for intelligent agents (conversational LLMs, multi-agent orchestration, etc.), it’s helpful to treat them as “machine interfaces”—clear and unambiguous not only to developers, but to algorithms as well. A good starting point is to produce a full OpenAPI specification for your service (for example using FastAPI, which automatically generates Swagger/OpenAPI docs). The OpenAPI standard lets agents read the entire API definition—what resources and parameters are available, how to authenticate, what inputs to send, and what responses to expect ([Building an AI agent with OpenAPI: LangChain vs. Haystack]). Crafting complete, AI-ready documentation is critical ([Is Your API AI-ready? Our Guidelines and Best Practices]).

Rich descriptions and metadata. Every endpoint and parameter should have an exhaustive description—not just a repeat of its name, but an explanation of “what this endpoint does,” “what data it expects,” and “what it returns.” Blobr, for instance, recommends describing endpoints in the context of use cases, and giving parameter schemas with format details, allowed ranges, and examples ([Is Your API AI-ready? Our Guidelines and Best Practices]). Using an operationId for each operation is also hugely valuable, as it gives both agents and developers a unique handle to invoke that operation.
Clear data models. Well-structured, consistent schemas (e.g. in OpenAPI components) with reusable $ref definitions help agents carry data smoothly from one call to the next. Consistent naming and typing across the entire API ensure that once an agent learns one response structure, it can reuse it elsewhere. It’s best practice to centralize your data model definitions so all endpoints share the same object schemas.
Limit response size. To prevent LLMs from “blacking out” on massive payloads, enforce pagination (e.g. 10–15 items per page) and filters. OpenAPI lets you define default page sizes or limits, so agents will naturally break large queries into smaller chunks. Standardizing pagination parameters (e.g. page & limit vs. offset) makes multiple endpoints uniform. Document your error responses clearly—agents rely on error codes and messages to diagnose problems (e.g. distinguish “ID not found” from “invalid format”).
Natural-language–friendly interfaces. Traditional APIs often require numeric IDs or domain-specific queries, which is cumbersome for LLMs. Consider adding search endpoints or text filters. For example, a CRM API might expose find_contact?query=Jane+Doe to let the agent locate a record by name, then issue a second call to fetch full details by ID. Likewise, allow clients to specify which fields to return, apply category filters, sorting, or range filters on values—this granularity helps agents avoid hallucinations by fetching only the data they actually need.
Enhanced OpenAPI (Arazzo, etc.). Enrich your OpenAPI spec with agent-focused metadata. Emerging standards like Arazzo extend OpenAPI to describe multi-step workflows and business-context flows. Even if you don’t adopt the full Arazzo spec, aim for AI-ready docs: clearly define endpoint limits, data ranges, examples, and domain semantics (i.e. “what does this parameter mean in business terms?”).

In summary, APIs designed for agents should be at least as well organized as human-facing APIs—but even more explicit and self-documenting. Following AI-ready guidelines ensures that agents (and orchestration libraries) can consume your system’s functions smoothly.

Integration Patterns: Tools/Functions vs. External Layer (MCP)

In conversational-agent environments, you can either embed tools/functions directly in the agent, or use a unified intermediary layer like the Model Context Protocol (MCP). Each approach has trade-offs:

Embedded Tools/Functions. Most agent frameworks (LangChain, AutoGen, CrewAI) let you register “tools”—Python functions or classes the agent can call. In OpenAI’s function-calling paradigm, you define JSON schemas and the model invokes them automatically. This approach is quick to implement (often a few lines of code) and frameworks provide “chain-of-thought” debugging. But every LLM has its own function-calling spec (the M×N problem), so you may need separate integration code for GPT vs. Claude. And orchestration across multiple tools must be handled manually in your agent logic.
External Layer (MCP). Alternatively, run a separate tool-server speaking MCP. Tools register themselves with rich metadata describing their inputs, outputs, and constraints. Any MCP-compatible agent (LangChain, Claude Desktop, etc.) can discover and invoke them without extra glue code. This normalizes cross-model and cross-agent integration—adding or modifying tools happens on the server side, and agents adapt automatically. The trade-off is extra infrastructure: you need an MCP server, session affinity/scale considerations, and you must maintain your tool definitions there. MCP shines in large, multi-model, multi-agent systems where standardized integration pays off.

Bottom line: For simple prototypes (one LLM, few tools), embedding functions is easiest. For large-scale multi-agent production systems, an external MCP layer brings standardization and flexibility—often a hybrid approach is best: embed core tools directly, expose others via MCP.

OpenAPI/FastAPI with Agent Frameworks (LangChain, AutoGen, CrewAI)

FastAPI is a natural choice for building AI-ready APIs. A typical integration pattern:

Build your FastAPI server with well-documented endpoints and Pydantic models. FastAPI auto-generates /docs with a complete openapi.json.
Load the spec into your agent framework. LangChain’s OpenAPI Toolkit can ingest the URL and spin up an OpenAPITool. Speakeasy’s examples show agents using Haystack or LangChain to read the spec and choose endpoints based on user requests.

LangChain: Use OpenAPITool or the community’s OpenAPI Agent. You can also implement hierarchical planning: one model decides which API calls to make, another executes them—saving tokens and improving consistency.
Haystack: Its modular pipeline (readers/writers) makes integrating FastAPI endpoints straightforward—often faster to set up than LangChain, with similar OpenAPI support.
AutoGen (Microsoft): Deploy an AutoGen workflow from FastAPI in a few lines: FastAPI accepts the user request and kicks off a multi-agent AutoGen process. You then return or store the final result.
CrewAI: Define tools by subclassing BaseTool or using built-in integrations. CrewAI teams often wrap LangChain tools under the hood. You can manually build a tool that fetches your FastAPI spec and issues requests.

Summary: The sweet spot is pairing well-designed APIs (FastAPI + full OpenAPI) with intelligent agents—LangChain, Haystack, AutoGen, or CrewAI can all consume that spec directly. Advanced setups may use hierarchical planning or multi-step pipelines, but the core is: FastAPI gives you the spec, the agent reads it, and your system scales as features grow.

mkbctrl/best_practices_ai_agent_api_design.md

Best Practices for Agent-Friendly API Design

Integration Patterns: Tools/Functions vs. External Layer (MCP)

OpenAPI/FastAPI with Agent Frameworks (LangChain, AutoGen, CrewAI)