Agent-First Thinking: Tools should work like you're helping a smart colleague, not programming a machine. If it's confusing for a human, it's confusing for an agent.
Consolidate Operations: One schedule_meeting tool beats separate list_users, check_availability, create_event tools. Agents prefer fewer, more capable tools.
Context Efficiency: Agents have limited memory. Default to concise outputs, offer "detailed" mode when needed. Implement pagination for large results.
- Tool names: Clear, semantic (
create_tasknotct) - Descriptions: 3-4 sentences minimum covering what, when, limitations
- Parameters: Unambiguous names (
user_idnotuser,start_datenotdate) - Write like explaining to a new team member
{
"name": "search_tickets",
"description": "Search support tickets by status, priority, or customer. Returns up to 50 results by default. Use 'detailed' format for full ticket content or 'summary' for quick overview.",
"inputSchema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search terms"},
"status": {"type": "string", "enum": ["open", "closed", "pending"]},
"format": {"type": "string", "enum": ["summary", "detailed"], "default": "summary"}
},
"required": ["query"]
}
}- Vague descriptions ("manages files")
- Overlapping functionality between tools
- Too many granular operations
- Cryptic parameter names
- No output format control
- Clear, specific descriptions with limitations
- Distinct tool purposes
- Consolidated operations
- Self-explanatory parameter names
- Flexible output formats
- Tool names:
^[a-zA-Z0-9_-]{1,64}$ - JSON Schema Draft 2020-12 for
inputSchema - Proper error handling with
is_error: true - Version header:
MCP-Protocol-Version: 2025-06-18
- Validate inputs before execution
- Return specific, actionable error messages
- Don't break conversation flow on errors
- Help agents understand what went wrong
- Validate and sanitize all inputs
- Implement proper authentication
- Use read-only defaults
- Rate limiting on server side
For each tool, ask:
- Would this be intuitive for a human colleague?
- Is the description complete enough to use without examples?
- Can parameters be understood without domain knowledge?
- Does it handle common error cases gracefully?
- Is the output format appropriate for the use case?
Red flags:
- Description under 2 sentences
- Generic parameter names (
data,options,config) - No error handling
- Overlaps with other tools
- Always returns full datasets
- Test with real scenarios: Complex, multi-step tasks that mirror actual usage
- Monitor tool selection errors: When agents pick wrong tools or wrong parameters
- Check context usage: Are tools wasting tokens on irrelevant data?
- Validate error recovery: Can agents adapt when tools fail?
- Fix descriptions first: Biggest impact for least effort
- Consolidate overlapping tools: Reduces agent confusion
- Add output format controls: Improves context efficiency
- Improve error messages: Enables better agent recovery
- Optimize for common workflows: Based on actual usage patterns
Use this checklist to evaluate and improve existing MCP tools. Focus on the biggest problems first - usually descriptions and parameter clarity.