Deep Review: plugins/claude/agents-sdk-creator

Date: 2026-01-27 Reviewer: Claude Opus 4.5 (with Codex and Claude partner reviews) Scope: Full skill directory — SKILL.md, 5 examples, 8 references, 2 scripts

Executive Summary

The agents-sdk-creator skill is a comprehensive, well-structured guide for building Python applications with the Claude Agent SDK (claude-agent-sdk). It covers the full SDK surface area across 465 lines of main workflow, 5 graduated examples, 8 detailed references, and 2 validation scripts. Cross-referencing against the official Anthropic documentation at platform.claude.com confirms the skill's core API claims are accurate: the query() vs ClaudeSDKClient distinction, message types, hook events, permission modes, custom tools, and anti-patterns all match.

The skill's strengths are its progressive disclosure (minimal to complete examples), security-first design (hooks, permissions, sandbox anti-patterns), and the automated validation script. The main areas for improvement are: verifying the UserMessage.uuid checkpointing claim against runtime behavior, hardening the bash-based validator, covering a few missing SDK features (receive_messages(), interrupt(), McpHttpServerConfig), and adding mechanisms to stay current as the SDK evolves.

Overall quality: High. The skill is production-ready with targeted improvements.

Detailed Findings

1. API Accuracy (verified against official docs)

Correct:

query() vs ClaudeSDKClient feature matrix matches the official Python reference exactly
All 5 message types and 4 content block types match official docs
6 Python hook events and 3 TypeScript-only events correctly identified
4 permission modes accurately described
@tool decorator, create_sdk_mcp_server(), and mcp__<server>__<tool> naming convention all correct
Anti-patterns (break in receive_response(), deprecated claude_code_sdk, bypassPermissions + allowUnsandboxedCommands) all confirmed by official docs

Discrepancies found:

The official SDK overview page shows hooks passed to query() in its "Hooks" tab example, contradicting the Python reference which says hooks are ClaudeSDKClient-only. The skill follows the Python reference (correct behavior), but users may encounter this inconsistency in Anthropic's own docs.
UserMessage.uuid is used for checkpointing but is not explicitly documented in the official UserMessage type definition. The SDK is alpha (0.1.x) and docs may lag runtime. Needs runtime verification.

2. Workflow Design

The 7-phase workflow (Requirements -> Setup -> Research -> Build -> Hooks/Permissions -> Subagents -> Validation) is thorough and well-sequenced. Each phase builds on the previous, and the skill correctly makes Phase 3 (Research) optional for simple agents.

Codex's feedback: The phases are "verbose but purposeful." A "fast track" path that jumps to Phase 4 with reminders to revisit hooks/subagents later would serve experienced users. This is a valid observation — the skill optimizes for correctness over speed.

3. Examples Quality

The 5 examples form a clear progression:

Example	Lines	API	Complexity
minimal-agent	~20	`query()`	Simplest possible
standard-agent	~55	`query()`	Error handling, budget, message processing
multi-turn-agent	~60	`ClaudeSDKClient`	REPL, interrupt, streaming
complete-agent	~214	`ClaudeSDKClient`	Custom tools, hooks, subagents, structured output
checkpoint-agent	~80	`ClaudeSDKClient`	File checkpointing, try/rollback

Each example is self-contained and runnable. The checkpoint-agent uses the uv shebang pattern from CLAUDE.md. The complete-agent demonstrates every major feature in a single coherent script.

Gap: No example shows receive_messages() (all examples use receive_response()). No example demonstrates interrupt() programmatically (the multi-turn example shows it via user input but not as an automated pattern).

4. References Quality

The 8 reference files total ~2,260 lines of documentation covering:

api-reference.md (434 lines): Complete type definitions, method signatures, options fields. High quality. Missing permission_prompt_tool_name field and McpHttpServerConfig type.
patterns-guide.md (529 lines): 12 patterns from one-shot to checkpointing, plus anti-patterns. Most comprehensive file. Missing an event-driven/background-watcher pattern.
hooks-and-permissions.md (330 lines): Permission evaluation order, hook events, callback signatures, 4 common patterns (security, audit, redirect, rate limiting). Excellent clarity.
tools-reference.md (195 lines): Built-in tools table, custom tool creation, external MCP servers. Solid coverage.
subagents-reference.md (241 lines): Constraints, invocation patterns, multi-agent architectures. Good coverage of edge cases.
sessions-reference.md (170 lines): Session capture, resume, fork, pipeline, ClaudeSDKClient continuity. Clear and practical.
structured-outputs.md (183 lines): Pydantic integration, schema design tips, complex examples. Well done.
sandbox-reference.md (174 lines): Security considerations, excludedCommands vs allowUnsandboxedCommands comparison table. Strong security focus.

5. Validation Scripts

detect-sdk-context.sh (118 lines): Checks for existing agent scripts, project dependencies, and MCP config. Includes a safety check preventing accidental creation in ~/.claude/plugins/cache/. Simple but effective.

validate-agent-script.sh (372 lines): Comprehensive bash/grep-based validator checking 20+ conditions. Catches real mistakes (deprecated package, hooks with query(), missing enable_file_checkpointing, Task in subagent tools).

Weakness: The grep-based approach is brittle. It can miss multiline arguments, aliased imports, and conditionally defined code. It can also false-positive on break statements in unrelated loops. Both Codex and the Claude reviewer flagged this.

6. Comparison to Official Quickstart

The official Anthropic quickstart (platform.claude.com/docs/en/agent-sdk/quickstart) walks through building a single bug-fixing agent. The skill goes far beyond this with 5 graduated examples, 12 patterns, and comprehensive reference docs. The skill is a superset of the official getting-started material.

7. Staleness Risk

The SDK is alpha (0.1.x) and evolving. The skill has no mechanism to detect or flag when it falls behind. New hook events, permission modes, options fields, or API changes could silently make parts of the skill incorrect.

Actionable Items

Item 1: Verify `UserMessage.uuid` for checkpointing at runtime

Description: The checkpoint-agent example and patterns-guide Pattern 12 both rely on UserMessage.uuid to capture checkpoint restore points. However, the official Python SDK docs do not list uuid as a field on UserMessage. The rewind_files(user_message_uuid) method exists, implying the UUID comes from somewhere, but the docs don't say where.

Why it matters: If UserMessage.uuid doesn't exist at runtime, the entire file checkpointing workflow is broken. This is the skill's highest-risk claim.

Suggested approach:

Create a minimal test script that creates a ClaudeSDKClient with enable_file_checkpointing=True
Send a query and iterate receive_response(), printing type(msg), dir(msg), and msg for each UserMessage
If uuid exists: document it as an undocumented but functional field with a note
If uuid doesn't exist: investigate where the checkpoint UUID comes from (possibly SystemMessage.data, AssistantMessage metadata, or a different API) and update all checkpoint examples

Files affected: examples/checkpoint-agent.md, references/patterns-guide.md (Pattern 12), SKILL.md (Phase 4 checkpointing note)

Notes and status: ready

Item 2: Document `receive_messages()` vs `receive_response()` and `interrupt()`

Description: The skill's build phase and all 5 examples exclusively use receive_response(). The API reference documents both receive_messages() and receive_response() as ClaudeSDKClient methods, but the workflow never explains when to choose one over the other. Similarly, interrupt() is mentioned in the multi-turn example's user input handling but never explained as a programmatic pattern.

Why it matters: Developers building streaming UIs, progress monitors, or cancellation-aware agents need to understand these methods. Codex specifically flagged this gap.

Suggested approach:

Add a brief subsection to SKILL.md Phase 4 (or a callout box) contrasting the two methods:
- receive_response(): yields messages until ResultMessage — use for standard workflows
- receive_messages(): yields ALL messages including from subagents — use when you need full visibility into subagent activity
Add a note on interrupt(): "Call await client.interrupt() to stop the current task mid-execution. The client remains usable — send a new query() to continue."
Consider adding a short example pattern (Pattern 13) showing programmatic interrupt with timeout

Files affected: SKILL.md (Phase 4), optionally references/patterns-guide.md

Notes and status: ready

Item 3: Add missing API types to reference

Description: Three types present in the official docs are missing from the skill's api-reference.md:

McpHttpServerConfig — HTTP-based MCP server connection type
permission_prompt_tool_name field on ClaudeAgentOptions — controls which tool name appears in permission prompts
CLIConnectionError — intermediate error class between ClaudeSDKError and CLINotFoundError

Why it matters: Users consulting the api-reference as their primary SDK docs will have an incomplete picture of available configuration options and error handling.

Suggested approach:

Add McpHttpServerConfig to the MCP Server Types section in api-reference.md
Add permission_prompt_tool_name: str | None = None to the ClaudeAgentOptions field listing
Add CLIConnectionError to the error types section (already partially documented since CLINotFoundError inherits from it)

Files affected: references/api-reference.md, references/tools-reference.md (MCP server section)

Notes and status: ready

Item 4: Clarify CLAUDE.md loading requirements

Description: The skill says setting_sources=["project"] loads CLAUDE.md, but the relationship between setting_sources and system_prompt preset is underspecified. The official docs indicate that CLAUDE.md content is injected as part of the Claude Code system prompt preset, meaning both settings must work together.

Why it matters: A user who sets setting_sources=["project"] without the claude_code preset (or vice versa) will not get the expected behavior and won't understand why.

Suggested approach:

In SKILL.md Phase 4 "With project context" section, add an explicit note: "Both setting_sources=["project"] AND system_prompt={"type": "preset", "preset": "claude_code"} are needed to load CLAUDE.md instructions into the agent."
In references/sessions-reference.md or references/api-reference.md, clarify the dependency
Update the validation script to check for setting_sources without the preset (and vice versa) as a warning

Files affected: SKILL.md, references/api-reference.md, scripts/validate-agent-script.sh

Notes and status: ready

Item 5: Harden validation script with Python AST analysis

Description: The validate-agent-script.sh (372 lines) uses bash grep and echo | grep patterns to detect imports, decorator usage, dict literals, and anti-patterns. This is brittle against multiline code, aliased imports, conditional definitions, and string content that happens to match patterns.

Why it matters: False negatives give a false sense of correctness. False positives erode trust in the validator. Both Codex and the Claude reviewer flagged this. Specific known issues:

break detection can false-positive on break in unrelated loops (e.g., a for loop processing items)
Multiline allowed_tools lists may not be detected
Aliased imports (from claude_agent_sdk import query as q) evade detection

Suggested approach:

Keep the bash script as a fast "lint" pass for quick sanity checks
Create a companion Python script (validate_agent_ast.py) that uses the ast module to:
- Parse the file into an AST
- Walk imports to verify claude_agent_sdk (not claude_code_sdk)
- Find async for loops and check for break within loops that iterate over receive_response()
- Check decorator usage (@tool) and verify create_sdk_mcp_server presence
- Detect ClaudeAgentOptions keyword arguments for hooks/tools/permissions analysis
Update the validation phase (Phase 7) to recommend the Python validator as primary, bash as fallback

Files affected: New file scripts/validate_agent_ast.py, scripts/validate-agent-script.sh (keep as-is), SKILL.md (Phase 7)

Notes and status: ready

Item 6: Add "fast track" workflow path for experienced users

Description: The 7-phase workflow is thorough but can feel heavy for experienced developers building simple agents. Codex noted it's "verbose but purposeful" and suggested a fast-track option.

Why it matters: Users who already know what they want (e.g., "build a one-shot code reviewer with structured output") shouldn't need to walk through requirements gathering and research phases.

Suggested approach:

Add a "Fast Track" section near the top of SKILL.md, after the Overview, with a decision flowchart:
- "Need one-shot task? -> See minimal-agent example, skip to Phase 4"
- "Need multi-turn? -> See multi-turn-agent example, skip to Phase 4"
- "Need hooks/custom tools? -> See complete-agent example, start at Phase 4"
- "Complex/unfamiliar? -> Follow all 7 phases"
Each fast-track path links directly to the relevant example and notes which phases to revisit (e.g., "After building, review Phase 5 for security hooks and Phase 7 for validation")

Files affected: SKILL.md

Notes and status: ready

Item 7: Add event-driven / background automation pattern

Description: The 12 patterns cover one-shot, interactive, multi-agent, session, sandbox, and checkpoint workflows. Missing is an event-driven pattern where an agent watches for external triggers (file changes, queue messages, webhooks) and reacts autonomously.

Why it matters: CI/CD listeners, deployment watchers, PR review bots, and monitoring agents are common real-world use cases. Codex specifically identified this gap.

Suggested approach:

Add Pattern 13 to references/patterns-guide.md: "Event-Driven Automation Agent"
The pattern should show:
- An outer event loop (e.g., watching a directory, polling an API, reading from a queue)
- Creating a new query() call or ClaudeSDKClient session for each event
- Cost tracking across events with cumulative budget enforcement
- Graceful shutdown handling
Example use case: Watch a directory for new .py files and run lint + fix on each one

Files affected: references/patterns-guide.md

Notes and status: ready

Item 8: Add SDK version tracking and staleness prevention

Description: The skill has no mechanism to detect or flag when it falls behind the SDK. The SDK is alpha and evolving — new hook events, options fields, or API changes could silently make parts of the skill incorrect.

Why it matters: Stale documentation that looks authoritative is worse than no documentation. Users will follow outdated patterns and get confused when they don't work.

Suggested approach:

Add a  comment at the top of SKILL.md and api-reference.md indicating the SDK version the docs were last verified against
Add a "Last verified" line in the SKILL.md header: "Last verified against: claude-agent-sdk 0.1.x (2026-01-27)"
Create a simple script (scripts/check-sdk-version.sh) that:
- Runs pip show claude-agent-sdk to get the installed version
- Compares against the documented version
- Warns if they differ
Document a maintenance cadence: "Review this skill against SDK changelog when the major or minor version changes"

Files affected: SKILL.md, references/api-reference.md, new file scripts/check-sdk-version.sh

Notes and status: ready

Item 9: Add branding guidelines note

Description: The official SDK overview mentions branding guidelines: users can say "Claude Agent" or "{YourAgentName} Powered by Claude" but should NOT say "Claude Code" or "Claude Code Agent" when naming their agents. The skill doesn't mention this.

Why it matters: Developers building production agents need to know the branding constraints to avoid naming issues. This is a simple addition.

Suggested approach:

Add a brief note at the end of SKILL.md Phase 1 (Requirements Gathering) or Phase 4 (Build):
- "When naming your agent, follow Anthropic's branding guidelines: use 'Claude Agent', 'Claude', or '{YourAgentName} Powered by Claude'. Do not use 'Claude Code' or 'Claude Code Agent' in agent names."
Link to the official overview for full guidelines

Files affected: SKILL.md

Notes and status: ready

Item 10: Improve `break` detection in validate script

Description: The validation script's break detection is a known false-positive risk. It checks for any break statement in a file that also contains receive_response, which will flag break in completely unrelated loops.

Why it matters: False positives reduce trust in the validator and train users to ignore its warnings. This is a quick targeted fix independent of the larger AST validator effort (Item 5).

Suggested approach:

In validate-agent-script.sh, replace the current heuristic:

# Current (line ~350): checks for ANY break + ANY receive_response in same file
if echo "$CONTENT" | grep -qE '^\s+break\s*$' && echo "$CONTENT" | grep -qE 'async\s+for.*receive_response'; then

With a more targeted check that looks for break within 20 lines after an async for.*receive_response line:

# Better: check for break within the body of a receive_response loop
if echo "$CONTENT" | grep -n 'async\s+for.*receive_response' | while read line_info; do
    line_num=$(echo "$line_info" | cut -d: -f1)
    end_line=$((line_num + 30))
    echo "$CONTENT" | sed -n "${line_num},${end_line}p" | grep -qE '^\s+break\s*$' && exit 0
done; then

This is still imperfect (the AST validator in Item 5 is the proper fix) but reduces false positives significantly

Files affected: scripts/validate-agent-script.sh

Notes and status: ready

Summary

#	Item	Priority	Effort
1	Verify `UserMessage.uuid` for checkpointing	Critical	Small
2	Document `receive_messages()` vs `receive_response()` and `interrupt()`	High	Small
3	Add missing API types (`McpHttpServerConfig`, `permission_prompt_tool_name`, `CLIConnectionError`)	Medium	Small
4	Clarify CLAUDE.md loading requirements	Medium	Small
5	Harden validation with Python AST script	Medium	Medium
6	Add fast-track workflow path	Medium	Small
7	Add event-driven automation pattern	Low	Medium
8	Add SDK version tracking / staleness prevention	Low	Small
9	Add branding guidelines note	Low	Trivial
10	Improve `break` detection in validate script	Low	Small

External Review Sources

Codex (via pairctl, chat 12fd57ee): Flagged receive_messages()/interrupt() gap, validator brittleness, suggested fast-track path and event-driven pattern
Claude (via pairctl): Verified all 9 API claim categories, flagged UserMessage.uuid risk, McpHttpServerConfig gap, and CLAUDE.md loading clarification
Official docs: platform.claude.com/docs/en/agent-sdk/overview, /python, /quickstart — fetched and cross-referenced 2026-01-27

possibilities/deep-review-agents-sdk-creator-2026-01-27.md

Select an option

No results found

Select an option

No results found

Deep Review: plugins/claude/agents-sdk-creator

Executive Summary

Detailed Findings

1. API Accuracy (verified against official docs)

2. Workflow Design

3. Examples Quality

4. References Quality

5. Validation Scripts

6. Comparison to Official Quickstart

7. Staleness Risk

Actionable Items

Item 1: Verify `UserMessage.uuid` for checkpointing at runtime

Item 2: Document `receive_messages()` vs `receive_response()` and `interrupt()`

Item 3: Add missing API types to reference

Item 4: Clarify CLAUDE.md loading requirements

Item 5: Harden validation script with Python AST analysis

Item 6: Add "fast track" workflow path for experienced users

Item 7: Add event-driven / background automation pattern

Item 8: Add SDK version tracking and staleness prevention

Item 9: Add branding guidelines note

Item 10: Improve `break` detection in validate script

Summary

External Review Sources

possibilities/deep-review-agents-sdk-creator-2026-01-27.md

Deep Review: plugins/claude/agents-sdk-creator

Executive Summary

Detailed Findings

1. API Accuracy (verified against official docs)

2. Workflow Design

3. Examples Quality

4. References Quality

5. Validation Scripts

6. Comparison to Official Quickstart

7. Staleness Risk

Actionable Items

Item 1: Verify UserMessage.uuid for checkpointing at runtime

Item 2: Document receive_messages() vs receive_response() and interrupt()

Item 3: Add missing API types to reference

Item 4: Clarify CLAUDE.md loading requirements

Item 5: Harden validation script with Python AST analysis

Item 6: Add "fast track" workflow path for experienced users

Item 7: Add event-driven / background automation pattern

Item 8: Add SDK version tracking and staleness prevention

Item 9: Add branding guidelines note

Item 10: Improve break detection in validate script

Summary

External Review Sources

Item 1: Verify `UserMessage.uuid` for checkpointing at runtime

Item 2: Document `receive_messages()` vs `receive_response()` and `interrupt()`

Item 10: Improve `break` detection in validate script