Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save possibilities/51e9f206f9ebeec1be1320ec8a79d104 to your computer and use it in GitHub Desktop.

Select an option

Save possibilities/51e9f206f9ebeec1be1320ec8a79d104 to your computer and use it in GitHub Desktop.

Deep Review: plugins/claude/skills/agents-sdk-creator/

Date: 2026-01-26 Reviewer: Claude Opus 4.5 + Codex (via pairctl) Sources: Official Python Agent SDK reference (platform.claude.com), knowctl claude-code topic, web research, cross-skill analysis


Executive Summary

The agents-sdk-creator skill is a comprehensive guide for building Python applications with the Claude Agent SDK. It includes a 7-phase workflow, 4 graded examples, 7 reference documents, and 2 validation scripts. The documentation depth and structure are strong.

However, the skill contains a critical factual error about query() vs ClaudeSDKClient capabilities that propagates through multiple files and would cause agents to produce non-functional code. The official Python SDK reference states that hooks, custom tools, and interrupts are only supported by ClaudeSDKClient, but the skill claims all three work with query(). This error affects the decision guide, the flagship "complete agent" example, the tools reference, and the hooks reference.

Beyond this central issue, the skill is missing coverage of several newer SDK features (sandbox settings, plugin loading, file checkpointing), uses asyncio.run instead of the officially recommended anyio.run, and follows a different workflow structure than sibling skills in the same plugin.


Detailed Findings

Architecture & Structure

Strengths:

  • Three-tier progressive disclosure (SKILL.md -> references/ -> examples/) follows the pattern established by sibling skills
  • Validation script (validate-agent-script.sh) catches common errors like deprecated imports, missing async patterns, and incorrect API usage
  • Context detection script (detect-sdk-context.sh) smartly identifies existing SDK projects
  • Examples are graded by complexity (minimal -> standard -> complete -> multi-turn)
  • Quick reference tables in SKILL.md provide useful at-a-glance information

Structural differences from sibling skills:

  • Uses 7 phases instead of the standard 6 used by command-creator, hook-creator, skill-creator, subagent-creator, plugin-creator, and marketplace-creator
  • Sibling pattern: Requirements -> Context Detection -> Research -> Structure -> Content -> Validation
  • This skill: Requirements -> Project Setup -> Research -> Build -> Hooks/Permissions -> Subagents -> Validation
  • The 7-phase approach is reasonable given the different nature of this skill (building Python apps vs creating Claude Code config files), but the naming deviation may cause friction for agents familiar with the shared pattern

Writing Quality

  • Clear imperative style consistent with sibling skills
  • Description follows the standard trigger-phrase pattern: "This skill should be used when the user asks to..."
  • Good use of decision tables for API selection and permission modes
  • Code examples are well-commented and practical

Actionable Items

1. Fix query() vs ClaudeSDKClient capability table (CRITICAL)

Description: The comparison table in SKILL.md (and repeated in references/api-reference.md) incorrectly states that query() supports hooks ("Yes (via options)") and custom tools ("Yes (via options)"). The official Python SDK reference at platform.claude.com/docs/en/agent-sdk/python explicitly states these are NOT supported by query():

Feature query() ClaudeSDKClient
Hooks Not supported Supported
Custom Tools Not supported Supported
Interrupts Not supported Supported
Continue Chat New session each time Maintains conversation

This error propagates to:

  • SKILL.md Phase 1 decision guide (directs users to choose query() even when they need hooks/tools)
  • SKILL.md Phase 4 custom tools code snippet (shows options with MCP tools in query() context)
  • SKILL.md Phase 5 hook example (builds hooks config but pairs it with query())
  • examples/complete-agent.md — the flagship example uses query() with hooks, custom MCP tools, AND subagents simultaneously. This code would fail at runtime.
  • references/tools-reference.md — explicitly states "Both query() and ClaudeSDKClient support custom MCP tools via options" and shows a query() example with custom tools
  • references/hooks-and-permissions.md — hook examples define options but don't show ClaudeSDKClient usage
  • references/patterns-guide.md — no pattern demonstrates hooks + ClaudeSDKClient

Why it matters: This is the single most impactful error. An agent following this skill would produce code that silently fails or throws errors. The "complete agent" example — which agents would naturally copy as a best-practice template — is non-functional according to the official SDK.

Suggested approach:

  1. Update the comparison table in SKILL.md and api-reference.md to match official docs
  2. Revise the Phase 1 decision guide: "If you need hooks, custom tools, or interrupts, you MUST use ClaudeSDKClient regardless of whether the task is one-shot"
  3. Rewrite examples/complete-agent.md to use ClaudeSDKClient (can still be a one-shot pattern using async with ClaudeSDKClient() as client: with a single query/response cycle)
  4. Update tools-reference.md "Running Custom Tools" section to use ClaudeSDKClient
  5. Update hooks-and-permissions.md examples to use ClaudeSDKClient
  6. Add a new pattern to patterns-guide.md: "One-shot with hooks and custom tools" using ClaudeSDKClient
  7. Update validate-agent-script.sh to warn when hooks or custom tools are detected alongside query() usage

Notes and status: ready


2. Add missing anyio guidance (HIGH)

Description: The official Agent SDK quickstart (platform.claude.com/docs/en/agent-sdk/quickstart) uses anyio.run(main) instead of asyncio.run(main()). The skill exclusively uses asyncio.run() in all examples and the validation checklist.

anyio is an async compatibility library that avoids event-loop conflicts in environments where an event loop is already running (e.g., Jupyter notebooks, some web frameworks). The SDK quickstart chose it deliberately.

Why it matters: Users running agent scripts in environments with existing event loops will get RuntimeError: This event loop is already running. Using anyio.run is the safer default and matches official guidance.

Suggested approach:

  1. Update SKILL.md Phase 2 (Project Setup) to recommend anyio as the default runner, noting asyncio.run() works in standalone scripts
  2. Update the uv script template to include anyio as a dependency:
    # /// script
    # requires-python = ">=3.10"
    # dependencies = ["claude-agent-sdk", "anyio"]
    # ///
  3. Update examples to use anyio.run(main) (note: no parentheses on main)
  4. Add a note in SKILL.md explaining when asyncio.run() is acceptable (standalone scripts with no existing event loop)
  5. Update validate-agent-script.sh to suggest anyio.run as the preferred pattern

Notes and status: ready


3. Document sandbox settings (MEDIUM)

Description: The official SDK reference documents SandboxSettings, SandboxNetworkConfig, and SandboxIgnoreViolations as full configuration types for controlling command execution sandboxing, network isolation, and violation handling. The skill mentions sandbox in the ClaudeAgentOptions listing with a one-line example but provides no reference documentation, no patterns, and no explanation of the security implications.

Key features missing from the skill:

  • SandboxSettings: enabled, autoAllowBashIfSandboxed, excludedCommands, allowUnsandboxedCommands, network, ignoreViolations, enableWeakerNestedSandbox
  • SandboxNetworkConfig: allowLocalBinding, allowUnixSockets, allowAllUnixSockets, httpProxyPort, socksProxyPort
  • SandboxIgnoreViolations: file, network patterns
  • can_use_tool integration for unsandboxed command approval
  • Security warning about bypassPermissions + allowUnsandboxedCommands combination

Why it matters: Sandboxing is the primary safety mechanism for agents running shell commands autonomously. Users building CI/CD agents or deployment tools need this guidance to avoid security issues.

Suggested approach:

  1. Add a new reference document: references/sandbox-reference.md
  2. Cover all three types with examples
  3. Add a sandbox pattern to patterns-guide.md (e.g., "Sandboxed CI Agent")
  4. Add sandbox considerations to the SKILL.md Phase 5 (Hooks and Permissions) section
  5. Include security warnings about dangerous combinations

Notes and status: ready


4. Document setting_sources behavior and defaults (MEDIUM)

Description: The official docs explain that when setting_sources is omitted or None, the SDK does NOT load any filesystem settings — no ~/.claude/settings.json, no .claude/settings.json, no .claude/settings.local.json, and critically no CLAUDE.md files. This is a deliberate isolation default. To load CLAUDE.md project instructions, users must set setting_sources=["project"].

The skill's api-reference.md mentions setting_sources and shows examples, but SKILL.md's workflow doesn't guide users through this decision. A user creating a "project-aware" agent would miss this and wonder why CLAUDE.md isn't being loaded.

Why it matters: Many agents need project context from CLAUDE.md. Without explicit guidance, users will create agents that silently ignore project instructions.

Suggested approach:

  1. Add a note in SKILL.md Phase 4 (Build the Agent) explaining the setting_sources default behavior
  2. In the project-aware pattern (patterns-guide.md Pattern 7), ensure setting_sources=["project"] is prominently shown
  3. Add to the validation checklist: "If the agent should follow CLAUDE.md, verify setting_sources includes 'project'"

Notes and status: ready


5. Add file checkpointing example and documentation (MEDIUM)

Description: The ClaudeSDKClient has a rewind_files(user_message_uuid) method that restores files to their state at a given point, and ClaudeAgentOptions has enable_file_checkpointing: bool. The api-reference.md mentions both but provides no usage example anywhere in the skill.

Why it matters: File checkpointing enables powerful "try and rollback" workflows — attempt a refactor, check if tests pass, rewind if they don't. This is a differentiating feature of the SDK vs plain CLI usage.

Suggested approach:

  1. Add a new example: examples/checkpoint-agent.md showing a try/rewind/retry pattern
  2. Or add a new pattern to patterns-guide.md: "Iterative Agent with File Checkpointing"
  3. Ensure the example shows: enable_file_checkpointing=True in options, capturing user_message_uuid from messages, calling client.rewind_files(uuid) on failure

Notes and status: ready


6. Document SdkPluginConfig for plugin loading (LOW-MEDIUM)

Description: The SDK supports loading local plugins programmatically via plugins=[{"type": "local", "path": "./my-plugin"}] in ClaudeAgentOptions. This allows SDK applications to leverage the same plugin ecosystem as the interactive CLI. The skill mentions this in a one-liner but doesn't explain what it does, when to use it, or what "local" means.

Why it matters: Plugin loading enables SDK apps to reuse skills, hooks, agents, and commands from plugins — a key integration point between the interactive and programmatic worlds.

Suggested approach:

  1. Add a brief section to api-reference.md explaining SdkPluginConfig
  2. Add a note in patterns-guide.md about when plugin loading is useful (e.g., reusing existing hooks/skills in an SDK app)
  3. Consider adding to the SKILL.md quick reference if space permits

Notes and status: ready


7. Add break-in-loops warning to patterns and examples (LOW-MEDIUM)

Description: The official SDK docs explicitly warn: "When iterating over messages, avoid using break to exit early as this can cause asyncio cleanup issues. Instead, let the iteration complete naturally or use flags to track when you've found what you need."

The skill's anti-patterns section in patterns-guide.md lists "Using break to exit message loops" but doesn't explain the asyncio cleanup consequence. Several examples implicitly encourage full iteration but don't call out this footgun.

Why it matters: An agent writing message-processing code might naturally insert break to exit early when it finds what it needs. This would cause subtle asyncio errors.

Suggested approach:

  1. Expand the anti-pattern entry in patterns-guide.md with the asyncio cleanup explanation
  2. Add a brief note in SKILL.md Phase 4 message handling section
  3. Show the flag-based alternative pattern explicitly

Notes and status: ready


8. Align workflow phases with sibling skills (LOW)

Description: All other skills in the claude plugin follow a consistent 6-phase workflow: Requirements -> Context Detection -> Research -> Structure -> Content -> Validation. The agents-sdk-creator uses 7 phases with different names: Requirements -> Project Setup -> Research -> Build -> Hooks/Permissions -> Subagents -> Validation.

The divergence is partially justified — this skill creates Python applications rather than Claude Code config files, so "Context Detection" and "Structure Creation" don't map cleanly. However, the inconsistency may confuse agents that have learned the shared workflow pattern from other skills.

Why it matters: Consistency across skills reduces cognitive load for agents and users. When all skills follow the same structure, an agent familiar with one skill can navigate others predictably.

Suggested approach: This is a judgment call. Two options:

  • Option A (Adapt): Rename phases to align closer to siblings: Phase 1 (Requirements), Phase 2 (Context Detection — existing project? dependencies?), Phase 3 (Research), Phase 4 (Structure — create project files/dirs), Phase 5 (Content — write the agent code with hooks, permissions, subagents, tools), Phase 6 (Validation). This collapses Phases 4-6 into a single "Content" phase with subsections.
  • Option B (Keep): Keep the 7-phase structure but add a note explaining why this skill diverges. The current breakdown is logical for the task at hand.

Recommend Option A for consistency, but either approach is valid.

Notes and status: ready


9. Update validation script for new findings (LOW)

Description: The validate-agent-script.sh script currently validates imports, async patterns, API usage, and message handling. It should be updated to catch the issues identified in this review:

  1. Warn when hooks or custom tools are used with query() (they require ClaudeSDKClient)
  2. Suggest anyio.run over asyncio.run
  3. Check for break statements inside message iteration loops
  4. Verify setting_sources is set when system_prompt uses the claude_code preset

Why it matters: The validation script is the last line of defense before an agent delivers code to the user. Catching these issues automatically prevents silent failures.

Suggested approach:

  1. Add a check: if file contains query( AND contains hooks= or create_sdk_mcp_server, emit a warning
  2. Add a check: if file uses asyncio.run, suggest anyio.run as the preferred alternative
  3. Add a check: if file contains break inside an async for message block, warn about asyncio cleanup
  4. Add a check: if file contains system_prompt.*preset.*claude_code but no setting_sources, suggest adding it

Notes and status: ready


10. Add stderr callback documentation (LOW)

Description: ClaudeAgentOptions has a stderr: Callable[[str], None] | None field that provides a callback for capturing stderr output from the CLI process. The deprecated debug_stderr field is also present. Neither is explained or demonstrated in the skill.

Why it matters: Debugging SDK applications often requires inspecting stderr output. The callback pattern is the modern replacement for debug_stderr.

Suggested approach:

  1. Add a brief entry in api-reference.md explaining stderr callback
  2. Show a simple example: stderr=lambda line: print(f"[DEBUG] {line}")
  3. Note that debug_stderr is deprecated

Notes and status: ready


Additional Observations (Non-Actionable)

Positive Patterns Worth Preserving

  1. Graded examples (minimal -> standard -> complete -> multi-turn) are excellent for progressive learning
  2. Context detection script prevents creation in read-only cache directories
  3. Validation script catches the most common errors (wrong package, missing async, deprecated API)
  4. Quick reference tables in SKILL.md provide fast lookup during coding
  5. Anti-patterns section in patterns-guide.md proactively prevents common mistakes
  6. uv shebang pattern follows project conventions from CLAUDE.md

Cross-Skill Lessons Applied

The skill already incorporates several patterns from sibling skills:

  • Progressive disclosure with SKILL.md -> references/ -> examples/
  • Frontmatter with name, description, argument-hint
  • Validation scripts in scripts/ directory
  • Context detection for existing projects
  • Imperative writing style

Comparison with Official Documentation Coverage

Topic Official Docs Skill Coverage Gap
query() basic usage Yes Yes None
ClaudeSDKClient Yes Yes Capability table wrong
Custom tools (@tool) Yes Yes Uses wrong API (query())
Hooks Yes Yes Uses wrong API (query())
Permissions Yes Yes Good coverage
Subagents Yes Yes Good coverage
Structured output Yes Yes Good coverage
Sessions Yes Yes Good coverage
Sandbox Yes Minimal Needs full reference
Plugin loading Yes One-liner Needs explanation
File checkpointing Yes API listing only Needs example
setting_sources Yes Reference only Needs workflow guidance
Tool I/O schemas Yes Not covered Low priority
StreamEvent type Yes Minimal Low priority

Priority Summary

# Item Severity Effort
1 Fix query() vs ClaudeSDKClient capabilities CRITICAL High (7+ files)
2 Add anyio guidance HIGH Medium (examples + SKILL.md)
3 Document sandbox settings MEDIUM Medium (new reference)
4 Document setting_sources defaults MEDIUM Low (notes in 3 files)
5 Add file checkpointing example MEDIUM Low (1 new file)
6 Document SdkPluginConfig LOW-MEDIUM Low (brief additions)
7 Add break-in-loops warning LOW-MEDIUM Low (expand existing)
8 Align workflow phases LOW Medium (restructure)
9 Update validation script LOW Medium (4 new checks)
10 Add stderr callback docs LOW Low (brief addition)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment