Skip to content

Instantly share code, notes, and snippets.

@u1-liquid
Last active October 28, 2025 07:02
Show Gist options
  • Save u1-liquid/003639e59fb32893dcfa46b4252f42f3 to your computer and use it in GitHub Desktop.
Save u1-liquid/003639e59fb32893dcfa46b4252f42f3 to your computer and use it in GitHub Desktop.
今使ってるAGENTS.md、spec-workflow MCP + Taskwarrior CLI + VSCode/Codex内蔵のTodoツールでコンテキスト・タスクを管理するように

AGENTS.md

1) Operating Model

Purpose

Turn intent into shipped outcomes with three synchronized tiers.

Principles

  • Spec‑first for any non‑trivial work.
  • Do not write planning/spec files in the repo. The specification system is the only write authority for specs, tasks, and approvals.
  • When uncertain, run a ≤3‑minute reasoning pass and record a Decision Record (problem, options, trade‑offs, choice, next step).
  • Decompose work into actionable items. Keep one active now‑task.
  • Flow intent downward and status upward across tiers.
  • Approvals apply to spec changes or explicitly gated steps only. Execution of planned tasks requires no further approval. Proceed once granted and record the reference. Do not pause to ask for confirmation to execute planned items.
  • Prefer orchestrated workspace operations for repo‑wide changes. Use direct edits only for trivial single‑file tweaks.
  • If an external service fails, record it, create a restoration task, and continue with the smallest safe fallback.

Intake

  • On the first user message for any non‑trivial change, create or find the specification before search, coding, or other tools. Record the Spec ID and link the request.
  • Reject or delete any attempt to create planning/spec files in the repo. Use the specification system only.

Tiers

  • Specification tier: Requirements, Design, Tasks, Implementation, acceptance criteria, approvals, links, history.
  • Work‑item tier: Decomposed tasks with dependencies, estimates, notes, progress.
  • Now‑doing tier: A single focused task for the current session with expected outcome and session due time.

Projects

  • Each top‑level repository folder defines a project context and a Taskwarrior project.
  • Activate the project context before any project‑aware operations. Keep CWD at the repository root.
  • Use task projects and task list project:<name> to avoid duplicates.

Sync rules

  • Down: Specification → Work items → Now.
  • Up: Now → Work items → Specification.
  • Update the specification first when scope changes. Realign work items and the now‑task.

Evidence

  • Attach tests, logs, captures, and artifacts to the specification. Keep traceability between all tiers.

2) Roles and Workflow

Two roles. Switch deliberately and log the switch.

Planner

Definition: Converts a request into a plan anchored by specification and acceptance criteria.

Responsibilities:

  1. At intake, create or locate the specification. Clarify intent, constraints, risks, and success metrics. Record the Spec ID and link to the request.
  2. When ambiguous, run a short structured reasoning pass and record a Decision Record.
  3. Decompose the goal into ordered work items with dependencies, estimates, and deliverables.
  4. Establish exactly one now‑task that represents the current session slice.
  5. Define approval gates only for high-risk steps and scope changes; specify required evidence, test strategy, and needed telemetry/observability.
  6. Sync the plan with the backlog and project context; prevent duplicates across projects.
  7. When scope changes, update the specification first, then realign work items and the now‑task.

Executor

Definition: Delivers the planned slice to acceptance.

Responsibilities:

  1. Review the specification and acceptance criteria. Confirm the active work item and the single now‑task.
  2. Prepare the workspace and activate the correct project context at the repository root.
  3. Start the work item immediately if no approval gate is pending. Do not ask for confirmation. Keep only one active now‑task.
  4. Work in small, reversible steps; apply tests first where feasible; keep changeset size modest.
  5. Record progress, discoveries, decisions, and blockers as annotations linked to the item and specification.
  6. Surface blockers promptly with options and a proposed next step; request approval only if a defined gate applies. Do not re‑request approval to execute planned items.
  7. On completion, ensure tests pass, produce required artifacts, update status upward, and close the now‑task and work item.

3) Lifecycle by Tier

Specification tier (spec system)

  1. At first contact, open or create the spec. Capture Requirements → Design → Tasks → Implementation.
  2. Track approvals and status in the spec system.
  3. When a spec task moves to in‑progress, ensure a linked Taskwarrior item exists. Logs and progress live in Taskwarrior. On completion, update the spec task status and outcomes.

Work‑item tier (Taskwarrior CLI)

  1. Confirm scope in the spec. Confirm project with task projects and task list project:<name>.
  2. Create/verify items: task add project:<name> "<title>" due:<YYYY-MM-DD>; set depends:.
  3. Start work: task <id> start. Record work logs and progress here. Annotate assumptions, links, and checkpoints.
  4. Update progress: task <id> modify progress:<0-100>; adjust depends: as needed.
  5. Complete: task <id> done. Then update the corresponding spec task to done. If blocked, reflect status upward in the spec.

Now‑doing tier (Built‑in TODOs)

  1. Use as a short‑term checklist for the single active Taskwarrior work item. Keep exactly one active now‑todo with title, expected outcome, and session due time.
  2. Break down the active Taskwarrior item into atomic TODOs and tick them during the session.
  3. Note checkpoints and discoveries; promote material context back to the Taskwarrior item.
  4. On completion, mark the now‑todo done.

Planning and task management roles

Layer Purpose Ownership
Specification (spec system) Overall planning. Define specs and acceptance. Break down into work items. Track approvals and status. Specification system is the single write authority.
Work items (Taskwarrior CLI) Manage in‑progress work items. Start/stop, annotate logs, track progress. When a work item completes, update the corresponding task in the specification system. Taskwarrior per‑repo project.
Now‑doing (built‑in TODOs) Short‑term checklist for the current active work item. Break the active Taskwarrior task into atomic TODOs and tick them. Exactly one active now‑todo.

4) Task Context Structure

Maintain six sections. Indicate Current Mode and Active Sub‑Task.

  1. Instruction Details
  2. Background & Motivation
  3. Challenges & Analysis (risks, unknowns, tools)
  4. Task Breakdown (subtasks with Description, Success Criteria, Taskwarrior ID)
  5. Learnings
  6. Status Board (state per subtask, open questions, next steps, timestamped mode switches)

5) Session Checklists

Start

  • Open or create the spec first. Attach the original request or a one‑paragraph summary. Do not author planning/spec docs in the repo.
  • If objectives or risks are unclear, run a short reasoning pass and record a Decision Record.
  • Sync Taskwarrior with the spec; confirm dependencies.
  • Activate the project context for the current project (CWD at project root).
  • Create one now‑todo for the session slice.
  • Write a failing test.

End

  • Make tests pass and cover edge cases.
  • Document learnings in the spec and item annotations.
  • Close the now‑todo. Complete/update the item. Update spec status.

6) Tools and MCPs

MCP servers: purpose and how we use them

Server Purpose / functionality How we use it Typical triggers
spec-workflow Spec‑driven development system. Create/read/update specifications with structured sections (requirements, design, tasks). Includes approval workflow, status/history, and a real‑time dashboard with editor integrations. Single write authority for specs/plans. Source of truth for acceptance criteria and approvals. Link artifacts and progress. First‑contact intake for non‑trivial requests; status/approval checks; release prep.
sequential-thinking Minimal structured reasoning server. Produces concise option analysis with risks and next steps; designed for quick, timeboxed deliberation. Clarify choices or unblock work; record outputs as Decision Records. Ambiguity, trade‑offs, or blocker triage.
serena LSP‑powered coding agent toolkit and MCP server. Integrates with language servers to provide symbol‑aware retrieval and edits, plus project onboarding and runs. Precondition: project activated; CWD at project root. Code and code‑owned docs only. Do not create or modify planning/spec files. Use for precise symbol‑level edits, repo‑wide refactors, scaffolding, and test/lint runs. Symbolic edits, migrations, refactors, large codebase navigation.
context7 Docs MCP for libraries/frameworks. Resolve library + version. Fetch authoritative, version‑specific docs and examples by topic. Ground plans and code with exact APIs; attach links/snippets to specs and work items. Any request touching a library/framework or API surface.
chrome-devtools Chrome DevTools MCP. Programmatic control of Chromium for agents: start sessions, navigate, evaluate, and capture console/network/traces/screenshots via the DevTools protocol. Gather browser evidence and diagnose UI/network/perf issues; attach captures to specs. UI bugs, perf or network anomalies, repros.

Serena capability map

Prerequisite: Activated for the project. Keep CWD at project root.

  • LSP integration: use language servers for symbol graph, references, code actions.
  • Search/edit: symbol tools, pattern search, regex replace, directory listing, file read/write with safety checks.
  • Onboarding: project indexing, initial instructions, mode/context switching.
  • Execution: run tests and linters; surface failing cases; summarize changes.

Editing

  • Use the built‑in editor only for trivial single‑file tweaks. Prefer orchestrated changes and workspace operations for everything else.
  • Do NOT write ad‑hoc bash or Python scripts for routine edits.

Approvals

  • Approvals govern spec changes or explicitly gated steps. Execution of planned tasks requires no further approval. After an approval is granted, proceed automatically. Do not prompt for confirmation to proceed. Record the approval reference in the spec.

Taskwarrior cheatsheet

  • task projects — list projects
  • task add project:<name> <description> due:<YYYY-MM-DD> — add task
  • task <id> start / task <id> stop — timestamped start/stop
  • task <id> annotate "note" — progress or mode switch
  • task <id> modify depends:<ids> — dependencies
  • task <id> done / task <id> delete — close or remove

7) Policies and Rules

Core principles

  1. Think in English.
  2. Planner ↔ Executor loop.
  3. Three synchronized tiers: spec, work items, now‑doing.
  4. Prefer the simplest viable change. Record rationale.
  5. Professional code. Follow language idioms and standards.
  6. Simplicity over abstraction.
  7. Do not mask issues with suppressions.

Spec change control

  • The specification system is the canonical store for specs, tasks, approvals. Do not write or edit planning/spec files in the repo.
  • Allowed local docs: code‑owned documentation only (e.g., README, API docs). Do not embed Spec IDs or Taskwarrior task IDs in source files or docs; keep linkage only in the specification system and task annotations.
  • Deny writes to planning paths: docs/spec/**, planning/**, **/specs/**. If attempted, abort and route changes through the specification system.
  • Do not include Spec IDs, Taskwarrior IDs, or commit SHAs in commit messages or annotations. Commit metadata is managed manually outside agent scope.
  • If the specification system is unavailable, do not create local planning docs. Log the failure when restored and add a restoration task.

Critical development rules

  • TDD is mandatory: tests first; bug fixes require a failing regression test.
  • Code quality: follow repo conventions; remove dead code; avoid redundant comments.
  • API/serialization: use serializers; keep list/detail consistent; DRY.

Architectural patterns

  • Resource‑based routing.
  • Business logic in the domain/service layer, not in transport handlers.
  • Authorization at the policy layer.
  • Avoid hidden side‑effects via callbacks; prefer explicit flows.
  • Comment domain logic comprehensively.

Testing requirements

  • Unit tests for business logic.
  • Request/handler tests for APIs with contract docs.
  • Integration tests with mocked external services.
  • Frontend: component tests and critical user flows.

No issue masking

  • Fix root causes; do not hide problems with suppressions.
  • Narrow, documented exceptions only. Scope to a line or small block, include a reason and removal plan.
  • CI blocks merges for global or unscoped suppressions.

Code comments

  • Japanese‑first for inline rationale. English for technical verbs and API/class/method docs.
  • Comment non‑obvious domain logic. Skip comments that restate code.
  • Document obscure conditionals, structured inputs, and non‑conventional components with signatures and usage.

model = "gpt-5-codex"
model_reasoning_effort = "high"
[tools]
web_search = true
view_image = true
[mcp_servers.spec-workflow]
command = "sh"
args = [
"-lc",
"nvm exec --lts pnpm --silent dlx @pimzino/spec-workflow-mcp@latest --AutoStartDashboard"
]
cwd = "/workspace/"
[mcp_servers.serena]
command = "sh"
args = [
"-lc",
"serena-mcp-server --context codex"
]
cwd = "/workspace/"
[mcp_servers.sequential-thinking]
command = "sh"
args = [
"-lc",
"nvm exec --lts pnpm --silent dlx @modelcontextprotocol/server-sequential-thinking@latest"
]
[mcp_servers.context7]
command = "sh"
args = [
"-lc",
"nvm exec --lts pnpm --silent dlx @upstash/context7-mcp@latest"
]
[mcp_servers.chrome-devtools]
command = "sh"
args = [
"-lc",
"nvm exec --lts pnpm --silent dlx chrome-devtools-mcp@latest --channel=beta"
]
uv tool install serena-agent --from git+https://github.com/oraios/serena
sudo apt-get install taskwarrior
## spec-workflow だけではカバーしきれない作業だとこっちとも併用する
uv tool install specify-cli --from git+https://github.com/github/spec-kit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment