Skip to content

Instantly share code, notes, and snippets.

@StuMason
Created March 24, 2026 19:01
Show Gist options
  • Select an option

  • Save StuMason/c61a9cd269228ca7d10a9523550bd963 to your computer and use it in GitHub Desktop.

Select an option

Save StuMason/c61a9cd269228ca7d10a9523550bd963 to your computer and use it in GitHub Desktop.
Conversation Decision Record - stick this in your CLAUDE.md and smoke it.

You are creating a commit that includes a Conversation Decision Record (CDR). Follow these steps exactly.

This CDR is likely the ONLY record of what happened and why. There may be no PR, no code review, no human reading the diff before or after this lands. Write accordingly — capture everything, because nothing else will.

Step 1: Gather git context

Run these in parallel:

  • git status (never use -uall)
  • git diff and git diff --staged to see all changes
  • git log --oneline -10 for recent commit style

If there are no changes to commit, tell the user and stop.

Step 2: Write the CDR

Reflect on the conversation that led to these changes. Write a CDR file at docs/context/YYYY-MM-DD-short-slug.md using today's date and a short descriptive slug derived from the changes.

Use this template as a guide — adapt it to what's relevant. Skip sections that don't apply. Keep it concise but complete — err on the side of capturing too much rather than too little.

# [Short Title Derived From Changes]

**Date:** YYYY-MM-DD

## Provenance

| Field | Value |
|---|---|
| **Model** | Full model identifier, e.g. `claude-opus-4-20250514`, `gpt-4o-2024-11-20` |
| **Model version/snapshot** | If available — the specific checkpoint or API version string |
| **Harness** | Tool that orchestrated the generation, e.g. `cursor 0.43`, `claude-code 1.2`, `aider 0.50`, `custom-agent` |
| **Harness config** | Relevant settings — temperature, system prompt version, tool use enabled, MCP servers connected |
| **Session ID** | A unique identifier for the conversation/session that produced this code. If unavailable, generate a UUID |
| **Generation timestamp** | When the code was generated (not when it was committed) |
| **Review status** | `none` · `human-approved` · `agent-reviewed` · `auto-merged` — be honest, this is the first thing checked during an incident |
| **Confidence** | `high` · `medium` · `low` — the model's self-assessed certainty. Low confidence = priority review target |
| **Triggered by** | What initiated this work — e.g. `human-request`, `agent-alpha-6233`, `ticket JIRA-123`, `cron schedule`, `alert-monitor/incident-4521` |
| **Touched areas** | e.g. `auth`, `crypto`, `input-validation`, `file-io`, `database`, `api-surface`, `config`, `infra` |
| **Test status** | `tests-added` · `tests-modified` · `tests-none` — absence of tests is itself a risk signal |
| **Dependencies introduced** | Any new packages, APIs, or services this code now relies on |
| **Dependencies reimplemented** | Any functionality this code implements that could have been a library — see note below on limitations |

> **⚠ Deps-Reimplemented limitation:** This field is self-reported by the same model that wrote the code. The model may not recognise when it has reimplemented library functionality — it often doesn't know what it doesn't know. Treat this field as best-effort. Supplement with static analysis tooling (e.g. Semgrep, custom rules) that can independently detect hand-rolled implementations of common library functions. A `none` value here does NOT mean no reimplementation occurred.

## What was asked
Capture the human's request as faithfully as possible. Not a sanitised summary — the actual intent, phrasing, and context. If the request was "this feature x is doing this, but we also want to add y" then write that. This is the only record of what was requested. Include:
- The original ask in the human's words (or the upstream agent's instruction)
- Any constraints, preferences, or context they gave
- What they explicitly did NOT ask for (if relevant — "was asked to add caching, was not asked about cache invalidation")

## What changed
Describe the actual code changes in enough detail that someone can understand the scope without reading the diff. This is NOT a repeat of the commit message — it's a narrative for someone who may never look at the code:
- Which files were created, modified, or deleted
- New endpoints, functions, classes, or modules introduced
- Behavioral changes — what does the system do differently now?
- Data model changes — new fields, tables, schemas
- Config or infrastructure changes
- Dependency changes — what was added, removed, or upgraded

## Why
The reasoning behind the approach taken. What problem were we solving? Why this solution and not another? What triggered this change — a ticket, an alert, a conversation, another agent?

## Decisions made
- **Decision 1**: Chose X because Y
- Flag any decisions where you were uncertain or chose between near-equal options. Future reviewers need to know which choices were deliberate and which were coin-flips.

## Rejected alternatives
Things considered but not done, and why.

## Prompt context
If this was an agent chain, what was the high-level goal and what did each step do? What system prompt or instructions were active? What tools or MCP servers were used during generation?

## Agent lineage
If this work was part of a delegation chain (e.g. a monitoring agent detected an issue and dispatched a coding agent), document the chain here:
- **Originator**: What kicked this off? (human, another agent, a scheduled job, an alert)
- **Chain**: e.g. `alert-agent → triage-agent → coding-agent (this session)`
- **Upstream context**: Link to or summarise the upstream trigger (incident ID, parent task, originating conversation)

Skip this section if the work was a direct human-to-agent interaction with no delegation.

## Risk surface

Flag anything that a future scanner or remediation tool should pay attention to:
- **Crypto/auth patterns used**: e.g. "HMAC-SHA256 for token validation", "bcrypt for password hashing", "hand-rolled JWT parsing"
- **Input boundaries**: Where does external input enter and how is it handled?
- **Data flow**: Does this code touch PII, secrets, credentials, or anything sensitive?
- **Assumptions**: What does this code assume about its environment that could be wrong? (e.g. "assumes TLS termination at load balancer", "assumes sanitized input from upstream service")
- **Known gaps**: Things the model wasn't asked about or explicitly punted on. Be honest — this is the most valuable field in a lights-out world.
- **Uncertain choices**: Specific implementation decisions where the model was unsure. Not the same as overall confidence — this is "I chose AES-256-GCM but AES-256-CBC might be what the existing codebase expects" or "I assumed UTF-8 input but didn't validate."

## Context
Any relevant conversation notes — links discussed, constraints mentioned, things to revisit later.

Rules for CDR content:

  • This CDR may be the ONLY artefact that explains what happened. Write it like a black box recording, not a casual note.
  • Write for a machine that needs to match this commit against a future vulnerability advisory
  • Also write for a human who's investigating an incident at 2am with no prior context
  • Capture the human's original request faithfully — paraphrasing loses signal
  • Describe what changed in enough detail that someone can understand scope without reading the diff
  • Be specific about crypto, auth, and input handling — vague descriptions are useless for remediation
  • Flag reimplemented library functionality explicitly — but acknowledge you might miss cases (see limitation note)
  • Flag uncertain decisions individually, not just overall confidence
  • Review: none and Tests: tests-none are not shameful — they're honest. Lying about review status defeats the entire purpose of this record
  • Confidence: low is not a failure — it's a gift to whoever triages this later
  • If the change is trivial, write a minimal CDR (just Provenance, "What changed" and "Why")
  • The commit SHA gets added after the commit is created — leave it out initially

If multiple CDRs already exist for today, append a number to the slug (e.g. 2024-01-15-auth-flow-2.md).

Step 3: Stage and commit

  1. Stage the code changes (specific files, not git add -A)
  2. Stage the CDR file
  3. Analyse all staged changes and draft a commit message following the repo's commit style
  4. Add commit trailers for machine-parseable provenance:
feat: add JWT validation endpoint

Descriptive commit message body as normal.

Model: claude-opus-4-20250514
Harness: cursor 0.43
Generated: 2025-11-14T10:30:00Z
Session-ID: a1b2c3d4e5f6
Review: none
Confidence: medium
Triggered-By: human-request
Tests: tests-added
Touched-Areas: auth, crypto
Deps-Reimplemented: jwt-validation
Risk-Flags: hand-rolled-crypto, external-input-parsing
CDR: docs/context/2025-11-14-jwt-validation.md

The trailers are structured data for automated tooling. They exist so that scanners can query across thousands of repos with git log without parsing markdown. The CDR is the full narrative. Both matter.

Key trailers explained:

  • Model: Exact model. When a model-specific advisory drops, you grep for this.
  • Harness: The orchestration tool. If a harness has a bug that produces bad code (e.g. a broken system prompt), you need to find everything it touched.
  • Generated: Timestamp of generation. Combined with Model, this lets you scope "everything produced by model X during time window Y."
  • Session-ID: Links back to the conversation or session. If a session is later found to have been corrupted (bad context, prompt injection), you can find every commit it produced.
  • Review: The single highest-value triage signal. During an incident the first question is "did anyone look at this?" none is the honest default in a lights-out world.
  • Confidence: The model's self-assessed certainty. low confidence commits go to the top of any review queue. Cheap metadata, high value.
  • Triggered-By: Causality chain. Who or what started this work? Lets you trace delegation across autonomous actors.
  • Tests: Whether tests were generated alongside the code. tests-none is a risk flag on its own in an autonomous codebase.
  • Deps-Reimplemented: The canary field. Library functionality reimplemented inline is a primary risk vector in LLM-generated code. Best-effort and self-reported — supplement with static analysis.
  • Risk-Flags: Freeform tags for anything a scanner should care about. Be generous — false positives are fine, false negatives are not.
  1. Create the commit
  2. Run git status to verify

Step 4: Update the CDR with commit SHA

After the commit succeeds:

  1. Get the commit SHA with git rev-parse --short HEAD
  2. Add a **Commit:** [sha] line to the CDR, below the Date line
  3. Amend the commit to include this update: git commit --amend --no-edit

Push to remote after the amend completes. If the push fails (e.g. branch protection, network error), log the failure in the CDR under Context and retry once. If it fails again, leave the commit local and surface the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment