Byline: Mine / Machine / Ours

Spec — 2026-02-10

Problem

Git blame tells you who last edited a line, but in human+AI collaborative coding, we need finer-grained attribution: what the human wrote, what the agent generated untouched, and what they shaped together.

Name

Byline — every piece of work gets a byline. The byline tells you whose hands shaped it.

Label	Definition
Mine	Content the human typed into chat verbatim, or manually edited in their editor between agent turns
Machine	Content generated by agent tool calls (Write/Edit) that was never modified before commit
Ours	Content that started as agent output but was reshaped by the human, OR went through multiple human↔agent revision cycles

Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│  Claude Code     │    │  Byline           │    │  Sidecar Files  │
│  Session         │───▶│  Analyzer         │───▶│  (.byline/)     │
│  Transcripts     │    │  (post-commit)    │    │                 │
│  (.jsonl)        │    │                   │    │                 │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                              │
                              ▼
                       ┌──────────────┐
                       │  git diff    │
                       │  (committed) │
                       └──────────────┘

Decisions Made

Granularity: Character/word-level (not line-level). Each line can have mixed provenance — e.g., agent wrote the function but human renamed a parameter.
Storage: Sidecar .byline/ directory with JSON files per commit. No git notes.
File types: Track everything. Use .byline-ignore for noise (lock files, build output, etc.).
Retroactive analysis: Build it. Lower fidelity for historical commits (no snapshot data), but transcript-based Mine/Machine/Ours classification still works.
Distribution: Claude Code plugin. Standalone repo at ~/repos/byline/. Install/uninstall via plugin system. Not embedded in any project — works with any Claude Code repo.
Repository: https://github.com/alexknowshtml/byline (standalone)

Data Sources (Already Available)

Session Transcripts (~/.claude/projects/{project}/{session-id}.jsonl):

Write tool calls: input.content = exact file content written
Edit tool calls: input.old_string + input.new_string = before/after
Read tool calls: tool_result.content = file state at time of read
User messages: message.content[].text = what the human typed
Timestamps: ISO 8601 on every entry
Sequencing: uuid + parentUuid for ordering

Classification Algorithm

For each file in a commit diff:

Step 1: Find Relevant Sessions

SELECT session_id, file_path, timestamp
FROM session_tool_calls
WHERE file_path = '{changed_file}'
AND timestamp BETWEEN {commit_window_start} AND {commit_window_end}

If multiple sessions touched the same file → flag for merge logic (see Multi-Session Handling).

Step 2: Build File Timeline

Walk the session transcript and extract ordered events for the file:

T1: Write(file, content_v1)         → Agent wrote content_v1
T2: Read(file) → sees content_v1    → No change (still Machine)
T3: Read(file) → sees content_v2    → Human edited between T2-T3 (content_v2 - content_v1 = Mine)
T4: Edit(file, old, new)            → Agent modified (new = Machine, unless old was Mine → Ours)
T5: [commit]                        → Final state

Step 3: Classify Each Segment (Character-Level)

For each changed region in the commit diff, use word-level diffing (e.g., Google's diff-match-patch) to classify individual tokens:

Machine: Tokens that match a Write/Edit tool output AND were never subsequently modified
Mine: Tokens that appear in a Read result but don't match any prior Write/Edit output — inferred as manual edits
Ours: Tokens that were written by a tool call but show differences at the next Read or at commit time

Step 4: Handle Gaps

If the human edits a file and Claude never reads it again before commit → diff last known tool-call state against committed version → differences are "Mine"
If content was typed verbatim in a user message and placed via Write → check if the Write content matches the user message text → if yes, "Mine" (human dictated it)

Multi-Session Handling

When two sessions touch the same file in the same commit window:

Non-overlapping ranges: Each session owns its character ranges independently
Overlapping ranges: Default to "Ours" (multiple agents + human = collaborative)
One wrote, one only read: Writing session owns attribution

Detection query:

SELECT file_path, COUNT(DISTINCT session_id) as session_count
FROM session_tool_calls
WHERE file_path IN (files_in_diff)
AND timestamp BETWEEN window_start AND window_end
GROUP BY file_path
HAVING session_count > 1

Plugin Architecture (Claude Code)

Distributed as a Claude Code plugin. Install/uninstall via standard plugin commands.

~/repos/byline/
  .claude-plugin/
    plugin.json              # Plugin manifest
  hooks/
    hooks.json               # Hook definitions using ${CLAUDE_PLUGIN_ROOT}
  skills/
    byline/SKILL.md          # /byline slash command
    byline-status/SKILL.md   # /byline-status diagnostic
  src/
    types.ts                 # Core types
    transcript-parser.ts     # Extract Write/Edit/Read from JSONL
    file-timeline.ts         # Build chronological edit history per file
    word-differ.ts           # Word-level diff + classification
    classifier.ts            # Orchestrator: commit SHA → attribution JSON
    buffer.ts                # Real-time operation buffer
    storage.ts               # .byline/ directory management
    retroactive.ts           # Historical commit analysis
    query.ts                 # Query engine over .byline/ files
    report.ts                # Report generation
  bin/
    byline.ts                # CLI entry point
  hooks/
    post-tool-use.ts         # PostToolUse hook script
    user-prompt-submit.ts    # File snapshot hook
    session-start.sh         # Buffer initialization
    post-commit.sh           # Git post-commit trigger
    install-git-hook.sh      # Auto-install git hook on SessionStart
  package.json
  tsconfig.json
  README.md
  .byline-ignore.default    # Default ignore patterns

Hook lifecycle:

PostToolUse → logs every Write/Edit/Read with timestamps and content to .byline/session-log.jsonl
UserPromptSubmit → snapshots tracked files to detect manual edits between agent turns
Stop → finalizes session data, runs classification engine, writes sidecar JSON

Slash commands:

/byline blame <file> — character-level attribution for a file
/byline stats [range] — Mine/Machine/Ours breakdown across commits
/byline show <sha> — byline data for a specific commit
/byline heatmap — file-level overview of who shaped what
/byline retro [range] — run retroactive analysis on historical commits

Output Format

.byline/commits/{short-sha}.json:

{
  "commit": "abc12345",
  "timestamp": "2026-02-10T14:30:00-05:00",
  "session_ids": ["uuid-1"],
  "files": {
    "src/app.tsx": {
      "summary": { "mine": 12, "machine": 45, "ours": 8 },
      "segments": [
        {
          "line": 10,
          "col_start": 0,
          "col_end": 45,
          "category": "machine",
          "source": "Write tool call at T1",
          "tool_call_id": "toolu_abc123"
        },
        {
          "line": 10,
          "col_start": 45,
          "col_end": 52,
          "category": "mine",
          "source": "Manual edit detected between T2 and T3"
        },
        {
          "line": 15,
          "col_start": 0,
          "col_end": 80,
          "category": "ours",
          "source": "Edit tool at T4, modified by human before commit"
        }
      ]
    }
  },
  "totals": {
    "mine": 12,
    "machine": 45,
    "ours": 8,
    "percent": { "mine": 18, "machine": 69, "ours": 13 }
  }
}

Retroactive Analysis

retro-analyze.js walks git history and matches commits to session transcripts:

For each historical commit, find sessions active during the commit window
Walk transcript tool calls for files in the diff
Classify using the same Mine/Machine/Ours algorithm
Fidelity notes:
- No UserPromptSubmit snapshot data → manual edits between tool calls detected only when Claude re-reads the file
- Gap between last tool call and commit filled by diffing last known state vs. committed version
- Lower confidence flag on segments where detection relied on inference rather than direct observation

`.byline-ignore` Defaults

package-lock.json
yarn.lock
bun.lockb
*.min.js
*.min.css
dist/
build/
node_modules/
.git/

Visualization Targets

Commit-level:

Byline summary per commit ("18% Mine, 69% Machine, 13% Ours")
Character-level blame with mixed attribution per line

Project-level:

Authorship over time (stacked area chart)
File-level heatmap (Mine/Machine/Ours concentration)
Session authorship profiles

Per-file:

VS Code gutter with byline attribution
PR review with per-hunk attribution tags

Meta/narrative:

Collaboration story for feature branches
Aggregate portfolio stats across repos

Implementation Phases

Phase 1: Transcript Parser

Parse JSONL transcripts
Extract Write/Edit/Read events per file
Build file timelines

Phase 2: Classification Engine

Word-level diff using diff-match-patch
Mine/Machine/Ours algorithm at character granularity
Handle "user message verbatim" detection
Handle "diff between tool calls" gap filling

Phase 3: Plugin & Hooks

PostToolUse hook for real-time event logging
UserPromptSubmit hook for file snapshots
Stop hook for session finalization
Slash commands for querying byline data

Phase 4: Retroactive Analysis

Historical commit walker
Session-to-commit matching
Confidence scoring for inferred classifications

Phase 5: Visualization

CLI query tools (blame, stats, show, heatmap)
File heatmap generation
Cross-repo aggregation

alexknowshtml/provenance-tracker-spec.md

Select an option

No results found

Select an option

No results found

Byline: Mine / Machine / Ours

Spec — 2026-02-10

Problem

Name

Categories

Architecture

Decisions Made

Data Sources (Already Available)

Classification Algorithm

Step 1: Find Relevant Sessions

Step 2: Build File Timeline

Step 3: Classify Each Segment (Character-Level)

Step 4: Handle Gaps

Multi-Session Handling

Plugin Architecture (Claude Code)

Output Format

Retroactive Analysis

`.byline-ignore` Defaults

Visualization Targets

Implementation Phases

alexknowshtml/provenance-tracker-spec.md

Byline: Mine / Machine / Ours

Spec — 2026-02-10

Problem

Name

Categories

Architecture

Decisions Made

Data Sources (Already Available)

Classification Algorithm

Step 1: Find Relevant Sessions

Step 2: Build File Timeline

Step 3: Classify Each Segment (Character-Level)

Step 4: Handle Gaps

Multi-Session Handling

Plugin Architecture (Claude Code)

Output Format

Retroactive Analysis

.byline-ignore Defaults

Visualization Targets

Implementation Phases

`.byline-ignore` Defaults