Git blame tells you who last edited a line, but in human+AI collaborative coding, we need finer-grained attribution: what the human wrote, what the agent generated untouched, and what they shaped together.
Byline — every piece of work gets a byline. The byline tells you whose hands shaped it.
| Label | Definition |
|---|---|
| Mine | Content the human typed into chat verbatim, or manually edited in their editor between agent turns |
| Machine | Content generated by agent tool calls (Write/Edit) that was never modified before commit |
| Ours | Content that started as agent output but was reshaped by the human, OR went through multiple human↔agent revision cycles |
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Claude Code │ │ Byline │ │ Sidecar Files │
│ Session │───▶│ Analyzer │───▶│ (.byline/) │
│ Transcripts │ │ (post-commit) │ │ │
│ (.jsonl) │ │ │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────┐
│ git diff │
│ (committed) │
└──────────────┘
- Granularity: Character/word-level (not line-level). Each line can have mixed provenance — e.g., agent wrote the function but human renamed a parameter.
- Storage: Sidecar
.byline/directory with JSON files per commit. No git notes. - File types: Track everything. Use
.byline-ignorefor noise (lock files, build output, etc.). - Retroactive analysis: Build it. Lower fidelity for historical commits (no snapshot data), but transcript-based Mine/Machine/Ours classification still works.
- Distribution: Claude Code plugin. Standalone repo at
~/repos/byline/. Install/uninstall via plugin system. Not embedded in any project — works with any Claude Code repo. - Repository: https://github.com/alexknowshtml/byline (standalone)
Session Transcripts (~/.claude/projects/{project}/{session-id}.jsonl):
- Write tool calls:
input.content= exact file content written - Edit tool calls:
input.old_string+input.new_string= before/after - Read tool calls:
tool_result.content= file state at time of read - User messages:
message.content[].text= what the human typed - Timestamps: ISO 8601 on every entry
- Sequencing:
uuid+parentUuidfor ordering
For each file in a commit diff:
SELECT session_id, file_path, timestamp
FROM session_tool_calls
WHERE file_path = '{changed_file}'
AND timestamp BETWEEN {commit_window_start} AND {commit_window_end}If multiple sessions touched the same file → flag for merge logic (see Multi-Session Handling).
Walk the session transcript and extract ordered events for the file:
T1: Write(file, content_v1) → Agent wrote content_v1
T2: Read(file) → sees content_v1 → No change (still Machine)
T3: Read(file) → sees content_v2 → Human edited between T2-T3 (content_v2 - content_v1 = Mine)
T4: Edit(file, old, new) → Agent modified (new = Machine, unless old was Mine → Ours)
T5: [commit] → Final state
For each changed region in the commit diff, use word-level diffing (e.g., Google's diff-match-patch) to classify individual tokens:
- Machine: Tokens that match a Write/Edit tool output AND were never subsequently modified
- Mine: Tokens that appear in a Read result but don't match any prior Write/Edit output — inferred as manual edits
- Ours: Tokens that were written by a tool call but show differences at the next Read or at commit time
- If the human edits a file and Claude never reads it again before commit → diff last known tool-call state against committed version → differences are "Mine"
- If content was typed verbatim in a user message and placed via Write → check if the Write content matches the user message text → if yes, "Mine" (human dictated it)
When two sessions touch the same file in the same commit window:
- Non-overlapping ranges: Each session owns its character ranges independently
- Overlapping ranges: Default to "Ours" (multiple agents + human = collaborative)
- One wrote, one only read: Writing session owns attribution
Detection query:
SELECT file_path, COUNT(DISTINCT session_id) as session_count
FROM session_tool_calls
WHERE file_path IN (files_in_diff)
AND timestamp BETWEEN window_start AND window_end
GROUP BY file_path
HAVING session_count > 1Distributed as a Claude Code plugin. Install/uninstall via standard plugin commands.
~/repos/byline/
.claude-plugin/
plugin.json # Plugin manifest
hooks/
hooks.json # Hook definitions using ${CLAUDE_PLUGIN_ROOT}
skills/
byline/SKILL.md # /byline slash command
byline-status/SKILL.md # /byline-status diagnostic
src/
types.ts # Core types
transcript-parser.ts # Extract Write/Edit/Read from JSONL
file-timeline.ts # Build chronological edit history per file
word-differ.ts # Word-level diff + classification
classifier.ts # Orchestrator: commit SHA → attribution JSON
buffer.ts # Real-time operation buffer
storage.ts # .byline/ directory management
retroactive.ts # Historical commit analysis
query.ts # Query engine over .byline/ files
report.ts # Report generation
bin/
byline.ts # CLI entry point
hooks/
post-tool-use.ts # PostToolUse hook script
user-prompt-submit.ts # File snapshot hook
session-start.sh # Buffer initialization
post-commit.sh # Git post-commit trigger
install-git-hook.sh # Auto-install git hook on SessionStart
package.json
tsconfig.json
README.md
.byline-ignore.default # Default ignore patterns
Hook lifecycle:
PostToolUse→ logs every Write/Edit/Read with timestamps and content to.byline/session-log.jsonlUserPromptSubmit→ snapshots tracked files to detect manual edits between agent turnsStop→ finalizes session data, runs classification engine, writes sidecar JSON
Slash commands:
/byline blame <file>— character-level attribution for a file/byline stats [range]— Mine/Machine/Ours breakdown across commits/byline show <sha>— byline data for a specific commit/byline heatmap— file-level overview of who shaped what/byline retro [range]— run retroactive analysis on historical commits
.byline/commits/{short-sha}.json:
{
"commit": "abc12345",
"timestamp": "2026-02-10T14:30:00-05:00",
"session_ids": ["uuid-1"],
"files": {
"src/app.tsx": {
"summary": { "mine": 12, "machine": 45, "ours": 8 },
"segments": [
{
"line": 10,
"col_start": 0,
"col_end": 45,
"category": "machine",
"source": "Write tool call at T1",
"tool_call_id": "toolu_abc123"
},
{
"line": 10,
"col_start": 45,
"col_end": 52,
"category": "mine",
"source": "Manual edit detected between T2 and T3"
},
{
"line": 15,
"col_start": 0,
"col_end": 80,
"category": "ours",
"source": "Edit tool at T4, modified by human before commit"
}
]
}
},
"totals": {
"mine": 12,
"machine": 45,
"ours": 8,
"percent": { "mine": 18, "machine": 69, "ours": 13 }
}
}retro-analyze.js walks git history and matches commits to session transcripts:
- For each historical commit, find sessions active during the commit window
- Walk transcript tool calls for files in the diff
- Classify using the same Mine/Machine/Ours algorithm
- Fidelity notes:
- No
UserPromptSubmitsnapshot data → manual edits between tool calls detected only when Claude re-reads the file - Gap between last tool call and commit filled by diffing last known state vs. committed version
- Lower confidence flag on segments where detection relied on inference rather than direct observation
- No
package-lock.json
yarn.lock
bun.lockb
*.min.js
*.min.css
dist/
build/
node_modules/
.git/
Commit-level:
- Byline summary per commit ("18% Mine, 69% Machine, 13% Ours")
- Character-level blame with mixed attribution per line
Project-level:
- Authorship over time (stacked area chart)
- File-level heatmap (Mine/Machine/Ours concentration)
- Session authorship profiles
Per-file:
- VS Code gutter with byline attribution
- PR review with per-hunk attribution tags
Meta/narrative:
- Collaboration story for feature branches
- Aggregate portfolio stats across repos
Phase 1: Transcript Parser
- Parse JSONL transcripts
- Extract Write/Edit/Read events per file
- Build file timelines
Phase 2: Classification Engine
- Word-level diff using diff-match-patch
- Mine/Machine/Ours algorithm at character granularity
- Handle "user message verbatim" detection
- Handle "diff between tool calls" gap filling
Phase 3: Plugin & Hooks
- PostToolUse hook for real-time event logging
- UserPromptSubmit hook for file snapshots
- Stop hook for session finalization
- Slash commands for querying byline data
Phase 4: Retroactive Analysis
- Historical commit walker
- Session-to-commit matching
- Confidence scoring for inferred classifications
Phase 5: Visualization
- CLI query tools (blame, stats, show, heatmap)
- File heatmap generation
- Cross-repo aggregation