Skip to content

Instantly share code, notes, and snippets.

@tamakiii
Last active June 18, 2025 05:37
Show Gist options
  • Save tamakiii/ddb87e252be03808285b9f4e020433d4 to your computer and use it in GitHub Desktop.
Save tamakiii/ddb87e252be03808285b9f4e020433d4 to your computer and use it in GitHub Desktop.
DuckDB-based tool for analyzing Claude Code conversations with MCP server integration.

Claude Code Conversation Analysis

A DuckDB-based tool with MCP (Model Context Protocol) server integration for analyzing Claude Code conversations directly within Claude, enabling powerful SQL queries on your conversation history.

Overview

This tool serves two primary purposes:

  1. MCP Server Integration: Use the MotherDuck MCP server to query conversation data directly within Claude
  2. Standalone Analysis: Load and analyze Claude Code conversation data from ~/.claude/projects/ using DuckDB

The MCP integration allows Claude to directly query your conversation database, enabling meta-analysis and self-improvement based on past interactions.

Quick Start

Using with Claude (MCP Server)

# 1. Set up the database
make setup

# 2. Load conversation data
make @load

# 3. Launch Claude with MCP server
make claude

Standalone Analysis

# 1. Verify dependencies are installed
make check

# 2. Initialize the DuckDB database
make setup

# 3. Interactively select and load conversation files
make @load

# 4. Explore loaded tables
make show-tables

# 5. Run SQL analyses directly
make duckdb

# 6. Clean up when done
make teardown

Installation

Prerequisites

# Install DuckDB
brew install duckdb  # macOS
# or visit https://duckdb.org/docs/installation/

# Install fzf for interactive file selection
brew install fzf     # macOS
# or visit https://github.com/junegunn/fzf#installation

# For MCP server integration (optional)
# Ensure you have Python with uvx for the MotherDuck MCP server
pip install uvx  # or your preferred Python package manager

Usage Guide

MCP Server Integration

The primary feature of this tool is enabling Claude to query your conversation history:

  1. Initialize and Load Data:

    make setup    # Create database
    make @load    # Select and load conversation files
  2. Launch Claude with MCP:

    make claude   # Opens Claude with the MotherDuck MCP server
  3. Query from Within Claude: Once connected, Claude can execute SQL queries on your conversation data directly, enabling:

    • Self-analysis of conversation patterns
    • Tool usage optimization recommendations
    • Identification of repetitive tasks
    • Performance metrics analysis

Standalone Usage

For direct database interaction outside of Claude:

  1. Check Dependencies: Ensure DuckDB and fzf are available

    make check
  2. Database Setup: Create a new DuckDB database

    make setup
  3. Load Conversations: Select conversation files interactively

    make @load
    # Use fzf to select projects from ~/.claude/projects/
  4. Analyze Data: Connect to DuckDB for SQL queries

    make duckdb
    # Run SQL queries in the DuckDB CLI

Available Commands

Command Description
make check Verify required dependencies
make setup Initialize DuckDB database
make @load Interactively load conversation files
make show-tables List all loaded conversation tables
make duckdb Open DuckDB CLI for direct analysis
make claude Launch Claude with MCP server integration
make teardown Remove database files

Data Schema

Message Structure

Each conversation message contains:

Field Type Description
id STRING Unique message identifier
type STRING Message type (user/assistant/system)
message JSON Full message content and metadata
timestamp STRING ISO 8601 timestamp
parent_id STRING Parent message ID (for branching)

Extracted Fields

The tool automatically extracts commonly used fields:

-- View available columns
DESCRIBE table_name;

-- Common fields include:
-- - type: 'user', 'assistant', 'system'
-- - timestamp: When the message was created
-- - message: JSON containing full message data
-- - parent_id: For conversation threading

Analysis Examples

Basic Statistics

-- Total messages by type
SELECT type, COUNT(*) as count
FROM conversation_table
GROUP BY type
ORDER BY count DESC;

-- Average messages per day
SELECT DATE(timestamp::timestamp) as day,
       COUNT(*) as messages
FROM conversation_table
GROUP BY day
ORDER BY day;

Tool Usage Analysis

-- Most frequently used tools
SELECT 
  JSON_EXTRACT_STRING(tool.use, '$.name') as tool_name,
  COUNT(*) as usage_count
FROM (
  SELECT UNNEST(JSON_EXTRACT(message, '$.tool_uses')) as tool
  FROM conversation_table
  WHERE type = 'assistant'
) tool_uses
GROUP BY tool_name
ORDER BY usage_count DESC;

-- Tool execution success rate
SELECT 
  CASE 
    WHEN JSON_EXTRACT_STRING(message, '$.content') LIKE '%error%' THEN 'error'
    ELSE 'success'
  END as result,
  COUNT(*) as count
FROM conversation_table
WHERE type = 'user' 
  AND JSON_EXTRACT_STRING(message, '$.content') LIKE '%tool_result%'
GROUP BY result;

Conversation Flow

-- Message chains and depth
WITH RECURSIVE message_chain AS (
  SELECT id, parent_id, type, 0 as depth
  FROM conversation_table
  WHERE parent_id IS NULL
  
  UNION ALL
  
  SELECT c.id, c.parent_id, c.type, mc.depth + 1
  FROM conversation_table c
  JOIN message_chain mc ON c.parent_id = mc.id
)
SELECT depth, COUNT(*) as messages_at_depth
FROM message_chain
GROUP BY depth
ORDER BY depth;

-- Conversation branches
SELECT parent_id, COUNT(*) as branches
FROM conversation_table
WHERE parent_id IS NOT NULL
GROUP BY parent_id
HAVING COUNT(*) > 1
ORDER BY branches DESC;

Performance Metrics

-- Response time analysis (approximate)
WITH response_times AS (
  SELECT 
    LAG(timestamp::timestamp) OVER (ORDER BY timestamp) as prev_time,
    timestamp::timestamp as curr_time,
    type
  FROM conversation_table
)
SELECT 
  AVG(EXTRACT(EPOCH FROM (curr_time - prev_time))) as avg_response_seconds
FROM response_times
WHERE type = 'assistant' 
  AND prev_time IS NOT NULL;

-- Session duration
SELECT 
  MIN(timestamp::timestamp) as session_start,
  MAX(timestamp::timestamp) as session_end,
  EXTRACT(EPOCH FROM (MAX(timestamp::timestamp) - MIN(timestamp::timestamp)))/60 as duration_minutes,
  COUNT(*) as total_messages
FROM conversation_table;

Common Patterns & Insights

1. Tool Result Dominance

Analysis often reveals that 80-90% of "user" messages are actually tool results, not direct user input:

-- Identify actual user input vs tool results
SELECT 
  CASE 
    WHEN JSON_EXTRACT_STRING(message, '$.content') LIKE '%tool_result%' THEN 'tool_result'
    ELSE 'user_input'
  END as message_type,
  COUNT(*) as count,
  ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) as percentage
FROM conversation_table
WHERE type = 'user'
GROUP BY message_type;

2. Repetitive Operations

Identify commonly repeated sequences:

-- Find repeated tool usage patterns
WITH tool_sequences AS (
  SELECT 
    JSON_EXTRACT_STRING(tool.use, '$.name') as tool_name,
    LAG(JSON_EXTRACT_STRING(tool.use, '$.name')) OVER (ORDER BY timestamp) as prev_tool
  FROM (
    SELECT timestamp, UNNEST(JSON_EXTRACT(message, '$.tool_uses')) as tool
    FROM conversation_table
    WHERE type = 'assistant'
  ) t
)
SELECT 
  prev_tool || ' -> ' || tool_name as sequence,
  COUNT(*) as occurrences
FROM tool_sequences
WHERE prev_tool IS NOT NULL
GROUP BY sequence
ORDER BY occurrences DESC
LIMIT 10;

3. Error Patterns

Track common errors and failures:

-- Error frequency by tool
SELECT 
  JSON_EXTRACT_STRING(message, '$.tool_name') as tool,
  COUNT(*) as error_count
FROM conversation_table
WHERE type = 'user' 
  AND JSON_EXTRACT_STRING(message, '$.is_error') = 'true'
GROUP BY tool
ORDER BY error_count DESC;

Advanced Usage

Export Analysis Results

-- Export to CSV
COPY (
  SELECT type, COUNT(*) as count
  FROM conversation_table
  GROUP BY type
) TO 'message_types.csv' (HEADER, DELIMITER ',');

-- Export to Parquet for further analysis
COPY conversation_table TO 'conversations.parquet' (FORMAT PARQUET);

Custom Views

-- Create a simplified view
CREATE VIEW conversation_summary AS
SELECT 
  id,
  type,
  timestamp::timestamp as ts,
  CASE 
    WHEN type = 'user' AND JSON_EXTRACT_STRING(message, '$.content') LIKE '%tool_result%' 
    THEN 'tool_result'
    WHEN type = 'user' THEN 'user_input'
    ELSE type
  END as message_category,
  LENGTH(JSON_EXTRACT_STRING(message, '$.content')) as content_length
FROM conversation_table;

-- Use the view
SELECT message_category, AVG(content_length) as avg_length
FROM conversation_summary
GROUP BY message_category;

Optimization Strategies

Based on typical analysis findings:

  1. Reduce Tool Output Verbosity

    • Many tool results contain redundant information
    • Consider filtering or summarizing large outputs
  2. Batch Similar Operations

    • Identify sequences that could be combined
    • Group related file operations
  3. Improve Context Retention

    • Analyze where context is lost and operations repeated
    • Track which information needs persistent storage
  4. Optimize Response Length

    • Measure assistant response lengths vs utility
    • Identify opportunities for more concise responses

MCP Server Benefits

The Model Context Protocol (MCP) integration provides unique capabilities:

  1. Self-Reflection: Claude can analyze its own conversation patterns and suggest improvements
  2. Context-Aware Responses: Query past conversations to maintain consistency
  3. Performance Tracking: Real-time analysis of tool usage and efficiency
  4. Pattern Recognition: Identify recurring issues or requests across projects
  5. Workflow Optimization: Data-driven suggestions for better development practices

Example queries Claude can run via MCP:

-- Find most common error patterns in my responses
SELECT tool_name, error_pattern, COUNT(*) as occurrences
FROM conversation_data
WHERE is_error = true
GROUP BY tool_name, error_pattern
ORDER BY occurrences DESC;

-- Analyze my response efficiency
SELECT 
  DATE(timestamp) as day,
  AVG(response_length) as avg_length,
  COUNT(DISTINCT session_id) as sessions
FROM conversation_data
WHERE type = 'assistant'
GROUP BY day;

Files & Structure

makefiles/claude/code/conversation/
├── Makefile          # Core automation commands
├── README.md         # This documentation
├── mcp.json          # MotherDuck MCP server configuration
└── database.duckdb   # DuckDB database (created after setup)

Key Files

  • mcp.json: Pre-configured MotherDuck MCP server settings that connect to the local DuckDB database
  • Makefile: Provides all automation commands for setup, loading, and analysis
  • database.duckdb: The DuckDB database containing all loaded conversation data (created by make setup)

Troubleshooting

Common Issues

  1. No conversation files found

    • Ensure Claude Code has been used and conversations exist
    • Check ~/.claude/projects/ directory permissions
  2. DuckDB connection errors

    • Verify DuckDB is installed: which duckdb
    • Check database file permissions
  3. JSON parsing errors

    • Some conversation files may have malformed JSON
    • Use make @load to select specific files

Debug Commands

# List available conversation files
find ~/.claude/projects -name "*.json" -type f

# Check DuckDB version
duckdb --version

# Verify database exists
ls -la *.db

Contributing

Improvements welcome! Consider adding:

  • Additional analysis queries
  • Visualization scripts
  • Export formats
  • Performance optimizations

Related Resources

.PHONY: help check setup teardown load @load duckdb show-tables desc count claude
PROJECTS := $(HOME)/.claude/projects
select-project = $(shell find "$(PROJECTS)" -mindepth 1 -maxdepth 1 -type d | fzf --select-1 --exit-0)
select-table = $(shell echo "$1" | sed -e 's/ /\n/g' | fzf --select-1 --exit-0)
get-table = $(shell echo "$(notdir $1)" | sed 's/[^a-zA-Z0-9]/_/g')
list-tables = $(shell duckdb database.duckdb -json "SHOW TABLES;" | jq -r '.[].name')
help:
@cat $(firstword $(MAKEFILE_LIST))
check:
type duckdb
type fzf
setup: \
database.duckdb
teardown:
@rm database.duckdb
load:
echo "CREATE TABLE $(call get-table,$(PROJECT)) AS SELECT * FROM read_json_auto('$(PROJECT)/*.jsonl');" | duckdb database.duckdb
@load:
$(MAKE) load PROJECT="$(call select-project)"
duckdb:
duckdb database.duckdb
show-tables:
duckdb database.duckdb -json "SHOW TABLES;" | jq
desc:
duckdb database.duckdb "DESC $(call get-tables,$(call select-project))"
count:
duckdb database.duckdb "SELECT COUNT(1) FROM $(call select-table,$(call list-tables))"
claude: mcp.json
claude --mcp-config $<
database.duckdb:
duckdb "$@" "SELECT 1;"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment