Subagent Token Tracking - Investigation Findings

Executive Summary

After thorough investigation of GitHub issue #313 regarding subagent token tracking, we discovered that the reported issue is more nuanced than initially described. While users are experiencing real problems with token limits, the root cause is not exactly what was initially documented. This report details our findings, corrections to initial assumptions, and the improvements we've implemented.

Initial Problem Report (Issue #313)

Users reported hitting 5-hour token limits while ccusage showed only 10-20% usage when using subagents (Task tool). The initial assessment suggested that:

Subagent tokens were being missed entirely
Tokens from isSidechain: true entries were being miscounted as the wrong model
The toolUseResult field containing aggregated totals was being ignored

Our Investigation Findings

Finding 1: Sidechain Entries ARE Being Counted Correctly

Initial Assumption: Entries with isSidechain: true were being parsed but attributed to the wrong model (opus-4 instead of sonnet-4).

Actual Finding:

The existing code (v15.9.3+) already respects the message.model field for each entry
Sidechain entries from assistant messages contain the correct model: "model": "claude-sonnet-4-20250514"
These tokens ARE being correctly attributed to sonnet-4 in the current version

Evidence:

// From main branch src/data-loader.ts line 895:
allEntries.push({ data, date, cost, model: data.message.model, project });
//                                        ^^^^^^^^^^^^^^^^^^^^^^
// The code already uses the individual entry's model, not a session default

Real Data Analysis:

# Checking actual sidechain entries in the test file:
$ grep '"isSidechain":true' d094b187-*.jsonl | grep '"role":"assistant"' | head -1 | jq '.message.model'
"claude-sonnet-4-20250514"
# ✅ Sidechain entries have the correct model set

Finding 2: The Example in SUBAGENT_FIX.md Was Incorrect

The original documentation showed a sidechain entry example with:

{
  "isSidechain": true,
  "message": {
    "model": "claude-opus-4-1-20250805",  // ← This is WRONG
    "usage": { ... }
  }
}

But in actual JSONL files, sidechain assistant entries have:

{
  "isSidechain": true,
  "message": {
    "model": "claude-sonnet-4-20250514",  // ← Correct model!
    "usage": { ... }
  }
}

This incorrect example led to the mistaken belief that all sidechain tokens were being attributed to opus-4.

Finding 3: toolUseResult Structure Varies

Initial Assumption: The toolUseResult field contains token usage data that's being ignored.

Actual Finding:

The toolUseResult field exists but its structure varies significantly
In our test data, most toolUseResult entries are either:
- Strings: "toolUseResult": "Error: No such tool available"
- Objects without usage data: {"filenames": [...], "mode": "...", "numFiles": ...}
The example in the documentation showing toolUseResult.usage with token counts appears to be from a different version or specific use case

Evidence:

$ grep '"toolUseResult":{' test.jsonl | jq '.toolUseResult | has("usage")'
false  # No usage field in the actual data

Finding 4: Both Versions Show Identical Output

When comparing the global version (15.9.3) with our fixed version (15.9.4):

# Both versions show the same token attribution for Aug 11, 2025:
- opus-4: 26,648,xxx tokens
- sonnet-4: 1,248,xxx tokens (includes the subagent tokens!)

This confirms that the current version is already handling model attribution correctly.

Why Users Still Experience Issues

Despite sidechain tokens being counted correctly, users may still experience discrepancies due to:

Cache Tokens: The toolUseResult.totalTokens might exclude cache tokens while the actual billing includes them
Timing Issues: Live monitoring might not update immediately during subagent execution
Different Token Types: Some token types might be counted differently in billing vs. reporting
Version Differences: Users might be running older versions that don't respect individual entry models

Improvements We Implemented

While the core issue was already addressed, our changes add value through:

1. Enhanced Token Extraction

export function extractUsageFromEntry(data: UsageData): UsageData['message']['usage'] | undefined {
  // Prioritize toolUseResult.usage if present (future-proofing)
  if (data.toolUseResult != null && typeof data.toolUseResult === 'object' 
      && 'usage' in data.toolUseResult && data.toolUseResult.usage != null) {
    return data.toolUseResult.usage;
  }
  // Fall back to message.usage
  return data.message.usage;
}

2. Schema Updates for Future Compatibility

Added isSidechain field to the schema
Added toolUseResult field with proper typing
These additions ensure forward compatibility as Claude Code evolves

3. Fallback Model Attribution

// Provides a fallback for edge cases where model might be missing
const model = data.message.model ?? (data.isSidechain === true ? 'claude-sonnet-4-20250514' : undefined);

4. Improved Code Organization

Centralized usage extraction logic
Consistent handling across all load functions
Better maintainability for future updates

Recommendations

For the GitHub Issue (#313)

Update the issue to clarify that:
- Recent versions (15.9.3+) already handle model attribution correctly
- The issue might be version-specific or related to other factors
- Users should ensure they're running the latest version
Request more information from users experiencing issues:
- Exact version of ccusage being used
- Sample JSONL files showing the discrepancy
- Screenshots of both ccusage output and Claude Code's actual limit message

For the Pull Request

The improvements we made are still valuable for:

Future-proofing: Handles toolUseResult.usage when it contains token data
Edge cases: Provides fallbacks for entries without model fields
Code quality: Cleaner, more maintainable code structure
Schema validation: Proper typing for subagent-related fields

Conclusion

The investigation revealed that the core issue described in the original report was based on incorrect assumptions about the data structure. The current version of ccusage already handles subagent token attribution correctly when the model field is present in each entry.

However, our improvements add robustness and future-proofing to handle:

Variations in toolUseResult structure
Edge cases where model information might be missing
Future changes to Claude Code's JSONL format

The real value of this work is not in fixing a critical bug (which doesn't exist in current versions) but in:

Clarifying the actual behavior of the system
Adding defensive coding for edge cases
Preparing for future Claude Code updates
Improving code maintainability

Test Data Summary

File: d094b187-1785-445c-8fbc-a353149cbabe.jsonl

Date: 2025-08-11
Total entries: 193
Sidechain entries: 63 (with isSidechain: true)
Main agent model: claude-opus-4-1-20250805
Subagent model: claude-sonnet-4-20250514
Subagent tokens: 1,086,368 (correctly attributed to sonnet-4)

Next Steps

Update GitHub Issue #313 with these findings
Submit PR with the improvements for future-proofing
Document the actual JSONL structure variations in the README
Consider adding debug logging to help users troubleshoot discrepancies
Monitor for new reports with specific version and data examples

ben-vargas/1-SUBAGENT_FIX_FINDINGS.md

Subagent Token Tracking - Investigation Findings

Executive Summary

Initial Problem Report (Issue #313)

Our Investigation Findings

Finding 1: Sidechain Entries ARE Being Counted Correctly

Finding 2: The Example in SUBAGENT_FIX.md Was Incorrect

Finding 3: toolUseResult Structure Varies

Finding 4: Both Versions Show Identical Output

Why Users Still Experience Issues

Improvements We Implemented

1. Enhanced Token Extraction

2. Schema Updates for Future Compatibility

3. Fallback Model Attribution

4. Improved Code Organization

Recommendations

For the GitHub Issue (#313)

For the Pull Request

Conclusion

Test Data Summary

Next Steps

PR: Enhance Subagent Token Handling and Future-Proof JSONL Parsing

Context

Key Findings

Changes Made

1. Enhanced Token Extraction (src/data-loader.ts)

2. Schema Updates

3. Improved Model Attribution

4. Code Quality Improvements

Testing

Impact

What This Fixes

What This Doesn't Change

Breaking Changes

Migration

Recommendations for Users

Related Issues

Checklist

Additional Notes

1. Enhanced Token Extraction (`src/data-loader.ts`)