This report compares the context management strategies of the Opencode agent and the Pi agent, based on trace analysis from the TODO app experiment on 2026-04-22 using the Kimi 2.5 model.
Opencode employs a highly structured, verbose, and comprehensive system prompt. It explicitly defines:
- Tool Use Protocol: Detailed instructions on how to use tools, the importance of parallel tool calls, and the expected behaviors after tool results (e.g., "If you anticipate making multiple non-interfering tool calls, you are HIGHLY RECOMMENDED to make them in parallel to significantly improve efficiency.").
- Coding Guidelines: Rigid rules for coding from scratch, bug fixes, features, and refactoring (e.g., "Make MINIMAL changes to achieve the goal. This is very important to your performance.").
- Research Protocols: Structured guidance for research, including planning, internet searching, and multimedia processing (e.g., "Make plans before doing deep or wide research, to ensure you are always on track.").
- Safety/Environment: Explicit constraints about the working directory, operating system, and safety protocols (e.g., "The operating environment is not in a sandbox. Any actions you do will immediately affect the user's system. So you MUST be extremely cautious.").
- Project Context: Directs the agent to check
AGENTS.mdfor project-specific conventions (e.g., "Markdown files namedAGENTS.mdusually contain the background, structure, coding styles, user preferences and other relevant information.").
This approach creates a high-overhead, highly-constrained context, which ensures consistent, safe actions but increases token usage significantly.
Pi employs a concise, harness-oriented system prompt. It emphasizes:
- Role Definition: Focuses on being an "expert coding assistant" operating within a specific harness (e.g., "You are an expert coding assistant operating inside pi, a coding agent harness. You help users by reading files, executing commands, editing code, and writing new files.").
- Concise Toolset: Lists available tools briefly without the extensive procedural rules found in Opencode (e.g., "Available tools: - read: Read file contents - bash: Execute bash commands - edit: Make precise file edits - write: Create or overwrite files").
- Documentation Focus: Provides clear, direct links to its own documentation for extensions, themes, skills, and SDKs (e.g., Main documentation: /Users/borislau/.nvm/versions/node/v20.20.0/lib/node_modules/@mariozechner/pi-coding-agent/README.md).
This approach creates a low-overhead, flexible context, designed for rapid interaction and task execution, relying more on the agent's internalized knowledge than strict procedural guidelines.
| Phase | Agent | Activity (Runs) | Avg Latency (ms) | Avg Tokens |
|---|---|---|---|---|
| Research | Opencode | 26 | 46,025.96 | 19,953.12 |
| Pi | 27 | 32,339.11 | 12,574.93 | |
| Plan | Opencode | 0 | N/A | N/A |
| Pi | 3 | 5,486.00 | 2,233.67 | |
| Implementation | Opencode | 3 | 6,088.00 | 12,640.67 |
| Pi | 7 | 7,213.29 | 2,442.57 | |
| Unknown | Opencode | 193 | 32,750.53 | 81,813.90 |
| Pi | 5 | 8,291.00 | 2,473.00 |
| Prompt Scenario | Opencode Context | Pi Context |
|---|---|---|
| "Let's go with..." | Massive procedural prompt + context rules. | Concise, assistant-focused prompt + Pi docs links. |
| "For remaining..." | Massive procedural prompt + context rules. | Concise, assistant-focused prompt + Pi docs links. |
- Token Efficiency: The significant token usage discrepancy (Opencode using ~37% more tokens in Research phase) is directly attributable to Opencode's expansive system prompt being sent with every single trace.
- Context Rigidity vs. Flexibility: Opencode's prompt is designed to enforce specific behaviors and safety constraints, while Pi's prompt is designed to enable assistance within a specific harness.
- Workflow Integration: Opencode's strategy is better suited for complex, multi-step engineering tasks requiring strict adherence to conventions (as defined in
AGENTS.md), whereas Pi is better suited for faster, more iterative interaction.
Average context window usage per trace against the Kimi K2.5 128k token window.
Each block = 1% ≈ 1,280 tokens · █ system prompt · ▓ messages + tool calls · ░ free space
Opencode · 365 traces · avg 61,866 input tokens · 48.4% of context window
─────────────────────────────────────────────────────────────────────────────────
█ █ █ █ ▓ ▓ ▓ ▓ ▓ ▓ System prompt 4,677 est. tokens 3.6%
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ Messages+tools 57,189 tokens 44.7%
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ Free 66,134 tokens 51.7%
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
Pi · 49 traces · avg 7,603 input tokens · 5.9% of context window
─────────────────────────────────────────────────────────────────────────
█ ▓ ▓ ▓ ▓ ▓ ░ ░ ░ ░ System prompt 1,367 est. tokens 1.1%
░ ░ ░ ░ ░ ░ ░ ░ ░ ░ Messages+tools 6,236 tokens 4.9%
░ ░ ░ ░ ░ ░ ░ ░ ░ ░ Free 120,397 tokens 94.1%
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
| Metric | Opencode | Pi | Ratio |
|---|---|---|---|
| Traces analyzed | 365 | 49 | — |
| Avg input tokens | 61,866 | 7,603 | 8.1× more |
| Avg system prompt tokens (est.) | 4,677 | 1,367 | 3.4× more |
| Avg messages in context | 101.6 | 18.6 | 5.5× more |
| Avg tool calls per trace | 51.7 | 8.3 | 6.2× more |
| Context window used | 48.4% | 5.9% | 8.2× more |
System prompt tokens estimated as
len(system_message_content) / 4. Agent identity derived fromcustom_metadata["trace.openrouter.api_key_name"]:opencode-202604vspi-20260422.
Each agent's system prompt consists of a static harness (shipped with the agent) and injected project context (read from AGENTS.md / project config at runtime).
System Prompt Composition · each character = 1% of that agent's total sys prompt · left→right = prompt order
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
OPENCODE 4,638 est. tok ╠IITTTTTTTTTTTTTCCCCCCCCCCCRRRRRRWWWWPPPPPPPUUUUMMMMMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA╣
Pi 1,367 est. tok ╠OOOLLLLLLGGGGGGGGGGGGGGDDDDDDDDDDDDDDDDDDDDJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ╣
Opencode sections · in prompt order (51.6% static harness → then injected AGENTS.md)
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
I Intro / role 2% 94 tok ██
T Tool Use Protocol 13% 598 tok █████████████
C Coding Guidelines 11% 499 tok ███████████
R Research Protocols 6% 281 tok ██████
W Working Environment 4% 173 tok ████
P Project Information 7% 334 tok ███████
U Ultimate Reminders 4% 192 tok ████
M Model/env metadata 5% 221 tok █████
A Injected AGENTS.md 48% 2,247 tok ████████████████████████████████████████████████
Pi sections · in prompt order (42.9% static harness → then injected project context)
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
O Role definition 3% 43 tok ███
L Tool listing 6% 85 tok ██████
G Guidelines 14% 188 tok ██████████████
D Pi documentation pointers 20% 271 tok ████████████████████
J Injected project context 57% 781 tok █████████████████████████████████████████████████████████
Measured on a median-length trace per agent.
| Section | Est. Tokens | % of Sys Prompt |
|---|---|---|
| Intro / role | 94 | 2.0% |
| Tool Use Protocol | 598 | 12.9% |
| Coding Guidelines | 499 | 10.7% |
| Research Protocols | 281 | 6.1% |
| Working Environment | 173 | 3.7% |
| Project Information (static) | 334 | 7.2% |
| Ultimate Reminders | 192 | 4.1% |
| Model / env metadata | 221 | 4.8% |
| Static harness subtotal | 2,391 | 51.6% |
| Injected AGENTS.md content | 2,247 | 48.4% |
| Total | 4,638 | 100% |
Across all 365 Opencode traces: injected AGENTS.md averages 2,096 est. tokens; static harness averages 2,627 est. tokens (varies because env metadata — skills, working dir — differs per project). 97.8% of traces had injected content; 90% fall in the 6–10k char band.
Ultimate Reminders (verbatim) — the closing section of Opencode's static harness, before the injected AGENTS.md:
At any time, you should be HELPFUL, CONCISE, and ACCURATE. Be thorough in your actions — test what you build, verify what you change — not in your explanations.
- Never diverge from the requirements and the goals of the task you work on. Stay on track.
- Never give the user more than what they want.
- Try your best to avoid any hallucination. Do fact checking before providing any factual information.
- Think about the best approach, then take action decisively.
- Do not give up too early.
- ALWAYS, keep it stupidly simple. Do not overcomplicate things.
- When the task requires creating or modifying files, always use tools to do so. Never treat displaying code in your response as a substitute for actually writing it to the file system.
Pi has no equivalent section — it ends its static harness at the documentation pointers and trusts the model to supply these behavioral defaults.
| Section | Est. Tokens | % of Sys Prompt |
|---|---|---|
| Role definition | 43 | 3.1% |
| Tool listing | 85 | 6.2% |
| Guidelines | 188 | 13.8% |
| Pi documentation pointers | 271 | 19.8% |
| Static harness subtotal | 587 | 42.9% |
| Injected project context | 781 | 57.1% |
| Total | 1,367 | 100% |
Pi's static harness is nearly identical across all 49 traces (5,467–5,480 chars), confirming a fully static prompt. Opencode's varies considerably (2,096–26,698 chars) due to dynamic env metadata and variable AGENTS.md size.
| Opencode | Pi | Ratio | |
|---|---|---|---|
| Static harness (est. tokens) | 2,391 | 587 | 4.1× more |
| Injected project context (est. tokens) | 2,247 | 781 | 2.9× more |
| Total system prompt | 4,638 | 1,367 | 3.4× more |
Opencode's static harness is 4× larger than Pi's even before project context — driven by its detailed procedural sections (Tool Use Protocol, Coding Guidelines, Research Protocols, Safety/Environment). Pi relies on the model's internalized knowledge instead.
Both agents use the same mechanism: they read AGENTS.md files from the project directory at startup and inject the content into the system prompt. The difference is purely structural:
| Opencode | Pi | |
|---|---|---|
| Boundary marker | Instructions from: /path/to/AGENTS.md (flat text) |
## /path/to/AGENTS.md heading under # Project Context |
| Dynamic fields injected | Working directory, date, skills list | Working directory, date |
| Content source | Same AGENTS.md files |
Same AGENTS.md files |
Pi's presentation is slightly more readable (path as a Markdown ## heading so content flows naturally below it), while Opencode uses a flat text boundary marker. The injected content itself is identical in principle — whatever is in the project's AGENTS.md.
Pi's entire tool section (85 est. tokens) is four bullet points:
Available tools:
- read: Read file contents
- bash: Execute bash commands (ls, grep, find, etc.)
- edit: Make precise file edits with exact text replacement
- write: Create or overwrite files
Opencode's Tool Use Protocol (598 est. tokens) is a full procedural manual: when to call tools, how to interpret results, instructions to parallelize calls, what to do after tool results return, and how to use the task subtask delegation tool.
The same pattern applies across every section: Opencode encodes expected behavior explicitly in the prompt; Pi trusts the model's training to supply it. This is the root cause of the 4.1× static harness size difference — not a difference in capability, but a difference in where the behavior contract lives: in the prompt vs. in the model weights.
Token counts are estimated (chars/4). Tool outputs are returned as user messages in both agents' trace formats, so user message tokens are high — they accumulate tool results across the session.
| Role | Opencode avg/trace | Pi avg/trace | Ratio |
|---|---|---|---|
| system | 1.0 | 1.0 | 1.0× |
| user | 58.0 | 10.2 | 5.7× more |
| assistant | 42.6 | 7.4 | 5.8× more |
| Total | 101.6 | 18.6 | 5.5× more |
Median messages/trace: Opencode 54, Pi 14 (P90: Opencode 282, Pi 39 — Opencode has a heavy right tail from long agentic runs).
| Role | Opencode tok/msg | Pi tok/msg | Opencode % of total | Pi % of total |
|---|---|---|---|---|
| system | 4,677 | 1,367 | 10.1% | 22.8% |
| user | 523 | 301 | 65.7% | 51.1% |
| assistant | 262 | 212 | 24.1% | 26.1% |
User messages dominate total token usage in both agents because tool outputs accumulate there across the full session. Despite Opencode's larger system prompt, the bulk of its token cost comes from a longer conversation history.
| Type | Opencode avg/trace | Pi avg/trace |
|---|---|---|
| Tool calls | 33.1 (77.6%) | 5.6 (75.3%) |
| Plain text responses | 9.6 (22.4%) | 1.8 (24.7%) |
~77% of assistant messages are tool invocations in both agents — the ratio is nearly identical, suggesting similar interaction patterns. The difference is session depth, not interaction style.
| Tool | Total calls | Avg/trace |
|---|---|---|
| bash | 5,600 | 15.3 |
| read | 3,074 | 8.4 |
| edit | 1,471 | 4.0 |
| write | 1,092 | 3.0 |
| glob | 453 | 1.2 |