Skip to content

Instantly share code, notes, and snippets.

@dat-boris
Last active April 23, 2026 01:26
Show Gist options
  • Select an option

  • Save dat-boris/8b62ea16745c2bb666b55d7d071835e1 to your computer and use it in GitHub Desktop.

Select an option

Save dat-boris/8b62ea16745c2bb666b55d7d071835e1 to your computer and use it in GitHub Desktop.
Comparison between OpenCode vs Pi

Context Management Comparison: Opencode vs. Pi

This report compares the context management strategies of the Opencode agent and the Pi agent, based on trace analysis from the TODO app experiment on 2026-04-22 using the Kimi 2.5 model.

1. Overview of Context Management Strategies

Opencode

Opencode employs a highly structured, verbose, and comprehensive system prompt. It explicitly defines:

  • Tool Use Protocol: Detailed instructions on how to use tools, the importance of parallel tool calls, and the expected behaviors after tool results (e.g., "If you anticipate making multiple non-interfering tool calls, you are HIGHLY RECOMMENDED to make them in parallel to significantly improve efficiency.").
  • Coding Guidelines: Rigid rules for coding from scratch, bug fixes, features, and refactoring (e.g., "Make MINIMAL changes to achieve the goal. This is very important to your performance.").
  • Research Protocols: Structured guidance for research, including planning, internet searching, and multimedia processing (e.g., "Make plans before doing deep or wide research, to ensure you are always on track.").
  • Safety/Environment: Explicit constraints about the working directory, operating system, and safety protocols (e.g., "The operating environment is not in a sandbox. Any actions you do will immediately affect the user's system. So you MUST be extremely cautious.").
  • Project Context: Directs the agent to check AGENTS.md for project-specific conventions (e.g., "Markdown files named AGENTS.md usually contain the background, structure, coding styles, user preferences and other relevant information.").

This approach creates a high-overhead, highly-constrained context, which ensures consistent, safe actions but increases token usage significantly.

Pi

Pi employs a concise, harness-oriented system prompt. It emphasizes:

  • Role Definition: Focuses on being an "expert coding assistant" operating within a specific harness (e.g., "You are an expert coding assistant operating inside pi, a coding agent harness. You help users by reading files, executing commands, editing code, and writing new files.").
  • Concise Toolset: Lists available tools briefly without the extensive procedural rules found in Opencode (e.g., "Available tools: - read: Read file contents - bash: Execute bash commands - edit: Make precise file edits - write: Create or overwrite files").
  • Documentation Focus: Provides clear, direct links to its own documentation for extensions, themes, skills, and SDKs (e.g., Main documentation: /Users/borislau/.nvm/versions/node/v20.20.0/lib/node_modules/@mariozechner/pi-coding-agent/README.md).

This approach creates a low-overhead, flexible context, designed for rapid interaction and task execution, relying more on the agent's internalized knowledge than strict procedural guidelines.

2. Quantitative Performance Analysis (Kimi 2.5)

Phase Agent Activity (Runs) Avg Latency (ms) Avg Tokens
Research Opencode 26 46,025.96 19,953.12
Pi 27 32,339.11 12,574.93
Plan Opencode 0 N/A N/A
Pi 3 5,486.00 2,233.67
Implementation Opencode 3 6,088.00 12,640.67
Pi 7 7,213.29 2,442.57
Unknown Opencode 193 32,750.53 81,813.90
Pi 5 8,291.00 2,473.00

3. Pairwise Analysis (Examples)

Prompt Scenario Opencode Context Pi Context
"Let's go with..." Massive procedural prompt + context rules. Concise, assistant-focused prompt + Pi docs links.
"For remaining..." Massive procedural prompt + context rules. Concise, assistant-focused prompt + Pi docs links.

3. Findings

  • Token Efficiency: The significant token usage discrepancy (Opencode using ~37% more tokens in Research phase) is directly attributable to Opencode's expansive system prompt being sent with every single trace.
  • Context Rigidity vs. Flexibility: Opencode's prompt is designed to enforce specific behaviors and safety constraints, while Pi's prompt is designed to enable assistance within a specific harness.
  • Workflow Integration: Opencode's strategy is better suited for complex, multi-step engineering tasks requiring strict adherence to conventions (as defined in AGENTS.md), whereas Pi is better suited for faster, more iterative interaction.

4. Context Composition Visualization

Average context window usage per trace against the Kimi K2.5 128k token window.
Each block = 1% ≈ 1,280 tokens · system prompt · messages + tool calls · free space

Opencode  ·  365 traces  ·  avg 61,866 input tokens  ·  48.4% of context window
─────────────────────────────────────────────────────────────────────────────────
█ █ █ █ ▓ ▓ ▓ ▓ ▓ ▓    System prompt   4,677 est. tokens    3.6%
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓    Messages+tools 57,189 tokens        44.7%
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓    Free           66,134 tokens        51.7%
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓
▓ ▓ ▓ ▓ ▓ ▓ ▓ ▓ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
Pi  ·  49 traces  ·  avg 7,603 input tokens  ·  5.9% of context window
─────────────────────────────────────────────────────────────────────────
█ ▓ ▓ ▓ ▓ ▓ ░ ░ ░ ░    System prompt   1,367 est. tokens    1.1%
░ ░ ░ ░ ░ ░ ░ ░ ░ ░    Messages+tools  6,236 tokens          4.9%
░ ░ ░ ░ ░ ░ ░ ░ ░ ░    Free          120,397 tokens         94.1%
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░
░ ░ ░ ░ ░ ░ ░ ░ ░ ░

Side-by-side summary

Metric Opencode Pi Ratio
Traces analyzed 365 49
Avg input tokens 61,866 7,603 8.1× more
Avg system prompt tokens (est.) 4,677 1,367 3.4× more
Avg messages in context 101.6 18.6 5.5× more
Avg tool calls per trace 51.7 8.3 6.2× more
Context window used 48.4% 5.9% 8.2× more

System prompt tokens estimated as len(system_message_content) / 4. Agent identity derived from custom_metadata["trace.openrouter.api_key_name"]: opencode-202604 vs pi-20260422.

5. System Prompt Section Breakdown

Each agent's system prompt consists of a static harness (shipped with the agent) and injected project context (read from AGENTS.md / project config at runtime).

System Prompt Section Breakdown

System Prompt Composition  ·  each character = 1% of that agent's total sys prompt  ·  left→right = prompt order
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
OPENCODE  4,638 est. tok  ╠IITTTTTTTTTTTTTCCCCCCCCCCCRRRRRRWWWWPPPPPPPUUUUMMMMMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA╣
Pi        1,367 est. tok  ╠OOOLLLLLLGGGGGGGGGGGGGGDDDDDDDDDDDDDDDDDDDDJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ╣

Opencode sections  ·  in prompt order  (51.6% static harness → then injected AGENTS.md)
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 I  Intro / role                2%     94 tok  ██
 T  Tool Use Protocol          13%    598 tok  █████████████
 C  Coding Guidelines          11%    499 tok  ███████████
 R  Research Protocols          6%    281 tok  ██████
 W  Working Environment         4%    173 tok  ████
 P  Project Information         7%    334 tok  ███████
 U  Ultimate Reminders          4%    192 tok  ████
 M  Model/env metadata          5%    221 tok  █████
 A  Injected AGENTS.md         48%  2,247 tok  ████████████████████████████████████████████████

Pi sections  ·  in prompt order  (42.9% static harness → then injected project context)
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 O  Role definition             3%     43 tok  ███
 L  Tool listing                6%     85 tok  ██████
 G  Guidelines                 14%    188 tok  ██████████████
 D  Pi documentation pointers  20%    271 tok  ████████████████████
 J  Injected project context   57%    781 tok  █████████████████████████████████████████████████████████

Measured on a median-length trace per agent.

Opencode (median trace · 4,638 est. tokens)

Section Est. Tokens % of Sys Prompt
Intro / role 94 2.0%
Tool Use Protocol 598 12.9%
Coding Guidelines 499 10.7%
Research Protocols 281 6.1%
Working Environment 173 3.7%
Project Information (static) 334 7.2%
Ultimate Reminders 192 4.1%
Model / env metadata 221 4.8%
Static harness subtotal 2,391 51.6%
Injected AGENTS.md content 2,247 48.4%
Total 4,638 100%

Across all 365 Opencode traces: injected AGENTS.md averages 2,096 est. tokens; static harness averages 2,627 est. tokens (varies because env metadata — skills, working dir — differs per project). 97.8% of traces had injected content; 90% fall in the 6–10k char band.

Ultimate Reminders (verbatim) — the closing section of Opencode's static harness, before the injected AGENTS.md:

At any time, you should be HELPFUL, CONCISE, and ACCURATE. Be thorough in your actions — test what you build, verify what you change — not in your explanations.

  • Never diverge from the requirements and the goals of the task you work on. Stay on track.
  • Never give the user more than what they want.
  • Try your best to avoid any hallucination. Do fact checking before providing any factual information.
  • Think about the best approach, then take action decisively.
  • Do not give up too early.
  • ALWAYS, keep it stupidly simple. Do not overcomplicate things.
  • When the task requires creating or modifying files, always use tools to do so. Never treat displaying code in your response as a substitute for actually writing it to the file system.

Pi has no equivalent section — it ends its static harness at the documentation pointers and trusts the model to supply these behavioral defaults.

Pi (median trace · 1,367 est. tokens)

Section Est. Tokens % of Sys Prompt
Role definition 43 3.1%
Tool listing 85 6.2%
Guidelines 188 13.8%
Pi documentation pointers 271 19.8%
Static harness subtotal 587 42.9%
Injected project context 781 57.1%
Total 1,367 100%

Pi's static harness is nearly identical across all 49 traces (5,467–5,480 chars), confirming a fully static prompt. Opencode's varies considerably (2,096–26,698 chars) due to dynamic env metadata and variable AGENTS.md size.

Harness comparison

Opencode Pi Ratio
Static harness (est. tokens) 2,391 587 4.1× more
Injected project context (est. tokens) 2,247 781 2.9× more
Total system prompt 4,638 1,367 3.4× more

Opencode's static harness is 4× larger than Pi's even before project context — driven by its detailed procedural sections (Tool Use Protocol, Coding Guidelines, Research Protocols, Safety/Environment). Pi relies on the model's internalized knowledge instead.

How project context works in both agents

Both agents use the same mechanism: they read AGENTS.md files from the project directory at startup and inject the content into the system prompt. The difference is purely structural:

Opencode Pi
Boundary marker Instructions from: /path/to/AGENTS.md (flat text) ## /path/to/AGENTS.md heading under # Project Context
Dynamic fields injected Working directory, date, skills list Working directory, date
Content source Same AGENTS.md files Same AGENTS.md files

Pi's presentation is slightly more readable (path as a Markdown ## heading so content flows naturally below it), while Opencode uses a flat text boundary marker. The injected content itself is identical in principle — whatever is in the project's AGENTS.md.

Why Pi's tool listing costs so much less

Pi's entire tool section (85 est. tokens) is four bullet points:

Available tools:
- read: Read file contents
- bash: Execute bash commands (ls, grep, find, etc.)
- edit: Make precise file edits with exact text replacement
- write: Create or overwrite files

Opencode's Tool Use Protocol (598 est. tokens) is a full procedural manual: when to call tools, how to interpret results, instructions to parallelize calls, what to do after tool results return, and how to use the task subtask delegation tool.

The same pattern applies across every section: Opencode encodes expected behavior explicitly in the prompt; Pi trusts the model's training to supply it. This is the root cause of the 4.1× static harness size difference — not a difference in capability, but a difference in where the behavior contract lives: in the prompt vs. in the model weights.

6. Message Breakdown

Token counts are estimated (chars/4). Tool outputs are returned as user messages in both agents' trace formats, so user message tokens are high — they accumulate tool results across the session.

Average messages per trace

Role Opencode avg/trace Pi avg/trace Ratio
system 1.0 1.0 1.0×
user 58.0 10.2 5.7× more
assistant 42.6 7.4 5.8× more
Total 101.6 18.6 5.5× more

Median messages/trace: Opencode 54, Pi 14 (P90: Opencode 282, Pi 39 — Opencode has a heavy right tail from long agentic runs).

Average tokens per message

Role Opencode tok/msg Pi tok/msg Opencode % of total Pi % of total
system 4,677 1,367 10.1% 22.8%
user 523 301 65.7% 51.1%
assistant 262 212 24.1% 26.1%

User messages dominate total token usage in both agents because tool outputs accumulate there across the full session. Despite Opencode's larger system prompt, the bulk of its token cost comes from a longer conversation history.

Assistant message types

Type Opencode avg/trace Pi avg/trace
Tool calls 33.1 (77.6%) 5.6 (75.3%)
Plain text responses 9.6 (22.4%) 1.8 (24.7%)

~77% of assistant messages are tool invocations in both agents — the ratio is nearly identical, suggesting similar interaction patterns. The difference is session depth, not interaction style.

Opencode top tool calls (365 traces)

Tool Total calls Avg/trace
bash 5,600 15.3
read 3,074 8.4
edit 1,471 4.0
write 1,092 3.0
glob 453 1.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment