Skip to content

Instantly share code, notes, and snippets.

@artemgetmann
Last active May 7, 2026 02:28
Show Gist options
  • Select an option

  • Save artemgetmann/74f28d2958b53baf50597b669d4bce43 to your computer and use it in GitHub Desktop.

Select an option

Save artemgetmann/74f28d2958b53baf50597b669d4bce43 to your computer and use it in GitHub Desktop.
Practical workflow for reducing token usage in Claude Code while preserving session continuity. Includes compacting strategies, CLAUDE.md structure, modular context management, and prompt engineering tips.

🧠 How to Save Context Tokens When Using Claude Code

This is a personal reference workflow for minimizing token usage while maintaining project continuity across Claude Code (Sonnet 4 with file access).


βœ… Setup: Populate CLAUDE.md

Claude loads CLAUDE.md automatically at session start.

Keep it under 5k tokens. Claude uses this as project memory.

Example prompt to Claude:

Populate and maintain CLAUDE.md with all relevant project-wide context so you can resume work efficiently without me repeating context each session. Include:
- Project summary & active features
- Tech stack
- Code style & naming conventions
- Known bugs and next TODOs
- Test scenarios we haven’t completed yet (if any)
Keep it under 5k tokens total.

If it becomes impossible to keep all relevant info under 5k tokens, split less critical sections into separate files under the docs/ directory. For example, if there are details about a future system version (e.g., MK2), create a separate markdown file like docs/mk2_notes.md instead of bloating CLAUDE.md.

πŸ› Session Workflow

πŸ–š End of Session

  1. Run:
/compact Focus on code samples and API usage

(You can change the focus as needed β€” e.g., "design decisions" or "test coverage, or just /compact, but will have more fluff and save less on tokens")

  1. Prompt Claude:
Append that summary to docs/progress.md
Also save a standalone summary to session_summary.md
  1. Do NOT run /clear unless you want to reset everything.

    • It's safer to just exit or start a new chat.
    • Session will stay available if you want to revisit it later.

πŸ†‘ Start of Session

  • Start new chat
  • Load:
@CLAUDE.md
@docs/progress.md
@session_summary.md (optional)

πŸ” Every ~40 Messages

  • Run:
/compact Focus on code samples and API usage
  • Prompt Claude:
Save that summary to session_summary.md

This keeps the working context tight and avoids token bloat.


πŸ“ FYI: Structured Project Logs

Maintain progress.md, log.md, etc. for running session history.

Log Workflow:

  • Update at end of each session
  • Claude can read it at session start
  • Load it manually with @progress.md if needed

Pro tip: Keep logs in docs/ and only load when needed.


πŸ“† FYI: Modular File Context with @file References

Instead of pasting long blocks:

  • Split reusable info into standalone .md or .yaml files
  • Load on demand via @filename.md

Good for:

  • Command syntax
  • API docs
  • Configs / routes / edge cases

🎯 FYI: Prompt Engineering for Token Efficiency

Write like a spec. Be precise.

Bad:

Hey can you fix this bug?

Good:

Fix the off-by-one bug in utils.py line 12 that causes the last item to be skipped. Input: list of ints. Output: adjusted list. Keep type hints.

Other tips:

  • Ask grouped questions instead of one-by-one
  • Avoid redundant background unless necessary

🧹 Summary

  • Use CLAUDE.md for persistent context (≀ 5k tokens)
  • End sessions with /compact, then log to progress.md and optionally session_summary.md
  • Start sessions by loading only needed files
  • Compact every ~40 messages to keep things clean
  • Use modular files and focused prompts to minimize token load

Keep it lean. Let Claude handle memory. You just keep shipping.

@Njoxpy
Copy link
Copy Markdown

Njoxpy commented Apr 11, 2026

Thank you for this. I also received another tip: always try to disable extended thinking for simple tasks.

@dupontbrice-Fr
Copy link
Copy Markdown

thanks for the smart tips to do ! :d

@artemgetmann
Copy link
Copy Markdown
Author

thanks for the smart tips to do ! :d

This is a bit outdated at this point but thank you! 😎

@artemgetmann
Copy link
Copy Markdown
Author

Thank you for this. I also received another tip: always try to disable extended thinking for simple tasks.

The thing that one has to keep in mind is the mental overhead. Maybe you save tokens and pay less with the AI provider, but you pay with your time, Brain Power, which is also not unlimited. The trade-off is: do you want to push most of that work to the AI and pay more, which I think is, in the long run, the smarter option, or do you want to do more optimization on your end and pay with your time instead? over time, productivity gains compound if the AI does more of the work and saves you more time.

@sanket-pilane
Copy link
Copy Markdown

Thanks for this tips it's really helpful for me...!🧠

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment