Skip to content

Instantly share code, notes, and snippets.

View eeshansrivastava89's full-sized avatar
🎯
Focusing

Eeshan Srivastava eeshansrivastava89

🎯
Focusing
View GitHub Profile
@eeshansrivastava89
eeshansrivastava89 / CLAUDE.md
Created March 8, 2026 04:20
My Claude Code guardrails β€” global CLAUDE.md

CLAUDE.md

Global instructions for Claude Code.

Guardrails

  • NEVER modify CLAUDE.md β€” Read-only for Claude
  • NEVER delete original content β€” Only reorganize or add
  • NEVER make irreversible changes without approval
  • NEVER commit unless explicitly asked
@eeshansrivastava89
eeshansrivastava89 / SKILL.md
Created March 8, 2026 04:23
Claude Code skill β€” context-save (saves session state for seamless resumption)
name context-save
description Save session context for resuming later
disable-model-invocation false

Context Save

Capture session state so the next session can resume seamlessly.

@eeshansrivastava89
eeshansrivastava89 / SKILL.md
Created March 8, 2026 04:23
Claude Code skill β€” context-restore (resumes from previous session)
name context-restore
description Restore session context from previous session
disable-model-invocation false

Context Restore

Resume work from where the last session ended. Read actual codebase along with this file.

@eeshansrivastava89
eeshansrivastava89 / qwen36-mtp-llamacpp.md
Created May 19, 2026 02:25
Running Qwen3.6 with MTP in llama.cpp

Running Qwen3.6 with Multi-Token Prediction in llama.cpp

Accurate as of May 18, 2026.

Multi-Token Prediction (MTP) uses the model's built-in prediction heads to draft multiple tokens in parallel, then verifies them against the main model. For Qwen3.6, this yields ~1.5–2Γ— faster generation with no accuracy loss.

This guide covers the Qwen3.6 27B and Qwen3.6 35B-A3B (MoE) models. As of May 2026, MTP support is merged into llama.cpp β€” no fork required.