A laptop-local hybrid search stack that indexes any private repo and lets you query by intent, not just keywords. No SaaS, just Vertex AI for embeddings. You get semantic similarity + BM25 ranking + exact location results (file:line-line) without sending your code anywhere except Google Cloud's embedding API.
claude-context is a Claude Code MCP skill that combines:
- Milvus (vector database) for storage and hybrid search
- LiteLLM (proxy) to route embedding requests
- Vertex AI
gemini-embedding-001(3072-dim) for semantic vectors - @zilliz/claude-context-mcp (MCP server) to wire it all into Claude Code or any MCP-aware agent
Ask "where is webhook signature verification?" and get back the actual code chunks with line numbers, even if the word "webhook" never appears in the source.
agent/CLI ── MCP/stdio ──► @zilliz/claude-context-mcp
│
├─► OPENAI_BASE_URL=http://127.0.0.1:4001/v1
│ │
│ ▼
│ LiteLLM proxy ──► Vertex AI gemini-embedding-001
│ (<your-gcp-project>, gcloud ADC)
│ 3072-dim vectors
▼
Milvus standalone (Docker)
gRPC: 127.0.0.1:19530
REST: 127.0.0.1:9091
one collection per repo
Vertex AI over AI Studio (no API key required, uses your gcloud application-default credentials). Milvus standalone not Lite (the MCP server needs gRPC).
- Docker (for Milvus + etcd + minio)
- Node 20+ (for the MCP server)
- Python 3.11+ (for LiteLLM)
gcloudauthenticated with ADC against a project that has Vertex AI enabled:gcloud auth application-default login gcloud config set project <your-gcp-project>
Clone the portable bundle (public repo):
git clone https://github.com/Timtech4u/Mercurie.git ~/code-search-bundle
cd ~/code-search-bundle/tools/local-code-searchInstall the skill and local-stack scripts:
# 1. Install skill into global Claude Code skills dir
mkdir -p ~/.claude/skills
cp -R skill ~/.claude/skills/claude-context
# 2. Copy local-stack (Milvus compose, LiteLLM config, wrappers)
mkdir -p ~/local-code-search
cp local-stack/* ~/local-code-search/
# 3. Symlink wrapper commands onto $PATH
mkdir -p ~/bin
ln -sf ~/.claude/skills/claude-context/bin/cc-* ~/bin/Bring up the stack (Milvus + LiteLLM):
cc-startHealthy output ends with == ready == and a 3072-dim embedding probe success. This is idempotent — run it any time something looks off.
The first run creates:
~/.context/.env(MCP environment: provider, base URL, Milvus address)~/.context/.contextignore(global ignore: vendor,*.pb.go, lockfiles,node_modules)- Docker volumes for Milvus data (at
~/local-code-search/volumes/)
Point cc-index at any repo:
cc-index ~/code/your-repoThis walks the repo, chunks each file (respecting .contextignore), embeds with Vertex AI, and writes to Milvus. ~5 minutes for ~1500 chunks. Progress appears live.
Check status:
cc-status ~/code/your-repoWhen it says ✅ ... fully indexed, you're ready to search.
Subsequent cc-index runs are incremental (only changed files re-embed) thanks to a merkle DAG at ~/.context/merkle/<hash>.json.
Force a full re-index:
cc-index ~/code/your-repo --forceQuery by intent:
cc-search ~/code/your-repo "where is webhook signature verified" --limit 3Output is JSON with Location: <file>:<startLine>-<endLine>, language, rank, and the literal code chunk. For plain file:line results:
cc-search ~/code/your-repo "<query>" --limit 5 \
| python3 -c "import json,sys,re; print('\n'.join(re.findall(r'Location: (\S+)', json.load(sys.stdin)['content'][0]['text'])))"Examples that work:
"rate limiting middleware"→ finds the middleware even if it's not namedRateLimiter"Spanner transaction retry logic"→ surfaces the retry loop by concept"error handling for customer import"→ chunks from the error package + handler
The MCP server is auto-registered when you install the skill. In a Claude Code session, just ask:
"Use claude-context to find where Stripe webhooks are verified."
The agent calls search_code over MCP, reads the returned chunks, and answers from real code with file:line citations.
First time on a new repo: ask the agent to index it first, or run cc-index <repo> yourself before the session.
The vector store gets stale as code lands. Re-index regularly:
cc-index ~/code/your-repoOnly changed files re-embed (incremental). A daily cron or launchd job is recommended.
- Vendor and lockfiles must be ignored. Without
.contextignore, a Go repo withvendor/indexes 10× larger and drowns real code in noise. The skill'scc-startwrites sensible defaults if missing. - Embedding model is
gemini-embedding-001, not-2. The-001model is GA on Vertex AI;-2may not be available in all projects. Both are 3072-dim. - Read-only triage when reviewing issues. If you use this for backlog review, output is a report the human acts on — never
gh issue closeor post comments automatically. - Conservative bias to KEEP. When search results are ambiguous, assume the issue is still valid. Closing real work by mistake costs more than keeping stale issues open.
- Index before review. Searching an empty collection produces false negatives. Wait for
✅ ... fully indexed.
If something breaks:
cc-start # diagnose + restart Milvus + LiteLLMTo see what's indexed:
curl -sS http://127.0.0.1:9091/api/v1/collections | jqTo drop a broken collection and re-index:
# Find the collection hash from the list above
curl -sS -X DELETE http://127.0.0.1:9091/api/v1/collection \
-H 'Content-Type: application/json' \
-d '{"collection_name":"hybrid_code_chunks_<hash>"}'
# Clear the merkle so the next index rebuilds from scratch
rm ~/.context/merkle/<hash>.json
cc-index ~/code/your-repo --force| Path | Purpose |
|---|---|
~/.claude/skills/claude-context/bin/ |
The cc-* wrapper scripts |
~/bin/cc-* |
Symlinks so they're on $PATH |
~/.context/.env |
MCP environment (provider, base URL, Milvus address) |
~/.context/.contextignore |
Global ignore (vendor, *.pb.go, lockfiles) |
~/.context/merkle/<hash>.json |
Per-repo content snapshot (incremental index) |
~/local-code-search/litellm.config.yaml |
LiteLLM → Vertex routing |
~/local-code-search/milvus-standalone-docker-compose.yml |
Milvus + etcd + minio |
~/local-code-search/volumes/ |
Milvus data (DO NOT delete) |
A production backlog triage (28 open issues, ~1500 code chunks indexed in 5 min) ran end-to-end in 7 min with 4 parallel agents. Every verdict cited file:line evidence. Zero false closes.
Public bundle: Apache-2.0. See the upstream repo for details.
Questions? File an issue at https://github.com/Timtech4u/Mercurie/issues