Skip to content

Instantly share code, notes, and snippets.

@mdp
Created May 18, 2026 19:11
Show Gist options
  • Select an option

  • Save mdp/e7d955776d111157bccebd9e9bd096df to your computer and use it in GitHub Desktop.

Select an option

Save mdp/e7d955776d111157bccebd9e9bd096df to your computer and use it in GitHub Desktop.
SigMap Codebase Report v6.10.10

SigMap Codebase Report

Generated: 2026-05-18
Version: 6.10.10
Branch: main (up-to-date with origin)

Repository Status

Item Value
Version 6.10.10
Latest Commit 270e21c (Merge PR #200 from develop)
Git Status Clean - up to date
Benchmark ID sigmap-v6.10-main
Benchmark Date 2026-05-12
Test Count 722

Architecture

SigMap is an AI context engine that extracts function/class signatures and ranks them for relevance — without using any LLM.

Ask → Rank → Context → Validate → Judge → Learn

Directory Structure

sigmap/
├── gen-context.js       # Bundled CLI (single file, zero deps)
├── src/
│   ├── extractors/      # 41 language extractors
│   ├── retrieval/
│   │   ├── ranker.js     # TF-IDF ranking engine
│   │   └── tokenizer.js  # Code-aware tokenization
│   ├── config/
│   │   ├── defaults.js   # Default config values
│   │   └── loader.js     # Config file parser
│   ├── learning/         # weights.js - learned rankings
│   ├── graph/           # builder.js, impact.js
│   ├── mcp/             # server.js, handlers.js
│   └── health/          # scorer.js
└── packages/
    ├── core/             # Programmatic API
    └── adapters/         # 10 output format adapters

Signature Extraction

Each language has its own extractor in /src/extractors/:

Language Approach
TypeScript/JS Regex parsing (classes, interfaces, functions)
Python Regex + Python AST for complex cases
Go, Rust, Java Regex-based extractors
R Roxygen comment parsing

Example transformation:

// Full code (50+ lines)
class UserService {
  async getUser(id: string): Promise<User> { ... }
}

// Signature (1 line)
class UserService
  getUser(id)

TF-IDF Ranking

src/retrieval/ranker.js scores each file using weighted signals:

Signal Weight Description
exactToken 1.0 Query token in signature
symbolMatch 0.5 Token in function/class name
prefixMatch 0.3 Token is prefix of sig token
pathMatch 0.8 Token in file path
recencyBoost 1.5× Recently changed files
graphBoost 0.4 Imports a scored file

Penalty Signals:

Signal Multiplier Pattern
testFile 0.4 test/, spec/
generatedCode 0.3 dist/, build/
docsFile 0.2 docs/, readme/
node_modules 0.0 Always zero

Graph Boost (2-hop with decay):

  • Hop 1: +0.40 for direct imports
  • Hop 2: +0.15 for transitive imports

Intent Detection (7 types)

SigMap detects query type and adjusts weights:

Intent exactToken symbolMatch pathMatch graphBoost
debug 1.2 - 0.6 -
explain - 0.8 0.9 -
refactor 0.8 0.9 - -
review 0.9 - 1.0 -
test 0.7 0.4 - -
integrate - - 1.1 0.7
navigate 0.9 - 1.2 -

Output Adapters (10 total)

Adapter Output File Used By
copilot .github/copilot-instructions.md GitHub Copilot
claude CLAUDE.md Claude Code
cursor .cursorrules Cursor, Cline
windsurf .windsurfrules Windsurf
openai .github/openai-context.md Ollama, Aider
gemini .github/gemini-context.md Google Gemini
codex AGENTS.md OpenAI Codex
willow Willow MCP Willow
llm-full llm-full.txt Full context
llm llm.txt Compact context

MCP Tools (9 total)

Tool Purpose
read_context Read context file (full or per-module)
search_signatures Search signatures by keyword
get_map Import graph, class hierarchy, routes
create_checkpoint Session snapshot with branch/commit
get_routing Model tier hints for files
explain_file Deep-dive with imports/callers
list_modules Module directory listing
query_context TF-IDF ranked retrieval
get_impact Dependency blast radius

Benchmark Results (v6.10-main)

Metric Value Baseline
Hit@5 78.9% 13.6% (5.8× lift)
Token reduction 97.9% 278K vs 13.5M tokens
Task success 52.2% 10%
Prompts per task 1.66 2.84 (40.6% fewer)
Repositories tested 21 JS, Python, Go, Rust, Java, R, etc.

Tokenizer

src/retrieval/tokenizer.js provides code-aware tokenization:

  • camelCasecamel case
  • snake_casesnake case
  • kebab-casekebab case
  • File paths → individual components
  • Stop words removed (the, a, an, in, of, etc.)

Configuration Defaults

{
  output: '.github/copilot-instructions.md',
  adapters: null,
  srcDirs: ['src', 'app', 'lib', 'packages', ...],
  exclude: ['node_modules', '.git', 'dist', ...],
  maxDepth: 6,
  maxSigsPerFile: 25,
  maxTokens: 6000,
  autoMaxTokens: true,
  coverageTarget: 0.80,
  secretScan: true,
  strategy: 'full',  // 'full' | 'per-module' | 'hot-cold'
  sigCache: false,
  retrieval: { topK: 10, recencyBoost: 1.5 },
  impact: { depth: 3, includeSigs: true }
}

Supported Languages (41 total)

TypeScript, JavaScript, Python, Java, Kotlin, Go, Rust, C#, C/C++, Ruby, PHP, Swift, Dart, Scala, Vue, Svelte, HTML, CSS/SCSS, YAML, Shell, SQL, GraphQL, Terraform, Protobuf, Dockerfile, TOML, XML, Properties, Markdown, R, GDScript

Key Files

File Purpose
gen-context.js Bundled CLI (single file, zero deps)
packages/core/index.js Programmatic API
src/retrieval/ranker.js TF-IDF ranking
src/retrieval/tokenizer.js Code tokenization
src/config/loader.js Config parsing + auto-detection
src/graph/builder.js Dependency graph builder
src/graph/impact.js Impact analysis
src/learning/weights.js Learned ranking multipliers
src/mcp/server.js MCP stdio server
src/health/scorer.js Health scoring
src/judge/judge-engine.js Groundedness scoring

Recent Changes (v6.10.x)

v6.10.10 (2026-05-12)

  • First-class R support (R6/S7 classes, Roxygen2 hints)
  • DESCRIPTION/NAMESPACE parsing for R packages
  • Windows path normalization fix for get_impact

v6.10.0 - v6.10.9

  • Python absolute imports in builder.js
  • Comprehensive import graph diagnostics
  • Workspace-scoped retrieval for monorepos

Installation

# Try without installing
npx sigmap

# Install globally
npm install -g sigmap

# Standalone binary (no Node.js required)
# Download from GitHub Releases

Quick Commands

sigmap              # Generate context
sigmap ask "auth"   # Ask a question
sigmap validate     # Check coverage
sigmap --health     # Health score
sigmap --mcp        # Start MCP server
sigmap --setup      # IDE integration

Report generated by analyzing the SigMap codebase directly. No LLM used.

  • Hop 2: +0.15 for transitive imports
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment