Skip to content

Instantly share code, notes, and snippets.

View athreesh's full-sized avatar

Anish athreesh

  • NVIDIA
View GitHub Profile

Two-Layer Knowledge Management for Claude Code

A system for persistent, cross-session memory using a git-backed work vault (raw findings) and an Obsidian vault (polished synthesis). Claude Code operates on both layers directly.

Architecture

┌─────────────────────────────────────────────────────┐
│                   Claude Code                        │
│ │
@athreesh
athreesh / clu-one-pager.md
Last active May 6, 2026 04:10
llm-d Performance Benchmarks: Qwen3-32B on Nebius H200 (Round-Robin vs KV-Events Routing)

CLU: Dynamo vs llm-d — Inference Orchestration Platform

Customer: AI-natives and CSP/NCPs deploying LLM inference at scale on Kubernetes Decision: Which orchestration layer to build on for production serving? Date: February 2026 | Dynamo 0.8.1 vs llm-d 0.5.0


PRODUCT CONFIGURATION

@athreesh
athreesh / grove-kind-demo-guide.md
Last active January 16, 2026 19:06
Grove on Kind/MicroK8s: kubectl-grove CLI Demo Guide

Grove on Kind/MicroK8s: kubectl-grove CLI Demo Guide

This guide walks you through deploying Grove on a local Kubernetes cluster (Kind or MicroK8s) and testing all kubectl-grove CLI commands.

Prerequisites

  • Docker installed and running
  • Go 1.22+ (for building kubectl-grove)
  • kubectl configured
  • Kind or MicroK8s installed
@athreesh
athreesh / documentation-audit-report.md
Created January 13, 2026 06:36
Dynamo Documentation Audit Report - January 2026

Dynamo Documentation Audit Report

Date: January 12, 2026
PR: #5380
Branch: docs/documentation-audit-updates


Transcript

@athreesh
athreesh / README.md
Last active January 4, 2026 03:23
# Claude Code PM Skills that I use

Claude Code PM Skills for AI Inference

Setup

gh gist clone 74fdca4ce6cf6e48711fc9b06e95dc74 ~/.claude/commands

Skills

  • /explain - Teaches inference concepts in PM-friendly terms
  • /architecture - Generates codebase architecture diagrams
@athreesh
athreesh / dynamo-disagg-kv-routing-study-guide.md
Created December 29, 2025 22:17
Dynamo: Disaggregated Serving & KV-Aware Routing Study Guide - Qwen3-32B Benchmark on H200s

Dynamo: Disaggregated Serving & KV-Aware Routing Study Guide

A comprehensive guide covering disaggregated LLM inference, KV-aware routing, and benchmarking insights from the Qwen3-32B benchmark on H200 GPUs.


Table of Contents

  1. Overview
  2. Benchmark Setup
  3. Results Comparison