Skip to content

Instantly share code, notes, and snippets.

@apnea
apnea / pascal.md
Last active August 1, 2025 09:17
Pascal's Demise - notes on waning support for Nvidia Pascal architecture

driver / toolkit in prod

Driver 565.77 CUDA 12.7

(currently Aug 2025 running on 22.04 Linux pop-os 6.12.10-76061203-generic)

CUDA

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#deprecated-architectures

CUDA Toolkit 12.9 Update 1 - Release Notes

Maxwell, Pascal, and Volta architectures are now feature-complete with no further enhancements planned. While CUDA Toolkit 12.x series will continue to support building applications for these architectures, offline compilation and library support will be removed in the next major CUDA Toolkit version release. Users should plan migration to newer architectures, as future toolkits will be unable to target Maxwell, Pascal, and Volta GPUs.

@apnea
apnea / gist:cffd9863b1f3b9d6672394088e149879
Created February 5, 2026 10:23
server a directory as a sorted list of links
#!/bin/bash
python3 -c "
import os, urllib.parse
files = [f for f in os.listdir('.') if os.path.isfile(f)]
files.sort(key=lambda x: os.path.getmtime(x), reverse=True)
with open('index.html', 'w') as out:
out.write('<html><body><h2>Files (newest first)</h2>')
for f in files:
url = urllib.parse.quote(f)
out.write(f'<a href=\"{url}\">{f}</a><br>\n')
@apnea
apnea / caveman-token-analysis.md
Last active May 2, 2026 18:24
Caveman skill token cost analysis

Caveman Token Cost Analysis

Date: 2026-04-18

Analysis of caveman — a skill/plugin instructing AI coding agents to respond in compressed prose, dropping articles, filler, and hedging while preserving technical accuracy.

Summary

Caveman claims: "~65-75% fewer tokens," measured via an eval harness that counts visible output tokens.

@apnea
apnea / spontaneous-language-switching.md
Created May 2, 2026 06:58
Spontaneous Language Switching in LLMs — Empirical Observations

Spontaneous Language Switching in LLMs

LLMs may spontaneously switch to Chinese mid-reasoning regardless of prompt language — observed in both OpenAI's o1 and Chinese models (DeepSeek, Qwen, GLM)

The papers listed below list 3 possible reasons for this: internal circuit competition, strategic reasoning advantages gained during training, and the influence of distributed training data.

1. Competition Between Internal Circuits

Mechanistic interpretability research suggests that multilingual LLMs possess two distinct internal subsystems that govern generation:

@apnea
apnea / zai-opencode-mapping.md
Created May 10, 2026 10:58
Z.AI Coding Plan — OpenCode agent-to-model mapping

Z.AI Coding Plan — OpenCode Agent Mapping

Quota Cost per Model

Peak hours: 14:00–18:00 UTC+8. Off-peak: all other times. Monthly quota is equivalent to ~15–30× the subscription fee, converted at API pricing rates.

Model Quota (Peak: 14:00-18:00 UTC+8) Quota (Off-Peak) Temporary (thru June)
GLM-5.1 1× off-peak
GLM-5-Turbo 1× off-peak
@apnea
apnea / glm5.1-prompt-research.md
Last active May 13, 2026 10:26
GLM-5/5.1 System Prompt Research & Design for opencode

GLM-5/5.1 System Prompt Research & Design

Date: 2026-05-12 Purpose: Design an optimal system prompt for GLM-5.1 in opencode (a coding agent CLI), informed by the GLM-5 paper, Z.AI docs, and community findings.


Sources

| Source | Key Takeaway |

@apnea
apnea / gist:038e8cd813aa089c1e39c10f1fa73189
Created May 17, 2026 15:31
Lattice Plugin Release Workflow
# Lattice Plugin Release Workflow
## Overview
Publishing a new version of `@apnea/opencode-lattice` involves three steps:
bump the version in package.json, push a git tag, and let CI handle the rest.
The GitHub Action (`.github/workflows/publish-plugin.yml`) automatically:
1. Publishes to npm when a `v*` tag is pushed
2. Creates a GitHub release with auto-generated notes
@apnea
apnea / vera-vs-opencode-codebase-index-vs-aft.md
Last active May 20, 2026 11:19
Vera vs opencode-codebase-index vs AFT: Feature & Metrics Comparison

Vera vs opencode-codebase-index vs AFT: Feature & Metrics Comparison

Vera (v0.7.0) — Local-first semantic code search CLI in pure Rust. 65 languages, ONNX embeddings + cross-encoder reranker, fully offline.

opencode-codebase-index (v0.8.0) — Semantic codebase indexing plugin for OpenCode, also runs as standalone MCP server. Hybrid TypeScript + Rust, API-first embeddings, 17+ languages.

AFT (v0.26.4) — Agent File Toolkit. Tree-sitter powered code manipulation and analysis for AI agents. Rust binary + thin TS plugins. 17 languages. Semantic search + trigram grep + structural editing + call-graph navigation + LSP diagnostics.


@apnea
apnea / beyond-text-code-representation-spectrum.md
Created May 20, 2026 12:12
Beyond Text: The Spectrum of Code Representation for LLM Coding Agents

Beyond Text: The Spectrum of Code Representation for LLM Coding Agents

An analysis of how code can be represented to LLMs — from raw text to architectural patterns — and where the research frontier currently sits.


The Spectrum

Code can be represented to LLMs at progressively richer levels of abstraction. Each level captures more structural and semantic information, but also requires more sophisticated tooling and domain knowledge to construct.

@apnea
apnea / GSS.md
Last active May 20, 2026 14:26
The Governance Spectrum Scaffold v2.0 — A Framework for Comparing AI Governance Across Model Risk and Agent Risk

The Governance Spectrum Scaffold

A Framework for Comparing AI Governance Across Model Risk and Agent Risk

Version 2.0
Date 20 May 2026
Critique cycles 2 (3 independent reviewers per cycle)
Source documents 19 (5 regulations, 6 practitioner/academic papers, 5 macro-prudential and cross-sector frameworks, 2 US fair lending guidance, 1 government agentic AI framework, 1 EU implementing guidance)