Skip to content

Instantly share code, notes, and snippets.

View belisarius222's full-sized avatar

Ted Blackman belisarius222

  • Massachusetts
  • 03:18 (UTC -04:00)
  • X @rovnys
View GitHub Profile
@belisarius222
belisarius222 / void-cron-prompt.html
Last active March 26, 2026 18:19
VOID Maintenance Cron Prompt #pagedrop
<!DOCTYPE html>
<html><head><meta charset="utf-8"><title>VOID Maintenance Cron Prompt</title>
<style>
body { font-family: system-ui, sans-serif; max-width: 800px; margin: 40px auto; padding: 0 20px; background: #0d1117; color: #c9d1d9; }
h1 { color: #58a6ff; border-bottom: 1px solid #30363d; padding-bottom: 8px; }
h2 { color: #79c0ff; margin-top: 24px; }
pre { background: #161b22; border: 1px solid #30363d; border-radius: 6px; padding: 16px; overflow-x: auto; font-size: 13px; line-height: 1.5; }
.key { background: #1f6feb22; border-left: 3px solid #1f6feb; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; }
.warn { background: #da363422; border-left: 3px solid #da3634; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; }
.new { background: #23863622; border-left: 3px solid #238636; padding: 8px 12px; margin: 8px 0; border-radius: 0 6px 6px 0; }
<!DOCTYPE html>
<html><head><meta charset="utf-8"><title>VOID Recursive Forecaster — Results</title>
<style>
body { font-family: system-ui, -apple-system, sans-serif; max-width: 900px; margin: 40px auto; padding: 0 20px; background: #0d1117; color: #c9d1d9; line-height: 1.6; }
h1 { color: #58a6ff; border-bottom: 1px solid #30363d; padding-bottom: 12px; }
h2 { color: #79c0ff; margin-top: 32px; }
h3 { color: #d2a8ff; margin-top: 24px; }
table { border-collapse: collapse; width: 100%; margin: 16px 0; }
th, td { border: 1px solid #30363d; padding: 8px 12px; text-align: left; }
th { background: #161b22; color: #79c0ff; }
@belisarius222
belisarius222 / attnres-results.md
Created March 21, 2026 03:45
AttnRes: Attention Over the Residual Stream — Experimental Results (2026-03-20)

AttnRes: Attention Over the Residual Stream

Overview

AttnRes replaces the standard residual connection in transformers with a depth attention mechanism — instead of simply adding each layer's output to a running sum, the model attends over previous layer outputs to decide what information to carry forward.

Standard transformers use x = x + layer(x) at every layer. AttnRes variants replace this with a learned attention operation across the depth axis: "which previous layers' outputs should I attend to when constructing the input to this layer?"

All experiments use a GPT-2-style decoder-only transformer trained on FineWeb-Edu (10B tokens), with RoPE, SwiGLU, and RMSNorm.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Record Break — modded-nanogpt 57.38s on 8×B200</title>
<style>
@import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700;800&family=JetBrains+Mono:wght@400;500&display=swap');
:root {
@belisarius222
belisarius222 / static-set-shortlist.md
Created March 9, 2026 01:02
Static frequency-ranked shortlist for speculative decoding -- 99.65% parity with zero parameters

Static Frequency-Ranked Shortlist for Speculative Decoding

Date: 2026-03-08 Model: google/gemma-3-1b-it (262,144 vocab) Eval dataset: wikitext-2-raw-v1 validation (254,828 positions) Code: voltropy/shortlist@8168cac

Summary

We discovered that a static, frequency-ranked token set with a simple margin-based fallback to full-vocab scoring achieves better parity than a trained neural router, with zero parameters, zero training, and zero inference-time routing.

@belisarius222
belisarius222 / swap-ffn-bench-a1-20260302.md
Created March 2, 2026 22:29
Swap-FFN benchmark: torch.compile + fused w13 on A100 (2026-03-02)

Swap-FFN Benchmark Results: torch.compile + Fused w13 on A100

Date: 2026-03-02 Machine: A1 (216.81.248.152), NVIDIA A100-SXM4-80GB, PyTorch 2.10.0+cu126, CUDA 12.6 Code: monarch repo, commit aa3bb6f (token-local swap-FFN state) Checkpoints: Trained baseline and hybrid1 checkpoints from s3://voltcode-artifacts-17f9c348/runs/monarch-swap-ffn/20260302/ Results: s3://voltcode-artifacts-17f9c348/runs/swap-ffn-bench/a1-compiled-20260302/

Background

@belisarius222
belisarius222 / swap-ffn-fused-vs-unfused-trace.md
Created March 2, 2026 20:17
Swap-FFN Fused vs Unfused Decode: Detailed Execution Trace (A100)

Swap-FFN Fused vs Unfused Decode: Execution Trace

Detailed execution trace comparing the fused Triton kernel path vs the unfused PyTorch/cuBLAS path for Swap-FFN decode inference.

Config: d_model=1024, d_ff=4096, core_count=512, n_events=8, k_ffn=2, d_router=64, collapse_k=8, collapse_r=224

Hardware: A100-SXM4-80GB, 108 SMs, 40 MB L2 cache


@belisarius222
belisarius222 / test.js
Created December 19, 2024 16:04
Test gist from MCP server verification
console.log("Verification test");
@belisarius222
belisarius222 / hello.js
Created December 19, 2024 15:53
Test gist from MCP server
console.log("Hello from MCP server!");
|pass &noun
:- %k %lard %base !< shed:khan
%+ slap (slop .^(vase %ca %/lib/strandio/hoon) !>(..zuse))
!, *hoon
=* strandio -
=, strand=strand:rand
^- shed:khan
=/ m (strand ,vase)
|- ^- form:m
;< =bowl:rand bind:m get-bowl:strandio