Skip to content

Instantly share code, notes, and snippets.

View ajcwebdev's full-sized avatar

Anthony Campolo ajcwebdev

View GitHub Profile
@ajcwebdev
ajcwebdev / index.md
Last active February 13, 2025 06:52
"Not All Language Model Features Are One-Dimensionally Linear" and "Sparse Feature Circuits" Summaries and Analyses
@ajcwebdev
ajcwebdev / index.md
Last active October 13, 2023 04:00
Transformer decoding in fifty lines of pseudocode

From the paper Transformer decoding in fifty lines of pseudocode by Bob Carpenter at the Flatiron Institute. I copied it by hand and it almost certainly has a mistake or two in it.

DECODE(tok: int<lower=1, upper=T>[N],
    alpha: matrix(T, V),
    betas: { query: matrix(V, K),
        key: matrix(V, K),
        value: matrix(V, V) }[A],
    gammas: { 1: vector(L), 2: matrix(L, V),
              3: vector(V), 4: matrix(V, L) }[A],