Skip to content

Instantly share code, notes, and snippets.

@jboner
jboner / latency.txt
Last active April 29, 2025 15:40
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
@kevin-smets
kevin-smets / iterm2-solarized.md
Last active April 29, 2025 18:03
iTerm2 + Oh My Zsh + Solarized color scheme + Source Code Pro Powerline + Font Awesome + [Powerlevel10k] - (macOS)

Default

Default

Powerlevel10k

Powerlevel10k

@zcourts
zcourts / tmux cheat sheet
Last active March 6, 2025 08:08
tmux cheat sheet image from DuckDuckGo UI
https://duckduckgo.com/?q=tmux+cheat+sheet&atb=v47-1_x&ia=cheatsheet&iax=1
@mattiasarro
mattiasarro / rwkv.py
Last active March 30, 2025 15:06
RWKV MVP
# Taken from https://johanwind.github.io/2023/03/23/rwkv_details.html.
# I've added additional comments restructured it a tiny bit, which makes it clearer for me.
import numpy as np
from torch import load as torch_load # Only for loading the model weights
from tokenizers import Tokenizer
exp = np.exp
layer_norm = lambda x, w, b : (x - np.mean(x)) / np.std(x) * w + b
sigmoid = lambda x : 1/(1 + exp(-x))
@Artefact2
Artefact2 / README.md
Last active April 29, 2025 20:04
GGUF quantizations overview
@VictorTaelin
VictorTaelin / dps_sup_nodes.md
Last active April 20, 2025 14:33
Accelerating Discrete Program Search with SUP Nodes

Fast Discrete Program Search 2

I am investigating how to use Bend (a parallel language) to accelerate Symbolic AI; in special, Discrete Program Search. Basically, think of it as an alternative to LLMs, GPTs, NNs, that is also capable of generating code, but by entirely different means. This kind of approach was never scaled with mass compute before - it wasn't possible! - but Bend changes this. So, my idea was to do it, and see where it goes.

Now, while I was implementing some candidate algorithms on Bend, I realized that, rather than mass parallelism, I could use an entirely different mechanism to speed things up: SUP Nodes. Basically, it is a feature that Bend inherited from its underlying model ("Interaction Combinators") that, in simple terms, allows us to combine multiple functions into a single superposed one, and apply them all to an argument "at the same time". In short, it allows us to call N functions at a fraction of the expected cost. Or, in simple terms: why parallelize when we can share?

A

import asyncio
import base64
import json
import os
import pyaudio
from websockets.asyncio.client import connect
class SimpleGeminiVoice:
def __init__(self):
@willccbb
willccbb / grpo_demo.py
Last active April 28, 2025 01:48
GRPO Llama-1B
# train_grpo.py
#
# See https://github.com/willccbb/verifiers for ongoing developments
#
import re
import torch
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig
from trl import GRPOConfig, GRPOTrainer