Skip to content

Instantly share code, notes, and snippets.

@nelhage
nelhage / pyperf.md
Created February 11, 2025 19:38
LTO+PGO pyperformance run

Benchmarks with tag 'apps':

Benchmark 2025-02-11_15-23-computed-goto-247b50dec8af 2025-02-11_16-19-tailcall-10c138c179ba 2025-02-11_16-15-computed-goto-nomerge-3178d094669b
2to3 295 ms 288 ms: 1.02x faster 293 ms: 1.01x faster
docutils 2.86 sec 2.83 sec: 1.01x faster not significant
html5lib 64.3 ms 63.9 ms: 1.01x faster 64.9 ms: 1.01x slower
Geometric mean (ref) 1.01x faster 1.00x slower

Benchmarks with tag 'apps':

Benchmark goto tail-call goto-mine
2to3 345 ms 331 ms: 1.04x faster 340 ms: 1.02x faster
docutils 3.27 sec 3.22 sec: 1.02x faster 3.26 sec: 1.00x faster
html5lib 72.6 ms 69.6 ms: 1.04x faster 73.6 ms: 1.01x slower
Geometric mean (ref) 1.03x faster 1.00x faster
@nelhage
nelhage / gist:0ad24619e4a0bc4e407d05da0db00125
Created February 10, 2025 03:50
Exploring benchmarks for CPython's tail-call JIT
❯ PYTHONHASHSEED=11 taskset -c 0-11 perf stat $tailcall -u venv/lib/python3.12/site-packages/pyperformance/data-files/benchmarks/bm_unpack_sequence/run_benchmark.py tuple -p1 -n1 -l 5000
.
unpack_sequence_tuple: 22.0 ns
Performance counter stats for 'venv/cpython3.14-fbced5d02230-compat-af96b9431081/bin/python -u venv/lib/python3.12/site-packages/pyperformance/data-files/benchmarks/bm_unpack_sequence/run_benchmark.py tuple -p1 -n1 -l 5000':
244.38 msec task-clock:u # 0.998 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
6,340 page-faults:u # 25.943 K/sec
#!/bin/bash
# Create a temporary file to store the input
tempfile=$(mktemp)
trap 'rm -f "$tempfile"' EXIT
# Read from stdin into the temporary file
cat > "$tempfile"
# Get the preferred editor, defaulting to vim if none is set
# before
def my_decorator(f):
def wrapper(*args, **kwds):
print('Calling decorated function')
return f(*args, **kwds)
return wrapper
# but why not
def my_decorator(f)(*args, **kwds):
print('Calling decorated function')
import pty
import time
import os
import termios
import signal
import sys
import traceback
import threading
if len(sys.argv) > 1:
package main
var x interface{ f() }
type A struct {
fp func()
}
func (a A) f() { a.fp() }
#!/usr/bin/env python
import os
import time
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
INTERVAL = 1
COMM_SIZE = (10,)
@nelhage
nelhage / llama.json
Last active January 3, 2021 18:34
Llama CF template
{
"Parameters": {
"ObjectStoreBucket": {
"Type": "String",
"Description": "A pre-existing S3 bucket to use for llama's object store"
},
"ObjectStorePrefix": {
"Type": "String",
"Description": "A prefix in $ObjectStoreBucket under which to store objects",
"Default": "/",
# BoringSSL build times, Pixelbook
# `make -j5`, local compiler
real 2m40.180s
user 8m37.189s
sys 1m21.565s
# `ninja -j6`, local
real 2m35.987s
user 7m53.337s
sys 1m14.439s