Skip to content

Instantly share code, notes, and snippets.

View lwf's full-sized avatar

Torbjörn Norinder lwf

View GitHub Profile
@pervognsen
pervognsen / shift_dfa.md
Last active April 21, 2025 19:59
Shift-based DFAs

A traditional table-based DFA implementation looks like this:

uint8_t table[NUM_STATES][256]

uint8_t run(const uint8_t *start, const uint8_t *end, uint8_t state) {
    for (const uint8_t *s = start; s != end; s++)
        state = table[state][*s];
    return state;
}

(Originally written as a reply to an HN submission of this article: https://www.cs.virginia.edu/~lat7h/blog/posts/434.html)

There's a simple recipe for arithmetically encoding recursive algebraic data types (in the functional programming sense) which is related to this.

What you might have seen is Goedel numbering where a finite sequence of natural numbers a_0, a_1, ..., a_n (where n isn't fixed but can vary per sequence) is mapped bijectively onto p_0^a_0 a_1^p_1 ... a_n^p_n where p_0, p_1, ... is an enumeration of the primes.

However, if you want to represent trees instead of sequences, you have a better, simpler option. The key is the existence of a bijective pairing function between N^2 and N, which you can write as <m, n> for m, n in N.

You have a lot of choices for how to construct the pairing function. But a curious fact is that there is essentially one polynomial pairing function and it's the one you saw in class when you learned that the rationals are countable: https://en.wikipedia.org/wiki/Fuet

// Linear-scan mark-compact collector for causal data structures, i.e. the reference graph is acyclic and the objects
// are ordered by age in the heap (which happens if you use a linear allocator) so they are automatically topologically sorted.
// This algorithm has very high memory-level parallelism which can be exploited by modern out-of-order processors, and should
// be memory throughput limited.
void collect(void) {
// Initialize marks from roots.
memset(marks, 0, num_nodes * sizeof(uint32_t));
int newest = 0, oldest = num_nodes;
for (int i = 0; i < num_roots; i++) {
marks[roots[i]] = 1;
@pervognsen
pervognsen / dbg-test.el
Last active April 23, 2023 05:27
dbg.el
(require 'dbg)
(global-set-key (kbd "<C-S-f5>") 'dbg-restart)
(global-set-key (kbd "<f5>") 'dbg-continue)
(global-set-key (kbd "<f9>") 'dbg-toggle-breakpoint)
(global-set-key (kbd "<f8>") 'dbg-watch)
(global-set-key (kbd "<f10>") 'dbg-next)
(global-set-key (kbd "<C-f10>") 'dbg-continue-to-here)
(global-set-key (kbd "<f11>") 'dbg-step)
(global-set-key (kbd "<C-S-f10>") 'dbg-jump)

My basic point is that currently, configuration management code manifests as a giant, unverifiable pile of mud. The languages we use lack types and are weak at making non-runtime assertions. With the modicum of sanity that a proper module system and types can bring to the table, we would be considerably better off.

@aphyr
aphyr / results
Created September 7, 2013 20:23
; Probability of a read seeing a collision for various poisson rates
{1 1.317905310489196E-7,
10 5.749661865748004E-6,
100 5.303513799265161E-5,
1000 5.090228791739002E-4,
10000 0.005040357033134383,
100000 0.04925138356052143,
1000000 0.4179179486964119}
@JonnieCache
JonnieCache / spacewar.lst
Created December 11, 2012 12:11
Spacewar PDP1 Source Code
-/macro fio-dec system, june 1963
007652 640500 szm=sza sma-szf
007652 650500 spq=szm i
007652 761200 clc=cma+cla-opr
- define senseswitch A
- repeat 3, A=A+A
- szs A
- term
- define init A,B
- law B
@erikfrey
erikfrey / merge_ldb.cpp
Created September 3, 2012 17:17
merge leveldbs
/*
* given a list of input leveldb's, merge them into a single output leveldb
*
*/
#include <iostream>
#include <algorithm>
#include <boost/program_options.hpp>
#include <boost/cstdint.hpp>