This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// The two sweetspots are 8-bit and 4-bit tags. In the latter case you can fit 14 32-bit keys in | |
// a cacheline-sized bucket and still have one leftover byte for metadata. | |
// As for how to choose tags for particular problems, Google's Swiss table and Facebook's F14 table | |
// both use hash bits that weren't used for the bucket indexing. This is ideal from an entropy perspective | |
// but it can be better to use some particular feature of the key that you'd otherwise check against anyway. | |
// For example, for 32-bit keys (with a usable sentinel value) you could use the 8 low bits as the tag | |
// while storing the remaining 24 bits in the rest of the bucket; that fits 16 keys per bucket. Or if the keys | |
// are strings you could store the length as the discriminator: with an 8-bit tag, 0 means an empty slot, | |
// 1..254 means a string of that length, and 255 means a string of length 255 or longer. With a 4-bit tag |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// This is another take on the mark-compact collector from https://gist.github.com/pervognsen/7fe51bef8977cb249ac4c6f830f818a5 | |
// To avoid having to do global compaction, our object indices will have two parts: a block index and a block offset. | |
// Within a block we have the same linear older-to-newer ordering by offset. But now blocks are allowed to have different ages. | |
// The block ages are defined by their position in a linked list: There's oldest_block and newest_block indices and then | |
// previous_block[block_index] for each block. This enables newest-to-oldest block iteration and the linked-list structure | |
// means that we can free an empty block by unlinking it. When a block is reused, it becomes the newest_block. Now, instead | |
// of only compacting within a block we will actually be coalescing across an age range of blocks. By doing so, we will usually | |
// be able to empty out entire blocks from the newer part of that age range, so they can be reused. This should have very similar | |
// performance characteri |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Linear-scan mark-compact collector for causal data structures, i.e. the reference graph is acyclic and the objects | |
// are ordered by age in the heap (which happens if you use a linear allocator) so they are automatically topologically sorted. | |
// This algorithm has very high memory-level parallelism which can be exploited by modern out-of-order processors, and should | |
// be memory throughput limited. | |
void collect(void) { | |
// Initialize marks from roots. | |
memset(marks, 0, num_nodes * sizeof(uint32_t)); | |
int newest = 0, oldest = num_nodes; | |
for (int i = 0; i < num_roots; i++) { | |
marks[roots[i]] = 1; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
struct Object { | |
Key key; // The key is any piece of data that uniquely identifies the object. | |
// ... | |
}; | |
struct Handle { | |
Key key; | |
Index index; // This caches a speculative table index for an object with the corresponding key. | |
}; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A grid graph is a directed graph where the vertices are arranged in k rows each | |
containing k vertices, where k is a positive integer. Let v[i,j] denote the jth | |
vertex in the ith row. Let s = v[1,1]. A grid graph has the following vertices and edges: | |
V = {v[i,j] | 1 <= i <= k and 1 <= j <= k} | |
E = {(v[i,j], v[i,j+1]) | 1 <= i <= k and 1 <= j < k} union | |
{(v[i,j], v[i,j-1]) | 1 <= i <= k and 1 < j <= k} union | |
{(v[i,j], v[i+1,j]) | 1 <= i < k and 1 <= j <= k} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ValueRef add(ValueRef left, ValueRef right) { | |
// Canonical commutation: x + y => y + x (constants have negative pos and move to the left) | |
if (left.pos > right.pos) SWAP(left, right); | |
switch (left.op) { | |
case NEG: | |
// Strength reduction: (-x) + y => y - x | |
return sub(right, getleft(left)); | |
case NUM: | |
// Strength reduction: 0 + x => x | |
if (getnum(left) == 0) return right; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Heavily based on ideas from https://github.com/LuaJIT/LuaJIT/blob/v2.1/src/lj_opt_fold.c | |
// The most fundamental deviation is that I eschew the big hash table and the lj_opt_fold() | |
// trampoline for direct tail calls. The biggest problem with a trampoline is that you lose | |
// the control flow context. Another problem is that there's too much short-term round-tripping | |
// of data through memory. It's also easier to do ad-hoc sharing between rules with my approach. | |
// From what I can tell, it also isn't possible to do general reassociation with LJ's fold engine | |
// since that requires non-tail recursion, so LJ does cases like (x + n1) + n2 => x + (n1 + n2) | |
// but not (x + n1) + (y + n2) => x + (y + (n1 + n2)) which is common in address generation. The | |
// code below has some not-so-obvious micro-optimizations for register passing and calling conventions, | |
// e.g. the unary_cse/binary_cse parameter order, the use of long fields in ValueRef. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Hash, displace, and compress: http://cmph.sourceforge.net/papers/esa09.pdf | |
# This is expected linear time for any seeded hash function that acts like a random hash function (universality isn't enough). | |
# (Actually, the code as written is O(n log n) when targeting 100% load. It's O(n) when targeting any smaller load factor.) | |
# You can make keys_per_bucket higher than the default of 4 but construction time will start to increase dramatically. | |
# The paper this is based on compresses the seeds (so the fact that the algorithm tries seeds in increasing order is important) | |
# which brings the representation size close to the information-theoretical minimum. I don't do any of that here, but it could | |
# be done as a postprocess. | |
def make_perfect_hash(keys, load_factor=1.0, keys_per_bucket=4, rhash=murmurhash, max_seed=1000000): | |
m = int(len(keys) / load_factor) | |
r = int(len(keys) / keys_per_bucket) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// The buffer is terminated with "\n\0" | |
int count(const char *p) { | |
int n = 0; | |
while (*p) { | |
while (*p != '\n') p++; | |
n++; | |
while (*p == '\n') p++; | |
} | |
return n; | |
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <stdlib.h> | |
#include <stdio.h> | |
#include <stdint.h> | |
#include <string.h> | |
const char *keywords[] = { | |
"completed_in", | |
"contributors", | |
"contributors_enabled", | |
"coordinates", |