Jaap ter Woerds jaapterwoerds

General Background and Overview

Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
Models and Issues in Data Stream Systems
Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
[Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t

Akka Cluster Implementation Notes

Slightly disorganized but reasonably complete notes on the algorithms, strategies and optimizations of the Akka Cluster implementation. Could use a lot more links and context etc., but was just written for my own understanding. Might be expanded later.

Links to papers and talks that have inspired the implementation can be found on the 10 last pages of this presentation.

Akka Gossip

Gossip state

This is the Gossip state representation:

Overview

There are lots of representations for strings. In most languages they pick one set of tradeoffs and run with it. In haskell the "default" implementation (at least the one in the prelude) is a pretty bad choice, but unlike most other languages (really) good implementations exist for pretty much every way you can twist these things. This can be a good thing, but it also leads to confusion, and frustration to find the right types and how to convert them.

	{-# LANGUAGE TypeFamilies #-}

	import Data.Function (on)
	import Control.Applicative

	data EventData e = EventData {
	eventId :: Int,
	body :: Event e
	}

	#!/usr/bin/env python
	from optparse import OptionParser
	from brod.zk import *
	import pickle
	import struct
	import socket
	import sys
	import time