Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save cenkbircanoglu/6cb964d57d72754ee5183b6f598bc11c to your computer and use it in GitHub Desktop.
Save cenkbircanoglu/6cb964d57d72754ee5183b6f598bc11c to your computer and use it in GitHub Desktop.
A collection of links for streaming algorithms and data structures
  1. General Background and Overview
  1. Hyperloglog and MinHash : Implementation of a form of hyperloglog and adding capabilities of MinHash algorithm on to it which would enable to perform set intersections."While it does require extra processing power to deal with collecting all the minima, it’s possible to get satisfactory performance out of the structure for a relatively low storage or memory footprint" (http://tech.adroll.com/blog/data/2013/07/10/hll-minhash.html)

  2. Streaming/Sketching Conference from AK Tech : Contains links to videos and slides from the speakers like Muthukrishnan who spoke about Count Min Sketch (http://blog.aggregateknowledge.com/2013/05/23/foundation-capital-and-aggregate-knowledge-sponsor-streamingsketching-conference/)

  3. Q-digest

  1. t-digest : A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means. Ted Dunning's variant of Q-digest that does some improvements (https://github.com/tdunning/t-digest)

  2. Implementations

  1. Count-Min Sketch
  1. Surveys
  1. Distributed Streams Algorithms for Sliding Windows by Phillip B. Gibbons and Srikanta Tirthapura (http://home.engineering.iastate.edu/~snt/pubs/tocs04.pdf)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment