Skip to content

Instantly share code, notes, and snippets.

View cstorey's full-sized avatar

cstorey

View GitHub Profile
@debasishg
debasishg / gist:8172796
Last active April 20, 2025 12:45
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
@neel-krishnaswami
neel-krishnaswami / re.ml
Created November 7, 2013 12:20
Implementation of DFA-based regexp matching using Antimirov derviatives
type re = C of char | Nil | Seq of re * re | Bot | Alt of re * re | Star of re
let rec null = function
| C _ | Bot -> false
| Nil | Star _ -> true
| Alt(r1, r2) -> null r1 || null r2
| Seq(r1, r2) -> null r1 && null r2
module R = Set.Make(struct type t = re let compare = compare end)
let rmap f rs = R.fold (fun r -> R.add (f r)) rs R.empty
@oehme
oehme / JodaToJdbc4.java
Last active December 27, 2015 09:29
Joda Time to JDBC - working and fast
private static final Calendar UTC = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
private Calendar utc() {
return (Calendar) UTC.clone();
}
//unchanged
public LocalDateTime getValue(ResultSet rs, int index) throws SQLException {
Timestamp ts = rs.getTimestamp(index, utc());
@al3xandru
al3xandru / gist:7283595
Last active December 27, 2015 06:39
Ricon West 2013 : Full day streams jump list
## Day 1 - Track 1 ##
00:10:00 Pat Helland: Keystone - Between a ROC and a SOFT place
01:12:18 Lindsey Kuper: LVars: Lattice-based Data Structures for Deterministic Parallelism
02:11:54 Eric Redmond: Yokozuna!
04:10:50 Justin Shoffstall & Charlie Voiselle: The Seven-Layer Burrito; Troubleshooting a Distributed Database in Production
05:11:00 Peter Bailis: Bad as I wanna be - Coordination and Consistency in Distributed Databases
06:13:24 Joseph Blomstedt: Bringing Consistency to Riak (Part 2)
07:16:22 Lightning talks (_nb_: you **must** see @tsantero!)
@syntagmatic
syntagmatic / index.html
Last active May 31, 2016 23:27 — forked from mbecica/index.html
Hyperbolic Grid
<!DOCTYPE HTML>
<head>
<script type="text/javascript" src="http://d3js.org/d3.v3.min.js"></script>
</head>
<body>
<canvas width=1000 height=600></canvas>
<script type="text/javascript">
var canvas = d3.select("canvas").node();
var xgrid = 10,
ygrid = 10,
@tef
tef / ersatz.bibimbap.rst
Last active December 21, 2015 02:19
try not to poison yourself with this weird old tip

ersatz bibimbap

bibimbap as i make it is basically

  • rice
  • sitr fried vegetables
  • some meat in tasty spices
  • gochujang (this is fermented soy bean paste with paprika)
  • a fried egg
@mmoulton
mmoulton / README.md
Last active November 7, 2020 18:19
Docker Container Stats Collection Using Collectd

Docker stats collection for collectd

This script can be used to feed collectd with cpu and memory usage statistics for running docker containers using the collectd exec plugin.

This script will report the used and cached memory as well as the user and system cpu usage by inspecting the appropriate cgroup stat file for each running container.

Usage

This script is intented to be executed by collectd on a host with running docker containers. To use, simply configure the exec plugin in collectd to execute the collectd-docker.sh script. You may need to adjust the script to match your particulars, such as the mount location for cgroup.

@rjhall
rjhall / SVD.scala
Last active December 20, 2015 22:08
import org.apache.commons.math3.linear._
import com.twitter.algebird.Operators._
import com.twitter.scalding._
import cascading.pipe.Pipe
import cascading.pipe.joiner.InnerJoin
import cascading.tuple.Fields
object SVD extends Serializable {