Skip to content

Instantly share code, notes, and snippets.

@debasishg
debasishg / gist:8172796
Last active October 19, 2025 00:47
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t

Moved

Now located at https://github.com/JeffPaine/beautiful_idiomatic_python.

Why it was moved

Github gists don't support Pull Requests or any notifications, which made it impossible for me to maintain this (surprisingly popular) gist with fixes, respond to comments and so on. In the interest of maintaining the quality of this resource for others, I've moved it to a proper repo. Cheers!

@pmeenan
pmeenan / user-timing-rum.js
Last active January 18, 2024 23:46
Support routine for adding W3C user timing events to a site. Includes some basic polyfill support for browsers that don't support user timing or navigation timing (though the start time for non-navigation timing support could be improved with IE < 9 to use IE's custom start event).
// Support routines for automatically reporting user timing for common analytics platforms
// Currently supports Google Analytics, Boomerang and SOASTA mPulse
// In the case of boomerang, you will need to map the event names you want reported
// to timer names (for mPulse these need to be custom0, custom1, etc) using a global variable:
// rumMapping = {'aft': 'custom0'};
(function() {
var wtt = function(n, t, b) {
t = Math.round(t);
if (t >= 0 && t < 3600000) {
// Google Analytics
{
"IAB1": "Arts & Entertainment",
"IAB1-1": "Books & Literature",
"IAB1-2": "Celebrity Fan/Gossip",
"IAB1-3": "Fine Art",
"IAB1-4": "Humor",
"IAB1-5": "Movies",
"IAB1-6": "Music",
"IAB1-7": "Television",
"IAB2": "Automotive",
@koto
koto / xssdetect.js
Created December 1, 2012 22:05
reflected xss detection using xssauditor on phantomjs
var page = require('webpage').create(),
system = require('system'),
address;
page.onInitialized = function () {
page.evaluate(function () {
// additional detection code here perhaps
// f.e. detecting STORED/DOM XSS
});
@anacrolix
anacrolix / striter.py
Created September 26, 2012 14:35
A Python IO class wrapping an iterable of strings.
import io
class StringIteratorIO(io.TextIOBase):
def __init__(self, iter):
self._iter = iter
self._left = ''
def readable(self):
return True
@robertpitt
robertpitt / socks5.js
Created July 30, 2012 01:30
SOCKS5 Server as per rfc1928 (nodejs)
/**
* SOCKS5 Proxy as per RFC1928
* @see : http://www.ietf.org/rfc/rfc1928.txt
*/
/**
* Load Dependancies
*/
var net = require('net'),
@hellerbarde
hellerbarde / latency.markdown
Created May 31, 2012 13:16 — forked from jboner/latency.txt
Latency numbers every programmer should know

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs

@TooTallNate
TooTallNate / repl-client.js
Created March 26, 2012 20:09
Running a "full-featured" REPL using a net.Server and net.Socket
var net = require('net')
var sock = net.connect(1337)
process.stdin.pipe(sock)
sock.pipe(process.stdout)
sock.on('connect', function () {
process.stdin.resume();
process.stdin.setRawMode(true)
@rafl
rafl / elasticsearch_cache
Created March 22, 2012 16:32
Munin ElasticSearch plugins
#!/usr/bin/env perl
# Parameters supported:
#
# config
# autoconf
#
# Magic markers:
#%# family=auto
#%# capabilities=autoconf