- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
// Process github event data exports. | |
// | |
// Go here for more info: http://www.githubarchive.org/ | |
package main | |
import ( | |
"compress/gzip" | |
"encoding/csv" | |
"encoding/json" | |
"io" |
L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns on recent CPU
L2 cache reference ........................... 7 ns 14x L1 cache
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy ............. 3,000 ns = 3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns = 20 µs
SSD random read ........................ 150,000 ns = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs 4X memory
This simple script will take a picture of a whiteboard and use parts of the ImageMagick library with sane defaults to clean it up tremendously.
The script is here:
#!/bin/bash
convert "$1" -morphology Convolve DoG:15,100,0 -negate -normalize -blur 0x1 -channel RBG -level 60%,91%,0.1 "$2"
A checklist for designing and developing internet scale services, inspired by James Hamilton's 2007 paper "On Desgining and Deploying Internet-Scale Services."
- Does the design expect failures to happen regularly and handle them gracefully?
- Have we kept things as simple as possible?
# add this code to your .bashrc file | |
# gitpending() transverses from the current directory to | |
# inspect 1 directory level deep for any git repos that have | |
# pending changes to commit. | |
function gitpending() | |
{ | |
for d in */ ; do | |
pushd $d > /dev/null | |
DIRNAME=$(basename "$d") |
* { | |
font-size: 12pt; | |
font-family: monospace; | |
font-weight: normal; | |
font-style: normal; | |
text-decoration: none; | |
color: black; | |
cursor: default; | |
} |
This project has moved to https://github.com/jonhoo/drwmutex so it can be imported into Go applications.
This list has moved to the following page:
http://anonymoushash.vmbrasseur.com/resources/negotiation/
It's joining a number of other lists of resources (management, telecommuting, etc.):
http://anonymoushash.vmbrasseur.com/resources/
Keeping them all together like this makes it much more likely I'll maintain them (as a group they'll have a larger mindshare in my cluttered & busy brain) rather than fire & forget a list which is never updated again.
/* | |
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | |
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT | |
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR | |
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT | |
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | |
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | |
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | |
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | |
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT |