- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
| #!/bin/bash | |
| # The script does automatic checking on a Go package and its sub-packages, including: | |
| # 1. gofmt (http://golang.org/cmd/gofmt/) | |
| # 2. goimports (https://github.com/bradfitz/goimports) | |
| # 3. golint (https://github.com/golang/lint) | |
| # 4. go vet (http://golang.org/cmd/vet) | |
| # 5. race detector (http://blog.golang.org/race-detector) | |
| # 6. test coverage (http://blog.golang.org/cover) | |
| set -e |
| # Install ARCH Linux with encrypted file-system and UEFI | |
| # The official installation guide (https://wiki.archlinux.org/index.php/Installation_Guide) contains a more verbose description. | |
| # Download the archiso image from https://www.archlinux.org/ | |
| # Copy to a usb-drive | |
| dd if=archlinux.img of=/dev/sdX bs=16M && sync # on linux | |
| # Boot from the usb. If the usb fails to boot, make sure that secure boot is disabled in the BIOS configuration. | |
| # Set swedish keymap |
| package main | |
| import ( | |
| "fmt" | |
| "log" | |
| "sort" | |
| "strconv" | |
| "strings" | |
| "unicode" | |
| ) |
Each of these commands will run an ad hoc http static server in your current (or specified) directory, available at http://localhost:8000. Use this power wisely.
$ python -m SimpleHTTPServer 8000| #!/usr/bin/ruby | |
| # Convert a Markdown README to HTML with Github Flavored Markdown | |
| # Github and Pygments styles are included in the output | |
| # | |
| # Requirements: json gem (`gem install json`) | |
| # | |
| # Input: STDIN or filename | |
| # Output: STDOUT | |
| # Arguments: "-c" to copy to clipboard (or "| pbcopy"), or "> filename.html" to output to a file | |
| # cat README.md | flavor > README.html |
These steps show two less common interactions with git to extract a single file which is inside a subfolder from a git repository. These steps essentially reduce the repository to just the desired files and should performed on a copy of the original repository (1.).
First the repository is reduced to just the subfolder containing the files in question using git filter-branch --subdirectory-filter (2.) which is a useful step by itself if just a subfolder needs to be extracted. This step moves the desired files to the top level of the repository.
Finally all remaining files are listed using git ls, the files to keep are removed from that using grep -v and the resulting list is passed to git rm which is invoked by git filter-branch --index-filter (3.). A bit convoluted but it does the trick.