- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
using DataFrames | |
using Dates # I am on 0.3 | |
# Note the quoting style and the custom time-style | |
# sed is used to remove softlinks "dir" -> "../dir" | |
cmd = `ls -1 -l --quoting-style=c --time-style='+%Y-%m-%d_%H:%M'` |> `sed 's/ -> ".*"$//g'` | |
df = open(cmd, "r", STDOUT) do io | |
readtable(io, header=false, | |
separator=' ', |
The dimensions of data on DNA variation such as single nucleotide polymorphisms or SNPs can be very large, involving thousands or millions of SNPs, measured on potentially thousands of individuals. Typical genotyping platforms may examine from 50K(K=thousand) to 2.5M (M= millions) SNPs. Some platforms could be even denser. There are 2 nucleotides (A, C, G or T) at each position (one on each chromosome). If the genotyping read is not sufficiently good, a missing value could be recorded in one or both chromosomes for that position/SNP. A frequently used re-codification of the nocleotide data is to replace the characters (i.e. alleles) by the count of the allele with the lower frequency in the sample, or according to a pre-specified allele as determined in the genotyping platform and software. Thus, instead of storing a pair of nucleotides (e.g., AA, AG, GG), researchers store the individual’s genotype as either 0,1,2, or NA. In thi
import jpl.Atom; | |
import jpl.Compound; | |
import jpl.JPL; | |
import jpl.Query; | |
import jpl.Term; | |
/** | |
* This class shows how to configure Logtalk in SWI or YAP using the Jpl library. | |
* You need to have the jpl jar in your classpath to compile and execute this file. |
function mycd() | |
{ | |
#if this directory is writable then write to directory-based history file | |
#otherwise write history in the usual home-based history file | |
tmpDir=$PWD | |
echo "#"`date '+%s'` >> $HISTFILE | |
echo $USER' has exited '$PWD' for '$@ >> $HISTFILE | |
builtin cd "$@" # do actual cd | |
if [ -w $PWD ]; then export HISTFILE="$PWD/.dir_bash_history"; touch $HISTFILE; chmod --silent 777 $HISTFILE; | |
else export HISTFILE="$HOME/.bash_history"; |