Elie A. eliasah

General Background and Overview

Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
Models and Issues in Data Stream Systems
Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
[Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t

Elasticsearch was created in 2010 by Shay Banon after forgoing work on another search solution, Compass, also built on Lucene and created in 2004.

Link to the Elasticsearch Blog.

	import java.io.*;
	import java.util.*;

	/** The class encapsulates an implementation of the Apriori algorithm
	* to compute frequent itemsets.
	*
	* Datasets contains integers (>=0) separated by spaces, one transaction by line, e.g.
	* 1 2 3
	* 0 9
	* 1 9

	import mesosphere.mesos.util.FrameworkInfo
	import org.apache.mesos.MesosSchedulerDriver


	/**
	* @author Tobi Knaup
	*/

	object Main extends App {

	-module(elasticsearch).
	-export([autocomplete/1, autocomplete/0,
	search/1, search/0,
	start/0,
	stop/0,
	loop/1]).
	-on_load(start/0).

	%% Return an autocomplete request
	autocomplete() ->

	package demo;

	import java.io.IOException;
	import java.util.StringTokenizer;

	import org.apache.hadoop.conf.Configuration;
	import org.apache.hadoop.conf.Configured;
	import org.apache.hadoop.fs.Path;
	import org.apache.hadoop.io.LongWritable;
	import org.apache.hadoop.io.MapWritable;

	curl -XPUT 'http://localhost:9200/us/user/1?pretty=1' -d '
	{
	"email" : "[email protected]",
	"name" : "John Smith",
	"username" : "@john"
	}
	'

	curl -XPUT 'http://localhost:9200/gb/user/2?pretty=1' -d '
	{

	require(ggplot2)

	# Figure 1
	ggplot(GermanCredit, aes(x = Class)) +
	geom_bar(aes(y = (..count..)/sum(..count..))) +
	labs(y = "prob.")