Vaquar Khan vaquarkhan

General Background and Overview

Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
Models and Issues in Data Stream Systems
Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
[Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep

Sessions

Code	Title	Duration	Link
Keynote	Andy Jassy Keynote Announcement Recap	0:01	https://www.youtube.com/watch?v=TZCxKAM2GtQ
Keynote	AWS re:Invent 2016 Keynote: Andy Jassy	2:22	https://www.youtube.com/watch?v=8RrbUyw9uSg
Keynote	AWS re:Invent 2016 Keynote: Werner Vogels	2:16	https://www.youtube.com/watch?v=ZDScBNahsL4
Keynote	[Tuesday Night Live with Jame

Resources

The introduction to Reactive Programming you've been missing

(by @andrestaltz)

This tutorial as a series of videos

If you prefer to watch video tutorials with live-coding, then check out this series I recorded with the same contents as in this article: Egghead.io - Introduction to Reactive Programming.

import boto3

client = boto3.client('glue')

response = client.create_crawler(
    Name='SalesCSVCrawler',
    Role='AWSGlueServiceRoleDefault',
    DatabaseName='sales-cvs',
    Description='Crawler for generated Sales schema',

	import org.apache.spark.graphx._
	import org.apache.spark.rdd.RDD

	case class Peep(name: String, age: Int)

	val vertexArray = Array(
	(1L, Peep("Kim", 23)),
	(2L, Peep("Pat", 31)),
	(3L, Peep("Chris", 52)),
	(4L, Peep("Kelly", 39)),

	// load error messages from a log into memory
	// then interactively search for various patterns

	// base RDD
	val lines = sc.textFile("log.txt")

	// transformed RDDs
	val errors = lines.filter(_.startsWith("ERROR"))
	val messages = errors.map(_.split("\t")).map(r => r(1))
	messages.cache()

	import boto3

	client = boto3.client(
	'emr',
	region_name='eu-west-1'
	)

	cmd = "hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount file:///etc/services /output"

	emrcluster = client.run_job_flow(

	<dependency>
	<groupId>io.springfox</groupId>
	<artifactId>springfox-swagger2</artifactId>
	<version>2.9.2</version>
	<scope>compile</scope>
	</dependency>
	<dependency>
	<groupId>io.springfox</groupId>
	<artifactId>springfox-swagger-ui</artifactId>
	<version>2.9.2</version>