Skip to content

Instantly share code, notes, and snippets.

View vaquarkhan's full-sized avatar
:octocat:
while( !(succeed=try())){}

Vaquar Khan vaquarkhan

:octocat:
while( !(succeed=try())){}
View GitHub Profile
@vaquarkhan
vaquarkhan / gist:ff4d29b3ccb1e03e19fe91eb7e7117c7
Created February 20, 2017 03:03 — forked from debasishg/gist:8172796
A collection of links for streaming algorithms and data structures
  1. General Background and Overview
@vaquarkhan
vaquarkhan / 00.graphx.scala
Created May 22, 2017 03:42 — forked from ceteri/00.graphx.scala
Spark GraphX demo
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
case class Peep(name: String, age: Int)
val vertexArray = Array(
(1L, Peep("Kim", 23)),
(2L, Peep("Pat", 31)),
(3L, Peep("Chris", 52)),
(4L, Peep("Kelly", 39)),
@vaquarkhan
vaquarkhan / log.scala
Created May 22, 2017 03:43 — forked from ceteri/log.scala
Intro to Apache Spark: code example for RDD animation
// load error messages from a log into memory
// then interactively search for various patterns
// base RDD
val lines = sc.textFile("log.txt")
// transformed RDDs
val errors = lines.filter(_.startsWith("ERROR"))
val messages = errors.map(_.split("\t")).map(r => r(1))
messages.cache()
@vaquarkhan
vaquarkhan / boto3_emr_create_cluster_with_wordcount_step.py
Created June 9, 2019 15:41 — forked from ruanbekker/boto3_emr_create_cluster_with_wordcount_step.py
Create EMR Cluster with a Wordcount Job as a Step in Boto3
import boto3
client = boto3.client(
'emr',
region_name='eu-west-1'
)
cmd = "hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount file:///etc/services /output"
emrcluster = client.run_job_flow(
@vaquarkhan
vaquarkhan / introrx.md
Created July 3, 2019 13:23 — forked from staltz/introrx.md
The introduction to Reactive Programming you've been missing
@vaquarkhan
vaquarkhan / aws_glue_boto3_example.md
Created August 26, 2019 05:50 — forked from ejlp12/aws_glue_boto3_example.md
AWS Glue Create Crawler, Run Crawler and update Table to use "org.apache.hadoop.hive.serde2.OpenCSVSerde"
import boto3

client = boto3.client('glue')

response = client.create_crawler(
    Name='SalesCSVCrawler',
    Role='AWSGlueServiceRoleDefault',
    DatabaseName='sales-cvs',
    Description='Crawler for generated Sales schema',
<dependency>
<groupId>io.springfox</groupId>
<artifactId>springfox-swagger2</artifactId>
<version>2.9.2</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>io.springfox</groupId>
<artifactId>springfox-swagger-ui</artifactId>
<version>2.9.2</version>