Skip to content

Instantly share code, notes, and snippets.

@soumyasd
soumyasd / RedisSortedSetQueryByTimeDemo.java
Created June 23, 2013 18:22
Simple Redis ZSET Demo with Dates
package redis;
import redis.clients.jedis.Jedis;
import redis.clients.jedis.JedisPool;
import redis.clients.jedis.JedisPoolConfig;
import java.util.Calendar;
import java.util.Date;
import java.util.Set;
{
"coordinates": null,
"created_at": "Thu Oct 21 16:02:46 +0000 2010",
"favorited": false,
"truncated": false,
"id_str": "28039652140",
"entities": {
"urls": [
{
"expanded_url": null,
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
import org.apache.spark.graphx.lib._
/*
* Alice, Dave and Bob are share traders,
* Each maintains atleast 1 trading account in their banks
* Trading accounts have shares, which are bought/sold in a transaction..
*
* E.g.
package uk.ac.ucl.cs.GI15.timNancyKawal {
class Trie[V](key: Option[Char]) {
def this() {
this(None);
}
import scala.collection.Seq
import scala.collection.immutable.TreeMap
import scala.collection.immutable.WrappedString
@soumyasd
soumyasd / 0.setup.sh
Last active August 29, 2015 14:07 — forked from ceteri/0.setup.sh
# using four part files to construct "minitweet"
cat rawtweets/part-0000[1-3] > minitweets
# change log4j properties to WARN to reduce noise during demo
mv conf/log4j.properties.template conf/log4j.properties
vim conf/log4j.properties # Change to WARN
# launch Spark shell REPL
./bin/spark-shell
@soumyasd
soumyasd / clk.tsv
Last active August 29, 2015 14:07 — forked from ceteri/clk.tsv
2014-03-04 15dfb8e6cc4111e3a5bb600308919594 11
2014-03-06 81da510acc4111e387f3600308919594 61
package topic
import spark.broadcast._
import spark.SparkContext
import spark.SparkContext._
import spark.RDD
import spark.storage.StorageLevel
import scala.util.Random
import scala.math.{ sqrt, log, pow, abs, exp, min, max }
import scala.collection.mutable.HashMap

Tuning Storm+Trident

Tuning a dataflow system is easy:

The First Rule of Dataflow Tuning:
* Ensure each stage is always ready to accept records, and
* Deliver each processed record promptly to its destination
// data files can be downloaded at https://s3.amazonaws.com/hw-sandbox/tutorial1/infochimps_dataset_4778_download_16677-csv.zip
import java.io.Serializable
import java.util
import org.apache.spark.sql._
val sc = new SparkContext("spark://master:7077", "Spark SQL Intro")
val sqlContext = new SQLContext(sc)
import sqlContext.createSchemaRDD
@soumyasd
soumyasd / notes.md
Last active August 29, 2015 14:11 — forked from gangstead/notes.md

Typesafe webinar notes: Spray & Akka HTTP

Presenter - Mathias Doenitz

Spary.io

  • embeddable http stack built on Akka actors
  • Just an HTTP integration layer, not for building full web apps
  • Server & client side