Cal calvinlfer

Experimenting with AWS Lambda for ETL

A lot of us are interested in doing more analysis with our service logs so I thought I'd share an experiment I'm doing with Sync. The main idea is to transform the raw logs into something that'll be nice to query and generate reports with in Redshift.

The Pipeline

Logs make their way into an S3 bucket (lets call it the 'raw' bucket) where we've got a lambda listening for new data. This lambda reads the raw heka protobuf gzipped data, does some transformation and writes a new file to a different S3 bucket (the 'processed' bucket) in a format that is redshift friendly (like json or csv). There's another lambda listening on the processed bucket that loads this data into Redshift.

Haskell, Stack and Intellij IDEA IDE setup tutorial how to get started

Upon completion you will have a sane, productive Haskell environment adhering to best practices.

Basics

Haskell is a programming language.
Stack is tool for Haskell projects. (similar tools for other languages include Maven, Gradle, npm, RubyGems etc)
Intellij IDEA IDE is a popular IDE.

Install required libraries

sudo apt-get install libtinfo-dev libghc-zlib-dev libghc-zlib-bindings-dev

Recursion and Trampolines in Scala

Recursion is beautiful. As an example, let's consider this perfectly acceptable example of defining the functions even and odd in Scala, whose semantics you can guess:

def even(i: Int): Boolean = i match {
  case 0 => true
  case _ => odd(i - 1)
}

def odd(i: Int): Boolean = i match {

JWT Kong Example

Get and Start Kong and Co

git clone [email protected]:Mashape/docker-kong.git
cd docker-kong/compose
docker-compose up

Create Kong API Route

Libraries

Web Server: Play (framework) or http4s (library)
Actors: akka
Asynchronous Programming: monix (for tasks, reactors, observables, scheduler etc)
Authentication: Silhouette
Authorization: Deadbolt
Command-line option parsing: case-app
CSV Parsing: kantan.csv
DB: doobie (for PostgreSQL)

	trait OrderedAtLeastOnceDelivery extends AtLeastOnceDelivery {
	type DeliveryId = Long

	private case class Delivery(destination: ActorPath, deliveryIdToMessage: (DeliveryId) => Any)

	private val deliveryQueue = scala.collection.mutable.Queue.empty[Delivery]

	override def deliver(destination: ActorPath)(deliveryIdToMessage: (DeliveryId) => Any): Unit = {
	if (super.numberOfUnconfirmed == 0) {
	super.deliver(destination)(deliveryIdToMessage)

	object Main extends App {
	AvoidLosingGenericType.run()
	AvoidMatchingOnGenericTypeParams.run()
	TypeableExample.run()
	TypeTagExample.run()
	}

	class Funky[A, B](val foo: A, val bar: B) {
	override def toString: String = s"Funky($foo, $bar)"
	}

	import akka.actor.ActorSystem
	import akka.stream._
	import akka.stream.scaladsl._

	import scala.io.StdIn
	import scala.util.Random

	object SimplePartitionSample extends App {

	implicit val system = ActorSystem()

	postgres:
	image: postgres:9.4
	volumes:
	- ./init.sql:/docker-entrypoint-initdb.d/init.sql

	import akka.{Done, NotUsed}
	import akka.actor.ActorSystem
	import akka.http.scaladsl.Http
	import akka.http.scaladsl.common.EntityStreamingSupport
	import akka.http.scaladsl.model.ws.TextMessage.{Streamed, Strict}
	import akka.http.scaladsl.model.ws.{Message, TextMessage, WebSocketRequest}
	import akka.stream.ActorMaterializer
	import akka.stream.scaladsl.{Flow, Keep, Sink, Source}
	import akka.util.ByteString
	import io.circe.Json