Skip to content

Instantly share code, notes, and snippets.

View davideanastasia's full-sized avatar

Davide Anastasia davideanastasia

View GitHub Profile
@davideanastasia
davideanastasia / circles.cpp
Last active December 13, 2015 17:59
circles.cpp
#include <vector>
using namespace std;
struct UpDown
{
UpDown(int in, int out)
: in_(in)
, out_(out)
{}
# - Find ODB library
# Find the native ODB includes and library
# This module defines
# ODB_INCLUDE_DIR, where to find Url.h, etc.
# ODB_LIBRARIES, libraries to link against to use ODB client C++.
# ODB_FOUND, If false, do not try to use ODB client C++.
# also defined, but not for general use are
# ODB_CORE_LIBRARY, where to find the core ODB library.
# ODB_MYSQL_LIBRARY, where to find the mysql ODB library.
# ODB_PGSQL_LIBRARY, where to find the pgsql ODB library.
@davideanastasia
davideanastasia / average.cpp
Created February 19, 2013 15:40
Cumulative Average
#include <iostream>
#include <vector>
#include <algorithm>
#include <cstdlib>
#include <cstdio>
#include <iterator>
using namespace std;
template<int begin, int end>
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.UUID;
import org.apache.cassandra.db.marshal.AsciiType;
import org.apache.cassandra.dht.IPartitioner;
import org.apache.cassandra.dht.Murmur3Partitioner;

Setting up Flume NG, listening to syslog over UDP, with an S3 Sink

My goal was to set up Flume on my web instances, and write all events into s3, so I could easily use other tools like Amazon Elastic Map Reduce, and Amazon Red Shift.

I didn't want to have to deal with log rotation myself, so I setup Flume to read from a syslog UDP source. In this case, Flume NG acts as a syslog server, so as long as Flume is running, my web application can simply write to it in syslog format on the specified port. Most languages have plugins for this.

At the time of this writing, I've been able to get Flume NG up and running on 3 ec2 instances, and all writing to the same bucket.

Install Flume NG on instances

import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.model.HttpMethods._
import akka.http.scaladsl.model.{Uri, StatusCodes, HttpResponse, HttpRequest}
import akka.stream.ActorMaterializer
import scala.concurrent.duration._
import scala.concurrent.Await
import scala.util.{Success, Try, Failure}
@davideanastasia
davideanastasia / CassandraEnum.scala
Last active December 17, 2015 07:38
Spark Cassandra Connector Enum fiasco
import org.apache.spark.{SparkConf, SparkContext}
import com.datastax.spark.connector._
import com.datastax.spark.connector.types._
import scala.reflect.runtime.universe._
object Days {
sealed abstract class WeekDay(val id: Int)
object WeekDay {
@davideanastasia
davideanastasia / main.1.java
Last active June 5, 2018 21:59
Apache Beam Getting Standard - #1
public class Indexer {
public static void main(String[] args) throws Exception {
Options options = PipelineOptionsFactory.fromArgs(args)
.withValidation()
.as(Options.class);
Pipeline p = Pipeline.create(options);
// ...
p.run().waitUntilFinish();
@davideanastasia
davideanastasia / main.2.java
Created June 5, 2018 22:02
Apache Beam Getting Started - #2
public class Indexer {
public static void main(String[] args) throws Exception {
// ...
p.apply(FileIO.match().filepattern(options.getInputFile()))
.apply(FileIO.readMatches())
.apply("ReadFile", ParDo.of(new ReadFileFn()))
// ..
}
@davideanastasia
davideanastasia / readfilefn.java
Created June 5, 2018 22:03
Apache Beam Getting Started - #3
public class ReadFileFn extends DoFn<FileIO.ReadableFile, KV<Metadata, String>> {
@ProcessElement
public void processElement(ProcessContext c) {
try {
FileIO.ReadableFile f = c.element();
String filename = f.getMetadata().resourceId().getFilename();
ReadableByteChannel rbc = f.open();
try (InputStream stream = Channels.newInputStream(rbc);