Skip to content

Instantly share code, notes, and snippets.

View saswata-dutta's full-sized avatar
💭
I may be slow to respond.

Saswata Dutta saswata-dutta

💭
I may be slow to respond.
View GitHub Profile
@saswata-dutta
saswata-dutta / FilterBadGzipFiles.scala
Created September 5, 2019 12:28 — forked from jfrazee/FilterBadGzipFiles.scala
Spark job to read gzip files, ignoring corrupted files
import java.io._
import scala.io._
import java.util.zip._
// Spark
import org.slf4j.Logger
import org.apache.spark.{ SparkConf, SparkContext, Logging }
// Hadoop
import org.apache.hadoop.io.compress.GzipCodec
/**
* A tiny class that extends a list with four combinatorial operations:
* ''combinations'', ''subsets'', ''permutations'', ''variations''.
*
* You can find all the ideas behind this code at blog-post:
*
* http://vkostyukov.ru/posts/combinatorial-algorithms-in-scala/
*
* How to use this class.
*
@saswata-dutta
saswata-dutta / bson_to_python_file.py
Created August 31, 2019 08:24 — forked from pomack/bson_to_python_file.py
Convert a MongoDB BSON file to a python file that can be imported.
#!/usr/bin/env python
import argparse
import bson
import datetime
import struct
import sys
INDENT_SPACES = ' '
def read_bson_file(file, as_class=dict, tz_aware=True, uuid_subtype=bson.OLD_UUID_SUBTYPE):
@saswata-dutta
saswata-dutta / regexp_match.cpp
Created August 21, 2019 09:33 — forked from bluesunxu/regexp_match.cpp
Match a regular expression (including '.' and '*' using dynamic programming
bool canMatch(char a, char b)
{
return (a == b || b == '.');
}
bool isMatch(const char *s, const char *p) {
int lS = strlen(s);
int lP = strlen(p);
vector<vector<bool> > F(lS + 1, vector<bool>(lP + 1));
F[0][0] = true;
@saswata-dutta
saswata-dutta / System Design.md
Created August 19, 2019 14:54 — forked from vasanthk/System Design.md
System Design Cheatsheet

System Design Cheatsheet

Picking the right architecture = Picking the right battles + Managing trade-offs

Basic Steps

  1. Clarify and agree on the scope of the system
  • User cases (description of sequences of events that, taken together, lead to a system doing something useful)
    • Who is going to use it?
    • How are they going to use it?
@saswata-dutta
saswata-dutta / Handler.java
Created July 26, 2019 20:15 — forked from lucastex/Handler.java
Reading files from Amazon S3 directly in a java.net.URL object.
package sun.net.www.protocol.s3;
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import java.net.URLStreamHandler;
import org.jets3t.service.ServiceException;
import org.jets3t.service.impl.rest.httpclient.RestS3Service;
@saswata-dutta
saswata-dutta / GOLTest.scala
Created July 15, 2019 14:13
Scala Game of Life - Craftsmanship Guild Mini Coderetreat
package guild.gameoflife
import org.scalatest.matchers.ShouldMatchers
import org.scalatest.prop.Checkers
import org.scalatest.Spec
import org.scalatest.junit.JUnitRunner
import org.junit.runner.RunWith
@RunWith(classOf[JUnitRunner])
class GOLTest extends Spec with ShouldMatchers with Checkers {
@saswata-dutta
saswata-dutta / pom.xml
Last active June 16, 2019 11:08 — forked from wrschneider/App.java
Spring Boot listener for Amazon SQS
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>bill</groupId>
<artifactId>boottest</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>boottest</name>
@saswata-dutta
saswata-dutta / spark-rest-submit.sh
Created May 31, 2019 07:16 — forked from yaravind/spark-rest-submit.sh
Submit apps (SparkPi as e.g.) to spark cluster using rest api
curl -X POST -d http://master-host:6066/v1/submissions/create --header "Content-Type:application/json" --data '{
"action": "CreateSubmissionRequest",
"appResource": "hdfs://localhost:9000/user/spark-examples_2.11-2.0.0.jar",
"clientSparkVersion": "2.0.0",
"appArgs": [ "10" ],
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass": "org.apache.spark.examples.SparkPi",
"sparkProperties": {
@saswata-dutta
saswata-dutta / Makefile
Created May 26, 2019 13:44 — forked from kwk/Makefile
Compiling with Address Sanitizer (ASAN) with CLANG and with GCC-4.8
.PHONY: using-gcc using-gcc-static using-clang
using-gcc:
g++-4.8 -o main-gcc -lasan -O -g -fsanitize=address -fno-omit-frame-pointer main.cpp && \
ASAN_OPTIONS=symbolize=1 ASAN_SYMBOLIZER_PATH=$(shell which llvm-symbolizer) ./main-gcc
using-gcc-static:
g++-4.8 -o main-gcc-static -static-libstdc++ -static-libasan -O -g -fsanitize=address -fno-omit-frame-pointer main.cpp && \
ASAN_OPTIONS=symbolize=1 ASAN_SYMBOLIZER_PATH=$(shell which llvm-symbolizer) ./main-gcc-static