Skip to content

Instantly share code, notes, and snippets.

View gavinwhyte's full-sized avatar

Gavin Whyte gavinwhyte

View GitHub Profile
# See blog post at http://vitobotta.com/sinatra-contact-form-jekyll/
%w(rubygems sinatra liquid active_support/secure_random resolv open-uri pony haml).each{ |g| require g }
APP_ROOT = File.join(File.dirname(__FILE__), '..')
set :root, APP_ROOT
set :views, File.join(APP_ROOT, "_layouts")
not_found do
@gavinwhyte
gavinwhyte / matrixandvectors.py
Last active August 29, 2015 14:24
Datascience
import math
# coding=utf-8
def vector_add(v,w):
"""adds corresponding elements"""
return [v_i + w_i
for v_i, w_i in zip(v,w)]
def vector_subtract(v, w):
"""subtracts corresponding elements"""
@gavinwhyte
gavinwhyte / elastic.py
Created July 13, 2015 05:00
Luigi elastic.co example
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
@gavinwhyte
gavinwhyte / pyspark.py
Created July 13, 2015 05:54
Luigi PySpark
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#!/usr/bin/ruby
require 'optparse'
require 'ostruct'
require 'json'
require 'open-uri'
require 'fileutils'
require 'net/http'
require 'time'
require 'date'
@gavinwhyte
gavinwhyte / Dockerfile
Created July 23, 2015 06:39
pythonbuild
FROM buildpack-deps:jessie
# remove several traces of debian python
RUN apt-get purge -y python.*
# http://bugs.python.org/issue19846
# > At the moment, setting "LANG=C" on a Linux system *fundamentally breaks Python 3*, and that's not OK.
ENV LANG C.UTF-8
RUN gpg --keyserver ha.pool.sks-keyservers.net --recv-keys C01E1CAD5EA2C4F0B8E3571504C367C218ADD4FF
# Propensity Score Matching in R
# Copyright 2013 by Ani Katchova
# install.packages("Matching")
library(Matching)
# install.packages("rbounds")
library("rbounds")
mydata<- read.csv("C:/Econometrics/Data/matching_earnings.csv")
attach(mydata)
import psycopg2
import pprint
configuration = { 'dbname': 'database_name',
'user':'user_name',
'pwd':'user_password',
'host':'redshift_endpoint',
'port':'redshift_password'
@gavinwhyte
gavinwhyte / sparksetup.py
Created August 22, 2015 00:11
Spark Setup
Setting up Apache Spark and IPython
Some notes on how to get started with Apache Spark on Ubuntu. It includes configuring IPython to run using a standalone pySpark.
To ensure that all the setup work can easily be replicated in a cloud-environment I have chosen to set everything up inside a Virtual Machine. For this we use Vagrant and VirtualBox. In this post I will not go into too much detail on how both of these work or how they can be configured. I recommend you read some of the Vagrant documentation to learn more about how to, for example, assign more RAM or shared folders with your host machine.
Setting up a Virtual Machine using Vagrant
Create a new folder on your machine that will be the home of your Vagrant file. Once the folder has been created cd into it and initialize a vagrant virtual machine. In this case I have chosen for a standard Ubuntu 14.04 distribution.
@gavinwhyte
gavinwhyte / rocksvsmines.py
Last active October 20, 2015 21:10
Rocks vs mines Machine Learning Classification
__author__ = 'gavinwhyte'
import urllib2
import numpy
import random
from sklearn import datasets, linear_model
from sklearn.metrics import roc_curve, auc
import pylab as pl
def confusionMatrix(predicted, actual, threshold):
if len(predicted) != len(actual): return -1