gavinwhyte’s gists

gavinwhyte / upgradegit.txt

Created August 31, 2015 07:03

upgrading git in ubuntu

	add the PPA to the local index:
	>sudo add-apt-repository ppa:git-core/ppa

	update the local repository index
	>sudo apt-get update

	and lastly install the git package.
	>sudo apt-get install git

gavinwhyte / requirements.txt

Created August 31, 2015 10:19

python requirements

gavinwhyte / rubyinstall.sh

Created August 31, 2015 10:21

	Installing Homebrew

	First, we need to install Homebrew. Homebrew allows us to install and compile software packages easily from source.

	Homebrew comes with a very simple install script. When it asks you to install XCode CommandLine Tools, say yes.

	Open Terminal and run the following command:

	ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
	Installing Ruby

gavinwhyte / rubyinstall.sh

Created August 31, 2015 10:25

ruby install

	Installing Homebrew

	First, we need to install Homebrew. Homebrew allows us to install and compile software packages easily from source.

	Homebrew comes with a very simple install script.

	When it asks you to install XCode CommandLine Tools, say yes.

	Open Terminal and run the following command:

gavinwhyte / knn.py

Created September 13, 2015 10:17

Knn

	__author__ = 'gavinwhyte'
	from numpy import *
	import operator

	import matplotlib
	import matplotlib.pyplot as plt

	def createDataSet():
	group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
	labels = ['A', 'A', 'B', 'B']

gavinwhyte / categorical.py

Created July 26, 2016 10:19

python

	# After determining with attributes are categorical and which
	# are numeric , you'll want descriptive stat for the numeric
	# variables and a count of the unique categories in each
	# categorical attribute


	import urllib2
	import sys
	import numpy as np

gavinwhyte / trips.py

Last active August 23, 2016 09:26

TripCount

	from scipy.stats import poisson
	import matplotlib.pyplot as plt


	import numpy as np


	fig, ax = plt.subplots(1, 1)

	x = np.fromfile('tripcountssample.txt',

gavinwhyte / multipart.py

Created January 18, 2017 03:43

To Run file python multipart.py bucketname extremely_large_file.txt

	#!/usr/bin/env python
	import os, sys
	import math
	import boto

	AWS_ACCESS_KEY_ID = ''
	AWS_SECRET_ACCESS_KEY = ''

	def upload_file(s3, bucketname, file_path):

gavinwhyte / pcainr.txt

Last active June 13, 2017 04:49

	Principal component analysis (PCA) is a dimensionality reduction technique that is widely used in data analysis.

	Reducing the dimensionality of a dataset can be useful in different ways. For example, our ability to visualize data is limited to 2 or 3 dimensions.

	Lower dimension can sometimes significantly reduce the computational time of some numerical algorithms.

	Besides, many statistical models suffer from high correlation between covariates, and PCA can be used to produce linear combinations of the covariates that are uncorrelated between each other.


	Computing PCA

gavinwhyte / consumer.scala

Created October 6, 2017 10:38

	import java.util

	import org.apache.kafka.clients.consumer.KafkaConsumer

	import scala.collection.JavaConverters._

	object ConsumerExample extends App {

	import java.util.Properties

Gavin Whyte gavinwhyte