devmacrile’s gists

devmacrile / sentiment140.r

Last active August 29, 2015 14:14

Simple R wrapper function for the Sentiment140 API

	# Wrapper function for the Sentiment140 API
	# An API for a maximum entropy model trained on ~1.5M tweets
	# The server will timeout if the job takes > 60 seconds,
	# so if the tweet count is relatively high, the function
	# will split the data into chunks of 2500 (fairly arbitrary choice)
	# http://help.sentiment140.com/api
	Sentiment140 <- function(sentences){

	# Load required packages
	library(plyr)

devmacrile / wrong-cv-example.r

Last active August 29, 2015 14:14

Sloppy implementation of a simulation exemplifying a common error in performing cross-validation (from The Elements of Statistical Learning, 7.10.2)

	# Example of a common cross-validation mistake
	# Described in The Elements of Statistical Learning, 7.10.2
	# http://statweb.stanford.edu/~tibs/ElemStatLearn/
	#
	# Consider a scenario with
	# N = 50 samples in two equal-sized classes, and p = 5000 quantitative
	# predictors (standard Gaussian) that are independent of the class labels.
	# The true (test) error rate of any classifier is 50%. We carried out the above
	# recipe, choosing in step (1) the 100 predictors having highest correlation
	# with the class labels, and then using a 1-nearest neighbor classifier, based

devmacrile / map1.py

Created February 5, 2015 16:29

Modifiable map-reduce code for running TF-IDF via Hadoop Streaming jobs.

	#!/usr/bin/python
	import sys
	import re
	import nltk
	from nltk.corpus import stopwords

	stop_words = stopwords.words('english')
	#input comes from standard input
	for line in sys.stdin:
	#separate incident id from text

devmacrile / writeTDE.r

Created February 26, 2016 03:43

Write R data.frame to a Tableau data extract file (.tde)


	# Write R data.frame to a Tableau data extract file (.tde) by building and executing
	# a python script which utilizes the Tableau data extract API (a hack, yes).
	#
	# This, naturally, has a hard dependency on the TDE API, so is only available for
	# Windows and Linux systems (unfortunately)
	#
	# Devin Riley
	# October, 2014

devmacrile / keybase.md

Created October 25, 2016 18:43

I hereby claim:

I am devmacrile on github.
I am devmacrile (https://keybase.io/devmacrile) on keybase.
I have a public key whose fingerprint is 08D8 CCCC 5D01 1F96 285C 2606 7709 364D 9CCA 6F14

To claim this, I am signing this object:

devmacrile / twenty_sided.py

Created March 18, 2017 18:11

Twenty sided di sum/difference simulation

	import numpy as np
	import matplotlib.pyplot as plt
	import seaborn as sns

	di1 = np.random.random_integers(1, 20, 100000)
	di2 = np.random.random_integers(1, 20, 100000)

	values = (di1 + di2) - (np.absolute(di1 - di2))
	sample_mean = np.nanmean(values)

devmacrile / definitions.py

Created October 11, 2017 20:03

Curious about how python handled simultaneous local definition compared to a Lisp example

Devin Riley devmacrile