Skip to content

Instantly share code, notes, and snippets.

View MichaelChirico's full-sized avatar

Michael Chirico MichaelChirico

View GitHub Profile
@MichaelChirico
MichaelChirico / rpb_test.R
Last active April 17, 2017 15:06
setup & test RPushbullet
install.packages('RPushbullet')
# get key here:
# https://www.pushbullet.com/#settings/account
# note: this will prompt for a default device
RPushbullet::pbSetup('keep_this_private')
library(RPushbullet)
#that's it!
pbPost('note', "IT'S", 'ALIVE')
@MichaelChirico
MichaelChirico / bug.csv
Created February 23, 2017 16:56
segfault bug
ID V
16227 0
16228 0
16229 0
16230 0
16232 0
16234 0
16235 0
16236 0
16237 0
@MichaelChirico
MichaelChirico / portland_output
Last active March 1, 2017 02:45
for sandboxing outputs among machines
feb28 = feb28 = structure(list(x_coordina = c(45.590501, 45.567097, 45.518155,
45.509379, 45.521205, 45.568948, 45.525444, 45.527497, 45.537308,
45.522968, 45.537873, 45.506088, 45.534096, 45.481038, 45.521164,
45.510095, 45.518234, 45.518234, 45.5263, 45.511535, 45.535158,
45.515665, 45.516012, 45.479503, 45.515318, 45.520765, 45.515158,
45.446186, 45.514426, 45.466002, 45.553682, 45.532582, 45.51153,
45.497917, 45.515003, 45.503943, 45.530414, 45.510336, 45.533919,
45.522286, 45.534656, 45.5322, 45.481887, 45.574603, 45.509471,
45.543617, 45.584307, 45.537896, 45.518114, 45.532731, 45.512403,
45.497119, 45.441898, 45.533295, 45.469783, 45.563008, 45.515077,
@MichaelChirico
MichaelChirico / read_whatsapp.R
Last active May 29, 2018 08:40
extract from Whatsapp history to data.table
library(data.table)
whatsapp_raw = readLines('~/Downloads/WhatsApp Chat with PhDelphia.txt')
#have to deal with multi-line messages :\
idx = grepl('^1?[0-9]/', whatsapp_raw)
idxrle = rle(idx)
bdpts = cumsum(idxrle$lengths)
for (ii in seq_len(length(idxrle$values))) {
if (!idxrle$values[ii]) {
@MichaelChirico
MichaelChirico / open_issues.R
Last active December 20, 2023 15:37
List all open issues of a repository. Great for exploring random open issues.
library(gh)
library(data.table)
user = 'Rdatatable'
repo = 'data.table'
# GET URL for 100 open issues at a time
issue_query_fmt = "/repos/%s/%s/issues?state=open&per_page=100&page=%d"
issues = list()
page = 1L
@MichaelChirico
MichaelChirico / bad_fill.csv
Created July 11, 2017 23:41
file won't read due to improper columns
We can't make this file beautiful and searchable because it's too large.
train_set,delx,dely,alpha,eta,lt,theta,k,l1,l2,kde.bw,kde.lags,kde.win,pei,pai
train_13,181.510032358347,620.980401238194,0,0.956507977447473,7,2.06174959123186,100,0,0,615.388604282634,1,11,0.0952380952380952,24.4244783950786
train_13,181.510032358347,620.980401238194,0,0.956507977447473,7,2.06174959123186,100,0,1e-05,615.388604282634,1,11,0.0238095238095238,6.10611959876964
train_13,181.510032358347,620.980401238194,0,0.956507977447473,7,2.06174959123186,100,0,5e-05,615.388604282634,1,11,0,0
train_13,181.510032358347,620.980401238194,0,0.956507977447473,7,2.06174959123186,100,0,1e-04,615.388604282634,1,11,0.0238095238095238,6.10611959876964
train_13,181.510032358347,620.980401238194,0,0.956507977447473,7,2.06174959123186,100,0,5e-04,615.388604282634,1,11,0.0238095238095238,6.10611959876964
train_13,181.510032358347,620.980401238194,0,0.956507977447473,7,2.06174959123186,100,0,0.001,615.388604282634,1,11,0.0119047619047619,3.05305979938482
train_13,181.510032358347,620.980401238194,0,0.956507977447473,7,2.06
@MichaelChirico
MichaelChirico / competition_performance.R
Last active July 21, 2017 00:21
code to calculate NIJ performance
setwd('~/Downloads')
library(rgdal)
library(rgeos)
library(data.table)
#ground truth crimes file
obs_crimes = readOGR('PPB Data', 'NIJ2017_MAR01_May31')
obs_crimes$occ_date = as.IDate(obs_crimes$occ_date, format = '%Y/%m/%d')
#get indices corresponding to each crime category
\begin{itemized}
\item $L(x) = f(x) g(h(q(r(x))))$, where $f(x) = \frac{1}{\sigma \sqrt{2 \pi}} \frac{1}{x}$,
$g(x) = e^x$, $h(x) = -\frac12 x^2$, $q(x) = \frac{x - \mu}{\sigma}$, and $r(x) = \log(x)$.
\item \texttt{f = function(x) exp(-.5*log(x)^2)/x/sqrt(2*pi)}
\item \texttt{curve(f, 0, 5)}
\item No. We can see this on the graph. Analytically, both $L(.136)$ and $L(.999)$ are roughly equal to .4.
@MichaelChirico
MichaelChirico / nhl_pugilism.R
Last active November 10, 2017 00:54
code for producing pugnacious nhl teams plot
library(rvest)
library(data.table)
# downloaded individually... website throttles scraping >:(
# URL stub for year YYYY is
# http://www.hockeyfights.com/leaders/teams/1/regYYYY
pgs = list.files('~/Desktop/nhl_fights',
full.names = TRUE, pattern = 'html')
names(pgs) = gsub('.*([0-9]{4}).*', '\\1', pgs)
@MichaelChirico
MichaelChirico / read_and_download.R
Last active February 4, 2018 16:18
Read OPA data, find properties in 19125, download their latest data
# load tools for manipulating data
library(data.table)
# read data (press TAB to enable path completion)
prop = fread('~/Downloads/opa_properties_public.csv')
# first, subset to 19125
## exploring: is zip code consistently formatted? (no)
prop[ , table(nchar(zip_code))]
## mainly looks like there are 6 formats:
## 1. 1912 [typo; 2 cases]