Skip to content

Instantly share code, notes, and snippets.

View Protonk's full-sized avatar
🗯️
Après moi, la brisure

Adam Hyland Protonk

🗯️
Après moi, la brisure
View GitHub Profile
@Protonk
Protonk / Theil.R
Created December 17, 2010 23:48
Theil index code from the lorenz.R gist. Works on those numbers
#Computes a very basic Theil index
i<-1
theil<-matrix(0,4,3)
for (i in i:4) {
theil[i,1]<-0.25*(gini[i,1]/0.25)*log(gini[i,1]/0.25)
theil[i,2]<-0.25*(gini[i,2]/0.25)*log(gini[i,2]/0.25)
theil[i,3]<-0.25*(gini[i,3]/0.25)*log(gini[i,3]/0.25)
}
theil<-rbind(theil,apply(theil,2,sum))
colnames(theil)<-c("Region A","Region B","Region C");
@Protonk
Protonk / Faux PCA.R
Created December 21, 2010 19:28
Univariate regression with some visuals to get the idea of what a transformation looks like
x<-c(1:50)
y<-x+rnorm(50,sd=10)
plot(x,y,pch=20,main="Distance from Fitted Line")
abline(coef(lm(y~x)[1],coef(lm(y~x)[2])),col="blue",lwd=2)
segments(x,predict(lm(y~x)),y1=lm(y~x)$model[,1])
#This bit took longer to do than I care to admit. Basically you want distance from the data point to the line
#as an adjacent side to the triangle that would have the vertical distance as the hypotenuse.
#Because I don't know how to tell R to draw a segment with a given slope and starting point, I have to give
@Protonk
Protonk / PCA Animation.R
Created December 21, 2010 19:37
transform some bivariate normals about their axes of variation
library(MASS)
sigma.test<-matrix(c(8,3,5,8),2,2)
biv.norm<-mvrnorm(200,mu=rep(0, 2),Sigma=sigma.test)
biv.pca<-prcomp(biv.norm,center=FALSE)
#An array is just a high dimensional matrix. So this is 2x2x50. There is nothing special about the number 50, I just picked that to smooth the
#animation.
#The numbers in the sequence make transition.ar[,,1] the 2x2 identity matrix and transition.ar[,,50] the final transformed rotation matrix from
@Protonk
Protonk / Genetic hello world.R
Created December 27, 2010 22:12
A bad hash at genetic optimization
target<-utf8ToInt("Hello World")
#####Random Mating with Mutation. No selective breeding at all.
#The generation array stores each generation (duh)
generation<- array(0,c(100,11,100))
generation[,,1]<- t(replicate(100, round(runif(11,0,255))))
#Two for loops is generally an R no-no, but since the computation is explicitly serial, I wasn't sure how to get around it
#However I was able to put most of the random number generation in the outer loop and use indices to pick and choose which
#ones I wanted.
bisect <- function(f, u, v, eps){
if ( f(u) * f(v) > 0)
stop (" error, select another value for u or v...")
if ( f(u) < 0) {
u1 <- u
v1 <- v
} else {
u1 <- v
v1 <- u
}
@Protonk
Protonk / ordinal ranking.R
Created January 1, 2011 20:02
Differences in ordinal ranking between absolute deviation and squared deviation
#Very simple fitness function. Works sensibly only on 1 dimensional numeric vectors, but that is
#ok for this example
fitness.hw<- function(tar.hw,meas.hw,type=c("abs","sqr")) {
diff.hw<-tar.hw-meas.hw
if (type=="abs") {
rowSums(abs(diff.hw))
}
else if (type=="sqr") {
rowSums(diff.hw^2)
}
@Protonk
Protonk / launchlog.R
Created January 14, 2011 21:14
Preprocessing and simple display of launches
####Pre-processing of launch log before sending it to Google Refine####
#The vector of widths can be grabbed a number of ways: by hand, with regular expressions, or from a file explaining the structure
launch.width<-c(12,27,15,31,26,9,24,16,9,12,5,19)
#Your file path will vary, of course
launch.path<-c("/Users/protonk/Dropbox/Past Econ Work/Launch vehicle market/Complete text launch log.txt")
launch.names<- c("Launch","Date","COSPAR","PL.Name","Orig.PL.Name","SATCAT","LV.Type","LV.SN","Site","Subsite","Success","Ref")
#Read in FWF and export a .csv
launch.test<-read.fwf(launch.path,launch.width,header=FALSE,skip=2,as.is=TRUE,col.names=launch.names)
write.csv(launch.test,file="~/R/launch.csv")
##########
@Protonk
Protonk / Stationarity.R
Last active April 6, 2020 15:00
Simple, non-technical stationary/non-stationary process examples
#ADF tests for both come from the function afd.test() in tseries. The means are just simple means over the ranges I chose.
####Stationary Process####
set.seed(42)
y<- w<- rnorm(1000)
for (t in 2:1000) {
y[t]<- 0.9*y[t-1]+w[t]
}
#Vector for line segments.
xy.mat<- cbind(c(0,450,225,750,600,1000),c(6.5,6.5,7.5,7.5,8.5,8.5))
##Plot the series, with annotations for means and the Dicky-Fuller Test##
@Protonk
Protonk / Bootstrap truncated.R
Created March 2, 2011 04:59
Truncated Normal with boostrapping
####Playing with Truncated Normals####
set.seed(42)
x<-rnorm(200,0,4)
y<-x[x>-2]
plot(density(y),xlim=c(-16,16),main="Truncated Normal Distribution",xlab="x")
lines(density(x),col=2)
segments(x0=-2,y0=0,y1=0.15,lty=2,lwd=2)
legend(6,0.125,legend=c("Original","Truncated","Truncation\nPoint"),col=c(2,1,1),lty=c(1,1,2));
@Protonk
Protonk / earnings.R
Created March 7, 2011 22:47
Age-earnings profile
#loads the packae which allows us to import state files
library(foreign)
#The file can also be a url to the data on your website, obviously if you load it
#on your own PC the path will be different. :)
adams<-read.dta(file="~/Downloads/assignment1.dta")
#Removes columns which aren't used and moves the eph column
adams<-adams[,c(7,2:6,8,9,14,15)]
#This next section converts the binary factors in the state file to factors as
#R is used to seeing them, one column with a specific level for each (hschool, college)