Skip to content

Instantly share code, notes, and snippets.

@Protonk
Created December 21, 2010 19:28
Show Gist options
  • Select an option

  • Save Protonk/750428 to your computer and use it in GitHub Desktop.

Select an option

Save Protonk/750428 to your computer and use it in GitHub Desktop.
Univariate regression with some visuals to get the idea of what a transformation looks like
x<-c(1:50)
y<-x+rnorm(50,sd=10)
plot(x,y,pch=20,main="Distance from Fitted Line")
abline(coef(lm(y~x)[1],coef(lm(y~x)[2])),col="blue",lwd=2)
segments(x,predict(lm(y~x)),y1=lm(y~x)$model[,1])
#This bit took longer to do than I care to admit. Basically you want distance from the data point to the line
#as an adjacent side to the triangle that would have the vertical distance as the hypotenuse.
#Because I don't know how to tell R to draw a segment with a given slope and starting point, I have to give
#the start and end points. In order to get that I compute the rise and run via similar triangles.
#x.shift and y.shift do most of the work in a vectorized computation.
x.shift<- sin(atan(-1/coef(lm(y~x))[2]))*cos(atan(-1/coef(lm(y~x))[2]))*sqrt((simple.y[,2]-simple.pred[,2])^2)
y.shift<- sin(atan(-1/coef(lm(y~x))[2]))*sin(atan(-1/coef(lm(y~x))[2]))*sqrt((simple.y[,2]-simple.pred[,2])^2)
#The for loop is here because we need to change the rise/run from positive to negative depending on whether or not
#the data point is above or below the regression line.
for (i in 1:50) {
if (simple.y[i,2]>simple.pred[i,2]) {
x.shift[i]<- -x.shift[i]
y.shift[i]<- -y.shift[i]
}
}
plot(x,y,pch=20,main="Transformed Distance from Fitted Line")
abline(coef(lm(y~x)[1],coef(lm(y~x)[2])),col="blue",lwd=2)
segments(x,y,x0=x+x.shift,y0=y+y.shift)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment