Skip to content

Instantly share code, notes, and snippets.

@Protonk
Created April 27, 2011 01:36
Show Gist options
  • Select an option

  • Save Protonk/943570 to your computer and use it in GitHub Desktop.

Select an option

Save Protonk/943570 to your computer and use it in GitHub Desktop.
another reddit regression
set.seed(20900)
#Same stuff as in the reddit thread.
x1 <- rt(500,df=5)
x2 <- 3 + 5*x1 + rnorm(500,0,3)
x3 <- 6 + rnorm(500)
x4 <- 4 + 4*rt(500,df=15)
w <- rnorm(500,0,10)
#Create the "dependent" variable
y <- 10 + 2*x1 + 7*x2 + 17*x3 + w
#These can be combined into one line by enclosing the object created by
#cbind in as.data.frame() but it doesn't matter
y.mat <- cbind(y,x1,x2,x3,x4)
y.df <- as.data.frame(y.mat)
#Saves the regression with just x1
lm.omit <- lm(y ~ x1, data=y.df)
#Adds x2. The syntax for update() can be a little confusing, but basically the
#period means anything in the old formula is used on either side of the ~ operator
#So we tell update we want to keep y and x1 and add x2
lm.x2 <- update(lm.omit, . ~ . + x2)
#We can plot some of these relationships but because this is a multiple regression
#we have to be careful--plotting y against only 1 explanatory variable can
#get us in trouble
with(y.df, plot(x1, y))
abline(a=coef(lm.omit)[[1]], b=coef(lm.omit)[[2]], col=2, lty=2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment