Skip to content

Instantly share code, notes, and snippets.

@Laurae2
Created February 24, 2017 19:03
Show Gist options
  • Save Laurae2/4cb8c569a6e1782dba49e8621b6ef80b to your computer and use it in GitHub Desktop.
Save Laurae2/4cb8c569a6e1782dba49e8621b6ef80b to your computer and use it in GitHub Desktop.
Polynomial Regression demo using R
# Setting up random matrix
set.seed(11111)
data <- data.frame(a = rnorm(n = 15) * 5,
b = rnorm(n = 15) * 3 + 1,
c = rnorm(n = 15) * 2 + 2)
# Setting up the (perfect) linear relationship
preds <- 2 + (data[, 1] * 2) + (data[, 2] * 3) + (data[, 3] * 4) + (data[, 3] ^ 2) + (data[, 1] * data[, 2])
# Setting up polynomial features
columns <- ncol(data)
for (i in 1:columns) {
data[, paste0(colnames(data)[i], "X", colnames(data)[i])] <- data[, i] * data[, i]
for (j in i:columns) {
data[, paste0(colnames(data)[i], "X", colnames(data)[j])] <- data[, i] * data[, j]
}
}
data <- as.matrix(cbind(Intercept = 1, data))
# Plotting data to understand what we have
PerformanceAnalytics::chart.Correlation(data.frame(data, preds = preds))
# If you do not have PerformanceAnalytics package, use this instead
# plot(data.frame(data, preds = preds))
# Getting linear regression coefficients
coefficients <- solve(t(data) %*% data, tol = 1e-30) %*% t(data) %*% preds
# Predicting on data
data %*% coefficients
# Checking data vs real values
plot(x = 1:nrow(data), y = (data %*% coefficients) - preds, xlab = "Obsevation number", ylab = "Residuals", main = "Residuals of predictions")
summary(lm(preds ~ ., data = data.frame(data[, 2:10], preds)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment