Last active
November 6, 2022 00:46
-
-
Save tomhopper/8c204d978c4a0cbcb8c0 to your computer and use it in GitHub Desktop.
Functions that return the PRESS statistic (predictive residual sum of squares) and predictive r-squared for a linear model (class lm) in R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Example usage for model_fit_stats.R | |
# Set up some data | |
x <- seq(from=0, to=10, by=0.5) | |
y1 <- 2*x + rnorm(length(x)) | |
# We want to compare two different linear models: | |
my.lm1 <- lm(y1 ~ sin(x)) | |
my.lm2 <- lm(y1 ~ x) | |
# We will use plyr for this. | |
library(plyr) | |
# Now call model_fit_stats() for each lm model that | |
# we have, using ldply. This returns the results in | |
# a data frame that is easily used for further | |
# calculations. | |
ldply(list(my.lm1, my.lm2), model_fit_stats) | |
# adply() also works, though it should be less robust | |
# than ldply(). | |
#adply(list(my.lm1, my.lm2), 1, model_fit_stats) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' @title Model Fit Statistics | |
#' @description Returns lm model fit statistics R-squared, adjucted R-squared, | |
#' predicted R-squared and PRESS. | |
#' Thanks to John Mount for his 6-June-2014 blog post, R style tip: prefer functions that return data frames" for | |
#' the idea \link{http://www.win-vector.com/blog/2014/06/r-style-tip-prefer-functions-that-return-data-frames} | |
#' @return Returns a data frame with one row and a column for each statistic | |
#' @param linear.model A \code{lm()} model. | |
model_fit_stats <- function(linear.model) { | |
r.sqr <- summary(linear.model)$r.squared | |
adj.r.sqr <- summary(linear.model)$adj.r.squared | |
pre.r.sqr <- pred_r_squared(linear.model) | |
PRESS <- PRESS(linear.model) | |
return.df <- data.frame(r.squared = r.sqr, adj.r.squared = adj.r.sqr, pred.r.squared = pre.r.sqr, press = PRESS) | |
return(return.df) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' @title Predictive R-squared | |
#' @author Thomas Hopper | |
#' @description returns the predictive r-squared. Requires the function PRESS(), which returns | |
#' the PRESS statistic. | |
#' @param linear.model A linear regression model (class 'lm'). Required. | |
#' | |
pred_r_squared <- function(linear.model) { | |
#' Use anova() to get the sum of squares for the linear model | |
lm.anova <- anova(linear.model) | |
#' Calculate the total sum of squares | |
tss <- sum(lm.anova$'Sum Sq') | |
# Calculate the predictive R^2 | |
pred.r.squared <- 1-PRESS(linear.model)/(tss) | |
return(pred.r.squared) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' @title PRESS | |
#' @author Thomas Hopper | |
#' @description Returns the PRESS statistic (predictive residual sum of squares). | |
#' Useful for evaluating predictive power of regression models. | |
#' @param linear.model A linear regression model (class 'lm'). Required. | |
#' | |
PRESS <- function(linear.model) { | |
#' calculate the predictive residuals | |
pr <- residuals(linear.model)/(1-lm.influence(linear.model)$hat) | |
#' calculate the PRESS | |
PRESS <- sum(pr^2) | |
return(PRESS) | |
} |
You're welcome! I'm glad it's useful.
Love this! You totally save my group project!!!
Hello
in one result, the press r-square showed NA. what does that mean? does it have any relevance? it is a regression with interaction. the result of the regression was as follows.
R-squared: 0.141 Adjusted R-squared: 0.133 PRESS R-squared: NA
BASIC ANALYSIS
-- Estimated Model for MDESND
Estimate Std Err t-value p-value Lower 95% Upper 95%
(Intercept) 3.858 0.050 77.449 0.000 3.760 3.956
c_MGESTDIV 0.210 0.059 3.546 0.000 0.094 0.327
CORPELENão Branco -0.046 0.062 -0.742 0.459 -0.168 0.076
c_MGESTDIV:CORPELENão Branco 0.053 0.073 0.722 0.471 -0.091 0.196
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for posting this, very helpful :)