Skip to content

Instantly share code, notes, and snippets.

@alexhallam
Created June 18, 2018 18:38
Show Gist options
  • Save alexhallam/1a4d3bcf52d92df7d327a4b065ec842c to your computer and use it in GitHub Desktop.
Save alexhallam/1a4d3bcf52d92df7d327a4b065ec842c to your computer and use it in GitHub Desktop.
Prediction workflow with train:test split
library(purrr)
library(broom)
library(modelr)
#data set up
my_iris <- iris %>%
mutate(train_test = ifelse(rbinom(n=n(), size = 1, prob = .85) == 1,
"train","test"))
#set up model function
model_by_group <- function(df){
lm(Sepal.Length ~ Sepal.Width,data = df %>% filter(train_test == "train"))
}
#store model, preds, coeffs, and model metrics in one dataframe
model_df <- my_iris %>%
group_by(Species) %>%
nest() %>%
mutate(model = map(data, model_by_group),
pred = map2(data, model, modelr::add_predictions),
coeffs = map(model, broom::tidy),
glance = map(model, broom::glance)
)
#want to view model coeffs
model_df %>%
unnest(coeffs)
#want to view model metrics such as r.squared
model_df %>%
unnest(glance)
#want to view predicted valus on test set
model_df %>%
unnest(pred) %>%
filter(train_test == "test")
#want avg absolute error
model_df %>%
unnest(pred) %>%
filter(train_test == "test") %>%
mutate(ae = abs(pred-Sepal.Length)) %>%
group_by(Species) %>%
summarise(mae = mean(ae))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment