Skip to content

Instantly share code, notes, and snippets.

@stephlocke
Created March 15, 2018 21:11
Show Gist options
  • Save stephlocke/b42214a66833abfceb2a84eaf3d38e99 to your computer and use it in GitHub Desktop.
Save stephlocke/b42214a66833abfceb2a84eaf3d38e99 to your computer and use it in GitHub Desktop.
a recipes 📦 workflow
library(recipes)
library(tidyverse)
library(AppliedPredictiveModeling)
data(AlzheimerDisease)
predictors %>%
cbind(diagnosis) ->
alzheimers
alzheimers %>%
mutate(male = factor(male),
Genotype = fct_infreq(fct_lump(Genotype, n=3))) ->
alzheimers
# split data
alzheimers %>%
initial_split(prop=.9) ->
alz_split
alz_split %>%
training() ->
alz_train
alz_split %>%
testing() ->
alz_test
# scaling / basics process
alz_train %>%
recipe(diagnosis ~ ., .) %>%
step_center(all_numeric()) %>%
step_scale(all_numeric()) %>%
prep(training=alz_train) ->
alz_preprocess
# feature reduction
alz_preprocess %>%
step_corr(all_numeric()) %>%
step_nzv(all_predictors()) %>%
step_zv(all_predictors()) %>%
step_pca(all_numeric()) %>%
step_upsample(diagnosis) %>%
prep(training=alz_train, retain=TRUE) ->
alz_preprocess
# prep training
alz_preprocess %>%
juice(all_outcomes(), all_predictors()) ->
alz_train_p
# prep test
alz_preprocess %>%
bake(alz_test) ->
alz_test_p
@rquintino
Copy link

thanks Steph! awesome tip, I needed to add also library(rsample) for initial_split

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment