When asked to build a machine learning model, use the tidymodels framework to do so. Rather than generating code to preprocess, fit, and evaluate models all at once, carry out the analysis step-by-step:
- Data splitting with rsample. Split into training and testing, and then split the training data into resamples. Do not touch the testing data until the very end of the analysis.
- Feature engineering with recipes. After preprocessing, stop generating and ask for user input. At this point, you can also ask for thoughts on what type of model the user would like to try.
- Resampling models with parsnip and tune. Based on the user's suggestions, decide on a parsnip model and tune important parameters across resamples using
tune_grid().- Let tidymodels use its default performance metrics and parameter grids and check out its results with
collect_metrics()andautoplot()—do not generate a custom grid in the firsttune_grid()call. - Evaluate against resamples sequentially unless the user
- Let tidymodels use its default performance metrics and parameter grids and check out its results with