Skip to content

Instantly share code, notes, and snippets.

View simonpcouch's full-sized avatar

Simon P. Couch simonpcouch

View GitHub Profile
@simonpcouch
simonpcouch / chiburbs.R
Created June 22, 2023 15:58
Chicago Suburbs Housing Data
library(tidymodels)
library(tidyverse)
library(stringr)
library(janitor)
library(doMC)
registerDoMC(cores = max(1, parallelly::availableCores() - 1))
# data cleaning --------
# we'd likely just do all this cleaning under the hood and supply
# the `chiburbs` result as the "initial" dataset
# benchmarking the new parsnip release
library(tidymodels)
# with v1.0.2 ------------------------------------------------------------
pak::pkg_install("tidymodels/[email protected]")
num_samples <- 10^(3:7)
num_resamples <- c(5, 10, 20)
nrow <- length(num_samples) * length(num_resamples)
library(tidymodels)
library(cli)

The tune package has machinery to catch and log errors and warnings that occur while evaluating proposed models against resamples.

At the moment, we print those warnings/errors out one-by-one as they appear during evaluation.

Proposed modifications to the internals of tune:::check_grid.


  tune_tbl <- tune_args(workflow)
  tune_params <- tune_tbl$id

  if (nrow(pset) == 0L) {
    msg <- c("!" = "No tuning parameters have been detected; performance will be
 evaluated using the resamples with no tuning.")
library(tidymodels)
library(stacks)
library(bonsai)

tidymodels_prefer()

# regression ------------------------------------------------------------------
reg_bt <-
  boost_tree(mtry = tune()) %>%

auditing one-to-many join warnings

With dev dplyr, we now see:

# pak::pak("tidyverse/dplyr")
library(parsnip)

mod <- 
 linear_reg(engine = 'glmnet', penalty = tune(), mixture = 1) %&gt;%
> devtools::test()
ℹ Loading agua
ℹ Testing agua
✔ | F W S  OK | Context
⠏ |         0 | misc                                                                          openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment Homebrew (build 11.0.15+0)
OpenJDK 64-Bit Server VM Homebrew (build 11.0.15+0, mixed mode)
✔ |         6 | misc [3.3s]                                                                   
  |======================================================================| 100%               

This issue came up in a conversation with @\mine-cetinkaya-rundel about teaching introductory stats / modeling courses using the tidymodels. I feel that, in some ways, parsnip’s guardrails re: augment make teaching broom’s principles fussier than it ought to. Fitting a model and passing it to each tidier:

library(tidyverse)
library(tidymodels)
library(palmerpenguins)

penguins <- drop_na(penguins)

penguins_tr <- penguins[1:200,]

An issue was recently filed in stacks about the object size of a stack increasing on save and reload.

Starting out with a quick reprex:

library(tidymodels)
library(modeldata)
library(readr)
#>