Skip to content

Instantly share code, notes, and snippets.

library(tidyverse)

data <- tibble(
  a = c(0, 0, 1),
  b = c(0, 1, 0),
  c = c(1, 1, 1)
)

data_sums &lt;- data %&gt;% 
@brshallo
brshallo / bootstrap-mean-rsample.R
Created April 27, 2021 14:39
Initial approach to bootstrapping a mean value, but is much slower than just using base::sample() so used that...
library(tidymodels)
### MUCH faster computationally to use base R `sample()` for this step... so did not use this approach
resamples <- rsample::bootstraps(preds, 5000)
avg_diff_sample <- function(split){
analysis(split) %>%
summarise(diff = mean(diff_abs_resids)) %>%
pull(diff)
}
library(tidyverse)
library(lubridate)

date <- ymd(20200101) + months(1:7)
company <- c("a", "b")

sim_rw <- function(start = 0, n = 7, mean = 1){
  arima.sim(model = list(order = c(0, 1, 0)), n = n - 1, mean = mean) %>% 
    as.numeric() %>% 
@brshallo
brshallo / undo-yeo-johnson.md
Created April 13, 2021 15:03
Useful for undoing transformation applied to predictions. See tidymodels/recipes#264 (https://github.com/tidymodels/recipes/issues/264) for discussion, though was closed without solution.
library(tidymodels)

rec_prep <- recipe(cty ~ ., data = mpg) %>% 
  step_YeoJohnson(cty) %>% 
  prep(data = mpg)

yj_estimate <- rec_prep %>% 
  tidy(1) %>% 
  pluck("value", 1)
@brshallo
brshallo / predict-interval-boot-only.R
Last active July 26, 2021 17:33
Prep interval and then produce prediction interval on a new data set. See thread: https://community.rstudio.com/t/prediction-intervals-with-tidymodels-best-practices/82594/15 also see prior set-up: https://gist.github.com/brshallo/3db2cd25172899f91b196a90d5980690 . The approach at this gist is similar but uses the bootstrapped residuals to produ…
library(tidyverse)
library(tidymodels)
# Control function used as part of `prep_interval()`
ctrl_fit_recipe <- function(x){
output <- list(fit = workflows::pull_workflow_fit(x),
recipe = workflows::pull_workflow_prepped_recipe(x))
c(output, list(resids =
bind_cols(
@brshallo
brshallo / source_rmd.R
Last active April 1, 2021 07:10 — forked from noamross/source_rmd.R
Source an RMD file
#' Source the R code from an knitr file, optionally skipping plots
#'
#' @param file the knitr file to source
#' @param skip_plots whether to make plots. If TRUE (default) sets a null graphics device
#'
#' @return This function is called for its side effects
#' @export
source_rmd = function(file, skip_plots = TRUE) {
temp = tempfile(fileext=".R")
knitr::purl(file, output=temp)
@brshallo
brshallo / source-rmd-chunks.r
Last active December 7, 2022 13:15
Function for sourcing individual or multiple chunks from an RMD document
library(magrittr)
library(stringr)
library(readr)
library(purrr)
library(glue)
library(knitr)
source_rmd_chunks <- function(file, chunk_labels, skip_plots = TRUE, output_temp = FALSE){
temp <- tempfile(fileext=".R")
library(tidyverse)

sample_n_of <- function(data, size, ...) {
  dots <- quos(...)
  
  group_ids <- data %>% 
    group_by(!!! dots) %>% 
    group_indices()
  
@brshallo
brshallo / rmse-interval.md
Last active February 13, 2024 02:12
The RMSE intervall method is based on the solution suggested on cross validated: https://stats.stackexchange.com/a/78318/193123
library(palmerpenguins)
library(dplyr)

#' @param rmse Root mean squared error on your sample
#' @param df Degrees of Freedom in your model. In this case it should be the
#'   same as the number of observations in your sample.
rmse_interval <- function(rmse, deg_free, p_lower = 0.025, p_upper = 0.975){
  tibble(.pred_lower = sqrt(deg_free / qchisq(p_upper, df = deg_free)) * rmse,
         .pred_upper = sqrt(deg_free / qchisq(p_lower, df = deg_free)) * rmse)
library(palmerpenguins)
library(tidyverse)

penguins <- palmerpenguins::penguins %>% 
  na.omit()

mod <- lm(body_mass_g ~ bill_length_mm + bill_depth_mm + sex, data = penguins)

bind_cols(