Skip to content

Instantly share code, notes, and snippets.

@brshallo
brshallo / effects_coding_check.md
Created February 27, 2019 01:30
Check that effects coding works as I expect it to -- coefficient represents difference from mean
# traditional `contr.sum` does not name levels, so use function from `car` package
library(car)
library(tidyverse)

param <- getOption("contrasts")
go_deviance <- param
go_deviance["unordered"] <- "contr.Sum"
options(contrasts = go_deviance)
@brshallo
brshallo / prep_training.md
Last active February 27, 2019 04:27
Change data used in `prep` in recipes package. If you specify the training argument -- this becomes the dataset used to "prep" the data -- this data also is automatically retained. (Otherwise, row 10 would not have been overwritten to "other".
library(tidyverse)
library(recipes)

tib_a <- tibble(x = c(rep("a", 9), "b")) %>% 
  mutate(rand = rnorm(n()))

tib_b <- tibble(x = c(rep("a", 90), "b")) %>% 
  mutate(rand = rnorm(n()))
@brshallo
brshallo / recipes_effects_coding_check.md
Created February 27, 2019 15:59
Check if recipes are working with deviance coding the way I would expect. Seems that they are (thought I'd seen some funny behavior within a project, but perhaps an illusion or other error).
library(dplyr)
library(tibble)
library(recipes)
library(car)

param <- getOption("contrasts")
go_deviance <- param
# traditional `contr.sum` does not name levels, so use function from `car` package
go_deviance["unordered"] <- "contr.Sum"
@brshallo
brshallo / extract_coefs_effects_coding.md
Last active August 28, 2020 13:01
Given lm object, extract the high level variable, the levels, and the reference level, as well as the estimate of the impact for this variable (assuming using effects coding).
library(tidyverse)
library(broom)
library(car)

param <- getOption("contrasts")
go_deviance <- param
# traditional `contr.sum` does not name levels, so use function from `car` package
go_deviance["unordered"] <- "contr.Sum"
@brshallo
brshallo / step_other_n_mult_example.md
Created February 28, 2019 19:58
Hack for having consistent sample size for step_other.
library(recipes)
library(tidyverse)

diamonds_nested <- diamonds %>% 
  group_by(cut) %>% 
  nest() %>% 
  mutate(recipes = map(data, ~recipe(price ~ clarity + color + carat, data = .x)))

diamonds_nested %&gt;% 
@brshallo
brshallo / assign_in_index.R
Created April 25, 2019 03:45
Two versions of making pipeable assign in index functions
library(tidyverse)
# this is difficult to read, but I was proud of myself using `lobstr::ast` to figure it out so wanted to save it.
assign_in_index <- function(start_vec, new_vals, num_index = 1:3) {
`<-`(`[`(start_vec[num_index]), new_vals)
start_vec
}
# this is the more sensible/readable version
assign_in_index <- function(start_vec, new_vals, num_index = 1:3) {
@brshallo
brshallo / ch25_solutions.Rmd
Created May 24, 2019 16:46
Solutions to chapter 25 of R for Data Science. Shared as example for Rstudio-git-Avecto problem
---
title: "Many models"
date: "`r paste('Last updated: ', format(Sys.time(), '%Y-%m-%d'))`"
author: "Bryan Shalloway"
output:
github_document:
toc: true
toc_depth: 3
html_document:
toc: true
@brshallo
brshallo / simulate_stlm_example.md
Last active July 21, 2019 19:27
Modified version of forecast.stlm that uses `simulate` rather than `forecast` on the `ets` component of the model. Is insufficient as does not capture variability in seasonality.
library(forecast)
library(tidyverse)

# hyjack `forecast.stlm` and simulate rather than forecast `ets` component
# (Insufficient -- likely only partially capture variability)
devtools::source_gist("https://gist.github.com/brshallo/4b93d0cf48937da6de06ffcffaed2b57")

set.seed(1234)
ts_test <- ts(rnorm(9, 0, 3),
@brshallo
brshallo / simulate_stlm_trend.R
Created July 21, 2019 19:25
Simulate trend component of `stlm` forecast object.
simulate_stlm <- function (object, h = 2 * object$m,
lambda = object$lambda, biasadj = NULL, newxreg = NULL,
allow.multiplicative.trend = FALSE, ...)
{
if (!is.null(newxreg)) {
if (nrow(as.matrix(newxreg)) != h) {
stop("newxreg should have the same number of rows as the forecast horizon h")
}
}
seasonal.periods <- attributes(object$stl)$seasonal.periods
@brshallo
brshallo / boot_sim_stlm.md
Last active July 21, 2019 23:25
Inadequate method for producing simulations off of `stlm` objects

General approach to pseudo simulation:
Take bootstrap sample --> build model --> refit model to original data --> forecast refitted model though use simulate() on ets/trend component of forecast.

library(tidyverse)
library(forecast)

# Function takes bootstrapped sample --> fits model --> refits model to original `y`
boot_mod_refit <- function(ts, fun = stlm, ...){
  bld.mbb.bootstrap(ts, 2)[[2]] %>%