Last active
February 7, 2022 21:38
-
-
Save vincentarelbundock/eb5284a4afd9e543acbf72bd8a34e8ca to your computer and use it in GitHub Desktop.
Bayesian bootstrap in `R` with `update()`
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(fixest) | |
library(data.table) | |
library(insight) | |
bboot_vab <- function(model, reps = 1e3, conf.level = .95, cluster = NULL) { | |
data <- insight::get_data(model) | |
setDT(data) | |
if (!is.null(cluster)) { | |
if (anyNA(data[[cluster]])) { | |
stop("The cluster variable cannot include missing values.") | |
} | |
cluster_idx <- unique(data[[cluster]]) | |
wts <- rexp(reps * length(cluster_idx)) | |
wts <- matrix(wts, ncol = reps) | |
wts <- data.table(cluster_idx, wts) | |
colnames(wts)[1] <- cluster | |
wts <- merge(data, wts, all.x = TRUE, sort = FALSE) | |
wts <- as.matrix(wts[, (ncol(data)+1):ncol(wts)]) } else { | |
wts <- rexp(reps * nrow(data)) | |
wts <- matrix(wts, ncol = reps) | |
} | |
fit <- function(i) { | |
coef(update(model, weights = wts[, i])) | |
} | |
boot_mat <- t(sapply(seq_len(reps), fit)) | |
alpha <- 1 - conf.level | |
ci <- c(alpha / 2, 1 - alpha / 2) | |
out <- t(apply(boot_mat, 2, \(x) c(mean(x), quantile(x, ci)))) | |
return(out) | |
} | |
model <- fepois(Sepal.Length ~ Sepal.Width + Petal.Length | Species, | |
data = iris) | |
bboot_vab(model, cluster = "Species") |
I don't think so, as in line 17 you merge wts
to data. So in effect with (ncol(data)+1):ncol(wts)
you only retain the wts part.
Ah, got it! Read too quickly.
To be fair, this is something I wrote in 10 minutes as a way to understand this thing. I didn't expect anyone to look at it it ;)
Made the change.
:) It does the job!
Follow up question that I have just now been looking into: Any intuition on how to allow for multiway clustering?
Maybe line 12 you would create a unique ID for each combination of your various cluster variables? A row-wise apply(data[[cluster_variables]], 1, paste)
would work, maybe?
Not sure whether that does the trick. I'll have a look sometime, if I find something, I'll post it here!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Won't
ncol(data)+1
be out of range?