thomvolker / quantile-mi.md

Created June 3, 2025 12:56

set.seed(123)

# Number of predictors
P <- 10
# Sample size
N <- 10000
# Generate variance covariance matrix as toeplitz
V <- toeplitz(P:1/P)
# Relative size of coefficients

thomvolker / faster_estimice.md

Created March 4, 2025 16:15

A slightly modified `estimice()` and `.norm.draw()`

Thom Benjamin Volker

04-03-2025

An attempt at a faster estimice() function for the R-package mice.

devtools::load_all(path = "C:/Users/5868777/surfdrive/Documents/mice")

thomvolker / tabpfn_r.md

Last active February 10, 2025 13:10

TabPFN in R with reticulate

Generate some example data.

X1 <- runif(200, 0, 10)
X2 <- sin(X1) + rnorm(200, 0, 0.5)
Y <- 3 + 0.5 * X1 + X2 + rnorm(200, 0, 1)

thomvolker / predint.md

Last active February 3, 2025 16:20

Prediction intervals with missing data

authors: Florian van Leeuwen, Thom Benjamin Volker, Gerko Vink and Stef van Buuren

The development and application of (clinical) prediction models is complicated by missing data, as most analysis techniques do not readily allow for incorporating missing values. Consequently, model parameters cannot be estimated, and predictions cannot be calculated. Ad-hoc fixes to deal with missing data, such as listwise deletion or mean imputation, work only under limited circumstances, such as MCAR, which are unlikely to hold in practice. A more principled approach dealing with missing data is multiple imputation (MI). Many studies confirmed that MI allows one to obtain unbiased and efficient estimates of model parameters under fairly general conditions. Practitioners in (clinical) prediction commonly conceive single imputation to be sufficient. The present study compares single versus multiple imputation for making point estimates (predictions) and prediction interval

thomvolker / using-google-colab-with-r-and-tensorflow.ipynb

Created December 17, 2024 11:00

Using Google Colab with R and tensorflow.ipynb

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

thomvolker / mi-predictions.md

Created October 15, 2024 16:57

Can we get away with single imputation when the goal is to obtain well-calibrated prediction intervals

Of course not! 😁

library(foreach)

set.seed(123)

nsim <- 200

thomvolker / dr-transformations.md

Last active September 27, 2024 10:50

Density ratio estimation is invariant to one-to-one and onto transformations

Thom Benjamin Volker

At least in the univariate case, the multivariate case must be considered in a future gist still.

N <- 10000000
x1 <- rexp(N, 1)

thomvolker / dre-regularize-to-1.md

Created August 23, 2024 14:28

The following code shows how to obtain density ratio estimates that are regularized to $1$ instead of $0$ (or an estimated intercept).

pred_adapt <- function(nu, de, ce, sigma, lambda) {
  Knu <- densityratio::distance(as.matrix(nu), as.matrix(ce), TRUE) |> kernel_gaussian(sigma)
  Kde <- densityratio::distance(as.matrix(de), as.matrix(ce), TRUE) |> kernel_gaussian(sigma)

  Kdede <- crossprod(Kde) / nrow(Kde)
  Knunu <- colMeans(Knu)
 alpha &lt;- solve(Kdede[-1, -1] + lambda * diag(ncol(Kde)-1), Knunu[-1] - Kdede[1, -1])

thomvolker / div-based-testing.md

Last active April 30, 2025 07:21

Divergence-based testing using density ratio estimation techniques

Thom Benjamin Volker

In this notebook, we explore properties of divergence-based two-sample testing, using the methods employed in the densityratio package (Volker et al., 2023). We consider three settings: two groups are drawn from the same distribution, two groups are drawn from distributions with different means, but with the same variance-covariance matrix, and two groups are drawn from distributions with the same means but different covariances. Subsequently, we compare the performance of the

thomvolker / mice-pmm-puzzle.md

Last active August 7, 2024 13:51

mice.impute.pmm() can give counterintuitive results when using type 1 matching, see for example the following scenario.

The following code is provided by Stef van Buuren, inspired by Templ 2023, Visualization and Imputation of Missing Values.

library(mice)
#> 
#> Attaching package: 'mice'
#> The following object is masked from 'package:stats':
#> 
#>     filter

Thom Volker thomvolker

A slightly modified estimice() and .norm.draw()

Thom Benjamin Volker

TabPFN in R with reticulate

Prediction intervals with missing data

Can we get away with single imputation when the goal is to obtain well-calibrated prediction intervals

Density ratio estimation is invariant to one-to-one and onto transformations

Thom Benjamin Volker

Divergence-based testing using density ratio estimation techniques

A slightly modified `estimice()` and `.norm.draw()`