set.seed(123)
# Number of predictors
P <- 10
# Sample size
N <- 10000
# Generate variance covariance matrix as toeplitz
V <- toeplitz(P:1/P)
# Relative size of coefficientsauthors: Florian van Leeuwen, Thom Benjamin Volker, Gerko Vink and Stef van BuurenThe development and application of (clinical) prediction models is complicated by missing data, as most analysis techniques do not readily allow for incorporating missing values. Consequently, model parameters cannot be estimated, and predictions cannot be calculated. Ad-hoc fixes to deal with missing data, such as listwise deletion or mean imputation, work only under limited circumstances, such as MCAR, which are unlikely to hold in practice. A more principled approach dealing with missing data is multiple imputation (MI). Many studies confirmed that MI allows one to obtain unbiased and efficient estimates of model parameters under fairly general conditions. Practitioners in (clinical) prediction commonly conceive single imputation to be sufficient. The present study compares single versus multiple imputation for making point estimates (predictions) and prediction interval
The following code shows how to obtain density ratio estimates that are regularized to
pred_adapt <- function(nu, de, ce, sigma, lambda) {
Knu <- densityratio::distance(as.matrix(nu), as.matrix(ce), TRUE) |> kernel_gaussian(sigma)
Kde <- densityratio::distance(as.matrix(de), as.matrix(ce), TRUE) |> kernel_gaussian(sigma)
Kdede <- crossprod(Kde) / nrow(Kde)
Knunu <- colMeans(Knu)
alpha <- solve(Kdede[-1, -1] + lambda * diag(ncol(Kde)-1), Knunu[-1] - Kdede[1, -1])
Thom Benjamin Volker
In this notebook, we explore properties of divergence-based two-sample
testing, using the methods employed in the densityratio package
(Volker et al., 2023). We consider three settings: two groups are drawn
from the same distribution, two groups are drawn from distributions with
different means, but with the same variance-covariance matrix, and two
groups are drawn from distributions with the same means but different
covariances. Subsequently, we compare the performance of the
mice.impute.pmm() can give counterintuitive results when using type 1 matching, see for example the following scenario.
The following code is provided by Stef van Buuren, inspired by Templ 2023, Visualization and Imputation of Missing Values.
library(mice)
#>
#> Attaching package: 'mice'
#> The following object is masked from 'package:stats':
#>
#> filter