Finding ellipsoidal probabilities under the multivariate normal model

Thom Benjamin Volker

Introduction

Many statistical models impose multivariate normality, for example, for example on the regression parameters in a linear regression model. A question that may arise is how to find the probability that a random vector falls into some convex region $\mathcal{A}$ under this model. That is, we seek to find $$P(\mathbf{x}\in \mathcal{A}) = \int_{\mathcal{A}} f(\mathbf{x}) d\mathbf{x},$$ where $f(\mathbf{x})$ is the density of the multivariate normal distribution. That is, $$f(\mathbf{x}) = \frac{1}{\sqrt{(2\pi)^p |\mathbf{\Sigma}|}}\exp \Bigg\{-\frac{1}{2} (\mathbf{x}- \mathbf{\mu})^\top \mathbf{\Sigma}^{-1} (\mathbf{x}- \mathbf{\mu}) \Bigg\},$$ where $p$ denotes the number of dimensions, $\mathbf{\mu}$ is the mean vector, and $\mathbf{\Sigma}$ is the covariance matrix, and $|\cdot|$ denotes the determinant.

Solving $P(x \in \mathcal{A})$ when $\mathcal{A}$ is a hyperrectangle

If the region $\mathcal{A}$ is rectangular, the problem is easily solved using of the shelf tools, such as the pmvnorm function in the mvtnorm package in R. In such circumstances, we seek to find the solution of the integral $$P(\mathbf{x}\in \mathcal{A}) = \frac{1}{\sqrt{(2\pi)^p |\mathbf{\Sigma}|}} \int_{a_1}^{b_1} \int_{a_2}^{b_2} \cdots \int_{a_p}^{b_p} \exp \Bigg\{ -\frac{1}{2} (\mathbf{x}- \mathbf{\mu})^\top \mathbf{\Sigma}^{-1} (\mathbf{x}- \mathbf{\mu}) \Bigg\} d\mathbf{x},$$ where $a_i$ and $b_i$ are the lower and upper bounds of the $i$th dimension of the region $\mathcal{A}$. Genz (1992) shows that this integral can be transformed into the iterated integral $$P(\mathbf{x}\in \mathcal{A}) = (e_1 - d_1) \int_0^1 (e_2 - d_2) \dots \int_0^1 (e_m - d_m) \int_0^1 d \mathbf{w},$$ with $$d_i = \mathbf{\Phi}((a_i - \sum^{i-1}_j c_{i,j} \mathbf{\Phi}^{-1}(d_j + w_j(e_j-d_j)))/c_{i,i})$$ and $$e_i = \mathbf{\Phi}((b_i - \sum^{i-1}_j c_{i,j} \mathbf{\Phi}^{-1}(d_j + w_j(e_j-d_j)))/c_{i,i}).$$ That is, we transform the problem into a series of univariate integrals that can be easily solved using standard univariate integration tools as, for example, pnorm() in R.

The following code implements the approach in R. Note, however, that the current implementations use some additional tricks to increase the accuracy of the program (or reduce the computation time), by changing the order of the variables.

set.seed(123)

# Parameters
P <- 5
rho <- 0.5
S <- rho + (1-rho) * diag(P)
mu <- 1:P

# Integration limits
a <- sample(-(1:P)) - mu
b <- 3 * sample(1:P) - mu

# Test results
mvtnorm::pmvnorm(lower = a, upper = b, mean = 0, sigma = S)

[1] 0.8409671
attr(,"error")
[1] 1.364544e-05
attr(,"msg")
[1] "Normal Completion"

# Convergence limits
maxerr <- 1e-4
Nmax <- 10000

# Cholesky decomposition
C <- t(chol(S))

# Initialize variables
intsum <- 0
varsum <- 0
N      <- 1
d1 <- pnorm(a[1]/C[1,1])
e1 <- pnorm(b[1]/C[1,1])
f1 <- e1 - d1

# Loop until convergence
conv <- FALSE
out <- numeric(Nmax)
while(!conv) {
  
  w <- runif(P-1)
  di <- d1
  ei <- e1
  fi <- ei - di
  for (i in 2:P) {
    if (i == 2) yi <- 0
    yi[i-1] <- qnorm(di + w[i-1] * (ei - di))
    di <- pnorm((a[i] - sum(C[i, 1:(i-1)]*yi))/C[i,i])
    ei <- pnorm((b[i] - sum(C[i, 1:(i-1)]*yi))/C[i,i])
    fi <- (ei - di) * fi
  }
  intsum <- intsum + fi
  varsum <- varsum + fi^2
  N <- N+1
  err <- 1 * ((varsum/N - (intsum/N)^2)/(N))^.5
  out[N-1] <- intsum/N
  if (N == Nmax | err < maxerr) conv <- TRUE
}
N

[1] 10000

err

[1] 0.001287477

intsum/N

[1] 0.8424054

plot(out[out != 0], type = "l", lwd = 2, col = "blue")
abline(h = mvtnorm::pmvnorm(lower = a, upper = b, mean = 0, sigma = S), lty=2)

However, if the integration is not over a hyperrectangle, the current approach cannot be used, and we need to resolve to other techniques.

Solving $P(x \in \mathcal{A})$ when $\mathcal{A}$ is a hyperellipse

TO DO

thomvolker/mvn-probs.md

Finding ellipsoidal probabilities under the multivariate normal model

Introduction

Solving $P(x \in \mathcal{A})$ when $\mathcal{A}$ is a hyperrectangle

Solving $P(x \in \mathcal{A})$ when $\mathcal{A}$ is a hyperellipse