Last active
September 25, 2022 19:11
-
-
Save AustinRochford/3e6fde7eaf9d5b93d27ba898bdac7f0f to your computer and use it in GitHub Desktop.
A Closed-form Solution for the Cholesky Decomposition of the Covariance Matrix of Exchangeable Normal Variables
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"id": "c3dd6161-4d55-4760-b316-ba9509c31d7c", | |
"metadata": {}, | |
"source": [ | |
"In a [previous post](https://austinrochford.com/posts/exch-param-pymc.html) I compared several parameterizations for estimating the covariance parameter of latent [exchangeable normal random variables](https://en.wikipedia.org/wiki/Exchangeable_random_variables) using [PyMC](http://pymc.io/). These parameterizations were necesssary because for quite some time before the [official release of PyMC version 4.0](https://www.pymc.io/blog/v4_announcement.html), the beta lacked support for multivariate random variables. Support for these distributions has since [been added](https://www.pymc.io/projects/docs/en/stable/api/distributions/multivariate.html), rendering the workaround parameterizations in that post superfluous. Even so, after writing that post I remained curious about the closed-form solution for the [Cholesky decomposition](https://en.wikipedia.org/wiki/Cholesky_decomposition) of the covariance matrix for exchangeable normal random variables, and eventually was able to derive it. This post presents that closed-form solution along with a numerical verification of its correctness.\n", | |
"\n", | |
"Recall that a vector of random variables is exchangeable if every permutation of its entries has the same joint distribution. A set of exchangeable normal random variables, $X_1, \\ldots, X_T \\sim N(\\mu, \\sigma^2)$ is parameterized by their marginal mean, $\\mu$, marginal scale, $\\sigma$, and the [Pearson correlation coefficient](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient)\n", | |
"$$\\rho = \\frac{\\mathbb{Cov}(X_i, X_j)}{\\sigma^2}.$$\n", | |
"\n", | |
"In this post we will assume $\\mu = 0$ and $\\sigma = 1$ for simplicity; the results are straightforward to generalize. With this assumption, the covariance matrix is\n", | |
"\n", | |
"$$\n", | |
"\\Sigma_{\\rho} = \\begin{pmatrix}\n", | |
" 1 & \\rho & \\cdots & \\rho & \\rho \\\\\n", | |
" \\rho & 1 & \\cdots & \\rho & \\rho \\\\\n", | |
" \\rho & \\rho & \\cdots & 1 & \\rho \\\\\n", | |
" \\rho & \\rho & \\cdots & \\rho & 1\n", | |
"\\end{pmatrix} \\in \\mathbb{R}^{T \\times T}.\n", | |
"$$\n", | |
"\n", | |
"The previous post used [SymPy](https://www.sympy.org/en/index.html) to calculate the Cholesky decomposition of this matrix for small values of $T$. It was obvious that there was a pattern, but the we did not pursue it at the time, as the SymPy factorization could be [lambdified](https://docs.sympy.org/latest/modules/utilities/lambdify.html) and the result applied to [Aesara](https://aesara.readthedocs.io/en/latest/) tensors to facilitate inference in PyMC. This post will prove and numerically verify the following closed-form solution for the Cholesky decomposition of $\\Sigma_{\\rho}.$\n", | |
"\n", | |
"Let $-\\frac{1}{T - 1} \\leq \\rho < 1$. Define $d_1 = 1$ and $\\ell_1 = \\rho$. For $1 < j \\leq T$, let\n", | |
"$$d_j = \\sqrt{d_{j - 1}^2 - \\ell_{j - 1}^2}$$\n", | |
"and\n", | |
"$$\\ell_j = \\frac{\\rho - 1}{d_j} + d_j.$$\n", | |
"\n", | |
"Finally, let\n", | |
"$$\n", | |
"L_T = \\begin{pmatrix}\n", | |
"d_1 & 0 & 0 & \\cdots & 0 \\\\\n", | |
"\\ell_1 & d_2 & 0 & \\cdots & 0 \\\\\n", | |
"\\ell_1 & \\ell_2 & d_3 & \\cdots & 0 \\\\\n", | |
" & & & \\ddots & \\\\\n", | |
"\\ell_1 & \\ell_2 & \\ell_3 & \\cdots & d_T\n", | |
"\\end{pmatrix}\n", | |
"\\in \\mathbb{R}^{T \\times T}.\n", | |
"$$\n", | |
"\n", | |
"**Claim** $\\Sigma_T = L_T L_T^{\\top}$.\n", | |
"\n", | |
"**Proof (by induction):**\n", | |
"\n", | |
"When $T = 1$ or $T = 2$, the claim is easily verified manually.\n", | |
"\n", | |
"Assume, for all $1 \\leq n \\leq T$, that $\\Sigma_n = L_n L_n^{\\top}$.\n", | |
"\n", | |
"It will be useful to block partition $L_n$ as\n", | |
"$$L_n = \\begin{pmatrix}\n", | |
"L_{n - 1} & 0 \\\\\n", | |
"v_n & d_n\n", | |
"\\end{pmatrix},$$\n", | |
"where $v_n = \\begin{pmatrix}\\ell_1 & \\ell_2 & \\cdots & \\ell_{n - 1}\\end{pmatrix}$ are the vector of lower triangular elements of $L_n$. The inductive hypothesis then becomes\n", | |
"$$\\begin{align*}\n", | |
"\\Sigma_n\n", | |
" & = L_n L_n^{\\top} \\\\\n", | |
" & = \\begin{pmatrix}\n", | |
" L_{n - 1} & 0 \\\\\n", | |
" v_n & d_n\n", | |
" \\end{pmatrix} \\begin{pmatrix}\n", | |
" L_{n - 1}^{\\top} & v_n^{\\top} \\\\\n", | |
" 0 & d_n\n", | |
" \\end{pmatrix} \\\\\n", | |
" & = \\begin{pmatrix}\n", | |
" \\Sigma_{n - 1} & L_{n - 1} v_n^{\\top} \\\\\n", | |
" v_n L_{n - 1}^{\\top} & v_n v_n^{\\top} + d_n^2\n", | |
" \\end{pmatrix}.\n", | |
"\\end{align*}$$\n", | |
"Equating the bottom right element of these two matrices gives\n", | |
"$$1 = (\\Sigma_n)_{n, n} = v_n v_n^{\\top} + d_n^2.$$\n", | |
"Restated, we have that $v_n v_n^{\\top} = 1 - d_n^2$.\n", | |
"Equating the off-diagonal elements in the final column gives, for $1 \\leq i < n$,\n", | |
"$$\\rho = (\\Sigma_n)_{i, n} = (L_{n - 1} v_n^{\\top})_i,$$\n", | |
"so $L_{n - 1} v_n^{\\top} = \\rho \\vec{1}$.\n", | |
"\n", | |
"Now,\n", | |
"$$L_{T + 1} L_{T + 1}^{\\top} = \\begin{pmatrix}\n", | |
" \\Sigma_T & L_T v_{T + 1}^{\\top} \\\\\n", | |
" v_{T + 1} L_T^{\\top} & v_{T + 1} v_{T + 1}^{\\top} + d_{T + 1}^2\n", | |
"\\end{pmatrix},\n", | |
"$$\n", | |
"so it suffices to show that\n", | |
"$$L_T v_{T + 1}^{\\top} = \\rho \\vec{1}$$\n", | |
"and\n", | |
"$$v_{T + 1} v_{T + 1}^{\\top} + d_{T + 1}^2 = 1.$$\n", | |
"\n", | |
"First, we have that\n", | |
"$$\\begin{align*}\n", | |
"L_T v_{T + 1}^{\\top}\n", | |
" & = \\begin{pmatrix}\n", | |
" L_{T - 1} & 0\\\\\n", | |
" v_T & d_T\n", | |
" \\end{pmatrix} \\begin{pmatrix}\n", | |
" v_T^{\\top} \\\\\n", | |
" \\ell_T\n", | |
" \\end{pmatrix} \\\\\n", | |
" & = \\begin{pmatrix}\n", | |
" L_{T - 1} v_T^{\\top} \\\\\n", | |
" v_T v_T^{\\top} + d_T \\cdot \\ell_T\n", | |
" \\end{pmatrix}.\n", | |
"\\end{align*}$$\n", | |
"From the inductive hypothesis, $L_{T - 1} v_T^{\\top} = \\rho \\vec{1}$. For the final entry,\n", | |
"$$\\begin{align*}\n", | |
"v_T v_T^{\\top} + d_T \\cdot \\ell_T\n", | |
" & = 1 - d_T^2 + d_T \\left(\\frac{\\rho - 1}{d_T} + d_T\\right) \\\\\n", | |
" & = \\rho.\n", | |
"\\end{align*}$$\n", | |
"\n", | |
"Second, we have that\n", | |
"$$\\begin{align*}\n", | |
"v_{T + 1} v_{T + 1}^{\\top} + d_{T + 1}^2\n", | |
" & = v_T v_T^{\\top} + \\ell_T^2 + d_{T + 1}^2 \\\\\n", | |
" & = 1 - d_T^2 + \\ell_T^2 + \\left(\\sqrt{d_T^2 - \\ell_T^2}\\right)^2 \\\\\n", | |
" & = 1.\n", | |
"\\end{align*}$$\n", | |
"\n", | |
"**QED**" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "e451e475-a037-41ff-ac8a-f029bb1354c2", | |
"metadata": {}, | |
"source": [ | |
"To validate these calculations numerically we will use [Hypothesis](https://hypothesis.readthedocs.io/en/latest/), a Python library for generative testing. First we make the necessary Python imports." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"id": "276824ae-cea5-4b80-b6b0-e764215cf068", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"from hypothesis import assume, given\n", | |
"from hypothesis.strategies import integers, floats" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"id": "6aec495e-0b02-4960-b54a-78386428a3f4", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import numpy as np" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "84d90ce3-f704-4b90-951e-1a684046b2f4", | |
"metadata": {}, | |
"source": [ | |
"The following function generates the covariance matrix, $\\Sigma_{\\rho}$, for given values of $\\rho$ and $T$." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"id": "a9b02f66-2f38-4d75-881b-994923576f17", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def get_cov_mat(ρ, T):\n", | |
" return np.eye(T) + ρ * (1 - np.eye(T))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "9fd548f0-28be-4509-bff6-6eba27f919c4", | |
"metadata": {}, | |
"source": [ | |
"The next function implements the closed-form solution for the Cholesky decomposition of $\\Sigma_{\\rho}$ presented above." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"id": "0be83b6f-7f39-4315-8087-efef83436c40", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def get_cov_mat_chol(ρ, T):\n", | |
" diag = np.empty(T)\n", | |
" tril = np.empty(T)\n", | |
" \n", | |
" diag[0] = 1\n", | |
" tril[0] = ρ\n", | |
"\n", | |
" for j in range(1, T):\n", | |
" diag[j] = np.sqrt(diag[j - 1]**2 - tril[j - 1]**2)\n", | |
" tril[j] = (ρ - 1) / diag[j] + diag[j] if j < T - 1 else 0\n", | |
" \n", | |
" return np.diag(diag) + np.tril(tril, k=-1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "2bd57e26-61fc-49b4-b1dd-8524ecdf3e27", | |
"metadata": {}, | |
"source": [ | |
"This final function is a Hypothesis test that verifies that we have, in fact, computed the Cholesky decomposition of $\\Sigma_{\\rho}$." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"id": "5c884cca-5097-4369-bb3d-12025aec35cc", | |
"metadata": { | |
"jupyter": { | |
"source_hidden": true | |
}, | |
"tags": [] | |
}, | |
"outputs": [], | |
"source": [ | |
"@given(floats(-0.5, 1), integers(2, 1000))\n", | |
"def test_cov_mat_chol(ρ, T):\n", | |
" assume(-1 / (T - 1) <= ρ < 1)\n", | |
" \n", | |
" Σ = get_cov_mat(ρ, T)\n", | |
" chol = get_cov_mat_chol(ρ, T)\n", | |
"\n", | |
" # check that chol is lower diagonal\n", | |
" assert (chol == np.tril(chol)).all()\n", | |
" \n", | |
" # check that \"chol is a factorization of Σ\n", | |
" assert np.allclose(Σ, chol @ chol.T)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "ff81a694-7764-4031-8187-7596ba366226", | |
"metadata": {}, | |
"source": [ | |
"The [`given`](https://hypothesis.readthedocs.io/en/latest/details.html#hypothesis.given) decorator tells Hypothesis to generate floating point values in the interval $\\left[-\\frac{1}{2}, 1\\right]$ for $\\rho$ and integers between 2 and 1,000 for $T$. The upper bound on values generated for $T$ is not necessary but ensures the test runs in a reasonable amount of time. The [`assume`](https://hypothesis.readthedocs.io/en/latest/details.html#hypothesis.assume) call ensures that $\\rho$ is in the appropriate interval based on the size of the random vector. Given this guidance, Hypothesis will generate random values of $\\rho$ and $T$ to run the test with." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"id": "b2b97060-f7ce-405a-9c7e-319ec1b61295", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"test_cov_mat_chol()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "c9cf9cc2-aba7-4701-b128-45db1948d95e", | |
"metadata": {}, | |
"source": [ | |
"The test passes, numerically validating our closed-form solution for the Cholesky decomposition.\n", | |
"\n", | |
"This post is available as a Jupyter notebook [here](https://nbviewer.org/gist/AustinRochford/3e6fde7eaf9d5b93d27ba898bdac7f0f)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"id": "e0df5c57-9e4b-4017-88fd-9b269685693c", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Last updated: Sun Sep 25 2022\n", | |
"\n", | |
"Python implementation: CPython\n", | |
"Python version : 3.10.6\n", | |
"IPython version : 8.4.0\n", | |
"\n", | |
"hypothesis: 6.54.6\n", | |
"\n", | |
"numpy: 1.23.2\n", | |
"\n" | |
] | |
} | |
], | |
"source": [ | |
"%load_ext watermark\n", | |
"%watermark -n -u -v -iv -p hypothesis" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.10.6" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 5 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment