You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Finding ellipsoidal probabilities under the multivariate normal model
Finding ellipsoidal probabilities under the multivariate normal model
Thom Benjamin Volker
Introduction
Many statistical models impose multivariate normality, for example, for
example on the regression parameters in a linear regression model. A
question that may arise is how to find the probability that a random
Density ratio estimation through Bregman divergence optimization
Thom Benjamin Volker
The density ratio estimation problem is to estimate the ratio of two probability density
functions $p(x)$ and $q(x)$ from samples $\{nu_i\}_{i=1}^n$ and $\{de_i\}_{i=1}^m$ drawn
from $p_{nu}(x)$ and $p_{de}(x)$, respectively. The density ratio estimation problem is
important in many machine learning applications, such as domain adaptation, covariate shift,
Regression is known to estimate regression coefficients conditional on the other coefficients in the model.
One way to show this is using the following script, which estimates coefficients in an iterative manner.
Specifically, we perform regression on the residuals that are based on all other variables that are estimated
previously. To do this, we initialize a response vector $r_\text{old}$ that equals the observed outcome variable, and
we initialize a vector of regression coefficients $b^{(0)}$ to a zero-vector of length $P$.
Then, starting with the first variable $X_1$, we update the regression coefficient by regressing $r_\text{old}$ on
$X_1$, and defining the regression coefficient as $b_1^{(t)} = b_1^{(t-1)} + \sigma_{X_1, r_\text{old}}$ and we
update the residual $r_\text{new}$ as $r\text{old} - X_1 * b_1^{(t)}$. For the next variable, we apply the same
Spectral density ratio estimation with singular value decomposition
Spectral density ratio estimation
Izbicki, Lee & Schafer (2014) show that the density ratio can be
computed through a eigen decomposition of the kernel gram matrix. However, for a data set of size $n$, the kernel
Gram matrix takes on dimensions $n \times n$ and an eigen decomposition has complexity $\mathcal{O}(n^3)$. Fortunately,
it is possible to approximate the solution with a subset of the kernel Gram matrix. That is, we can perform an eigen
decomposition of a subset of $k \geq n$ rows and columns of the kernel Gram matrix to approximate the original solution.
Where should the Gaussian centers come from in density ratio estimation?
Sugiyama, Suzuki & Kanamori (2012) advise to use (a subset from) the numerator samples in
density ratio estimation. However, this advise might very much depend on the use-case.
Namely, the density ratio can be estimated accurately in some region only if there are
centers located in that region. In the figures below, we have two datasets sampled from
two functions with different support. If the centers are chosen from the population with
smallest support, the regularization seems to have a stronger effect.
Density ratio estimation with and without an intercept
Density ratio estimation with and without intercept
When performing density ratio estimation, the regularization parameter
$\lambda$ causes the estimated kernel weights to be shrunken towards
zero. However, a ratio is least complex when it is one, rather than
zero. Hence, the regularization yields a bias that implies more mass in
the denominator samples (compared) to the numerator samples. By adding
an intercept, we can mitigate the bias, but not completely remove it.
This is the case because the intercept is also regularized, and hence
shrunken towards zero.
Simple implementation of singular value decomposition
Singular value decomposition
Thom Benjamin Volker
The singular value decomposition of a $n \times p$ matrix
$\boldsymbol{X}$ is a factorization of the form
$$\boldsymbol{X} = \boldsymbol{U} \boldsymbol{\Sigma} \boldsymbol{V^\top},$$
where $\boldsymbol{U}$ is a $n \times p$ semi-orthogonal matrix with the
left singular vectors, $\boldsymbol{\Sigma}$ is a $p \times p$ diagonal
matrix with non-negative real numbers on the diagonal such that
$\sigma_{1,1} \geq \sigma_{2,2} \geq \dots \geq \sigma_{p,p} \geq 0$