Singular value decomposition

Thom Benjamin Volker

The singular value decomposition of a $n \times p$ matrix $\boldsymbol{X}$ is a factorization of the form $$\boldsymbol{X} = \boldsymbol{U} \boldsymbol{\Sigma} \boldsymbol{V^\top},$$ where $\boldsymbol{U}$ is a $n \times p$ semi-orthogonal matrix with the left singular vectors, $\boldsymbol{\Sigma}$ is a $p \times p$ diagonal matrix with non-negative real numbers on the diagonal such that $\sigma_{1,1} \geq \sigma_{2,2} \geq \dots \geq \sigma_{p,p} \geq 0$ known as the singular values, and $\boldsymbol{V}$ is a $p \times p$ orthogonal matrix with the right singular vectors.

Note that $\boldsymbol{U}$ is a semi-orthogonal matrix, such that $\boldsymbol{U^\top} \boldsymbol{U} = \boldsymbol{I_p}$ and $\boldsymbol{U} \boldsymbol{U^\top} \neq \boldsymbol{I_n}$, with $\boldsymbol{I_p}$ a $p$-dimensional identity matrix and $\boldsymbol{V}$ is an orthogonal matrix, such that $\boldsymbol{V^\top} \boldsymbol{V} = \boldsymbol{V} \boldsymbol{V^\top} = \boldsymbol{I_p}$.

Singular value decomposition in `R`

The singular value decomposition of a matrix can be obtained on a column by column basis, using the following program.

svd_R <- function(X, n.dim, n.iter, tol = 1e-08, verbose = FALSE) {
  N <- nrow(X)
  P <- ncol(X)
  
  u <- matrix(rnorm(N*n.dim), ncol = n.dim)
  v <- matrix(rnorm(P*n.dim), nrow = n.dim)
  s <- rep(0, n.dim)
  
  X_res <- X
  warn <- NULL;
  
  for (d in 1:n.dim) {
    converged <- FALSE; 
    iter <- 0
    if (d == 2) X_res <- X - u[,1:(d-1), drop = FALSE] %*% s[1] %*% t(v[,1:(d-1), drop = FALSE])
    if (d > 2)  X_res <- X - u[,1:(d-1), drop = FALSE] %*% diag(s[1:(d-1)]) %*% t(v[,1:(d-1), drop = FALSE])
    u_new <- u_old <- u[,d]
    v_new <- v_old <- v[,d]
    while (!converged) {
      iter <- iter + 1
      if (verbose) cat("\r", "Dimension: ", d, " | Iteration: ", iter)
      u_new <- X_res %*% v_new
      u_new <- u_new / sqrt(sum(u_new^2))
      v_new <- t(X_res) %*% u_new
      v_new <- v_new / sqrt(sum(v_new^2))
      if (sum(sqrt((v_new - v_old)^2)) < tol) {
        converged <- TRUE
      } else if (iter == n.iter) {
        warn <- "The algorithm has not converged, but the maximum number of iterations is reached."
      } else {
        u_old <- u_new
        v_old <- v_new
      }
    }
    u[,d] <- u_new
    v[,d] <- v_new
    s[d]  <- t(u[,d]) %*% X_res %*% v[,d]
  }
  cat("\n")
  if (!is.null(warn)) warning(warn)
  return(list(u = u, v = v, s = s))
}

That is, for every dimension of the composition, we proceed iteratively. We first calculate the first left singular vector $\boldsymbol{u_1}$, keeping our values for the right singular value fixed to their current guess, and scale the result accordingly. Note that we assume that the data matrix $\boldsymbol{X}$ is standardized, such that the column means are zero, and the column standard deviations are one, such that $\sqrt{\sum_{i} u_{i,1}^2} = \boldsymbol{u_1^\top u_1} = 1$. Subsequently, we can calculate the first right singular vector $\boldsymbol{v_1}$ by keeping the values for the left singular vector fixed to their current guess, and scale the result accordingly. We continue doing this until the algorithm has converged (i.e., the values for $\boldsymbol{u_1}$ and $\boldsymbol{v_1}$ do not change anymore). If this is the case, we store the values of the first left and right singular value and calculate the first singular value $\sigma_{1,1}$. Subsequently, we calculate the residual matrix $\boldsymbol{X_\text{res}}$ by subtracting our current approximation of the data matrix from the original data matrix, using only a single dimension to approximate $\boldsymbol{X}$. We then repeat the procedure for the second dimension, the third dimension, and so on, until we have reached the desired number of dimensions.

set.seed(24)
N <- 100
P <- 50
S <- diag(P)
S[S==0] <- 0.5

X <- rnorm(N*P) |> matrix(N) %*% chol(S)

t <- Sys.time()
SVD_base <- svd(X)
Sys.time() - t

Time difference of 0.00176692 secs

t <- Sys.time()
SVD_R    <- svd_R(X, P, 2000, 1e-05, FALSE)

Sys.time() - t

Time difference of 0.281373 secs

diag(cor(SVD_base$u, SVD_R$u))

 [1] -1  1  1 -1 -1 -1  1  1 -1  1 -1 -1 -1  1  1  1  1 -1  1  1 -1  1 -1  1 -1
[26]  1 -1 -1  1 -1  1 -1 -1  1  1  1 -1  1  1 -1  1  1 -1 -1 -1 -1  1  1 -1  1

So, we now have an algorithm that can obtain the singular value decomposition of a matrix that actually works, but that is substantially slower than the native R implementation. This is not surprising, as the native R implementation is based on the LAPACK library, which is written in Fortran. However, we can still improve our algorithm by using the Rcpp package to write the algorithm in C++, which is much faster than R. We can then use the Rcpp package to compile the C++ code and use it in R.

Singular value decomposition in `C++`

The C++ code for the singular value decomposition is shown below. Note that we use the Rcpp package to write the code, which allows us to use R objects in C++ and vice versa. We also use the RcppArmadillo package, which allows us to use the Armadillo library in C++. Despite some practical differences, the code is pretty similar to the R code.

library(Rcpp)
svd_script <- '
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppProgress)]]
// [[Rcpp::depends(RcppArmadillo)]]
#include <progress.hpp>
#include <progress_bar.hpp>

#define RCPP_ARMADILLO_RETURN_ANYVEC_AS_VECTOR

// [[Rcpp::export]]
List svd_cpp(arma::mat X, arma::mat u, arma::mat v, arma::vec s, int n_dim, int n_iter, double tol, bool verbose) {
  arma::mat X_res = X;
  Progress p(n_dim, verbose);
  bool warn      = false;

  for (int d = 0; d < n_dim; d++) {
    if (d == 1) X_res = X - u.cols(0, d-1) * s(0) * v.cols(0, d-1).t();
    if (d > 1)  X_res = X - u.cols(0, d-1) * diagmat(s.subvec(0, d-1)) * v.cols(0, d-1).t();

    p.increment();
    
    arma::vec u_old = u.col(d);
    arma::vec u_new = u.col(d);
    arma::vec v_old = v.col(d);
    arma::vec v_new = v.col(d);

    bool converged = false;
    int iter = 0;

    while (!converged) {
      iter++;
      
      u_new = X_res * v_new;
      u_new = u_new / sqrt(sum(u_new % u_new));
      v_new = X_res.t() * u_new;
      v_new = v_new / sqrt(sum(v_new % v_new));
      if (sum(sqrt(pow(v_new - v_old, 2))) < tol) {
        converged = true;
      } else if (iter == n_iter) {
        warn = true;
      } else {
        u_old = u_new;
        v_old = v_new;
      }
    }
    u.col(d) = u_new;
    v.col(d) = v_new;
    s(d)     = dot(u.col(d), X_res * v.col(d));
  }
  if (verbose) Rcpp::Rcout << "\\n";
  if (warn) {
    Rcpp::warning("The algorithm has not converged, but the maximum number of iterations is reached.");
  }
  return Rcpp::List::create(Rcpp::Named("u") = u, Rcpp::Named("v") = v, Rcpp::Named("s") = s);
}
'

cppFunction(svd_script, depends = c("RcppArmadillo", "RcppProgress"))

Warning: No function found for Rcpp::export attribute at file42444e852cdf.cpp:8

svdcpp <- function(X, n.dim = ncol(X), n.iter = 2000, tol = 1e-08, verbose = FALSE) {
  N <- nrow(X)
  P <- ncol(X)
  
  u <- matrix(rnorm(N*n.dim), ncol = n.dim)
  v <- matrix(rnorm(P*n.dim), ncol = n.dim)
  s <- rep(0, n.dim)
  
  svd_cpp(X, u, v, s, n.dim, n.iter, tol, verbose)
}

The function again yields correct results.

SVD_Cpp    <- svdcpp(X, P, 2000, 1e-05, FALSE)
diag(cor(SVD_base$u, SVD_Cpp$u))

 [1] -1 -1  1  1 -1  1  1  1  1  1  1  1  1  1  1  1  1 -1  1 -1  1  1 -1 -1  1
[26] -1  1  1 -1 -1  1 -1  1  1  1 -1  1 -1 -1  1 -1  1  1  1 -1  1 -1 -1 -1 -1

Also all singular values are as they’re supposed to be.

cor(cbind(SVD_Cpp$s, SVD_R$s, SVD_base$d))

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    1
[3,]    1    1    1

However, we’re still much slower than the native R implementation, which probably means that our algorithm is simply quite slow.

bench::mark(
  base = svd(X),
  R    = svd_R(X, P, 2000, 1e-05, FALSE),
  Cpp  = svdcpp(X, P, 2000, 1e-05, FALSE),
  check = FALSE
)

Warning: Some expressions had a GC in every iteration; so filtering is
disabled.

# A tibble: 3 × 6
  expression      min   median `itr/sec` mem_alloc `gc/sec`
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
1 base        958.2µs   1.03ms    922.       279KB     9.98
2 R           400.1ms 438.16ms      2.28     465MB    45.6 
3 Cpp          84.9ms  91.75ms     11.1      229KB     0

Hence, we are able to program a singular value decomposition manually, but in terms of computational efficiency, I’m unsure whether we can do much better than the base R implementation.

thomvolker/svd.md

Singular value decomposition

Singular value decomposition in R

Singular value decomposition in C++

Singular value decomposition in `R`

Singular value decomposition in `C++`