# ssvd: Sparse regularized low-rank matrix approximation. In irlba: Fast Truncated Singular Value Decomposition and Principal Components Analysis for Large Dense and Sparse Matrices

 ssvd R Documentation

## Sparse regularized low-rank matrix approximation.

### Description

Estimate an l1-penalized singular value or principal components decomposition (SVD or PCA) that introduces sparsity in the right singular vectors based on the fast and memory-efficient sPCA-rSVD algorithm of Haipeng Shen and Jianhua Huang.

### Usage

```ssvd(x, k = 1, n = 2, maxit = 500, tol = 0.001, center = FALSE,
scale. = FALSE, alpha = 0, tsvd = NULL, ...)
```

### Arguments

 `x` A numeric real- or complex-valued matrix or real-valued sparse matrix. `k` Matrix rank of the computed decomposition (see the Details section below). `n` Number of nonzero components in the right singular vectors. If `k > 1`, then a single value of `n` specifies the number of nonzero components in each regularized right singular vector. Or, specify a vector of length `k` indicating the number of desired nonzero components in each returned vector. See the examples. `maxit` Maximum number of soft-thresholding iterations. `tol` Convergence is determined when ||U_j - U_{j-1}||_F < tol, where U_j is the matrix of estimated left regularized singular vectors at iteration j. `center` a logical value indicating whether the variables should be shifted to be zero centered. Alternately, a centering vector of length equal the number of columns of `x` can be supplied. Use `center=TRUE` to perform a regularized sparse PCA. `scale.` a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. Alternatively, a vector of length equal the number of columns of `x` can be supplied. The value of `scale` determines how column scaling is performed (after centering). If `scale` is a numeric vector with length equal to the number of columns of `x`, then each column of `x` is divided by the corresponding value from `scale`. If `scale` is `TRUE` then scaling is done by dividing the (centered) columns of `x` by their standard deviations if `center=TRUE`, and the root mean square otherwise. If `scale` is `FALSE`, no scaling is done. See `scale` for more details. `alpha` Optional scalar regularization parameter between zero and one (see Details below). `tsvd` Optional initial rank-k truncated SVD or PCA (skips computation if supplied). `...` Additional arguments passed to `irlba`.

### Details

The `ssvd` function implements a version of an algorithm by Shen and Huang that computes a penalized SVD or PCA that introduces sparsity in the right singular vectors by solving a penalized least squares problem. The algorithm in the rank 1 case finds vectors u, w that minimize

||x - u w^T||_F^2 + lambda||w||_1

such that ||u|| = 1, and then sets v = w / ||w|| and d = u^T x v; see the referenced paper for details. The penalty lambda is implicitly determined from the specified desired number of nonzero values `n`. Higher rank output is determined similarly but using a sequence of lambda values determined to maintain the desired number of nonzero elements in each column of `v` specified by `n`. Unlike standard SVD or PCA, the columns of the returned `v` when `k > 1` may not be orthogonal.

### Value

A list containing the following components:

• u regularized left singular vectors with orthonormal columns

• d regularized upper-triangluar projection matrix so that `x %*% v == u %*% d`

• v regularized, sparse right singular vectors with columns of unit norm

• center, scale the centering and scaling used, if any

• lambda the per-column regularization parameter found to obtain the desired sparsity

• iter number of soft thresholding iterations

• n value of input parameter `n`

• alpha value of input parameter `alpha`

### Note

Our `ssvd` implementation of the Shen-Huang method makes the following choices:

1. The l1 penalty is the only available penalty function. Other penalties may appear in the future.

2. Given a desired number of nonzero elements in `v`, value(s) for the lambda penalty are determined to achieve the sparsity goal subject to the parameter `alpha`.

3. An experimental block implementation is used for results with rank greater than 1 (when `k > 1`) instead of the deflation method described in the reference.

4. The choice of a penalty lambda associated with a given number of desired nonzero components is not unique. The `alpha` parameter, a scalar between zero and one, selects any possible value of lambda that produces the desired number of nonzero entries. The default `alpha = 0` selects a penalized solution with largest corresponding value of `d` in the 1-d case. Think of `alpha` as fine-tuning of the penalty.

5. Our method returns an upper-triangular matrix `d` when `k > 1` so that `x %*% v == u %*% d`. Non-zero elements above the diagonal result from non-orthogonality of the `v` matrix, providing a simple interpretation of cumulative information, or explained variance in the PCA case, via the singular value decomposition of `d %*% t(v)`.

What if you have no idea for values of the argument `n` (the desired sparsity)? The reference describes a cross-validation and an ad-hoc approach; neither of which are in the package yet. Both are prohibitively computationally expensive for matrices with a huge number of columns. A future version of this package will include a revised approach to automatically selecting a reasonable sparsity constraint.

Compare with the similar but more general functions `SPC` and `PMD` in the `PMA` package by Daniela M. Witten, Robert Tibshirani, Sam Gross, and Balasubramanian Narasimhan. The `PMD` function can compute low-rank regularized matrix decompositions with sparsity penalties on both the `u` and `v` vectors. The `ssvd` function is similar to the PMD(*, L1) method invocation of `PMD` or alternatively the `SPC` function. Although less general than `PMD`(*), the `ssvd` function can be faster and more memory efficient for the basic sparse PCA problem. See https://bwlewis.github.io/irlba/ssvd.html for more information.

(* Note that the s4vd package by Martin Sill and Sebastian Kaiser, https://cran.r-project.org/package=s4vd, includes a fast optimized version of a closely related algorithm by Shen, Huang, and Marron, that penalizes both `u` and `v`.)

### References

• Shen, Haipeng, and Jianhua Z. Huang. "Sparse principal component analysis via regularized low rank matrix approximation." Journal of multivariate analysis 99.6 (2008): 1015-1034.

• Witten, Tibshirani and Hastie (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. _Biostatistics_ 10(3): 515-534.

### Examples

```
set.seed(1)
u <- matrix(rnorm(200), ncol=1)
v <- matrix(c(runif(50, min=0.1), rep(0,250)), ncol=1)
u <- u / drop(sqrt(crossprod(u)))
v <- v / drop(sqrt(crossprod(v)))
x <- u %*% t(v) + 0.001 * matrix(rnorm(200*300), ncol=300)
s <- ssvd(x, n=50)
table(actual=v[, 1] != 0, estimated=s\$v[, 1] != 0)
oldpar <- par(mfrow=c(2, 1))
plot(u, cex=2, main="u (black circles), Estimated u (blue discs)")
points(s\$u, pch=19, col=4)
plot(v, cex=2, main="v (black circles), Estimated v (blue discs)")
points(s\$v, pch=19, col=4)

# Let's consider a trivial rank-2 example (k=2) with noise. Like the
# last example, we know the exact number of nonzero elements in each
# solution vector of the noise-free matrix. Note the application of
# different sparsity constraints on each column of the estimated v.
# Also, the decomposition is unique only up to sign, which we adjust
# for below.
set.seed(1)
u <- qr.Q(qr(matrix(rnorm(400), ncol=2)))
v <- matrix(0, ncol=2, nrow=300)
v[sample(300, 15), 1] <- runif(15, min=0.1)
v[sample(300, 50), 2] <- runif(50, min=0.1)
v <- qr.Q(qr(v))
x <- u %*% (c(2, 1) * t(v)) + .001 * matrix(rnorm(200 * 300), 200)
s <- ssvd(x, k=2, n=colSums(v != 0))

# Compare actual and estimated vectors (adjusting for sign):
s\$u <- sign(u) * abs(s\$u)
s\$v <- sign(v) * abs(s\$v)
table(actual=v[, 1] != 0, estimated=s\$v[, 1] != 0)
table(actual=v[, 2] != 0, estimated=s\$v[, 2] != 0)
plot(v[, 1], cex=2, main="True v1 (black circles), Estimated v1 (blue discs)")
points(s\$v[, 1], pch=19, col=4)
plot(v[, 2], cex=2, main="True v2 (black circles), Estimated v2 (blue discs)")
points(s\$v[, 2], pch=19, col=4)
par(oldpar)

```

irlba documentation built on Oct. 4, 2022, 1:05 a.m.