rpca: Randomized principal component analysis (rpca). In Benli11/rPCA: Randomized Singular Value Decomposition

Description

Fast computation of the principal components analysis using the randomized singular value decomposition.

Usage

 ```1 2``` ```rpca(A, k = NULL, center = TRUE, scale = TRUE, retx = TRUE, p = 10, q = 2, rand = TRUE) ```

Arguments

 `A` array_like; a numeric (m, n) input matrix (or data frame) to be analyzed. If the data contain NAs na.omit is applied. `k` integer; number of dominant principle components to be computed. It is required that k is smaller or equal to min(m,n), but it is recommended that k << min(m,n). `center` bool, optional; logical value which indicates whether the variables should be shifted to be zero centered (TRUE by default). `scale` bool, optional; logical value which indicates whether the variables should be scaled to have unit variance (TRUE by default). `retx` bool, optional; logical value indicating whether the rotated variables / scores should be returned (TRUE by default). `p` integer, optional; oversampling parameter for rsvd (default p=10), see `rsvd`. `q` integer, optional; number of additional power iterations for rsvd (default q=1), see `rsvd`. `rand` bool, optional; if (TRUE), the rsvd routine is used, otherwise svd is used.

Details

Principal component analysis is an important linear dimension reduction technique.

Randomized PCA is computed via the randomized SVD algorithm (`rsvd`). The computational gain is substantial, if the desired number of principal components is relatively small, i.e. k << min(m,n).

The print and summary method can be used to present the results in a nice format. A scree plot can be produced with `ggscreeplot`. The individuals factor map can be produced with `ggindplot`, and a correlation plot with `ggcorplot`.

The predict function can be used to compute the scores of new observations. The data will automatically be centered (and scaled if requested). This is not fully supported for complex input matrices.

Value

`rpca` returns a list with class rpca containing the following components:

 `rotation` array_like; the rotation (eigenvectors); (n, k) dimensional array. `eigvals` array_like; eigenvalues; k dimensional vector. `sdev` array_like; standard deviations of the principal components; k dimensional vector. `x` array_like; the scores / rotated data; (m, k) dimensional array. `center, scale` array_like; the centering and scaling used.

Note

The principal components are not unique and only defined up to sign (a constant of modulus one in the complex case) and so may differ between different PCA implementations.

Similar to `prcomp` the variances are computed with the usual divisor N - 1.

Author(s)

N. Benjamin Erichson, [email protected]

`ggscreeplot`, `ggindplot`, `ggcorplot`, `plot.rpca`, `predict`, `rsvd`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26``` ```library('rsvd') # # Load Edgar Anderson's Iris Data # data('iris') # # log transform # log.iris <- log( iris[ , 1:4] ) iris.species <- iris[ , 5] # # Perform rPCA and compute only the first two PCs # iris.rpca <- rpca(log.iris, k=2) summary(iris.rpca) # Summary print(iris.rpca) # Prints the rotations # # Use rPCA to compute all PCs, similar to \code{\link{prcomp}} # iris.rpca <- rpca(log.iris) summary(iris.rpca) # Summary print(iris.rpca) # Prints the rotations plot(iris.rpca) # Produce screeplot, variable and individuls factor maps. ```