README.md

R Package LSSPCA

R build status Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public. license Last-changedate"

This R package is a companion to my tutorial paper published on xxx and computes sparse principal components by Least Squares Sparse Principal Components Analysis.

LSSPCA provides only 3 functions

It also provides 2 utilities (not methods, for various reasons)

This is a lightweight package not designed to handle large matrices. A dated (but working, made obsolete on CRAN) package with methods for visualizing the PCs is available here (or devtools::install_github("merolagio/spca")). A new version with fast C++ code and PSPCA is in the making and will be released on CRAN one day. By the way, if you find this package useful, please acknowledge my work. It will make my manager happy :wink:

The number of nonzero loadings can be controlled by changing the parameter alpha (which is the minimal proportion of variance explained by the Principal Components to be reproduced).

Orthogonal sparse components (USPCA) are computed by choosing spcaMethod = "u". Correlated components (CSPCA) can be obtained by choosing spcaMethod = "c". The higher order CSPCA components may explain a bit more variance at the price of being correlated.

The variables can be selected by using exhaustive, stepwise, forward or backward selection via the argument variableSelection options "e", "s", "f" and "b", respectively.

An example of sparse loadings is in this image

The sparse PCs are combinations of only 2, 3 and 4 variables out of 16 but are a (very) close approximation to the original PCs. See by yourself:

Well, the idea is to approximate the data as well as possible with sparse components. Since the PCs give the best approximation of the data, approximating the PCs with sparse components is pretty much the same thing. So, decent sparse components can be obtained by simply projecting (yes, by linear regression, PSPCA) the PCs on a subset of variables, option spcaMethod = "p".

Explanations, details and examples about LS SPCA can be found in the tutorial paper xxx.

More can be found in the mathematically oriented papers:

A discussion about sparsifying rotated principal components can be found in

Ahhh, I almost forgot: LS SPCA is much better than other SPCA methods because it maximizes the variance explained and produces orthogonal components which are combinations of key variables. Other methods do not and have serious problems with correlated variables. All explained and proved in my papers.



merolagio/LSSPCA documentation built on April 29, 2021, 4:17 p.m.