Description Usage Arguments Details Value Author(s) References Examples
For each component, the variables are selected so as to explain a percentage alpha of the variance explained by the corresponding principal component.
1 2 3 4 5 |
X |
The data matrix. |
alpha |
Real in [0,1]. percentage of variance of the PCs explained by the sparse component. |
maxcard |
a vector or an integer. Missing values filled with last value. |
ncomps |
number of components to compute |
spcaMethod |
character vector how LS SPCA components are computed: "u" for uncorrelated, "c" for correlated and "p" for projection. If only one value, the same method is used for all components. |
scalex |
= FALSE, whether to scale the variables to unit variance. Variables are scaled to zero mean (if needed) even if scaleX = FALSE |
variableSelection |
how the variables for each components are selected 'seqrep' stepwise, 'exhaustive' all subsets 'backward', 'forward', 'lasso' |
really.big |
logical, set to true if the matrix is large for faster variable selection no exhaustive search, of course |
force.in |
NULL or list of indeces that must be in component. not for lasso. [NULL] |
force.out |
NULL or list of indeces cannot be in component. [NULL] |
selectfromthese |
NULL or list of indeces from which model chosen. [NULL] |
lsspca_forLasso |
use lsspca with indeces selected with lasso or just the lasso regression |
lasso_penalty |
real between 0 and 1. 0-> ridge regression, 1 -> lasso |
for USPCA, maxcard
cannot be smaller than the order of the components
computed, so maxcard = c(1, 1, 1)
will be automatically changed to
maxcard = c(1, 2, 3)
. Exhaustive search can be slow for matrices with
30 or more variables. See the documentation for leaps::regsubset
and glmnet::glmnet for the options.
a list
Matrix with the loadings scaled to unit L_2 norm.
Matrix of loadings scaled to unit L_1 norm.
integer number of components computed. Default is 4.
Vector with the cardinalities of each loadings.
List with the indices of the non-zero loadings for each component.
A list with only the nonzero ladings for each component.
Vector with the % variance explained by each component.
Vector with the % variance explained by each principal component.
Vector with the % cumulative variance explained by each component.
Vector with the % proportion of cumulative variance explained by each component to that explained by the PCs.
the SPCs scores.
Matrix with the PCs loadings scaled to unit L_2 norm.
the PCs scores.
method used to compute the sparse loadings
Matrix of correlations among the sparse components. Only if spcaMethod != "u" and ncomps > 1.
The called with its arguments.
Giovanni Merola
Giovanni M. Merola. 2014. Least Squares Sparse Principal
Component Analysis: a Backward Elimination approach to attain large
loadings. Austr.&NZ Jou. Stats. 57, pp 391-429
Giovanni M. Merola and Gemai Chen. 2019. Sparse Principal Component Analysis: an
efficient Least Squares approach. Jou. Multiv. Analysis 173, pp 366–382
http://arxiv.org/abs/1406.1381
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | ## Not run:
library(LSSPCA)
data(hitters)
dim(hitters)
## USPCA 95
hit_uspca95 = lsspca(X = hitters, alpha = 0.95, ncomps = 4,
spcaMethod = "u", subsectSelection = "e")
#> Warning message:
#> In log(vr) : NaNs produced
## the warnings come from the variable selection, don't worry
## print contributions (only.nonzero)
print_spca(hit_uspca95)
## summaries
summary_spca(hit_uspca95, contributions = TRUE, digits = 1)
## print loadings individually
lapply(hit_uspca95$loadingslist, function(x) round(x, 2))
## print contributions individually
lapply(hit_uspca95$loadingslist, function(x) round(x/sum(abs(x)), 2))
## plot PC and USPC loadings
par(mfrow = c(1, 2))
barplot(-hit_uspca95$PCloadings[, 1], main = "PCA")
barplot(-hit_uspca95$loadings[, 1], main = "USPCA")
par(mfrow = c(1,1))
## Holzinger data
data(holzinger)
dim(holzinger)
## CSPCA
hol_cspca95 = lsspca(X = holzinger, alpha = 0.95, ncomps = 4,
spcaMethod = "c", subsectSelection = "e")
## summaries
t(data.frame(card = hol_cspca95$cardinality,
cvexp = round(hol_cspca95$cvexp, 2),
rcvexp = round(hol_cspca95$rcvexp, 2)))
## print loadings
lapply(hol_cspca95$loadingslist, function(x) round(x, 2))
## print contributions
lapply(hol_cspca95$loadingslist, function(x) round(x/sum(abs(x)), 2))
## correlation between SPCs
round(hol_cspca95$corComp, 2)
## plot contributions
barplot(-hol_cspca95$contributions[, 1])
## SPCs scores against PC scores
plot(hol_cspca95$scores[, 1], hol_cspca95$PCscores[, 1], pch = 16)
regline = lm(hol_cspca95$PCscores[, 1] ~ hol_cspca95$scores[, 1]- 1)$coef
abline(a = 0, b = regline, col = 2)
## SPCA on each ability separately
h_groups = lapply(seq(1, 10, 3), function(x) x:(x + 2))
## projection SPCA
hol_block_spca95 = lsspca(X = holzinger, alpha = 0.95, ncomps = 4,
spcaMethod = "p", subsectSelection = "e",
selectfromthese = h_groups)
## summaries
t(data.frame(card = hol_block_spca95$cardinality,
cvexp = round(hol_block_spca95$cvexp, 2),
rcvexp = round(hol_block_spca95$rcvexp, 2)))
## print loadings
lapply(hol_block_spca95$loadingslist, function(x) round(x, 2))
## print contributions
lapply(hol_block_spca95$loadingslist, function(x) round(x/sum(abs(x)), 2))
## correlation between SPCs
round(hol_block_spca95$corComp, 2)
## plot the contributions for each SPC
par(mfrow = c(2, 2))
for(k in 1:4){
barplot(-hol_block_spca95$contributions[, k])
}
par(mfrow = c(1, 1))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.