mmsca: mmsca: Sparse SCA/PCA with and/or: ridge, lasso, group lasso,...

Description Usage Arguments Value Examples

View source: R/RcppExports.R

Description

This function performs PCA/SCA with and/or: ridge, lasso, group lasso, elitist lasso regularization. This function allows for constraining certain weights to zero.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
mmsca(
  X,
  ncomp,
  ridge,
  lasso,
  grouplasso,
  elitistlasso,
  groups,
  constraints,
  itr,
  Wstart,
  tol = 1e-07,
  nStarts = 1L,
  printLoss = TRUE,
  coorDes = FALSE,
  coorDesItr = 1L
)

Arguments

X

A data matrix of class matrix

ncomp

The number of components to estimate (an integer)

ridge

A vector containing a ridge parameter for each column of W separately, to set the same ridge penalty for the component weights W, specify: ridge = rep(value, ncomp), value is a non-negative double

lasso

A vector containing a ridge parameter for each column of W separately, to set the same lasso penalty for the component weights W, specify: lasso = rep(value, ncomp), value is a non-negative double

grouplasso

A vector containing a grouplasso parameter for each column of W separately, to set the same grouplasso penalty for the component weights W, specify: grouplasso = rep(value, ncomp), value is a non-negative double

elitistlasso

A vector containing a elitistlasso parameter for each column of W separately, to set the same elitistlasso penalty for the component weights W, specify: elitistlasso = rep(value, ncomp), value is a non-negative double

groups

A vector specifying which columns of X belong to what block. Example: c(10, 100, 1000). The first 10 variables belong to the first block, the 100 variables after that belong to the second block etc.

constraints

A matrix of the same dimensions as the component weights matrix W (ncol(X) x ncomp). A zero entry corresponds in constraints corresponds to an element in the same location in W that needs to be constraint to zero. A non-zero entry corresponds to an element in the same location in W that needs to be estimated.

itr

The maximum number of iterations (a positive integer)

Wstart

A matrix of ncomp columns and nrow(X) rows with starting values for the component weight matrix W, if Wstart only contains zeros, a warm start is used: the first ncomp right singular vectors of X

tol

The convergence is determined by comparing the loss function value after each iteration, if the difference is smaller than tol, the analysis is converged. Default value is 10e-8

nStarts

The number of random starts the analysis should perform. The first start will be performed with the values given by Wstart. The consecutive starts will be Wstart plus a matrix with random uniform values times the current start number (the first start has index zero).

printLoss

A boolean: TRUE will print the lossfunction value each 1000 iteration.

coorDes

A boolean with the default FALSE. If coorDes is FALSE the estimation of the majorizing function to estimate the component weights W conditional on the loadings P will be found using matrix inverses which can be slow. If set to true the marjozing function will be optimized (or partially optimized) using coordinate descent, in many cases coordinate descent will be faster

coorDesItr

An integer specifying the maximum number of iterations for the coordinate descent algorithm, the default is set to 1. You do not have to run this algorithm until convergence before alternating back to the estimation of the loadings. The tolerance for this algorithm is hardcoded and set to 10^-8.

Value

A list containing:
W A matrix containing the component weights
P A matrix containing the loadings
loss A numeric variable containing the minimum loss function value of all the nStarts starts
converged A boolean containing TRUE if converged FALSE if not converged.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
J <- 30
X <- matrix(rnorm(100*J), 100, J)
ncomp <- 3

#An example of sparse SCA with ridge, lasso, and grouplasso regularization, with 2 groups, no constraints, and a "warm" start
mmsca(X = X, 
       ncomp = ncomp, 
       ridge = rep(10e-8, ncomp),
       lasso = rep(1, ncomp),
       grouplasso = rep(1, ncomp),
       elitistlasso = rep(0, ncomp),
       groups = c(J/2, J/2), 
       constraints = matrix(1, J, ncomp), 
       itr = 1000000, 
       Wstart = matrix(0, J, ncomp))

# Extended example: Perform SCA with group lasso regularization try out all common dinstinctive structures
# create sample data, with common and distinctive structure
ncomp <- 3 
J <- 30
comdis <- matrix(1, J, ncomp)
comdis[1:15, 1] <- 0 
comdis[15:30, 2] <- 0 

comdis <- sparsify(comdis, 0.1) #set 10 percent of the 1's to zero
variances <- makeVariance(varianceOfComps = c(100, 80, 90), J = J, error = 0.05) #create realistic eigenvalues
dat <- makeDat(n = 100, comdis = comdis, variances = variances)
X <- dat$X

results <- mmsca(X = X, 
    ncomp = ncomp, 
    ridge = rep(10e-8, ncomp),
    lasso = rep(0, ncomp),
    grouplasso = rep(5, ncomp),
    elitistlasso = rep(0, ncomp),
    groups = c(J/2, J/2), 
    constraints = matrix(1, J, ncomp), 
    itr = 1000000, 
    Wstart = matrix(0, J, ncomp))

#inspect results
results$W
dat$P[, 1:ncomp]

#for model selection functions see mmscaModelSelection() and mmscaHyperCubeSelection()

trbKnl/sparseWeightBasedPCA documentation built on July 22, 2020, 10:29 p.m.