sparseWeightBasedPCA: Sparse weight based PCA

Description Usage Arguments Value Examples

This function performs PCA/SCA with and/or: ridge, lasso, group lasso, elitist lasso regularization. This function allows for constraining certain weights to zero.

mmsca(
  X,
  ncomp,
  ridge,
  lasso,
  grouplasso,
  elitistlasso,
  groups,
  constraints,
  itr,
  Wstart,
  tol = 1e-07,
  nStarts = 1L,
  printLoss = TRUE,
  coorDes = FALSE,
  coorDesItr = 1L
)

`X`	A data matrix of class `matrix`
`ncomp`	The number of components to estimate (an integer)
`ridge`	A vector containing a ridge parameter for each column of W separately, to set the same ridge penalty for the component weights W, specify: ridge = `rep(value, ncomp)`, value is a non-negative double
`lasso`	A vector containing a ridge parameter for each column of W separately, to set the same lasso penalty for the component weights W, specify: lasso = `rep(value, ncomp)`, value is a non-negative double
`grouplasso`	A vector containing a grouplasso parameter for each column of W separately, to set the same grouplasso penalty for the component weights W, specify: grouplasso = `rep(value, ncomp)`, value is a non-negative double
`elitistlasso`	A vector containing a elitistlasso parameter for each column of W separately, to set the same elitistlasso penalty for the component weights W, specify: elitistlasso = `rep(value, ncomp)`, value is a non-negative double
`groups`	A vector specifying which columns of `X` belong to what block. Example: `c(10, 100, 1000)`. The first 10 variables belong to the first block, the 100 variables after that belong to the second block etc.
`constraints`	A matrix of the same dimensions as the component weights matrix W (`ncol(X)` x `ncomp`). A zero entry corresponds in constraints corresponds to an element in the same location in W that needs to be constraint to zero. A non-zero entry corresponds to an element in the same location in W that needs to be estimated.
`itr`	The maximum number of iterations (a positive integer)
`Wstart`	A matrix of `ncomp` columns and `nrow(X)` rows with starting values for the component weight matrix W, if `Wstart` only contains zeros, a warm start is used: the first `ncomp` right singular vectors of `X`
`tol`	The convergence is determined by comparing the loss function value after each iteration, if the difference is smaller than `tol`, the analysis is converged. Default value is `10e-8`
`nStarts`	The number of random starts the analysis should perform. The first start will be performed with the values given by `Wstart`. The consecutive starts will be `Wstart` plus a matrix with random uniform values times the current start number (the first start has index zero).
`printLoss`	A boolean: `TRUE` will print the lossfunction value each 1000 iteration.
`coorDes`	A boolean with the default `FALSE`. If coorDes is `FALSE` the estimation of the majorizing function to estimate the component weights W conditional on the loadings P will be found using matrix inverses which can be slow. If set to true the marjozing function will be optimized (or partially optimized) using coordinate descent, in many cases coordinate descent will be faster
`coorDesItr`	An integer specifying the maximum number of iterations for the coordinate descent algorithm, the default is set to 1. You do not have to run this algorithm until convergence before alternating back to the estimation of the loadings. The tolerance for this algorithm is hardcoded and set to `10^-8`.

A list containing:
W A matrix containing the component weights
P A matrix containing the loadings
loss A numeric variable containing the minimum loss function value of all the nStarts starts
converged A boolean containing TRUE if converged FALSE if not converged.

J <- 30
X <- matrix(rnorm(100*J), 100, J)
ncomp <- 3

#An example of sparse SCA with ridge, lasso, and grouplasso regularization, with 2 groups, no constraints, and a "warm" start
mmsca(X = X, 
       ncomp = ncomp, 
       ridge = rep(10e-8, ncomp),
       lasso = rep(1, ncomp),
       grouplasso = rep(1, ncomp),
       elitistlasso = rep(0, ncomp),
       groups = c(J/2, J/2), 
       constraints = matrix(1, J, ncomp), 
       itr = 1000000, 
       Wstart = matrix(0, J, ncomp))

# Extended example: Perform SCA with group lasso regularization try out all common dinstinctive structures
# create sample data, with common and distinctive structure
ncomp <- 3 
J <- 30
comdis <- matrix(1, J, ncomp)
comdis[1:15, 1] <- 0 
comdis[15:30, 2] <- 0 

comdis <- sparsify(comdis, 0.1) #set 10 percent of the 1's to zero
variances <- makeVariance(varianceOfComps = c(100, 80, 90), J = J, error = 0.05) #create realistic eigenvalues
dat <- makeDat(n = 100, comdis = comdis, variances = variances)
X <- dat$X

results <- mmsca(X = X, 
    ncomp = ncomp, 
    ridge = rep(10e-8, ncomp),
    lasso = rep(0, ncomp),
    grouplasso = rep(5, ncomp),
    elitistlasso = rep(0, ncomp),
    groups = c(J/2, J/2), 
    constraints = matrix(1, J, ncomp), 
    itr = 1000000, 
    Wstart = matrix(0, J, ncomp))

#inspect results
results$W
dat$P[, 1:ncomp]

#for model selection functions see mmscaModelSelection() and mmscaHyperCubeSelection()