mmscaModelSelection: Model selection for MMSCA
In trbKnl/sparseWeightBasedPCA: Sparse weight based PCA

Description Usage Arguments Value Examples

A function that performs model selection, for the regularizers and the number of components for mmsca()

mmscaModelSelection(
  X,
  ridgeSeq,
  lassoSeq,
  grouplassoSeq,
  elitistlassoSeq,
  ncompSeq,
  tuningMethod = "BIC",
  groups,
  nrFolds = NULL,
  itr = 1e+06,
  nStart = 1,
  tol = 1e-07,
  coorDes = TRUE,
  coorDesItr = 100,
  printProgress = TRUE
)

`X`	A data matrix of class `matrix`
`ridgeSeq`	A range of values for the ridge penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`lassoSeq`	A range of values for the lasso penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`grouplassoSeq`	A range of values for the group lasso penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`elitistlassoSeq`	A range of values for the elitist lasso penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`ncompSeq`	A range of integers for the number of components that need to be examined.
`tuningMethod`	A string indicating which model selection method should be used. "BIC" enables the Bayesian information criterion, "IS" enables the index of sparseness. "CV" enables cross-validation (CV) with the EigenVector method, if CV is used, the number of folds nrFolds needs to be chosen. The number of folds should be an integer less than `nrow(X)`. The data are then split in equal sized chunks if order of appearance.
`groups`	A vector specifying which columns of X belong to what block. Example: `c(10, 100, 1000)`. The first 10 variables belong to the first block, the 100 variables after that belong to the second block etc.
`itr`	The maximum number of iterations (a positive integer)
`tol`	The convergence is determined by comparing the loss function value after each iteration, if the difference is smaller than `tol`, the analysis is converged. Default value is `10e-8`
`coorDes`	A boolean with the default `FALSE`. If coorDes is `FALSE` the estimation of the majorizing function to estimate the component weights W conditional on the loadings P will be found using matrix inverses which can be slow. If set to `TRUE` the marjozing function will be optimized (or partially optimized) using coordinate descent, in some cases coordinate descent will be faster
`coorDesItr`	An integer specifying the maximum number of iterations for the coordinate descent algorithm, the default is set to 1. You do not have to run this algorithm until convergence before alternating back to the estimation of the loadings. The tolerance for this algorithm is hardcoded and set to `10^-8`.
`printProgress`	A boolean: `TRUE` will print the progress of the model selection
`nrFold`	An integer that specify the number of folds that Cross-validation should use if tuningmethod == "CV", the number of folds needs to be lower then `nrow(X)`.
`nStarts`	The number of random starts the analysis should perform. The first start will be a warm start. You can not give custom starting values.

A list containing:
results A list with ncomp elements each containing the following items in a list

"BIC, IS or MSPE" The index chosen in tuningMethod for all combinations of ridge, lasso, grouplasso and elististlasso
"bestBIC, bestIS, bestMSPE or bestMSPE1stdErrorRule" The best index according to the chosen tuning method
"nNonZeroCoef" The number of non zero weights in the best model
"ridge" The value of the ridge penalty corresponding to the best model
"lasso" The value of the lasso penalty corresponding to the best model
"grouplasso" The value of the group lasso penalty corresponding to the best model
"elististlasso" The value of the elitist lasso penalty corresponding to the best model
"ncomp" The number of component that was used for these items
"ridge1stdErrorRule" In case tuningMethod == "CV", the value of the ridge penalty according to the 1 standard error rule: the most sparse model within one standard error of the model with the lowest MSPE
"lasso1stdErrorRule" In case tuningMethod == "CV", the value of the lasso penalty according to the 1 standard error rule: the most sparse model within one standard error of the model with the lowest MSPE
"grouplasso1stdErrorRule" In case tuningMethod == "CV", the value of the group lasso penalty according to the 1 standard error rule: the most sparse model within one standard error of the model with the lowest MSPE
"elitistlasso1stdErrorRule" In case tuningMethod == "CV", the value of the elitist lasso penalty according to the 1 standard error rule: the most sparse model within one standard error of the model with the lowest MSPE
"ridge1stdErrorRule" In case tuningMethod == "CV", the value of the ridge according to the 1 standard error rule: the most sparse model within one standard error of the model with the lowest MSPE

bestNcomp The number of component with the best value for the chosen tuning index

 
X <- matrix(rnorm(100 * 10), 100, 10)

out <- mmscaModelSelection(X, 
            ridgeSeq = seq(0, 1, by = 0.1), 
            lassoSeq = 0:100, 
            grouplassoSeq = 0,
            elitistlassoSeq = 0, 
            ncompSeq = 1:3, 
            tuningMethod = "CV", 
            groups = ncol(X), 
            nrFolds = 10, 
            itr = 100000, 
            nStart = 1, 
            coorDes = FALSE, 
            coorDesItr = 100, 
            printProgress = TRUE)

#Inspect the results of the model selection for the optimal number of components according to the tuning method
out$results[[out$bestNcomp]]