mmscaHyperCubeSelection: Hyper Cube Model selection for MMSCA
In trbKnl/sparseWeightBasedPCA: Sparse weight based PCA

Description Usage Arguments Value Examples

View source: R/mmscaHyperCubeSelection.R

A function that performs model selection for the regularizers of mmsca(). This function tunes a grid of the tuning parameters determine by the min and max of their corresponding sequences and a step size the provided by stepsize argument. It picks out the best combination, and zooms in on that combination, by making a new smaller grid around the previous best combination. This process continues until the average range of the sequences is less than stopWhenRange. The new sequences are determined by taking the minimum value to be: best value - range, and maximum value by: best value + range, and a pre-specified step size in stepsize.

mmscaHyperCubeSelection(
  X,
  ncomp,
  ridgeSeq,
  lassoSeq,
  grouplassoSeq,
  elitistlassoSeq,
  stepsize,
  logscale = FALSE,
  stopWhenRange = 0.05,
  groups,
  nrFolds = NULL,
  nStart,
  itr = 1e+06,
  printProgress = TRUE,
  coorDes = TRUE,
  coorDesItr = 1,
  tol = 1e-07,
  method = "BIC"
)

`X`	A data matrix of class `matrix`
`ridgeSeq`	A range of values for the ridge penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`lassoSeq`	A range of values for the lasso penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`grouplassoSeq`	A range of values for the group lasso penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`elitistlassoSeq`	A range of values for the elitist lasso penalty that need to be examined. Specify a zero if the tuning parameter is not wanted.
`stepsize`	The sequences of ridgeSeq, lassoSeq, grouplassoSeq, and elitistlassoSeq are constructed by `seq(min(seq), max(seq), by = stepsize)`. So `stepsize` determines how fine the grid is.
`logscale`	determines whether the sequences are on the log-scale or not. By default it is set to `FALSE`.
`stopWhenRange`	A numeric value with default 0.05. If the average range of the tuning sequences is less than this value the algorithm stops.
`groups`	A vector specifying which columns of X belong to what block. Example: `c(10, 100, 1000)`. The first 10 variables belong to the first block, the 100 variables after that belong to the second block etc.
`nStart`	The number of random starts the analysis should perform. The first start will be a warm start, W will be started with the first Q right singular vectors of X. You can not give custom starting values.
`itr`	The maximum number of iterations of `mmsca()` (a positive integer). Default is set to `10e5`.
`printProgress`	A boolean with default TRUE. If set to `TRUE`, the proges of the procedure will be printed to the screen.
`coorDes`	A boolean with the default `FALSE`. If coorDes is `FALSE` the estimation of the majorizing function to estimate the component weights W conditional on the loadings P will be found using matrix inverses which can be slow. If set to true the marjozing function will be optimized (or partially optimized) using coordinate descent, in some cases coordinate descent will be faster.
`coorDesItr`	An integer specifying the maximum number of iterations for the coordinate descent algorithm, the default is set to 1. You do not have to run this algorithm until convergence before alternating back to the estimation of the loadings. The tolerance for this algorithm is hardcoded and set to `10^-8`.
`tol`	A numeric value specifying the tolerance of mmsca(), it determine when the algorithm is converged (\|current loss - previous loss\| < tol), by default it is set to `10e-8`. Which might be too small or too large depending on the scaling of the data.
`method`	A string indicating which model selection method should be used. "BIC" enables the Bayesian information criterion, "IS" enables the index of sparseness. "CV" enables cross-validation (CV) with the EigenVector method, "CV1stdError" enables CV with the one standard error rule, this will pick the combination of tuning parameters that leads to the most sparse model, still within one standard error of the best model, if "CV" or "CV1stdError" is used, the number of folds `nrFolds` needs to be chosen. The number of folds should be an integer less than `nrow(X)`. The data are then split in equal sized chunks if order of appearance. Note that if you choose "C1stdError", the number of folds influences the standard error, choose it too small and standard error will be large, consequently all models fall within one standard error of the best model.
`nrFold`	An integer that specify the number of folds that Cross-validation should use if tuningmethod == "CV", the number of folds needs to be lower then `nrow(X)`.

A list containing:
ridge A vector with ncomp elements all equal to the chosen ridge value
lasso A vector with ncomp elements all equal to the chosen lasso value
grouplasso A vector with ncomp elements all equal to the chosen group lasso value
elitistlasso A vector with ncomp elements all equal to the chosen elitist lasso value

 
# Example select the lasso and ridge parameter for mmsca()
# create sample data
ncomp <- 3 
J <- 30
comdis <- matrix(1, J, ncomp)

comdis <- sparsify(comdis, 0.5) #set 50 percent of the 1's to zero
variances <- makeVariance(varianceOfComps = c(100, 80, 90), J = J, error = 0.05) #create realistic eigenvalues
dat <- makeDat(n = 100, comdis = comdis, variances = variances)
X <- dat$X

#Note: can take some time
results <- mmscaHyperCubeSelection(X,
              ncomp = 3,
              ridgeSeq = 0:3,
              lassoSeq = 0:10,
              grouplassoSeq = 0,
              elitistlassoSeq = 0,
              stepsize = 5,
              logscale = FALSE,
              groups = ncol(X),
              nStart = 1,
              itr = 100000,
              printProgress = TRUE,
              coorDes = FALSE,
              coorDesItr = 1,
              method = "CV1stdError",
              tol = 10e-5,
              nrFolds = 10)

#fit the model with the selected hyper parameters
fit <- mmsca(X = X, 
    ncomp = ncomp, 
    ridge = results$ridge,
    lasso = results$lasso,
    grouplasso = results$grouplasso,
    elitistlasso = results$elitistlasso,
    groups = ncol(X), 
    constraints = matrix(1, J, ncomp), 
    itr = 1000000, 
    Wstart = matrix(0, J, ncomp))

#inspect the results
fit$W
dat$P[, 1:ncomp]