ncompsearch: Search for Number of Components

View source: R/src.r

ncompsearchR Documentation

Search for Number of Components

Description

Determination of the number of components based on cross-validated method or Bayesian information criterion (BIC)

Usage

ncompsearch(
  X,
  Y = NULL,
  Z = NULL,
  comps = 1:3,
  lambdaX = NULL,
  lambdaY = NULL,
  lambdaXsup = NULL,
  lambdaYsup = NULL,
  eta = 1,
  type = "lasso",
  inX = NULL,
  inY = NULL,
  inXsup = NULL,
  inYsup = NULL,
  muX = 0,
  muY = 0,
  nfold = 5,
  regpara = FALSE,
  maxrep = 3,
  minpct = 0,
  maxpct = 1,
  criterion = c("CV", "BIC")[1],
  whichselect = NULL,
  intseed = 1
)

## S3 method for class 'ncompsearch'
print(x, ...)

## S3 method for class 'ncompsearch'
plot(x, cidx = 1, ...)

Arguments

X

a matrix or list of matrices indicating the explanatory variable(s). This parameter is required.

Y

a matrix or list of matrices indicating objective variable(s). This is optional. If there is no input for Y, then PCA is implemented.

Z

a vector, response variable(s) for implementing the supervised version of (multiblock) PCA or PLS. This is optional. The length of Z is the number of subjects. If there is no input for Z, then unsupervised PLS/PCA is implemented.

comps

numeric vector for the maximum numbers of componets to be considered.

lambdaX

numeric vector of regularized parameters for X, with a length equal to the number of blocks. If lambdaX is omitted, no regularization is conducted.

lambdaY

numeric vector of regularized parameters for Y, with a length equal to the number of blocks. If lambdaY is omitted, no regularization is conducted.

lambdaXsup

numeric vector of regularized parameters for the super weight of X with length equal to the number of blocks. If omitted, no regularization is conducted.

lambdaYsup

numeric vector of regularized parameters for the super weight of Y with length equal to the number of blocks. If omitted, no regularization is conducted.

eta

numeric scalar indicating the parameter indexing the penalty family. This version contains only choice 1.

type

a character, indicating the penalty family. In this version, only one choice is available: "lasso."

inX

a (list of) numeric vector to specify the variables of X which are always in the model.

inY

a (list of) numeric vector to specify the variables of X which are always in the model.

inXsup

a (list of) numeric vector to specify the blocks of X which are always in the model.

inYsup

a (list of) numeric vector to specify the blocks of Y which are always in the model.

muX

a numeric scalar for the weight of X for the supervised case. 0 <= muX <= 1.

muY

a numeric scalar for the weight of Y for the supervised case. 0 <= muY <= 1.

nfold

number of folds - default is 5.

regpara

logical, If TRUE, the regularized parameters search is also conducted simultaneously.

maxrep

numeric scalar for the number of iteration.

minpct

minimum candidate parameters defined as a percentile of automatically determined (possible) candidates.

maxpct

maximum candidate parameters defined as a percentile of automatically determined (possible) candidates.

criterion

a character, the evaluation criterion, "CV" for cross-validation, based on a matrix element-wise error, and "BIC" for Bayesian information criteria. The "BIC" is the default.

whichselect

which blocks selected.

intseed

seed number for the random number in the parameter estimation algorithm.

x

an object of class "ncompsearch", usually, a result of a call to ncompsearch

...

further arguments passed to or from other methods.

cidx

Parameters used in the plot function to specify whether block or super is used. 1=block (default), 2=super.

Details

This function searches for the optimal number of components.

Value

comps

numbers of components

mincriterion

minimum criterion values

criterions

criterion values

optncomp

optimal number of components based on minimum cross-validation error

Examples

##### data #####
tmpdata = simdata(n = 50, rho = 0.8, Yps = c(10, 12, 15), Xps = 20, seed=1)
X = tmpdata$X; Y = tmpdata$Y 

##### number of components search #####
ncomp1 = ncompsearch(X, Y, comps = c(1, 5, 10*(1:2)), nfold=5)
#plot(ncomp1)


msma documentation built on May 29, 2024, 2:52 a.m.