Regularized Parameters Search

Share:

Description

Regularized parameters search method for "msma".

Usage

1
2
3
4
5
6
regparasearch(X, Y = NULL, Z = NULL, eta = 1, type = "lasso",
  inX = NULL, inY = NULL, muX = 0, muY = 0, comp = 1, nfold = 5,
  maxrep = 3, minpct = 0, maxpct = 1, method = c("BIC", "CV")[1])

## S3 method for class 'regparasearch'
print(x, ...)

Arguments

X

a (list of) matrix, explanatory variable(s) which is required.

Y

a (list of) matrix, objective variable(s). This is optional. If no input for Y, then the PCA method is implemented.

Z

a vector, response variable(s). This is optional. The length is the number of subjects. If no input for Z, then the unsupervised PLS/PCA is implemented.

eta

numeric scalar, the parameter indexing the penalty family. This version has only the choice 1.

type

a character, the penalty family. This version has only the choice "lasso".

inX

a (list of) numeric vector to specify the variables of X which are always in the model.

inY

a (list of) numeric vector to specify the variables of X which are always in the model.

muX

a numeric scalar for the weight of X for the supervised. 0<=muX<=1.

muY

a numeric scalar for the weight of Y for the supervised. 0<=muY<=1.

comp

numeric scalar for the number of components to be considered.

nfold

number of folds - default is 5.

maxrep

numeric scalar for the number of iteration.

minpct

percent of minimum candidate parameters.

maxpct

percent of maximum candidate parameters.

method

a character, the evaluation method, "CV" for cross-validation based on matrix element-wise error, and "BIC" for Bayesian information criteria. The default is the BIC.

x

an object of class "regparasearch", usually, a result of a call to regparasearch

...

further arguments passed to or from other methods.

Details

This is a function to search regularized parameters of sparseness lambdaX and lambdaY for msma. The initial range of candidates are computed based on the fit with the values of regularized parameters of 0. The binary search is conducted for the divided parameter range into two regions. The representative value for the region is a median and the optimal region is selected with the minimum criteria obtained from the fit with the value. The CV error or BIC can be used as criteria. The selected region is also divided into two region and the same process is iterated by maxrep times. Thus, the final median value in the selected region is set to be the optimal regularized parameter. The search is conducted with combinations of parameters for X and Y. The range of candidates for regularized parameters can be restricted with the percentile of the limit (minimum or maximum) of the range.

Value

optlambdaX

Optimal parameters for X

optlambdaY

Optimal parameters for Y

mincriterion

Minimum criterion value

criterions

All resulting criterion values in the process

pararange

Range of candidates parameters

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
##### data #####
tmpdata = simdata(n = 50, rho = 0.8, Yps = c(10, 12, 15), Xps = 20, seed=1)
X = tmpdata$X; Y = tmpdata$Y 

##### Regularized parameters search #####
opt1 = regparasearch(X, Y, comp=1, nfold=5, maxrep=2)
opt1
fit4 = msma(X, Y, comp=1, lambdaX=opt1$optlambdaX, lambdaY=opt1$optlambdaY)
fit4
summary(fit4)

##### Restrict search range #####
opt2 = regparasearch(X, Y, comp=1, nfold=5, maxrep=2, minpct=0.5)
opt2