cv_structuredSCA: A K-fold cross-validation procedure when common/distinctive...

Description Usage Arguments Details Value References Examples

Description

cv_structuredSCA helps to find a range of lasso tuning parameters for the common component so as to generate sparse common component.

Usage

1
2
cv_structuredSCA(DATA, Jk, R, Target, Position, MaxIter, NRSTARTS,
  LassoSequence, nfolds)

Arguments

DATA

The concatenated data block, with rows representing subjects.

Jk

A vector. Each element of this vector is the number of columns of a data block.

R

The number of components (R>=2).

Target

A matrix containing 0's and 1's. Its number of columns equals to R, and its number of rows equals to the number of blocks to be integrated. Thus, if the element in

Position

Indicate on which component(s) the Lasso Penalty is imposed. If unspecified, the algorithm assume that the Lasso penalty is imposed on the common component(s) only. If there is no common component, then Lasso penalty is applied to all components.

MaxIter

Maximum number of iterations for this algorithm. The default value is 400.

NRSTARTS

The number of multistarts for this algorithm. The default value is 5.

LassoSequence

The range of lasso tuning parameters. The default value is a sequence of 50 numbers from 0.00000001 to the smallest Lasso tuning parameter that can make the entire common component(s) to be zeros. Note that by default the 50 numbers are equally spaced on the log scale.

nfolds

Number of folds. If missing, then 10 fold cross-validation will be performed.

Details

This function searches through a range of lasso tuning parameters for the common component, while keeping distinctive components fixed (- that is, the zeros in the distinctive components are fixed). This function may be of help if a user wants to obtain some sparseness in the common component.

Value

MSPE

A vector of mean squared prediction error (MSPE) for the sequence of Lasso tuning parameter values.

MSPE1SE

The lowest MSPE + 1SE.

Standard_Error

Standard errors.

LassoSequence

The sequence of Lasso tuning parameters used in cross-validation.

plot

A plot of mean square errors +/- 1 standard error against Lasso tuning parameters. The plot is plotted against a log scale of lambda if LassoSequence is not defined by users.

LassoRegion

A region where the suitable lambda can be found, according to the "1 SE rule".

RecommendedLasso

A Lasso tuning parameter that leads to a model with PRESS closest to the lowest PRESS + 1SE.

P_hat

Estimated component loading matrix, given the recommended tuning parameter.

T_hat

Estimated component score matrix, given the recommended tuning parameter.

plotlog

An index number for function plot(), which is not useful for users.

References

Witten, D.M., Tibshirani, R., & Hastie, T. (2009), A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10(3), 515-534.

Gu, Z., & Van Deun, K. (2016). A variable selection method for simultaneous component based data integration. Chemometrics and Intelligent Laboratory Systems, 158, 187-199.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
DATA1 <- matrix(rnorm(50), nrow=5)
DATA2 <- matrix(rnorm(100), nrow=5)
DATA <- cbind(DATA1, DATA2)
Jk <- c(10, 20) #DATA1 has 10 columns, DATA2 20.
R <- 4 
Target <- matrix(c(1,1,1,0,1,0,0,1), 2, 4) 
cv_structuredSCA(DATA, Jk, R, Target, MaxIter = 100, NRSTARTS = 40, 
                LassoSequence = seq(from= 0.002, to=0.1, 
                length.out = 10))

## End(Not run)

ZhengguoGu/RegularizedSCA documentation built on July 4, 2019, 2:46 p.m.