Description Usage Arguments Details Value References See Also Examples
Peforms nfolds cross validation to select optimal tuning parameters for SELPCCA based on training data. If you want to apply optimal tuning parameters to testing data, you may also use multiplescca.
1 2 3 | cvselpscca(Xdata1=Xdata1,Xdata2=Xdata2,ncancorr=ncancorr,CovStructure="Iden",
isParallel=TRUE,ncores=NULL,nfolds=5,ngrid=10,
standardize=TRUE,thresh=0.0001,maxiteration=20)
|
Xdata1 |
A matrix of size n \times p for first dataset. Rows are samples and columns are variables. |
Xdata2 |
A matrix of size n \times q for second dataset. Rows are samples and columns are variables. |
ncancorr |
Number of canonical correlation vectors. Default is 1. |
CovStructure |
Covariance structure to use in estimating sparse canonical correlation vectors. Either "Iden" or "Ridge". Iden assumes the covariance matrix for each dataset is identity. Ridge uses the sample covariance for each dataset. See reference article for more details. |
isParallel |
TRUE or FALSE for parallel computing. Default is TRUE. |
ncores |
Number of cores to be used for parallel computing. Only used if isParallel=TRUE. If isParallel=TRUE and ncores=NULL, defaults to half the size of the number of system cores. |
nfolds |
Number of cross validation folds. Default is 5. |
ngrid |
Number of grid points for tuning parameters. Default is 10 for each dataset. |
standardize |
TRUE or FALSE. If TRUE, data will be normalized to have mean zero and variance one for each variable. Default is TRUE. |
maxiteration |
Maximum iteration for the algorithm if not converged. Default is 20. |
thresh |
Threshold for convergence. Default is 0.0001. |
The function will return several R objects, which can be assigned to a variable. To see the results, use the “$" operator.
hatalpha |
Estimated sparse canonical correlation vectors for first dataset. |
hatbeta |
Estimated sparse canonical correlation vectors for second dataset. |
CovStructure |
Covariance structure used in estimating sparse canonical correlation vectors. Either "Iden" or "Ridge". |
optTau |
Optimal tuning parameters for each dataset. |
maxcorr |
Estimated canonical correlation coefficient. |
tunerange |
Grid values for each dataset used for searching optimal tuning paramters. |
Sandra E. Safo, Jeongyoun Ahn, Yongho Jeon, and Sungkyu Jung (2018) , Sparse Generalized Eigenvalue Problem with Application to Canonical Correlation Analysis for Integrative Analysis of Methylation and Gene Expression Data. Biometrics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | library(SELPCCA)
##---- read in data
data(DataExample)
Xdata1=DataExample[[1]]
Xdata2=DataExample[[2]]
##---- call cross validation to estimate first canonical correlation vectors
ncancorr=1
mycv=cvselpscca(Xdata1=Xdata1,Xdata2=Xdata2,ncancorr=ncancorr,CovStructure="Iden",
isParallel=FALSE,ncores=NULL,nfolds=5,ngrid=10,
standardize=TRUE,thresh=0.0001,maxiteration=20)
#check output
train.correlation=mycv$maxcorr
optTau=mycv$optTau
hatalpha=mycv$hatalpha
hatbeta=mycv$hatbeta
#obtain correlation plot using training data
scoresX1=Xdata1%*% hatalpha
scoresX2=Xdata2%*% hatbeta
plot(scoresX1, scoresX2,lwd=3,
,xlab=paste(
"First Canonical correlation variate for dataset", 1),
ylab=paste("First Canonical correlation variate for dataset", 2),
main=paste("Correlation plot for datasets",1, "and" ,2, ",", "\u03C1 =", mycv$maxcorr))
#obtain correlation plot using testing data
Xtestdata1=DataExample[[3]]
Xtestdata2=DataExample[[4]]
scoresX1=Xtestdata1%*%hatalpha
scoresX2=Xtestdata2%*%hatbeta
mytestcorr=round(abs(cor(Xtestdata1%*%hatalpha,Xtestdata2%*%hatbeta)),3)
plot(scoresX1, scoresX2,lwd=3,xlab=paste(
"First Canonical correlation variate for dataset", 1),
ylab=paste("First Canonical correlation variate for dataset", 2),
main=paste("Correlation plot for datasets",1, "and" ,2, ",", "\u03C1 =", mytestcorr))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.