seedCCA | R Documentation |
The function seedCCA
is mainly for implementing seeded canonical correlation analysis proposed by Im et al. (2015). The function conducts the following four methods, depending on the value of type
. The option type
has one of c("cca", "seed1", "seed2", "pls")
.
seedCCA(X,Y,type="seed2",ux=NULL,uy=NULL,u=10,eps=0.01,cut=0.9,d=NULL,AS=TRUE,scale=FALSE)
X |
numeric vector or matrix (n * p), the first set of variables |
Y |
numeric vector or matrix (n * r), the second set of variables |
type |
character, a choice of methods among |
ux |
numeric, maximum number of projections for X. The default is NULL. If this is not NULL, it surpasses the option |
uy |
numeric, maximum number of projections for Y. The default is NULL. If this is not NULL, it surpasses the option |
u |
numeric, maximum number of projections. The default is 10. This is used for |
eps |
numeric, the criteria to terminate iterative projections. The default is 0.01. If increment of projections is less than |
cut |
numeric, between 0 and 1. The default is 0.9.
If |
d |
numeric, the user-selected number of largest eigenvectors of cov(X, Y) and cov(Y, X). The default is NULL. This only works for |
AS |
logical, status of automatic stop of projections. The default is |
scale |
logical. scaling predictors to have zero mean and one standard deviation. The default is |
Let p and r stand for the numbers of variables in the two sets and n stands for the sample size. The option of type="cca"
can work only when max(p,r) < n, and seedCCA
conducts standard canonical correlation analysis (Johnson and Wichern, 2007). If type="cca"
is given and either p or r is equal to one, ordinary least squares (OLS) is done instead of canonical correlation analysis. If max(p,r) >= n, either type="seed1"
or type="seed2"
has to be chosen. This is the main purpose of seedCCA
. If type="seed1"
, only one set of variables, saying X with p for convenience, to have more variables than the other, saying Y with r, is initially reduced by the iterative projection approach (Cook et al. 2007). And then, the canonical correlation analysis of the initially-reduced X and the original Y is finalized. If type="seed2"
, both X and Y are initially reduced. And then, the canonical correlation analysis of the two initially-reduced X and Y are finalzed. If type="pls"
, partial least squares (PLS) is done. If type="pls"
is given, the first set of variables in seedCCA
is predictors and the second set is response. This matters The response can be multivariate. Depeding on the value of type
, the resulted subclass by seedCCA
are different.:
type="cca"
: subclass "finalCCA" (p >2; r >2; p,r<n)
type="cca"
: subclass "seedols" (either p or r is equal to 1.)
type="seed1"
and type="seed2"
: subclass "finalCCA" (max(p,r)>n)
type="pls"
: subclass "seedpls" (p>n and r <n)
So, plot(object)
will result in different figure depending on the object.
The order of the values depending on type is follows.:
type="cca"
: standard CCA (max(p,r)<n, min(p,r)>1) / "finalCCA" subclass
type="cca"
: ordinary least squares (max(p,r)<n, min(p,r)=1) / "seedols" subclass
type="seed1"
: seeded CCA with case1 (max(p,r)>n and p>r) / "finalCCA" subclass
type="seed1"
: seeded CCA with case1 (max(p,r)>n and p<=r) / "finalCCA" subclass
type="seed2"
: seeded CCA with case2 (max(p,r)>n) / "finalCCA" subclass
type="pls"
: partial least squares (p>n and r<n) / "seedpls" subclass
type="cca" |
Values with selecting |
cor |
canonical correlations |
xcoef |
the estimated canonical coefficients for X |
ycoef |
the estimated canonical coefficients for Y |
Xscores |
the estimated canonical variates for X |
Yscores |
the estimated canonical variates for Y |
type="cca" |
Values with selecting |
coef |
the estimated ordinary least squares coefficients |
X |
X, the first set |
Y |
Y, the second set |
type="seed1" |
Values with selecting |
cor |
canonical correlations |
xcoef |
the estimated canonical coefficients for X |
ycoef |
the estimated canonical coefficients for Y |
proper.u |
a suggested proper number of projections for X |
initialMX0 |
the initialized canonical coefficient matrices of X |
newX |
initially-reduced X |
Y |
the original Y |
Xscores |
the estimated canonical variates for X |
Yscores |
the estimated canonical variates for Y |
type="seed1" |
Values with selecting |
cor |
canonical correlations |
xcoef |
the estimated canonical coefficients for X |
ycoef |
the estimated canonical coefficients for Y |
proper.u |
a suggested proper number of projections for Y |
X |
the original X |
initialMY0 |
the initialized canonical coefficient matrices of Y |
newY |
initially-reduced Y |
Xscores |
the estimated canonical variates for X |
Yscores |
the estimated canonical variates for Y |
type="seed2" |
Values with selecting |
cor |
canonical correlations |
xcoef |
the estimated canonical coefficients for X |
ycoef |
the estimated canonical coefficients for Y |
proper.ux |
a suggested proper number of projections for X |
proper.uy |
a suggested proper number of projections for Y |
d |
suggested number of eigenvectors of cov(X,Y) |
initialMX0 |
the initialized canonical coefficient matrices of X |
initialMY0 |
the initialized canonical coefficient matrices of Y |
newX |
initially-reduced X |
newY |
initially-reduced Y |
Xscores |
the estimated canonical variates for X |
Yscores |
the estimated canonical variates for Y |
type="pls" |
Values with selecting |
coef |
the estimated coefficients for each iterative projection upto u |
u |
the maximum number of projections |
X |
predictors |
Y |
response |
scale |
status of scaling predictors |
cases |
the number of observations |
R. D. Cook, B. Li and F. Chiaromonte. Dimension reduction in regression without matrix inversion. Biometrika 2007; 94: 569-584.
Y. Im, H. Gang and JK. Yoo. High-throughput data dimension reduction via seeded canonical correlation analysis, J. Chemometrics 2015; 29: 193-199.
R. A. Johnson and D. W. Wichern. Applied Multivariate Statistical Analysis. Pearson Prentice Hall: New Jersey, USA; 6 edition. 2007; 539-574.
K. Lee and JK. Yoo. Canonical correlation analysis through linear modeling, AUST. NZ. J. STAT. 2014; 56: 59-72.
###### data(cookie) ###### data(cookie) myseq<-seq(141,651,by=2) X<-as.matrix(cookie[-c(23,61),myseq]) Y<-as.matrix(cookie[-c(23,61),701:704]) dim(X);dim(Y) ## standard CCA fit.cca <-seedCCA(X[,1:4], Y, type="cca") ## standard canonical correlation analysis is done. plot(fit.cca) ## ordinary least squares fit.ols1 <-seedCCA(X[,1:4], Y[,1], type="cca") ## ordinary least squares is done, because r=1. fit.ols2 <-seedCCA(Y[,1], X[,1:4], type="cca") ## ordinary least squares is done, because p=1. ## seeded CCA with case 1 fit.seed1 <- seedCCA(X, Y, type="seed1") ## suggested proper value of u is equal to 3. fit.seed1.ux <- seedCCA(X, Y, ux=6, type="seed1") ## iterative projections done 6 times. fit.seed1.uy <- seedCCA(Y, X, uy=6, type="seed1", AS=FALSE) ## projections not done until uy=6. plot(fit.seed1) ## partial least squares fit.pls1 <- seedCCA(X, Y[,1], type="pls") fit.pls.m <- seedCCA(X, Y, type="pls") ## multi-dimensional response par(mfrow=c(1,2)) plot(fit.pls1); plot(fit.pls.m) ######## data(nutrimouse) ######## data(nutrimouse) X<-as.matrix(nutrimouse$gene) Y<-as.matrix(nutrimouse$lipid) dim(X);dim(Y) ## seeded CCA with case 2 fit.seed2 <- seedCCA(X, Y, type="seed2") ## d not specified, so cut=0.9 (default) used. fit.seed2.99 <- seedCCA(X, Y, type="seed2", cut=0.99) ## cut=0.99 used. fit.seed2.d3 <- seedCCA(X, Y, type="seed2", d=3) ## d is specified with 3. ## ux and uy specified, so proper values not suggested. fit.seed2.uxuy <- seedCCA(X, Y, type="seed2", ux=10, uy=10) plot(fit.seed2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.