# MultiCCA: Perform sparse multiple canonical correlation analysis. In PMA: Penalized Multivariate Analysis

## Description

Given matrices \$X1,...,XK\$, which represent K sets of features on the same set of samples, find sparse \$w1,...,wK\$ such that \$sum_(i<j) (wi' Xi' Xj wj)\$ is large. If the columns of Xk are ordered (and type="ordered") then wk will also be smooth. For \$X1,...,XK\$, the samples are on the rows and the features are on the columns. \$X1,...,XK\$ must have same number of rows, but may (and usually will) have different numbers of columns.

## Usage

 1 2 3 4 5 6 7 8 9 10 MultiCCA( xlist, penalty = NULL, ws = NULL, niter = 25, type = "standard", ncomponents = 1, trace = TRUE, standardize = TRUE )

## Arguments

 xlist A list of length K, where K is the number of data sets on which to perform sparse multiple CCA. Data set k should be a matrix of dimension \$n x p_k\$ where \$p_k\$ is the number of features in data set k. penalty The penalty terms to be used. Can be a single value (if the same penalty term is to be applied to each data set) or a K-vector, indicating a different penalty term for each data set. There are 2 possible interpretations for the penalty terms: If type="standard" then this is an L1 bound on wk, and it must be between 1 and \$sqrt(p_k)\$ (\$p_k\$ is the number of features in matrix Xk). If type="ordered" then this is the parameter for the fused lasso penalty on wk. ws A list of length K. The kth element contains the first ncomponents columns of the v matrix of the SVD of Xk. If NULL, then the SVD of \$X1,...,XK\$ will be computed inside the MultiCCA function. However, if you plan to run this function multiple times, then save a copy of this argument so that it does not need to be re-computed. niter How many iterations should be performed? Default is 25. type Are the columns of \$x1,...,xK\$ unordered (type="standard") or ordered (type="ordered")? If "standard", then a lasso penalty is applied to v, to enforce sparsity. If "ordered" (generally used for CGH data), then a fused lasso penalty is applied, to enforce both sparsity and smoothness. This argument can be a vector of length K (if different data sets are of different types) or it can be a single value "ordered"/"standard" (if all data sets are of the same type). ncomponents How many factors do you want? Default is 1. trace Print out progress? standardize Should the columns of \$X1,...,XK\$ be centered (to have mean zero) and scaled (to have standard deviation 1)? Default is TRUE.

## Value

 ws A list of length K, containg the sparse canonical variates found (element k is a \$p_k x ncomponents\$ matrix). ws.init A list of length K containing the initial values of ws used, by default these are the v vector of the svd of matrix Xk.

## References

Witten D. M., Tibshirani R., and Hastie, T. (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, Gol 10 (3), 515-534, Jul 2009