mcancor | R Documentation |
Performs a canonical correlation analysis (CCA) on multiple data domains,
where constraints such as non-negativity or sparsity are enforced on the
canonical vectors. The result
of the analysis is returned as a list of class mcancor
.
mcancor(
x,
center = TRUE,
scale_ = FALSE,
nvar = min(sapply(x, dim)),
predict,
cor_tol = NULL,
nrestart = 10,
iter_tol = 0,
iter_max = 50,
partial_model = NULL,
verbosity = 0
)
x |
a list of numeric matrices which contain the data from the different domains |
center |
a list of logical values indicating whether the empirical mean
of (each column of) the corresponding data matrix should be subtracted.
Alternatively, a list of vectors can be supplied, where each vector
specifies the mean to be subtracted from the corresponding data matrix.
Each list element is passed to |
scale_ |
a list of logical values indicating whether the columns of the
corresponding data matrix should be scaled to have unit variance before the
analysis takes place. The default is |
nvar |
the number of canonical variables to be computed for each domain. With the default setting, canonical variables are computed until at least one data matrix is fully deflated. |
predict |
a list of regression functions to predict the sum of the
canonical variables of all other domains. The formal arguments for each
regression function are the design matrix |
cor_tol |
a threshold indicating the magnitude below which canonical
variables should be omitted. Variables are omitted if the sum of all their
correlations are less than or equal to |
nrestart |
the number of random restarts for computing the canonical variables via iterated regression steps. The solution achieving maximum explained correlation over all random restarts is kept. A value greater than one can help to avoid poor local maxima. |
iter_tol |
If the relative change of the objective is less than
|
iter_max |
the maximum number of iterations to be performed. The
procedure is terminated if either the |
partial_model |
|
verbosity |
an integer specifying the verbosity level. Greater values result in more output, the default is to be quiet. |
mcancor
generalizes nscancor
to the case where more than
two data domains are available for an analysis. Its objective is to maximize
the sum of all pairwise correlations of the canonical variables.
mcancor
returns a list of class mcancor
with the following elements:
cor |
a multi-dimensional array containing the additional correlations
explained by each pair of canonical variables. The first two dimensions
correspond to the domains, and the third dimension corresponds to the
different canonical variables per domain (see also |
coef |
a list of matrices containing the canonical vectors related to each data domain. The canonical vectors are stored as the columns of each matrix. |
center |
the list of empirical means used to center the data matrices |
scale |
the list of empirical standard deviations used to scale the data matrices |
xp |
the list of deflated
data matrices corresponding to |
macor
, nscancor
, scale
# As of version 1.2.1 of the PMA package, breastdata.rda is no longer
# contained in the package and needs to be downloaded separately
breastdata_url <- "https://statweb.stanford.edu/~tibs/PMA/breastdata.rda"
breastdata_file <- tempfile("breastdata_", fileext = ".rda")
status <- download.file(breastdata_url, breastdata_file, mode = "wb")
if (status > 0)
stop("Unable to download from", breastdata_url)
load(breastdata_file)
# Three data domains: a subset of genes, and CGH spots for the first and
# second chromosome
x <- with(
breastdata,
list(t(rna)[ , 1:100], t(dna)[ , chrom == 1], t(dna)[ , chrom == 2])
)
# Sparse regression functions with different cardinalities for different domains
generate_predict <- function(dfmax) {
force(dfmax)
return(
function(x, sc, cc) {
en <- glmnet::glmnet(x, sc, alpha = 0.05, intercept = FALSE, dfmax = dfmax)
W <- coef(en)
return(W[2:nrow(W), ncol(W)])
}
)
}
predict <- lapply(c(20, 10, 10), generate_predict)
# Compute two canonical variables per domain
mcc <- mcancor(x, predict = predict, nvar = 2)
# Compute another canonical variable for each domain
mcc <- mcancor(x, predict = predict, nvar = 3, partial_model = mcc)
mcc$cor
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.