Description Usage Arguments Value Examples
RCCA function performs Canonical Correlation Analysis with L2 regularization and allows to conduct Canonical Correlation Analysis in high dimensions For a pair of random vectors
x = (x_1, ..., x_p) and y = (y_1, ..., y_q)
it seeks for such vectors
α = (α_1, ..., α_p) and β = (β_1, ..., β_q)
that satisfy L2 constraints
||α|| <= t_1 and ||β|| <= t_2
and that maximize the correlation cor(u, v) between the linear combnations
u = <x , α> and v = <y , β>.
Here <a , b> refers to the inner product between two vectors. The optimal values for α and β are called canonical coefficients and the resulting linear combinations u and v are called canonical variates. It is actually possible to continue the process and find a sequence of canonical coefficients
α[1], ..., α[k] and β[1], ..., β[k]
that satisfy the L2 constraints and such that linear combinations
u[i] = <x , α[i]> and v[i] = <y , β[i]>
form two sets
{u[1], ..., u[k]} and {v[k], ..., v[k]}
of independent random variables. The maximmum possible number of such canonical variates is k = min(p, q). Note that the above optimization problem is equivalet to maximizing the modified correlation coefficient
cov(<x , α>, <y , β>) / ( cov(<x , α>) + λ_1 ||α||^2 )^1/2 ( var(<y , β>) + λ_2 ||β||^2 )^1/2,
where
λ_1 and λ_2
control the resulting sparsity of the canonical coefficients.
1 |
X |
a rectangular n x p matrix containing n observations of random vector x. |
Y |
a rectangular n x q matrix containing n observations of random vector y. |
lambda1 |
a non-negative penalty factor used for regularizing X side coefficients. By default |
lambda2 |
a non-negative penalty factor used for regularizing Y side coefficients. By default |
A list containing the PCMS problem solution:
n.comp
– the number of computed canonical components, i.e. k = min(p, q).
cors
– the resulting k canonical correlations.
mod.cors
– the resulting k values of modified canonical correlation.
x.coefs
– p x k matrix representing k canonical coefficient vectors α[1], ..., α[k].
x.vars
– n x k matrix representing k canonical variates u[1], ..., u[k].
y.coefs
– q x k matrix representing k canonical coefficient vectors β[1], ..., β[k].
y.vars
– n x k matrix representing k canonical variates v[1], ..., v[k].
1 2 3 4 5 6 7 8 9 10 11 12 13 | data(X)
data(Y)
#run RCCA
rcca = RCCA(X, Y, lambda1 = 10, lambda2 = 0)
#check the modified canonical correlations
plot(1:rcca$n.comp, rcca$mod.cors, pch = 16, xlab = 'component', 'ylab' = 'correlation', ylim = c(0, 1))
#check the canonical correlations
points(1:rcca$n.comp, rcca$cors, pch = 16, col = 'purple')
#compare them to cor(x*alpha, y*beta)
points(1:rcca$n.comp, diag(cor(X %*% rcca$x.coefs, Y %*% rcca$y.coefs)), col = 'cyan', pch = 16, cex = 0.7)
#check the canonical coefficients for the first canonical variates
barplot(rcca$x.coefs[,'can.comp1'], col = 'orange', 'xlab' = 'X feature', ylab = 'value')
barplot(rcca$y.coefs[,'can.comp1'], col = 'darkgreen', 'xlab' = 'Y feature', ylab = 'value')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.