Description Usage Arguments Value References Examples

Regularized Generalized Canonical Correlation Analysis (RGCCA) is a generalization
of regularized canonical correlation analysis to three or more sets of variables.
Given *J* matrices *X_1, X_2, ..., X_J* that represent
*J* sets of variables observed on the same set of *n* individuals. The matrices
*X_1, X_2, ..., X_J* must have the same number of rows,
but may (and usually will) have different numbers of columns. The aim of RGCCA is to study
the relationships between these *J* blocks of variables. It constitutes a general
framework for many multi-block data analysis methods. It combines the power of
multi-block data analysis methods (maximization of well identified criteria)
and the flexibility of PLS path modeling (the researcher decides which blocks
are connected and which are not). Hence, the use of RGCCA requires the construction
(user specified) of a design matrix *C*, that characterize
the connections between blocks. Elements of the symmetric design matrix *C = (c_{jk})*
is equal to 1 if block *j* and block *k* are connected, and 0 otherwise.
The function rgcca() implements a monotonically convergent algorithm (i.e. the bounded
criteria to be maximized increases at each step of the iterative procedure) that is very
similar to the PLS algorithm proposed by Herman Wold and finds at convergence a stationnary point
of the RGCCA optimization problem. . Moreover, depending on the
dimensionality of each block *X_j*, *j = 1, ..., J*, the primal (when *n > p_j*) algorithm or
the dual (when *n < p_j*) algorithm is used (see Tenenhaus et al. 2015).
Moreover, by deflation strategy, rgcca() allow to compute several RGCCA block
components (specified by ncomp) for each block. Within each block, block components are guaranteed to
be orthogonal using the deflation procedure. The so-called symmetric deflation is considered in
this implementation, i.e. each block is deflated with respect to its own component(s).
It should be noted that the numbers of components per block can differ from one block to another.

1 2 3 |

`A` |
A list that contains the |

`C` |
A design matrix that describes the relationships between blocks (default: complete design). |

`tau` |
tau is either a |

`ncomp` |
A |

`scheme` |
The value is "horst", "factorial", "centroid" or any diffentiable convex scheme function g designed by the user (default: "centroid"). |

`scale` |
If scale = TRUE, each block is standardized to zero means and unit variances and then divided by the square root of its number of variables (default: TRUE). |

`init` |
The mode of initialization to use in RGCCA algorithm. The alternatives are either by Singular Value Decompostion ("svd") or random ("random") (Default: "svd"). |

`bias` |
A logical value for biaised or unbiaised estimator of the var/cov (default: bias = TRUE). |

`tol` |
The stopping value for convergence. |

`verbose` |
If verbose = TRUE, the progress will be report while computing (default: TRUE). |

`Y` |
A list of |

`a` |
A list of |

`astar` |
A list of |

`C` |
A design matrix that describes the relation between blocks (user specified). |

`tau` |
A vector or matrix that contains the values of the shrinkage parameters applied to each block and each dimension (user specified). |

`scheme` |
The scheme chosen by the user (user specified). |

`ncomp` |
A |

`crit` |
A vector that contains the values of the criteria across iterations. |

`primal_dual` |
A |

`AVE` |
indicators of model quality based on the Average Variance Explained (AVE): AVE(for one block), AVE(outer model), AVE(inner model). |

Tenenhaus M., Tenenhaus A. and Groenen PJF (2017), Regularized generalized canonical correlation analysis: A framework for sequential multiblock component methods, Psychometrika, in press

Tenenhaus A., Philippe C., & Frouin V. (2015). Kernel Generalized Canonical Correlation Analysis. Computational Statistics and Data Analysis, 90, 114-131.

Tenenhaus A. and Tenenhaus M., (2011), Regularized Generalized Canonical Correlation Analysis, Psychometrika, Vol. 76, Nr 2, pp 257-284.

Schafer J. and Strimmer K., (2005), A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statist. Appl. Genet. Mol. Biol. 4:32.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | ```
#############
# Example 1 #
#############
data(Russett)
X_agric =as.matrix(Russett[,c("gini","farm","rent")])
X_ind = as.matrix(Russett[,c("gnpr","labo")])
X_polit = as.matrix(Russett[ , c("demostab", "dictator")])
A = list(X_agric, X_ind, X_polit)
#Define the design matrix (output = C)
C = matrix(c(0, 0, 1, 0, 0, 1, 1, 1, 0), 3, 3)
result.rgcca = rgcca(A, C, tau = c(1, 1, 1), scheme = "factorial", scale = TRUE)
lab = as.vector(apply(Russett[, 9:11], 1, which.max))
plot(result.rgcca$Y[[1]], result.rgcca$Y[[2]], col = "white",
xlab = "Y1 (Agric. inequality)", ylab = "Y2 (Industrial Development)")
text(result.rgcca$Y[[1]], result.rgcca$Y[[2]], rownames(Russett), col = lab, cex = .7)
#############
# Example 2 #
#############
data(Russett)
X_agric =as.matrix(Russett[,c("gini","farm","rent")])
X_ind = as.matrix(Russett[,c("gnpr","labo")])
X_polit = as.matrix(Russett[ , c("inst", "ecks", "death",
"demostab", "dictator")])
A = list(X_agric, X_ind, X_polit, cbind(X_agric, X_ind, X_polit))
#Define the design matrix (output = C)
C = matrix(c(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0), 4, 4)
result.rgcca = rgcca(A, C, tau = c(1, 1, 1, 0), ncomp = rep(2, 4),
scheme = function(x) x^4, scale = TRUE) # HPCA
lab = as.vector(apply(Russett[, 9:11], 1, which.max))
plot(result.rgcca$Y[[4]][, 1], result.rgcca$Y[[4]][, 2], col = "white",
xlab = "Global Component 1", ylab = "Global Component 2")
text(result.rgcca$Y[[4]][, 1], result.rgcca$Y[[4]][, 2], rownames(Russett),
col = lab, cex = .7)
## Not run:
######################################
# example 3: RGCCA and leave one out #
######################################
Ytest = matrix(0, 47, 3)
X_agric =as.matrix(Russett[,c("gini","farm","rent")])
X_ind = as.matrix(Russett[,c("gnpr","labo")])
X_polit = as.matrix(Russett[ , c("demostab", "dictator")])
A = list(X_agric, X_ind, X_polit)
#Define the design matrix (output = C)
C = matrix(c(0, 0, 1, 0, 0, 1, 1, 1, 0), 3, 3)
result.rgcca = rgcca(A, C, tau = rep(1, 3), ncomp = rep(1, 3),
scheme = "factorial", verbose = TRUE)
for (i in 1:nrow(Russett)){
B = lapply(A, function(x) x[-i, ])
B = lapply(B, scale2)
resB = rgcca(B, C, tau = rep(1, 3), scheme = "factorial", scale = FALSE, verbose = FALSE)
# look for potential conflicting sign among components within the loo loop.
for (k in 1:length(B)){
if (cor(result.rgcca$a[[k]], resB$a[[k]]) >= 0)
resB$a[[k]] = resB$a[[k]] else resB$a[[k]] = -resB$a[[k]]
}
Btest =lapply(A, function(x) x[i, ])
Btest[[1]]=(Btest[[1]]-attr(B[[1]],"scaled:center")) /
(attr(B[[1]],"scaled:scale"))/sqrt(NCOL(B[[1]]))
Btest[[2]]=(Btest[[2]]-attr(B[[2]],"scaled:center")) /
(attr(B[[2]],"scaled:scale"))/sqrt(NCOL(B[[2]]))
Btest[[3]]=(Btest[[3]]-attr(B[[3]],"scaled:center")) /
(attr(B[[3]],"scaled:scale"))/sqrt(NCOL(B[[3]]))
Ytest[i, 1] = Btest[[1]]%*%resB$a[[1]]
Ytest[i, 2] = Btest[[2]]%*%resB$a[[2]]
Ytest[i, 3] = Btest[[3]]%*%resB$a[[3]]
}
lab = apply(Russett[, 9:11], 1, which.max)
plot(result.rgcca$Y[[1]], result.rgcca$Y[[2]], col = "white",
xlab = "Y1 (Agric. inequality)", ylab = "Y2 (Ind. Development)")
text(result.rgcca$Y[[1]], result.rgcca$Y[[2]], rownames(Russett),
col = lab, cex = .7)
text(Ytest[, 1], Ytest[, 2], substr(rownames(Russett), 1, 1),
col = lab, cex = .7)
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.