In isglobal-brge/GCCA: Generalized Canonical Correlation Analysis having missing individuals

knitr::opts_chunk$set(message = FALSE, warning = FALSE,
                      cache=TRUE, fig.width = 5.5, fig.height = 5.5)

Introduction

This manual describes how to peform a Regularized Generalized Canonical Correlation Analysis analysis and then how to perform the same data analysis with missing Canonical Correlation Analysis analysis method implemented in mgcca function in this package.

Getting started

mgcca package depends on BigDataStatMeth packages so in orther to install mgcca we need to install it.

devtools::install_github("isglobal-brge/BigDataStatMeth")

The development version of mgcca package can be installed from BRGE GitHub repository:

devtools::install_github("isglobal-brge/mgcca")

The package depends on other R/Bioc packages that can be installed using standard functions (install.packages and/or biocLite). Then, the package can be loaded into R as usual

library(mgcca)

This vignette uses data from different packages such as RGCCA from CRAN and omicade4 from Bioconductor with commands,

# Install RGCCA
install.packages("RGCCA")

# Install omicade4
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("omicade4")

Canonical correlation

library(RGCCA)
data(Russett, package="RGCCA")
Xagric <- as.matrix(Russett[,c("gini","farm","rent")])
Xind <- as.matrix(Russett[,c("gnpr","labo")])
Xpolit <- as.matrix(Russett[ , c("inst", "ecks", "death")])
X <- list(Xagric=Xagric, Xind=Xind, Xpolit=Xpolit)

labs <- rownames(Russett)
group <- as.vector(apply(Russett[, c("demostab", "demoinst", "dictator")], 
                       1, which.max))
names(group) <- labs

Russett.sel <- cbind(Xagric, Xind, Xpolit)
X.rgcca <- X
X.rgcca[[4]] <- Russett.sel
C <- matrix(c(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0), 4, 4)
res.rgcca <- rgcca(X.rgcca, C, tau = c(0, 0, 0, 0), ncomp = rep(2, 4),
                   scheme = "factorial", scale = TRUE, verbose=FALSE)

plot(res.rgcca$Y[[4]][, 1], res.rgcca$Y[[4]][, 2], type = "n",
     xlab = "Global Component 1", ylab = "Global Component 2")
text(res.rgcca$Y[[4]][, 1], res.rgcca$Y[[4]][, 2], labs, col = group)
grid()
abline(h=0)
abline(v=0)

corr <- cor(Russett.sel, res.rgcca$Y[[4]])

plot(corr[,1], corr[, 2], type = "n",
     xlab = "Global Component 1", ylab = "Global Component 2")
text(corr[, 1], corr[, 2], rownames(corr))
grid()
abline(h=0)
abline(v=0)

res.mgcca <- mgcca(X)

plotInds(res.mgcca, group=as.factor(group), print.labels = TRUE, 
        pos.leg="topleft")
plotVars(res.mgcca, nlab=5)

Reguralized canonical correlation: the `omic` case

The nutrimouse dataset comes from a nutrition study in the mouse. It was provided by Pascal Martin from the Toxicology and Pharmacology Laboratory (French National Institute for Agronomic Research) and is available at R package CCA. It contains information on 40 mouse for two tables and a grouping variable having wild-type and PPARalpha -/- genotypes. Such information can be retrieved into R by:

library(CCA)
data(nutrimouse, package = "CCA")
X <- as.matrix(nutrimouse$gene)
Y <- as.matrix(nutrimouse$lipid)
group <- nutrimouse$genotype

Table X includes gene expresion on 120 genes potentially involved in nutritional problems, while table Y contains concentration of 21 hepatic fatty acids concentration.

dim(X)
dim(Y)
table(group)

X.s <- scale(X)
Y.s <- scale(Y)
lambda.est <- estim.regul(X.s, Y.s, plt=FALSE)
res.rcca <- rcc(X.s, Y.s, lambda.est[1], lambda.est[2])

plt.indiv(res.rcca, 1, 2, ind.names = group)

XX <- list(genes=X.s, lipids=Y.s)
ll <- c(lambda.est$lambda1, lambda.est$lambda2)
res.mgcca <- mgcca(XX, method = "penalized", lambda=c(0.001, 0.1))

plotInds(res.mgcca, group=group)

X.rgcca <- list(X, Y, cbind(X,Y))
C <- matrix(c(0, 0,  1, 0, 0, 1, 1, 1, 0), 3, 3)
res.rgcca <- rgcca(X.rgcca, C, tau = c(0, 0, 0), ncomp = rep(2, 3),
                   scheme = "factorial", scale = TRUE, verbose=FALSE)

plot(res.rgcca$Y[[3]][, 1], res.rgcca$Y[[3]][, 2], type = "n",
     xlab = "Global Component 1", ylab = "Global Component 2")
text(res.rgcca$Y[[3]][, 1], res.rgcca$Y[[3]][, 2], group, col = as.numeric(group))
grid()
abline(h=0)
abline(v=0)

plotVars(res.mgcca, nlab=5)

getSignif(res.mgcca, pval=1e-3, df="genes")

var.signif <- getSignif(res.mgcca, pval=1e-3, df="genes")$variable
plotVars(res.mgcca, var=var.signif, var.col = rep("red", length(var.signif)), 
         var.lab=TRUE, df="genes")

library(omicade4)
res.mcia <- mcia(list(genes=t(X.s), lipid=t(Y.s)))
plotVar(res.mcia, nlab=5)

plot(res.mcia, phenovec=group, sample.lab=FALSE, df.color=c("blue", "darkgreen"))

Generalized Canonical Correlation with missing individuals

sessionInfo()

isglobal-brge/GCCA documentation built on Feb. 19, 2022, 9:21 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

isglobal-brge/GCCA
Generalized Canonical Correlation Analysis having missing individuals

In isglobal-brge/GCCA: Generalized Canonical Correlation Analysis having missing individuals

Introduction

Getting started

Canonical correlation

Reguralized canonical correlation: the `omic` case

Generalized Canonical Correlation with missing individuals

R Package Documentation

Browse R Packages

We want your feedback!

isglobal-brge/GCCA Generalized Canonical Correlation Analysis having missing individuals

In isglobal-brge/GCCA: Generalized Canonical Correlation Analysis having missing individuals

Introduction

Getting started

Canonical correlation

Reguralized canonical correlation: the omic case

Generalized Canonical Correlation with missing individuals

R Package Documentation

Browse R Packages

We want your feedback!

isglobal-brge/GCCA
Generalized Canonical Correlation Analysis having missing individuals

Reguralized canonical correlation: the `omic` case