runCompareIcaSets: runCompareIcaSets
In bitona/MineICA: Analysis of an ICA decomposition obtained on genomics data

runCompareIcaSets

R Documentation

runCompareIcaSets

Description

This function encompasses the comparison of several IcaSet objects using correlations and the plot of the corresponding correlation graph. The IcaSet objects are compared by calculating the correlation between either projection values of common features or genes, or contributions of common samples.

Usage

  runCompareIcaSets(icaSets, labAn,
    type.corr = c("pearson", "spearman"), cutoff_zval = 0,
    level = c("genes", "features", "samples"),
    fileNodeDescr = NULL, fileDataGraph = NULL,
    plot = TRUE, title = "", col, cutoff_graph = NULL,
    useMax = TRUE, tkplot = FALSE)

Arguments

`icaSets`	List of `IcaSet` objects, e.g results of ICA decompositions obtained on several datasets.
`labAn`	Vector of names for each icaSet, e.g the the names of the datasets on which were calculated the decompositions.
`type.corr`	Type of correlation to compute, either `'pearson'` or `'spearman'`.
`cutoff_zval`	Either NULL or 0 (default) if all genes are used to compute the correlation between the components, or a threshold to compute the correlation using the genes that have at least a scaled projection higher than cutoff_zval. Will be used only when `level` is one of `c("features","genes")`.
`level`	Data level of the `IcaSet` objects on which is applied the correlation. It must correspond to a data level shared by the IcaSet objects: `'samples'` if they were applied to common samples (correlations are computed between matrix `A`), `'features'` if they were applied to common features (correlations are computed between matrix `S`), `'genes'` if they share gene IDs after annotation into genes (correlations are computed between matrix `SByGene`).
`fileNodeDescr`	File where node descriptions are saved (useful when the user wants to visualize the graph using Cytoscape).
`fileDataGraph`	File where graph description is saved (useful when the user wants to visualize the graph using Cytoscape).
`plot`	if `TRUE` (default) plot the correlation graph
`title`	title of the graph
`col`	vector of colors indexed by elements of labAn; if missing, colors will be automatically attributed
`cutoff_graph`	the cutoff used to select pairs that will be included in the graph
`useMax`	if `TRUE`, the graph is restricted to edges that correspond to maximum correlation between components, see details
`tkplot`	If TRUE, performs interactive plot with function `tkplot`, else uses `plot.igraph`

Details

This function calls four functions: compareAn which computes the correlations, compareAn2graphfile which builds the graph, nodeAttrs which builds the node description data, and plotCorGraph which uses tkplot to plot the graph in an interactive device.

If the user wants to see the correlation graph in Cytoscape, he must fill the arguments fileDataGraph and fileNodeDescr, in order to import the graph and its node descriptions as a .txt file in Cytoscape.

When labAn is missing, each element i of icaSets is labeled as 'Ani'.

The user must carefully choose the data level used in the comparison: If level='samples', the correlations are based on the mixing matrices of the ICA decompositions (of dimension samples x components). 'A' will be typically chosen when the ICA decompositions were computed on the same dataset, or on datasets that include the same samples. If level='features' is chosen, the correlation is calculated between the source matrices (of dimension features x components) of the ICA decompositions. 'S' will be typically used when the ICA decompositions share common features (e.g same microarrays). If level='genes', the correlations are calculated on the attributes 'SByGene' which store the projections of the annotated features. 'SByGene' will be typically chosen when ICA were computed on datasets from different technologies, for which comparison is possible only after annotation into a common ID, like genes.

cutoff_zval is only used when level is one of c('features','genes'), in order to restrict the correlation to the contributing features or genes.

When cutoff_zval is specified, for each pair of components, genes or features that are included in the circle of center 0 and radius cutoff_zval are excluded from the computation of the correlation.

It must be taken into account by the user that if cutoff_zval is different from NULL or zero, the computation will be much slowler since each pair of component is treated individually.

Edges of the graph are built based on the correlation values between the components. Absolute values of correlations are used since components have no direction.

If useMax is TRUE each component will be linked to only one component of each other IcaSet that corresponds to the most correlated component among all components of the same IcaSet. If cutoff_graph is specified, only correlations exceeding this value are taken into account to build the graph. For example, if cutoff is 1, only relationships between components that correspond to a correlation value higher than 1 will be included. Absolute correlation values are used since the components have no direction.

The contents of the returned list are

dataGraph:: dataGraph data.frame that describes the correlation graph,
nodeAttrs:: nodeAttrs data.frame that describes the node of the graph
graph: graph the graph as an igraph-object,
graphid:: graphid the id of the graph plotted using tkplot.

Value

A list consisting of

dataGraph:: a data.frame defining the correlation graph
nodeAttrs:: a data.frame describing the node of the graph,
graph:: the graph as an object of class igraph,
graphid: the id of the graph plotted with tkplot

Author(s)

Anne Biton

Examples

dat1 <- data.frame(matrix(rnorm(10000),ncol=10,nrow=1000))
rownames(dat1) <- paste("g", 1:1000, sep="")
colnames(dat1) <- paste("s", 1:10, sep="")
dat2 <- data.frame(matrix(rnorm(10000),ncol=10,nrow=1000))
rownames(dat2) <- paste("g", 1:1000, sep="")
colnames(dat2) <- paste("s", 1:10, sep="")

## run ICA
resJade1 <- runICA(X=dat1, nbComp=3, method = "JADE")
resJade2 <- runICA(X=dat2, nbComp=3, method = "JADE")

## build params
params <- buildMineICAParams(resPath="toy/")

## build IcaSet objects
icaSettoy1 <- buildIcaSet(params=params, A=data.frame(resJade1$A), S=data.frame(resJade1$S),
                          dat=dat1, alreadyAnnot=TRUE)$icaSet
icaSettoy2 <- buildIcaSet(params=params, A=data.frame(resJade2$A), S=data.frame(resJade2$S),
                          dat=dat2, alreadyAnnot=TRUE)$icaSet

## compare IcaSet objects
## use tkplot=TRUE to get an interactive graph
rescomp <- runCompareIcaSets(icaSets=list(icaSettoy1, icaSettoy2), labAn=c("toy1","toy2"),
                             type.corr="pearson", level="genes", tkplot=FALSE)


## Not run: 
## load the microarray-based gene expression datasets
## of breast tumors
library(breastCancerMAINZ)
library(breastCancerVDX)
data(mainz)
data(vdx)

## Define a function used to build two examples of IcaSet objects
## and annotate the probe sets into gene Symbols
treat <- function(es, annot="hgu133a.db") {
   es <- selectFeatures_IQR(es,10000)
   exprs(es) <- t(apply(exprs(es),1,scale,scale=FALSE))
   colnames(exprs(es)) <- sampleNames(es)
   resJade <- runICA(X=exprs(es), nbComp=10, method = "JADE", maxit=10000)
   resBuild <- buildIcaSet(params=buildMineICAParams(), A=data.frame(resJade$A), S=data.frame(resJade$S),
                        dat=exprs(es), pData=pData(es), refSamples=character(0),
                        annotation=annot, typeID= typeIDmainz,
                        chipManu = "affymetrix", mart=mart)
   icaSet <- resBuild$icaSet
}
## Build the two IcaSet objects
icaSetMainz <- treat(mainz)
icaSetVdx <- treat(vdx)

## compare the IcaSets
runCompareIcaSets(icaSets=list(icaSetMainz, icaSetVdx), labAn=c("Mainz","Vdx"), type.corr="pearson", level="genes")

## End(Not run)

bitona/MineICA documentation built on April 23, 2023, 1:41 p.m.