generateTreeViewFiles: Wrapper function to perform functional Clustering Enrichment...

funcClustAnnotR Documentation

Wrapper function to perform functional Clustering Enrichment Analysis (CLEAN)

Description

This function takes a gene expression data set, hierarchical clusterings of genes and samples, and a list of gene sets representing functional categories. It performs hierarchical clustering (if not provided) and performs the Clustering Enrichment Analysis. Finally, it generates files to display data, clustering, and functional annotation using tools such as the Java-based fTreeView.

Usage

funcClustAnnot(data = NULL, rclust = NULL, cclust = NULL, funcCategories = NULL, fClustAnnotations = NULL, bkgList = NULL, file = "cluster", maxSize = 1000, minSize = 10, minGenesInCategory = 10, maxGenesInCategory = 1000, minSigsInCategory = 2, inBkg = TRUE, sigFDR = 0.1, addCLEANscore2cdt = TRUE, species = NULL, estimateNullDistribution = FALSE, atr.distance = "euc", gtr.distance = "euc", dec = ".", digits = 5, verbose = TRUE, saveDataObjects = FALSE, sampleDesc = NULL, callTreeView = FALSE)
generateTreeViewFiles(data = NULL, rclust = NULL, cclust = NULL, functionalCategories = NULL, fClustAnnotations = NULL, bkgList = NULL, file = "cluster", maxSize = 1000, minSize = 10, maxNumOfClusters=NULL, minGenesInCategory = 10, maxGenesInCategory = 1000, inBkg = TRUE, sigFDR = 0.1, addCLEANscore2cdt = TRUE, species = NULL, estimateNullDistribution = FALSE, dec = ".", verbose = TRUE, saveDataObjects = FALSE, sampleDesc = NULL, callTreeView = FALSE)
geneListEnrichment(geneList, allGenes, functionalCategories = NULL, species = NULL, minGenesInCategory=10, maxGenesInCategory=1000, inBkg=TRUE, sigFDR = 0.1, verbose=TRUE)
runCLEAN(data = NULL, rclust = NULL, functionalCategories = NULL, bkgList = NULL, file = "cluster", maxSize=1000, minSize=10, maxNumOfClusters=NULL, minGenesInCategory=10, maxGenesInCategory=1000, inBkg=TRUE, sigFDR = 0.1, species = NULL, estimateNullDistribution = FALSE, verbose=TRUE, saveDataObjects = FALSE)

Arguments

data

Gene expression data set.

geneList

A gene list to be tested for enrichment of functional categories.

rclust

Gene clustering. Can be file name, distance matrix, hclust object etc. If hclust, rclust$labels must be a numeric vector indexing the rows of the data parameter.

cclust

Sample clustering.

functionalCategories

A collection of gene sets, representing functional categories.

funcCategories

A collection of gene sets, representing functional categories.

fClustAnnotations

A list of Clustering Enrichment Analysis results. If provided, CLEAN is not performed which considerably reduces computing time.

allGenes

A list of genes to be used as background for gene list enrichment analysis.

bkgList

A list of genes to be used as background for CLEAN.

file

Name of the file(s) (without extensions) to be generated.

maxSize

Maximum cluster size to be considered by CLEAN.

minSize

Minimum cluster size to be considered by CLEAN.

maxNumOfClusters

Maximum number of clusters to be considered by CLEAN. If specified, the top 'maxNumOfClusters' from the tree cutting function will be chosen

minGenesInCategory

Minimum number of all genes in the expression data set overlapping with a given functional category.

maxGenesInCategory

Maximum number of all genes in the expression data set overlapping with a given functional category.

minSigsInCategory

Minimum number of genes in a cluster overlapping with a functional category.

inBkg

– to be completed –

sigFDR

FDR cutoff for a functional category to be overrepresented in a cluster.

addCLEANscore2cdt

If TRUE one or more columns are added to the cdt file indicating the highest significance level observed for a cluster by CLEAN.

species

Two letter description of the species to be used to generate gene ontology categories (e.g. "Hs" for human, "Mm" for mouse). This parameter is used when funcCategories = "GO"

estimateNullDistribution

If TRUE an empirical estimate of the Null-distribution of the CLEAN score is computed.

atr.distance

The distance metric used by the sample clustering – deprecated.

gtr.distance

The distance metric used by the gene clustering – deprecated.

dec

The decimal character to be used when writing the treeview files.

digits

Number of significant digits to be used when writing the treeview files – deprecated.

verbose

If TRUE lots of additional output will be generated.

saveDataObjects

If TRUE, data, clusterings, and CLEAN results will be saved to disc as with filename <file>.RData.

sampleDesc

A vector or factor describing the samples. Samples with same description will be grouped together if cclust = NA.

callTreeView

If TRUE the Java application fTreeView is called to display the CLEAN results.

Details

coming soon.

Value

A list:

data

A dataframe containing the gene expression data.

rclust, cclust

Hierarchical clusterings (hclust objects) of genes and samples, respectively.

fClustAnnotations

CLEAN results.

Author(s)

Johannes Freudenberg, Xiangdong Liu, Mario Medvedovic.

References

Coming soon.

See Also

hclust, gimmR, GO, KEGG, r2cdt, call.treeview

Examples

data(gimmOut)
require(CLEAN.Rn)
res <- runCLEAN(gimmOut, species = "Rn")

generateTreeViewFiles(gimmOut, functionalCategories=getFunctionalCategories("geneRIFs", species = "Rn"))
#same as
generateTreeViewFiles(gimmOut, functionalCategories="geneRIFs", species = "Rn")

#multiple category types
generateTreeViewFiles(gimmOut, functionalCategories=c("geneRIFs", "CpGislands", "GO", "KEGG"), species = "Rn")

trt <- sapply(colnames(gimmOut$clustData)[-(1:2)], function(str) strsplit(str, split = "_")[[1]][1])
#not run
#generateTreeViewFiles(gimmOut, cclust = NA, verbose = FALSE, functionalCategories=c("geneRIFs",
#    "CpGislands", "GO", "KEGG"), species = "Rn", callTreeView = TRUE, sampleDesc = trt)
generateTreeViewFiles(gimmOut, cclust = NA, verbose = FALSE, functionalCategories=c("geneRIFs",
    "CpGislands", "GO", "KEGG"), species = "Rn", callTreeView = FALSE, sampleDesc = trt)

#geneList enrichment
geneList <- gimmOut$clustData[,1]
require(org.Rn.eg.db)
allGenes <- unique(keys(org.Rn.egSYMBOL)) #one should really use the list of
                                          #genes represented on the microarray instead
res <- geneListEnrichment(geneList, allGenes, functionalCategories = "GO",
	species = "Rn", sigFDR = 0.01, maxGenesInCategory = 10000)

#using primary geneset
data(cMap)
data(gimmOut)
#download.file("ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/homologene.data",destfile="homologene.data",mode="wb")
#pValues.rat <- convertGeneTable(cMap$cMapPvalues, fromSpecies="h", toSpecies="r")
#generateTreeViewFiles(gimmOut, functionalCategories=list(cMap=list(pValues.rat, cMap$cMapDescr)), species="Rn", bkgList=NULL)
#call.treeview()

uc-bd2k/CLEAN documentation built on Sept. 22, 2022, 4:12 a.m.