Hypergeometric tests and Gene Set Enrichment Analyses over a list of gene set collections

Description

This function takes a list of gene set collections, a named phenotype vector (with names of the phenotype vector as the universe), a vector of hits (gene names only) and returns the results of hypergeometric and gene set enrichment analyses for all of the gene set collections (with multiple hypothesis testing corrections).

Usage

1
2
3
analyzeGeneSetCollections(listOfGeneSetCollections, geneList, hits, 
pAdjustMethod="BH", pValueCutoff=0.05, nPermutations=1000, 
minGeneSetSize=15, exponent=1, verbose=TRUE, doGSOA=TRUE, doGSEA=TRUE)

Arguments

listOfGeneSetCollections

a list of gene set collections (a 'gene set collection' is a list of gene sets). Even if only one collection is being tested, it must be entered as an element of a 1-element list, e.g. ListOfGeneSetCollections = list(YourOneGeneSetCollection). Naming the elements of listOfGeneSetCollections will result in these names being associated with the relevant data frames in the output (meaningful names are advised)

geneList

a numeric or integer vector of phenotypes in descending or ascending order with elements named by their EntrezIds (no duplicates nor NA values)

hits

a character vector of the EntrezIds of hits, as determined by the user

pAdjustMethod

a single character value specifying the p-value adjustment method to be used (see 'p.adjust' for details)

pValueCutoff

a single numeric value specifying the cutoff for p-values considered significant

nPermutations

a single integer or numeric value specifying the number of permutations for deriving p-values in GSEA

minGeneSetSize

a single integer or numeric value specifying the minimum number of elements in a gene set that must map to elements of the gene universe. Gene sets with fewer than this number are removed from both hypergeometric analysis and GSEA.

exponent

a single integer or numeric value used in weighting phenotypes in GSEA (see the function gseaScores)

verbose

a single logical value specifying to display detailed messages (when verbose=TRUE) or not (when verbose=FALSE)

doGSOA

a single logical value specifying to perform gene set overrepresentation analysis (when doGSOA=TRUE) or not (when doGSOA=FALSE)

doGSEA

a single logical value specifying to perform gene set enrichment analysis (when doGSEA=TRUE) or not (when doGSEA=FALSE)

Details

All gene names must be EntrezIds in 'listOfGeneSetCollections', 'geneList', and 'hits'.

Value

HyperGeo.results

a list of data frames containing the results for all gene set collections in the input.

GSEA.results

a similar list of data frames containing the results from GSEA. As an example, to access the GSEA results for a gene set collection named "MyGeneSetCollection", one would enter: output$GSEA.results$MyGeneSetCollection

Sig.pvals.in.both

a list of data frames containing the gene sets with p-values considered significant in both hypergeometric test and GSEA, before p-value correction. Each element of the list contains the results for one gene set collection.

Sig.adj.pvals.in.both

a list of data frames containing the gene sets with p-values considered significant in both hypergeometric test and GSEA, after p-value correction. Each element of the list contains the results for one gene set collection.

Author(s)

John C. Rose, Xin Wang

See Also

analyze

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Not run: 
library(org.Dm.eg.db)
library(GO.db)
library(KEGG.db)
##load phenotype vector (see the vignette for details about the 
##preprocessing of this data set)
data("KcViab_Data4Enrich")
##Create a list of gene set collections for Drosophila melanogaster (Dm)
GO_MF <- GOGeneSets(species="Dm", ontologies="MF")
PW_KEGG <- KeggGeneSets(species="Dm")
ListGSC <- list(GO_MF=GO_MF, PW_KEGG=PW_KEGG)
##Conduct enrichment analyses
GSCAResults <- analyzeGeneSetCollections(
		listOfGeneSetCollections=ListGSC,
		geneList=KcViab_Data4Enrich,
		hits=names(KcViab_Data4Enrich)[which(abs(KcViab_Data4Enrich)>2)],
		pAdjustMethod="BH",
		nPermutations=1000,
		minGeneSetSize=200,
		exponent=1,
		verbose=TRUE
)

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.