Hypergeometric probability of gene set enrichment

Description

A method for computing enrichment probilities based on the hypergeometric distribution. This method performs an over-representation analysis by generating 2x2 incidence matrices for gene sets provided as 'query' and 'sets' as GeneSet, SignedGeneSet, GeneSetCollection or CMAPCollection objects. If 'sets' is an NChannelSet object with quantitative data, gene sets are induced on the fly from the channel specified by the 'element' parameter.

Arguments

query

A CMAPCollection, GeneSet, or GeneSetCollection object containing the 'query' gene sets to compare against the 'sets'

sets

A CMAPCollection, GeneSetCollection or GeneSet object

universe

A character string of gene ids for all genes that could potentially be of interest, e.g. all genes represented on a microarray, all annotated genes, etc.

keep.scores

Logical: store the identifiers for the genes detected in 'query' and 'sets' ? (Default: FALSE) The size of the generated CMAPResults object increases with the number of contained gene sets. For very large collections, setting this parameter to 'TRUE' may require large amounts of memory.

element

A character string corresponding to the assayDataElementName of the NChannelSet object to be thresholded on the fly with the induceCMAPCollection.

lower

The lower threshold for the induceCMAPCollection.

higher

The 'higher' threshold for the induceCMAPCollection.

min.set.size

Number of genes a gene set induced by induceCMAPCollection needs to contain to be included in the analysis (Default:5).

...

Additional arguments passed to downstream methods.

Value

A CMAPResults object

Methods

signature(query = "CMAPCollection", sets = "CMAPCollection", universe = "character")
signature(query = "CMAPCollection", sets = "NChannelSet", universe = "character")
signature(query = "SignedGeneSet", sets = "CMAPCollection", universe = "character")
signature(query = "SignedGeneSet", sets = "NChannelSet", universe = "character")
signature(query = "GeneSet", sets = "CMAPCollection", universe = "character")
signature(query = "GeneSet", sets = "NChannelSet", universe = "character")
signature(query = "GeneSetCollection", sets = "CMAPCollection", universe = "character")
signature(query = "GeneSetCollection", sets = "NChannelSet", universe = "character")
signature(query = "GeneSet", sets = "GeneSetCollection", universe = "character")
signature(query = "CMAPCollection", sets = "GeneSetCollection", universe = "character")
signature(query = "GeneSetCollection", sets = "GeneSetCollection", universe = "character")
signature(query = "GeneSet", sets = "GeneSet", universe = "character")

Note

p-values are corrected for multiple testing separately for each query set, but not across multiple queries.

See Also

fisher.test

Examples

1
2
3
4
5
6
7
data(gCMAPData)

gene.set.collection <- induceCMAPCollection(gCMAPData, "z", higher=2, lower=-2)

## compare all gene sets in the gene.set.collection to each other
universe = featureNames(gCMAPData)
fisher_score(gene.set.collection, gene.set.collection, universe = universe)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.