enrichment: Enrichment Compared To Chance

Description Usage Arguments Details Value See Also

Description

Calculate the enrichment and P-value using the hypergeometric distribution.

Usage

1
2
3
4
5
6
7
enrichment(nMatch, nYourSet, nTotal, nTargetSubset)

enrichment.Nway(nSets = 2, nMatch, nDrawn, nTotal, 
		nSimulations = 1000000)

simulate.enrichment.Nway(nSets = 2, nMatch, nDrawn, nTotal, 
		nSimulations = 1000000)

Arguments

nMatch

the number of genes in common from 'yourSet' and the 'targetSet'

nYourSet

the number of genes in your selected subset

nTotal

the total number of gene

nTargetSubset

the number of genes in the target subset

nSets

for extending to a higher number of samples

nSimulations

number of simulations to run

Details

Based on the hypergeometric function. See dhyper. Given a set of 'nTotal' genes, that contains a particular subset of 'nTargetSubset' genes of interest. This represents the theoretical 'urn of balls' with this subset representing the white balls. By some selection criteria, you draw out a subset of 'nYourSet' genes from the urn, and observe that 'nMatch' of your set are from the target subset (i.e. are white). This function returns the expected number of white balls drawn, and the P-value likelihood of drawing 'nMatch' by chance alone.

For higher numbers of sets, there is a simulation-based method (that is quite slow). Imagine 4 independent differential expression results, and you wish to know how likely it is to see 17 genes in common from the top 200 DE genes in a genome with 5000 genes. Use: enrichment.Nway( nSets=4, nMatch=17, nDrawn=200, nTotal=5000)

simulate.enrichment.Nway is a convenience wrapper function that runs the simulation and graphically fits the results to a normal distribution, to help estimate the probability of having a matching overlap of that size.

Value

A list that restates the inputs, with additional terms:

nExpected

The expected number of matches, if pure chance was the only influence

P_atLeast_N

Probability of pulling at least 'nMatch' of the particular subset by chance

P_atMost_N

Probability of pulling no more than 'nMatch' of the particular subset by chance

Additionally for enrichment.Nway:

distribution

The results of the simulation, showing how often each possible outcome (number of matches) was observed

Additionally for simulate.enrichment.Nway: a plot of the distribution, with a best fit normal curve that highlights the number of matches.

See Also

See dhyper


robertdouglasmorrison/DuffyTools documentation built on Dec. 7, 2018, 8:02 a.m.