empPvals: Calculate p-values from a set of observed test statistics and...
In jdstorey/qvalue: Q-value estimation for false discovery rate control

View source: R/empPvals.R

empPvals

R Documentation

Calculate p-values from a set of observed test statistics and simulated null test statistics

Description

Calculates p-values from a set of observed test statistics and simulated null test statistics

Usage

empPvals(stat, stat0, pool = TRUE)

Arguments

`stat`	A vector of calculated test statistics.
`stat0`	A vector or matrix of simulated or data-resampled null test statistics.
`pool`	If FALSE, stat0 must be a matrix with the number of rows equal to the length of `stat`. Default is TRUE.

Details

The argument stat must be such that the larger the value is the more deviated (i.e., "more extreme") from the null hypothesis it is. Examples include an F-statistic or the absolute value of a t-statistic. The argument stat0 should be calculated analogously on data that represents observations from the null hypothesis distribution. The p-values are calculated as the proportion of values from stat0 that are greater than or equal to that from stat. If pool=TRUE is selected, then all of stat0 is used in calculating the p-value for a given entry of stat. If pool=FALSE, then it is assumed that stat0 is a matrix, where stat0[i,] is used to calculate the p-value for stat[i]. The function empPvals calculates "pooled" p-values faster than using a for-loop.

See page 18 of the Supporting Information in Storey et al. (2005) PNAS (http://www.pnas.org/content/suppl/2005/08/26/0504609102.DC1/04609SuppAppendix.pdf) for an explanation as to why calculating p-values from pooled empirical null statistics and then estimating FDR on these p-values is equivalent to directly thresholding the test statistics themselves and utilizing an analogous FDR estimator.

Value

A vector of p-values calculated as described above.

Author(s)

John D. Storey

References

Storey JD and Tibshirani R. (2003) Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences, 100: 9440-9445.
http://www.pnas.org/content/100/16/9440.full

Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW. (2005) Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences, 102 (36), 12837-12842.
http://www.pnas.org/content/102/36/12837.full.pdf?with-ds=yes

Examples

# import data
data(hedenfalk)
stat <- hedenfalk$stat
stat0 <- hedenfalk$stat0 #vector from null distribution

# calculate p-values
p.pooled <- empPvals(stat=stat, stat0=stat0)
p.testspecific <- empPvals(stat=stat, stat0=stat0, pool=FALSE)

# compare pooled to test-specific p-values
qqplot(p.pooled, p.testspecific); abline(0,1)

jdstorey/qvalue documentation built on Sept. 9, 2023, 1:50 p.m.