# empPvals: Calculate p-values from a set of observed test statistics and... In qvalue: Q-value estimation for false discovery rate control

## Description

Calculates p-values from a set of observed test statistics and simulated null test statistics

## Usage

 `1` ```empPvals(stat, stat0, pool = TRUE) ```

## Arguments

 `stat` A vector of calculated test statistics. `stat0` A vector or matrix of simulated or data-resampled null test statistics. `pool` If FALSE, stat0 must be a matrix with the number of rows equal to the length of `stat`. Default is TRUE.

## Details

The argument `stat` must be such that the larger the value is the more deviated (i.e., "more extreme") from the null hypothesis it is. Examples include an F-statistic or the absolute value of a t-statistic. The argument `stat0` should be calculated analogously on data that represents observations from the null hypothesis distribution. The p-values are calculated as the proportion of values from `stat0` that are greater than or equal to that from `stat`. If `pool=TRUE` is selected, then all of `stat0` is used in calculating the p-value for a given entry of `stat`. If `pool=FALSE`, then it is assumed that `stat0` is a matrix, where `stat0[i,]` is used to calculate the p-value for `stat[i]`. The function `empPvals` calculates "pooled" p-values faster than using a for-loop.

See page 18 of the Supporting Information in Storey et al. (2005) PNAS (http://www.pnas.org/content/suppl/2005/08/26/0504609102.DC1/04609SuppAppendix.pdf) for an explanation as to why calculating p-values from pooled empirical null statistics and then estimating FDR on these p-values is equivalent to directly thresholding the test statistics themselves and utilizing an analogous FDR estimator.

## Value

A vector of p-values calculated as described above.

John D. Storey

## References

Storey JD and Tibshirani R. (2003) Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences, 100: 9440-9445.
http://www.pnas.org/content/100/16/9440.full

Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW. (2005) Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences, 102 (36), 12837-12842.
http://www.pnas.org/content/102/36/12837.full.pdf?with-ds=yes

`qvalue`
 ``` 1 2 3 4 5 6 7 8 9 10 11``` ```# import data data(hedenfalk) stat <- hedenfalk\$stat stat0 <- hedenfalk\$stat0 #vector from null distribution # calculate p-values p.pooled <- empPvals(stat=stat, stat0=stat0) p.testspecific <- empPvals(stat=stat, stat0=stat0, pool=FALSE) # compare pooled to test-specific p-values qqplot(p.pooled, p.testspecific); abline(0,1) ```