empPvals: Calculate p-values from a set of observed test statistics and...
In qvalue: Q-value estimation for false discovery rate control

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/empPvals.R

Calculates p-values from a set of observed test statistics and simulated null test statistics

1	empPvals(stat, stat0, pool = TRUE)

`stat`	A vector of calculated test statistics.
`stat0`	A vector or matrix of simulated or data-resampled null test statistics.
`pool`	If FALSE, stat0 must be a matrix with the number of rows equal to the length of `stat`. Default is TRUE.

The argument stat must be such that the larger the value is the more deviated (i.e., "more extreme") from the null hypothesis it is. Examples include an F-statistic or the absolute value of a t-statistic. The argument stat0 should be calculated analogously on data that represents observations from the null hypothesis distribution. The p-values are calculated as the proportion of values from stat0 that are greater than or equal to that from stat. If pool=TRUE is selected, then all of stat0 is used in calculating the p-value for a given entry of stat. If pool=FALSE, then it is assumed that stat0 is a matrix, where stat0[i,] is used to calculate the p-value for stat[i]. The function empPvals calculates "pooled" p-values faster than using a for-loop.

See page 18 of the Supporting Information in Storey et al. (2005) PNAS (http://www.pnas.org/content/suppl/2005/08/26/0504609102.DC1/04609SuppAppendix.pdf) for an explanation as to why calculating p-values from pooled empirical null statistics and then estimating FDR on these p-values is equivalent to directly thresholding the test statistics themselves and utilizing an analogous FDR estimator.

A vector of p-values calculated as described above.

John D. Storey

Storey JD and Tibshirani R. (2003) Statistical significance for genome-wide experiments. Proceedings of the National Academy of Sciences, 100: 9440-9445.
http://www.pnas.org/content/100/16/9440.full

Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW. (2005) Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences, 102 (36), 12837-12842.
http://www.pnas.org/content/102/36/12837.full.pdf?with-ds=yes

qvalue

# import data
data(hedenfalk)
stat <- hedenfalk$stat
stat0 <- hedenfalk$stat0 #vector from null distribution

# calculate p-values
p.pooled <- empPvals(stat=stat, stat0=stat0)
p.testspecific <- empPvals(stat=stat, stat0=stat0, pool=FALSE)

# compare pooled to test-specific p-values
qqplot(p.pooled, p.testspecific); abline(0,1)