combinePValues: Combine p-values
In scran: Methods for Single-Cell RNA-Seq Data Analysis

Description Usage Arguments Details Value Author(s) References Examples

Combine p-values from independent or dependent hypothesis tests using a variety of meta-analysis methods.

combinePValues(
  ...,
  method = c("fisher", "z", "simes", "berger", "holm-middle"),
  weights = NULL,
  log.p = FALSE,
  min.prop = 0.5
)

`...`	Two or more numeric vectors of p-values of the same length.
`method`	A string specifying the combining strategy to use.
`weights`	A numeric vector of positive weights, with one value per vector in `...`. Alternatively, a list of numeric vectors of weights, with one vector per element in `...`. This is only used when `method="z"`.
`log.p`	Logical scalar indicating whether the p-values in `...` are log-transformed.
`min.prop`	Numeric scalar in [0, 1] specifying the minimum proportion of tests to reject for each set of p-values when `method="holm-middle"`.

This function will operate across elements on ... in parallel to combine p-values. That is, the set of first p-values from all vectors will be combined, followed by the second p-values and so on. This is useful for combining p-values for each gene across different hypothesis tests.

Fisher's method, Stouffer's Z method and Simes' method test the global null hypothesis that all of the individual null hypotheses in the set are true. The global null is rejected if any of the individual nulls are rejected. However, each test has different characteristics:

Fisher's method requires independence of the test statistic. It is useful in asymmetric scenarios, i.e., when the null is only rejected in one of the tests in the set. Thus, a low p-value in any test is sufficient to obtain a low combined p-value.
Stouffer's Z method require independence of the test statistic. It favours symmetric rejection and is less sensitive to a single low p-value, requiring more consistently low p-values to yield a low combined p-value. It can also accommodate weighting of the different p-values.
Simes' method technically requires independence but tends to be quite robust to dependencies between tests. See Sarkar and Chung (1997) for details, as well as work on the related Benjamini-Hochberg method. It favours asymmetric rejection and is less powerful than the other two methods under independence.

Berger's intersection-union test examines a different global null hypothesis - that at least one of the individual null hypotheses are true. Rejection in the IUT indicates that all of the individual nulls have been rejected. This is the statistically rigorous equivalent of a naive intersection operation.

In the Holm-middle approach, the global null hypothesis is that more than 1 - min.prop proportion of the individual nulls in the set are true. We apply the Holm-Bonferroni correction to all p-values in the set and take the ceiling(min.prop * N)-th smallest value where N is the size of the set (excluding NA values). This method works correctly in the presence of correlations between p-values.

A numeric vector containing the combined p-values.

Aaron Lun

Fisher, R.A. (1925). Statistical Methods for Research Workers. Oliver and Boyd (Edinburgh).

Whitlock MC (2005). Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach. J. Evol. Biol. 18, 5:1368-73.

Simes RJ (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73:751-754.

Berger RL and Hsu JC (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statist. Sci. 11, 283-319.

Sarkar SK and Chung CK (1997). The Simes method for multiple hypothesis testing with positively dependent test statistics. J. Am. Stat. Assoc. 92, 1601-1608.

p1 <- runif(10000)
p2 <- runif(10000)
p3 <- runif(10000)

fish <- combinePValues(p1, p2, p3)
hist(fish)

z <- combinePValues(p1, p2, p3, method="z", weights=1:3)
hist(z)

simes <- combinePValues(p1, p2, p3, method="simes")
hist(simes)

berger <- combinePValues(p1, p2, p3, method="berger")
hist(berger)