radjust_pf: Adjust p-values for Replicability across Two Independent,...

Description Usage Arguments Details Value Note References See Also Examples

Description

Given two vectors of p-values from the primary and follow-up studies, returns the adjusted p-values for false discovery rate control on replicability claims. The p-value vectors are only for features selected for follow-up.

Usage

1
2
3
radjust_pf(pv1, pv2, m, c2 = 0.5, l00 = 0, variant = c("none",
  "general_dependency", "use_threshold"), threshold = NULL,
  alpha = 0.05)

Arguments

pv1

numeric vector of p-values from the primary study which corresponds to the p-values from the follow-up study (pv2).

pv2

numeric vector of p-values from the follow-up study.

m

the number of features examined in the primary study (> length(pv1)).

c2

the relative boost to the p-values from the follow-up study. c2 = 0.5 (the default) is recommended. It was observed in simulations to yield similar power to procedure with the optimal value (which is unknown for real data).

l00

a lower bound of the fraction of features (out of m) with true null hypotheses in both studies. For example, for GWAS on the whole genome, the choice of 0.8 is conservative in typical applications.

variant
none

the default.

general_dependency

use m*=m*sum(1/i) instead of m.

use_threshold

c1 is computed given the threshold value.

Both variants guarantee that the procedure that declares as replicated all features with r-values below alpha, controls the FDR at level alpha, for any type of dependency of the p-values in the primary study.

threshold

the selection threshold for p-values from the primary study; must be supplied when variant 'use_threshold' is selected, otherwise ignored.

alpha

The FDR level to control.

Details

When many hypotheses are simultaneously examined in a primary study, and then a subset of hypotheses are forwarded for follow-up in an independent study, it is of interest to know which findings are replicated across studies. As a measure of replicability of significance, we compute the r-value, i.e. the FDR adjusted replicability p-value, for each hypothesis followed-up. This measure is different than the FDR adjusted p-value in a typical meta-analysis, where a p-value close to zero in one of the studies is enough to declare the finding as highly significant. The FDR r-value for a feature is the smallest FDR level at which we can say that the finding is among the replicated ones.

Value

vector of length of pv1 and pv2, containing the r-values.

Note

The function is also available as a web applet: http://www.math.tau.ac.il/~ruheller/App.html

References

Bogomolov, M. and Heller, R. (2013). Discovering findings that replicate from a primary study of high dimension to a follow-up study. Journal of the American Statistical Association, Vol. 108, No. 504, Pp. 1480-1492.

Heller, R., Bogomolov, M., & Benjamini, Y. (2014). Deciding whether follow-up studies have replicated findings in a preliminary large-scale omics study. Proceedings of the National Academy of Sciences of the United States of America, Vol. 111, No. 46, Pp. 16262<e2><80><93>16267.

See Also

radjust_sym for replicability analysis in two symmetric studies.

Examples

1
2
3
4
 data(crohn)
 rv  <- radjust_pf(pv1 = crohn$pv1, pv2 = crohn$pv1, m = 635547, l00 = 0.8)
 rv2 <- radjust_pf(pv1 = crohn$pv1, pv2 = crohn$pv1, m = 635547, l00 = 0.8,
                   variant="use_threshold",threshold = 1e-5)

radjust documentation built on May 2, 2019, 3:40 p.m.