radjust_sym: Adjust p-values for Replicability across Two Independent...

Description Usage Arguments Details Value References See Also Examples

View source: R/radjust_sym.R

Description

Given two vectors of p-values from two independent studies, returns the adjusted p-values for false discovery rate control on replicability claims.

Usage

1
2
3
4
5
radjust_sym(pv1, pv2, w1 = 0.5, input_type = c("selected_features",
  "all_features"), general_dependency = FALSE,
  directional_rep_claim = FALSE,
  variant = c("non-adaptive-with-alpha-selection", "adaptive",
  "non-adaptive"), alpha = if (variant == "non-adaptive") NULL else 0.05)

Arguments

pv1, pv2

numeric vectors of p-values. If directional_rep_claim is TRUE, they must be left-sided p-values. Can be either the p-values for the selected features from each study (the default input type), or the p-values for all the features from each study. Can be either of the same length (so the same location in each vector corresponds to the same feature) or with names (so the same name in each vector corresponds to the same feature).

w1

fraction between zero and one, of the relative weight for the p-values from study 1. Default value is 0.5 (see Details for other values).

input_type

whether pv1 and pv2 contain all the p-values from each study or only the selected ones (the default).

general_dependency

TRUE or FALSE, indicating whether to correct for general dependency. The recommended default value is FALSE (see Details).

directional_rep_claim

TRUE or FALSE, indicating whether to perform directional replicability analysis. The default value is FALSE. If TRUE, pv1 and pv2 should be left-sided p-values (see Details).

variant

A character string specifying the chosen variant for a potential increase in the number of discoveries. Must be one of "non-adaptive-with-alpha-selection" (default), "adaptive", or "non-adaptive" (see Details).

alpha

The threshold on p-values for selecting the features in each study and the significance level for replicability analysis (see Details).

Details

For FDR control at level α on replicability claims, all features with r-value at most α are declared as replicated. In addition, the discoveries from study 1 among the replicability claims have an FDR control guarantee at level w1 * α. Similarly, the discoveries from study 2 among the replicability claims have an FDR control guarantee at level (1-w1) * α.

Setting w1 to a value different than half is appropriate for stricter FDR control in one of the studies. For example, if study two has a much larger sample size than study one (and both studies examine the same problem), then setting w1 > 0.5 will provide a stricter FDR control for the larger study and greater power for the replicability analysis, see Bogomolov and Heller (2018) for details.

The theoretical FDR control guarantees assume independence within each vector of p-values. However, empirical investigations suggest that the method is robust to deviations from independence. In practice, we recommend using it whenever the Benjamini-Hochberg procedure is appropriate for use with single studies, as this procedure can be viewed as a two-dimensional Benjamini-Hochberg procedure which enjoys similar robustness properties. For general dependence, we provide the option to apply a more conservative procedure with theoretical FDR control guarantee for any type of dependence, by setting general_dependency to TRUE.

If variant is "non-adaptive", then the non-adaptive replicability analysis procedure of Bogomolov and Heller (2018) is applied on the input p-values pv1 and pv2. If variant is "non-adaptive-with-alpha-selection", then for a user specified alpha (default 0.05) only p-values from study one below w1 * α and from study two below (1-w1) * α are considered for replicability analysis. This additional step prevents including in the selected sets features that cannot be discovered as replicability claims at the nominal FDR level α, thus reducing the multiplicity adjustment necessary for replicability analysis. If variant is "adaptive", then for a user specified alpha the adaptive replicability analysis procedure is applied on the dataset, see Bogomolov and Heller (2018) for details.

The meaning of the replicability claim for a feature if directional_rep_claim is FALSE, is that both null hypotheses are false (or both alternatives are true). Setting directional_rep_claim to TRUE is useful if the discoveries of interest are directional but the direction is unknown. For example, a directional replicability claim for a feature is the claim that both associations examined for it are positive, or both associations examined for it are negative, but not that one association is positive and the other negative. For directional replicability analysis, the input p-values pv1 and pv2 should be the left-sided input p-values (left-sided is the choice without loss of generality, since we assume the left and right sided p-values sum to one for each null hypothesis).

Value

The function returns a list with the following elements:

call the function call.
inputs a list with the function's input parameters (except pv1 and pv2).
results_table a data frame with the features selected in both studies and their r-values (see description below).
selected1 the features selected in study 1 (when the variant is either "adaptive" or "non-adaptive-with-alpha-selection").
selected2 the features selected in study 2, same as above.
n_selected1 the number of selected features in study 1.
n_selected2 the number of selected features in study 2.
pi1 the estimate of the true-nulls fraction in the study 1 among the selected in study 2, when variant = "adaptive".
pi2 the estimate of the true-nulls fraction in the study 2 among the selected in study 1, when variant = "adaptive".

The third element in the list, results_table, includes the following columns:

name char. the name of the feature as extracted from the named vectors, or the location, if the input vectors are not named.
p.value.1 numeric the one-sided p-value from study 1 as inputed (denoted by pv1). When directional_rep_claim==TRUE the one-sided p-values in the direction of effect are presented (i.e, min(pv1,1-pv1)).
p.value.2 numeric the one-sided p-value from study 2 as inputed (denoted by pv2). When directional_rep_claim==TRUE the one-sided p-values in the direction of effect are presented (i.e, min(pv2,1-pv2)).
r.value numeric the replicability adjusted p-value (= r-value).
Direction char. the direction of the replicability claim, when directional_rep_claim = TRUE.
Significant char. indicates for which features replicability claims can be made at level α, when variant is set to "adaptive" or "non-adaptive-with-alpha-selection".

References

Bogomolov, M. and Heller, R. (2018). Assessing replicability of findings across two studies of multiple features. Biometrika.

See Also

radjust_pf for replicability analysis in primary and follow-up studies.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
data(mice)
## transform the two-sided p-values to one-sided in the same direction (left):
## (we use the direction of the test statistic to do so and assume that it is continuous)

pv1 <- ifelse(mice$dir_is_left1, mice$twosided_pv1/2, 1-mice$twosided_pv1/2)
pv2 <- ifelse(mice$dir_is_left2, mice$twosided_pv2/2, 1-mice$twosided_pv2/2)

## run the examples as in the article:

mice_rv_adaptive <- radjust_sym(pv1, pv2, input_type = "all", directional_rep_claim = TRUE,
                                variant = "adaptive", alpha=0.05)
print(mice_rv_adaptive)

mice_rv_non_adpt_sel <- radjust_sym(pv1, pv2, input_type = "all", directional_rep_claim = TRUE,
                                    variant = "non-adaptive-with-alpha-selection", alpha=0.05)
print(mice_rv_non_adpt_sel)

mice_rv_non_adpt <- radjust_sym(pv1, pv2, input_type = "selected", directional_rep_claim = TRUE,
                                variant = "non-adaptive")
print(mice_rv_non_adpt)

radjust documentation built on May 2, 2019, 3:40 p.m.