empOPW: Perform Empirical Optimal P-value Weighting
In mshasan/empOPW: Empirical Method of Optimal Pvalue Weighting

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/empOPW.R

A function to perform weighted p-value multiple hypothesis test. This function compute the ranks probability of the test statistics by the filter statistics given the effect sizes, and consequently the weights if neighter the weights nor the probabilities are given. Then provides the number of rejected null hypothesis and the list of the rejected pvalues as well as the corresponing filter statistics.

empOPW(pvalue, filter, weight = NULL, ranksProb = NULL,
  mean_testEffect = NULL, alpha = 0.05, tail = 1L, delInterval = 0.001,
  group = NULL, max.group = 5L, h_breaks = 71L,
  effectType = c("continuous", "binary"), method = c("BH", "BON"), ...)

`pvalue`	a vector of pvalues of the test statistics
`filter`	a vector of filter statistics
`weight`	optional weight vector not required
`ranksProb`	ranks probabilities of the test-groups by the filters given the mean effect. Note that, for each group of tests ranks probbaility would be the same.
`mean_testEffect`	mean test effect of the true alterantives
`alpha`	significance level of the hypothesis test
`tail`	right-tailed or two-tailed hypothesis test. default is right-tailed test.
`delInterval`	interval between the `delta` values of a sequence. Note that, `delta` is a LaGrange multiplier, necessary to normalize the weight
`group`	Integer, number of groups. Default is NULL. If one wants to use a fixed number of group then should use max.group = NULL
`max.group`	maximum number of groups to be used to split the p-values, default is five. Note that, it is better to keep approximately 1000 p-values per group.
`h_breaks`	number of breaks to be used for the histogram, default is 71
`effectType`	type of effect sizes; c("continuous", "binary")
`method`	type of methods is used to obtain the results; c("BH", "BON"), Benjemini-Hochberg or Bonferroni
`...`	Arguments passed to internal functions

If one wants to test

H_0: epsilon_i = 0 vs. H_a: epsilon_i > 0,

then the mean_testEffect and mean_filterEffect should be mean of the test and filter effect sizes, respectively. This is called hypothesis testing for the continuous effect sizes.

If one wants to test

H_0: epsilon_i = 0 vs. H_a: epsilon_i = epsilon,

then mean_testEffect and mean_filterEffect should be median or any discrete value of the test and filter effect sizes. This is called hypothesis testing for the Binary effect sizes, where epsilon refers to a fixed value.

The main goal of the function is to compute the probabilities of the ranks from the pvalues ranked by the filter statistics, consequently the weights. Although weights ranksProb are optional, empOPW has the options so that one can compute the probabilities and the weights externally if necessary (see the examples).

Internally, empOPW function compute the ranksProb and consequently the weights, then uses the p-values to make conclusions about hypotheses. Although ranksProb is not required to the function, One can compute ranksProb empirically by using the function prob_rank_givenEffect_emp.

The function internally compute mean_testEffect from the test statistics, which is obtainde from the p-values.

It is better to see different combinations of groups and h_breaks to optimize the rejections. The number of p-values per group could be approximately 1000.

One must need to provide either the number of groups or the maximum number of groups. If both or only the max.group is given then the max.group will be used to obatian the optimal group;otherwise, the number of groups will be determined by the group.

totalTests total number of hypothesis tests evaluated

nullProp estimated propotion of the true null hypothesis

opGroup Integer, optimal number of groups

ranksProb probability of the ranks given the mean filter effect, p(rank | ey = mean_filterEffect)

group_wgt Numeric vector of group weights (normalized)

weight Numeric vector of normalized weight for all tests

rejections total number of rejections

rejections_list list of rejected pvalues and the corresponding filter statistics

Mohamad S. Hasan

prob_rank_givenEffect_emp weight_binary weight_continuous

library(OPWeight)
# generate pvalues and filter statistics
m = 10000
set.seed(123)
filters = runif(m, min = 0, max = 2.5)          # filter statistics
H = rbinom(m, size = 1, prob = 0.1)             # hypothesis true or false
tests = rnorm(m, mean = H * filters)            # Z-score
pvals = 1 - pnorm(tests)                        # pvalue

# general use
results <- empOPW(pvalue = pvals, filter = filters, effectType = "continuous",
                                              method = "BH")

# supply the mean test effect externally
library(qvalue)
nullProp = qvalue(p = pvals, pi0.method = "bootstrap")$pi0
m0 = ceiling(nullProp*m)
m1 = m - m0

et = mean(sort(tests, decreasing = TRUE)[1:m1])
results2 <- empOPW(pvalue = pvals, filter = filters, mean_testEffect = et,
               tail = 2, effectType = "continuous", method = "BH")

# supply the ranks probability externally
grp = 5
probs = prob_rank_givenEffect_emp(pvalue = pvals, filter = filters, group = grp,
                               h_breaks = 101, effectType = "continuous")
results3 <- empOPW(pvalue = pvals, filter = filters, ranksProb = probs,
                 effectType = "continuous", tail = 2, method = "BH")

# supply weight externally
wgt <- weight_continuous(alpha = .05, et = et, m = grp, ranksProb = probs)
results4 <- empOPW(pvalue = pvals, filter = filters, weight = wgt,
                        effectType = "continuous", alpha = .05, method = "BH")