ihw.default: ihw: Main function for Independent Hypothesis Weighting

View source: R/ihw_convex.R

ihw.defaultR Documentation

ihw: Main function for Independent Hypothesis Weighting

Description

Given a vector of p-values, a vector of covariates which are independent of the p-values under the null hypothesis and a nominal significance level alpha, IHW learns multiple testing weights and then applies the weighted Benjamini Hochberg (or Bonferroni) procedure.

Usage

## Default S3 method:
ihw(
  pvalues,
  covariates,
  alpha,
  covariate_type = "ordinal",
  nbins = "auto",
  m_groups = NULL,
  folds = NULL,
  quiet = TRUE,
  nfolds = 5L,
  nfolds_internal = 5L,
  nsplits_internal = 1L,
  lambdas = "auto",
  seed = 1L,
  distrib_estimator = "grenander",
  lp_solver = "lpsymphony",
  adjustment_type = "BH",
  null_proportion = FALSE,
  null_proportion_level = 0.5,
  return_internal = FALSE,
  ...
)

## S3 method for class 'formula'
ihw(formula, data = parent.frame(), ...)

Arguments

pvalues

Numeric vector of unadjusted p-values.

covariates

Vector which contains the one-dimensional covariates (independent under the H0 of the p-value) for each test. Can be numeric or a factor. (If numeric it will be converted into factor by binning.)

alpha

Numeric, sets the nominal level for FDR control.

covariate_type

"ordinal" or "nominal" (i.e. whether covariates can be sorted in increasing order or not)

nbins

Integer, number of groups into which p-values will be split based on covariate. Use "auto" for automatic selection of the number of bins. Only applicable when covariates is not a factor.

m_groups

Integer vector of length equal to the number of levels of the covariates (only to be specified when the latter is a factor/categorical). Each entry corresponds to the number of hypotheses to be tested in each group (stratum). This argument needs to be given when the complete vector of p-values is not available, but only p-values below a given threshold, for example because of memory reasons. See the vignette for additional details and an example of how this principle can be applied with numerical covariates.

folds

Integer vector or NULL. Pre-specify assignment of hypotheses into folds.

quiet

Boolean, if False a lot of messages are printed during the fitting stages.

nfolds

Number of folds into which the p-values will be split for the pre-validation procedure

nfolds_internal

Within each fold, a second (nested) layer of cross-validation can be conducted to choose a good regularization parameter. This parameter controls the number of nested folds.

nsplits_internal

Integer, how many times to repeat the nfolds_internal splitting. Can lead to better regularization parameter selection but makes ihw a lot slower.

lambdas

Numeric vector which defines the grid of possible regularization parameters. Use "auto" for automatic selection.

seed

Integer or NULL. Split of hypotheses into folds is done randomly. To have the output of the function be reproducible, the seed of the random number generator is set to this value at the start of the function. Use NULL if you don't want to set the seed.

distrib_estimator

Character ("grenander" or "ECDF"). Only use this if you know what you are doing. ECDF with nfolds > 1 or lp_solver == "lpsymphony" will in general be excessively slow, except for very small problems.

lp_solver

Character ("lpsymphony" or "gurobi"). Internally, IHW solves a sequence of linear programs, which can be solved with either of these solvers.

adjustment_type

Character ("BH" or "bonferroni") depending on whether you want to control FDR or FWER.

null_proportion

Boolean, if True (default is False), a modified version of Storey's estimator is used within each bin to estimate the proportion of null hypotheses.

null_proportion_level

Numeric, threshold for Storey's pi0 estimation procedure, defaults to 0.5

return_internal

Returns a lower level representation of the output (only useful for debugging purposes).

...

Arguments passed to internal functions.

formula

formula, specified in the form pvalue~covariate (only 1D covariate supported)

data

data.frame from which the variables in formula should be taken

Value

A ihwResult object.

See Also

ihwResult, plot,ihwResult-method, ihw.DESeqResults

Examples


save.seed <- .Random.seed; set.seed(1)
X   <- runif(20000, min=0, max=2.5)   # covariate
H   <- rbinom(20000,1,0.1)            # hypothesis true or false
Z   <- rnorm(20000, H*X)              # Z-score
.Random.seed <- save.seed
pvalue <- 1-pnorm(Z)                  # pvalue

ihw_fdr <- ihw(pvalue, X, .1)        # Standard IHW for FDR control
ihw_fwer <- ihw(pvalue, X, .1, adjustment_type = "bonferroni")    # FWER control
table(H[adj_pvalues(ihw_fdr) <= 0.1] == 0) #how many false rejections?
table(H[adj_pvalues(ihw_fwer) <= 0.1] == 0)



nignatiadis/IHW documentation built on Aug. 22, 2023, 2:11 p.m.