ihw.default: ihw: Main function for Independent Hypothesis Weighting
In nignatiadis/IHW: Independent Hypothesis Weighting

View source: R/ihw_convex.R

ihw.default

R Documentation

ihw: Main function for Independent Hypothesis Weighting

Description

Given a vector of p-values, a vector of covariates which are independent of the p-values under the null hypothesis and a nominal significance level alpha, IHW learns multiple testing weights and then applies the weighted Benjamini Hochberg (or Bonferroni) procedure.

Usage

## Default S3 method:
ihw(
  pvalues,
  covariates,
  alpha,
  covariate_type = "ordinal",
  nbins = "auto",
  m_groups = NULL,
  folds = NULL,
  quiet = TRUE,
  nfolds = 5L,
  nfolds_internal = 5L,
  nsplits_internal = 1L,
  lambdas = "auto",
  seed = 1L,
  distrib_estimator = "grenander",
  lp_solver = "lpsymphony",
  adjustment_type = "BH",
  null_proportion = FALSE,
  null_proportion_level = 0.5,
  return_internal = FALSE,
  ...
)

## S3 method for class 'formula'
ihw(formula, data = parent.frame(), ...)

Arguments

`pvalues`	Numeric vector of unadjusted p-values.
`covariates`	Vector which contains the one-dimensional covariates (independent under the H0 of the p-value) for each test. Can be numeric or a factor. (If numeric it will be converted into factor by binning.)
`alpha`	Numeric, sets the nominal level for FDR control.
`covariate_type`	"ordinal" or "nominal" (i.e. whether covariates can be sorted in increasing order or not)
`nbins`	Integer, number of groups into which p-values will be split based on covariate. Use "auto" for automatic selection of the number of bins. Only applicable when covariates is not a factor.
`m_groups`	Integer vector of length equal to the number of levels of the covariates (only to be specified when the latter is a factor/categorical). Each entry corresponds to the number of hypotheses to be tested in each group (stratum). This argument needs to be given when the complete vector of p-values is not available, but only p-values below a given threshold, for example because of memory reasons. See the vignette for additional details and an example of how this principle can be applied with numerical covariates.
`folds`	Integer vector or NULL. Pre-specify assignment of hypotheses into folds.
`quiet`	Boolean, if False a lot of messages are printed during the fitting stages.
`nfolds`	Number of folds into which the p-values will be split for the pre-validation procedure
`nfolds_internal`	Within each fold, a second (nested) layer of cross-validation can be conducted to choose a good regularization parameter. This parameter controls the number of nested folds.
`nsplits_internal`	Integer, how many times to repeat the nfolds_internal splitting. Can lead to better regularization parameter selection but makes ihw a lot slower.
`lambdas`	Numeric vector which defines the grid of possible regularization parameters. Use "auto" for automatic selection.
`seed`	Integer or NULL. Split of hypotheses into folds is done randomly. To have the output of the function be reproducible, the seed of the random number generator is set to this value at the start of the function. Use NULL if you don't want to set the seed.
`distrib_estimator`	Character ("grenander" or "ECDF"). Only use this if you know what you are doing. ECDF with nfolds > 1 or lp_solver == "lpsymphony" will in general be excessively slow, except for very small problems.
`lp_solver`	Character ("lpsymphony" or "gurobi"). Internally, IHW solves a sequence of linear programs, which can be solved with either of these solvers.
`adjustment_type`	Character ("BH" or "bonferroni") depending on whether you want to control FDR or FWER.
`null_proportion`	Boolean, if True (default is False), a modified version of Storey's estimator is used within each bin to estimate the proportion of null hypotheses.
`null_proportion_level`	Numeric, threshold for Storey's pi0 estimation procedure, defaults to 0.5
`return_internal`	Returns a lower level representation of the output (only useful for debugging purposes).
`...`	Arguments passed to internal functions.
`formula`	`formula`, specified in the form pvalue~covariate (only 1D covariate supported)
`data`	data.frame from which the variables in formula should be taken

Value

A ihwResult object.

Examples


save.seed <- .Random.seed; set.seed(1)
X   <- runif(20000, min=0, max=2.5)   # covariate
H   <- rbinom(20000,1,0.1)            # hypothesis true or false
Z   <- rnorm(20000, H*X)              # Z-score
.Random.seed <- save.seed
pvalue <- 1-pnorm(Z)                  # pvalue

ihw_fdr <- ihw(pvalue, X, .1)        # Standard IHW for FDR control
ihw_fwer <- ihw(pvalue, X, .1, adjustment_type = "bonferroni")    # FWER control
table(H[adj_pvalues(ihw_fdr) <= 0.1] == 0) #how many false rejections?
table(H[adj_pvalues(ihw_fwer) <= 0.1] == 0)

nignatiadis/IHW documentation built on Aug. 22, 2023, 2:11 p.m.