closedBY: Closed Benjamini-Yekutieli procedure for simultaneous FDR...

View source: R/cBY_wrapper.R

closedBYR Documentation

Closed Benjamini-Yekutieli procedure for simultaneous FDR control

Description

Applies the closed testing version of the Benjamini-Yekutieli (BY) procedure. The standard BY procedure controls the false discovery rate (FDR) at level \alpha under arbitrary dependence but only provides a single set of rejections. The closed BY procedure provides simultaneous FDR control: for every set of hypotheses, it determines whether that set can be reported as discoveries while maintaining FDR control at level \alpha, regardless of which other sets are inspected.

Usage

closedBY(p, set = NULL, alpha = 0.05, approximate = FALSE)

Arguments

p

Numeric vector of p-values, one per hypothesis. Values must lie in [0, 1]; exact zeros are replaced internally by a small positive constant to avoid numerical issues.

set

Optional subsetting vector for p (logical, index, or negative index), indicating which hypotheses belong to the set to be checked for closedBY significance. If NULL (the default), the function instead returns the size of the largest closedBY-significant set.

alpha

Numeric scalar in [0, 1]. The target FDR level. Defaults to 0.05.

approximate

Logical. If FALSE (the default), uses an exact algorithm that is guaranteed to find the largest closedBY-significant set. If TRUE, uses a faster approximate algorithm based on bisection that may occasionally return a smaller set. The approximate method is recommended for exploratory analyses or large inputs where computation time is a concern.

Details

The closed BY procedure is based on a local e-value for every intersection hypothesis. A set R of hypotheses is closedBY-significant — and therefore a valid simultaneous rejection — if and only if, for every subset S \subseteq [m], the local e-value exceeds |S \cap R|/|R|\alpha. This guarantees post-hoc FDR control: you may report any closedBY-significant set as your discovery set without inflating the FDR above \alpha, even if the choice of set was data-driven.

The function has two modes:

  • Set-checking mode (when set is supplied): Returns TRUE if the specified set is closedBY-significant (i.e., can be reported as a valid simultaneous rejection at level \alpha), and FALSE otherwise.

  • Discovery mode (when set = NULL): Returns the size r of the largest closedBY-significant set. The r hypotheses with the smallest p-values always form one such set. This gives the maximum number of hypotheses that can be reported while maintaining simultaneous FDR control. In particular, the set R consisting of the r smallest p-values is closedBY-significant.

Note that closedBY significance is not a monotone property: a set of size r being closedBY-significant does not imply that all smaller sets are as well. The exact algorithm therefore checks all set sizes, while the approximate algorithm (approximate = TRUE) uses a faster bisection strategy that may occasionally underestimate the largest significant set.

Value

  • If set is supplied: a single logical value. TRUE indicates that the specified set is closedBY-significant and can be reported as a simultaneous rejection at FDR level \alpha. FALSE indicates it cannot.

  • If set = NULL: a single non-negative integer r. The r hypotheses with the smallest p-values form a valid simultaneous rejection set. A return value of 0 means no non-empty set can be rejected.

References

Benjamini, Y., & Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics, 29(4), 1165–1188.

Goeman, J. J., & Solari, A. (2011). Multiple testing for exploratory research. Statistical Science, 26(4), 584–597.

Xu, Z., Solari, A., Fischer, L., de Heide, R., Ramdas, A., & Goeman, J. (2025). Bringing closure to false discovery rate control: A general principle for multiple testing. arXiv preprint arXiv:2509.02517.

See Also

p.adjust() for standard p-value-based non-simultaneous multiple testing corrections, including the BY procedure (method = "BY"). closedeBH() for the analogous procedure based on e-values.

Examples

set.seed(42)
# 20 null hypotheses (p ~ Uniform(0,1)) and 10 non-nulls (p ~ Beta(0.1, 1), smaller on average)
p <- c(runif(20), rbeta(10, 0.1, 1))

# --- Discovery mode ---
# Find the maximum number of simultaneous rejections at FDR level 5%
r <- closedBY(p, alpha = 0.05)
cat("Largest simultaneous rejection set:", r, "\n")

# The r hypotheses with the smallest p-values form a valid discovery set
discovery_set <- p <= sort(p)[r]
cat("P-values in discovery set:", round(sort(p[discovery_set]), 4), "\n")

# --- Set-checking mode ---
# Check whether a researcher-defined set is a valid simultaneous rejection
candidate_set <- p < 0.01
closedBY(p, set = candidate_set, alpha = 0.05)

# --- Exact vs. approximate ---
r_exact  <- closedBY(p, alpha = 0.05, approximate = FALSE)
r_approx <- closedBY(p, alpha = 0.05, approximate = TRUE)
cat("Exact:", r_exact, "  Approximate:", r_approx, "\n")


eClosure documentation built on April 15, 2026, 5:08 p.m.

Related to closedBY in eClosure...