In pneuvial/sanssouci: Post Hoc Multiple Testing Inference

knitr::opts_chunk$set(echo = TRUE)

The Simes inequality

Let $\mathbf{q}=(q_i)_{1 \leq i \leq m}$ be a vector of random variables such that:

for each $i=1 \ldots m$, $q_i \sim \mathcal{U}[0,1]$
$\mathbf{q}$ satisfies positive regression dependency on a subset (PRDS)

We refer to @sarkar98probability for a formal definition of this form of positive dependence. Here, we simply note that it holds in particular of Gaussian equi-correlated random variables.

Then, letting $(q_{(1)}, \ldots q_{(m)} )$ be the corresponding order statistics, we have:

$$ \mathbb{P} \left( \exists i \in {1, \ldots m} \::\: q_{(i)} \leq \frac{\alpha i}{m}\right) \leq \alpha $$ This inequality is due to @simes86improved. It is sharp under independence of the $q_i$ (that is, the above inequality is an equality). How sharp is it under positive dependence?

Estimating the size of the Simes test

library("sanssouci")
library("matrixStats")
library("knitr")

We write a small function to estimate the size of the Simes test, that is, the level actually achieved by the left-hand side in the above inequality.

size <- function(rho, m, B, alpha=0.05) {
    X0 <- gaussianTestStatistics(m, B, dep = "equi", param = rho)$X0
    p0 <- pnorm(X0, lower.tail = FALSE)
    thr <- t_linear(alpha, 1:m, m)
    p0s <- sanssouci:::colSort(p0);
    isAbove <- sweep(p0s, 1, thr,  "<")
    nAbove <- colSums(isAbove)
    mean(nAbove > 0)
}

Our simulation parameters are set as follows:

m <- 1e3
rhos <- c(0, 0.1, 0.2, 0.4, 0.8)
alpha <- 0.2
B <- 1000

That is:

$r m$ hypotheses
equi-correlation level between $r rhos[1]$ and $r rhos[length(rhos)]$
target level of the Simes test: $r alpha$
$r B$ replications are used to estimate the size of the test

ahat <- sapply(rhos, size, m, B, alpha=alpha)
ses <- sqrt(ahat*(1-ahat)/B)

We estimate the size of the test as the average size $\hat{\alpha}$ achieved across replications, and the corresponding standard error as $\sqrt{\hat{\alpha}(1-\hat{\alpha})/B}$.

tb <- rbind(ahat/alpha, ses/alpha)
colnames(tb) <- rhos
rownames(tb) <- c("Achieved level/target level", "Standard error")
knitr::kable(tb, digits = 2)

This table illustrates the sharpness of the Simes test under independence ($\rho=0$), and the conservativeness of this test under positive dependence ($\rho>0$). For example, when $\rho=0.4$, the size of the Simes test is less than half the target level.

This conservativeness is one of the motivations of the development of post hoc inference methods that use randomization to adapt to dependence. See the following vignettes:

Joint Error Rate Calibration

Permutation-based post hoc inference for differential gene expression studies

Permutation-based post hoc inference for fMRI studies