LOND | R Documentation |
Implements the LOND algorithm for online FDR control, where LOND stands for (significance) Levels based On Number of Discoveries, as presented by Javanmard and Montanari (2015).
LOND(
d,
alpha = 0.05,
betai,
dep = FALSE,
random = TRUE,
display_progress = FALSE,
date.format = "%Y-%m-%d",
original = TRUE
)
d |
Either a vector of p-values, or a dataframe with three columns: an identifier (‘id’), date (‘date’) and p-value (‘pval’). If no column of dates is provided, then the p-values are treated as being ordered in sequence, arriving one at a time. |
alpha |
Overall significance level of the FDR procedure, the default is 0.05. |
betai |
Optional vector of |
dep |
Logical. If |
random |
Logical. If |
display_progress |
Logical. If |
date.format |
Optional string giving the format that is used for dates. |
original |
Logical. If |
The function takes as its input either a vector of p-values, or a dataframe with three columns: an identifier (‘id’), date (‘date’) and p-value (‘pval’). The case where p-values arrive in batches corresponds to multiple instances of the same date. If no column of dates is provided, then the p-values are treated as being ordered in sequence, arriving one at a time.
The LOND algorithm controls the FDR for independent p-values (see below for
the modification for dependent p-values). Given an overall significance level
\alpha
, we choose a sequence of non-negative numbers \beta_i
such
that they sum to \alpha
. The values of the adjusted significance
thresholds \alpha_i
are chosen as follows:
\alpha_i = (D(i-1) +
1)\beta_i
where D(n)
denotes the number of discoveries in the first
n
hypotheses.
A slightly modified version of LOND with thresholds \alpha_i =
max(D(i-1), 1)\beta_i
provably controls the FDR under positive dependence
(PRDS condition), see Zrnic et al. (2021).
For arbitrarily dependent p-values, LOND controls the FDR if it is modified
with \beta_i / H(i)
in place of \beta_i
, where H(j)
is the
i-th harmonic number.
Further details of the LOND algorithm can be found in Javanmard and Montanari (2015).
out |
A dataframe with the original data |
Javanmard, A. and Montanari, A. (2015) On Online Control of False Discovery Rate. arXiv preprint, https://arxiv.org/abs/1502.06197.
Javanmard, A. and Montanari, A. (2018) Online Rules for Control of False Discovery Rate and False Discovery Exceedance. Annals of Statistics, 46(2):526-554.
Zrnic, T., Ramdas, A. and Jordan, M.I. (2021). Asynchronous Online Testing of Multiple Hypotheses. Journal of Machine Learning Research (to appear), https://arxiv.org/abs/1812.05068.
LONDstar
presents versions of LORD for synchronous
p-values, i.e. where each test can only start when the previous test has
finished.
sample.df <- data.frame(
id = c('A15432', 'B90969', 'C18705', 'B49731', 'E99902',
'C38292', 'A30619', 'D46627', 'E29198', 'A41418',
'D51456', 'C88669', 'E03673', 'A63155', 'B66033'),
date = as.Date(c(rep('2014-12-01',3),
rep('2015-09-21',5),
rep('2016-05-19',2),
'2016-11-12',
rep('2017-03-27',4))),
pval = c(2.90e-08, 0.06743, 0.01514, 0.08174, 0.00171,
3.60e-05, 0.79149, 0.27201, 0.28295, 7.59e-08,
0.69274, 0.30443, 0.00136, 0.72342, 0.54757))
set.seed(1); LOND(sample.df)
LOND(sample.df, random=FALSE)
set.seed(1); LOND(sample.df, alpha=0.1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.