knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
This package provides a step-down procedure for controlling the False Discovery Proportion (FDP) in a competition-based setup (see Dong et al. (2022)). This includes target-decoy competition (TDC) in computational mass spectrometry and the knockoff construction in regression. FDP control (also referred to as FDX) is probabilistic in nature: given a prespecified FDP tolerance $\alpha \in (0, 1)$ and a desired confidence level $1-\gamma$ the procedure reports a list of discoveries for which the FDP is $\leq \alpha$ with probability $\geq 1 - \gamma$.
You can install the development version of stepdownfdp from GitHub with:
# install.packages("devtools") devtools::install_github("uni-Arya/stepdownfdp")
For use in a single-decoy experiment, the user must input a collection of target/decoy scores for each hypothesis, an FDP threshold $\alpha \in (0, 1)$, and a confidence level $\gamma \in (0, 1)$.
First use mirandom()
to convert target/decoy scores into winning scores
and labels. The argument scores
must be an $m \times 2$ matrix,
where $m$ is the number of hypotheses. Make sure that the first
column of scores
corresponds to target scores.
library(stepdownfdp) set.seed(123) target_scores <- rnorm(200, mean = 1.75) decoy_scores <- rnorm(200, mean = 0) scores <- cbind(target_scores, decoy_scores) scores_and_labels <- mirandom(scores) head(scores_and_labels)
Pass the output of mirandom()
into fdp_sd()
to return the winning scores
and indices of hypotheses that were rejected. The output of fdp_sd()
is
in decreasing order of winning scores.
results <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1) W_scores <- results$discoveries indices <- results$discoveries_ind W_scores indices
For use in a multiple-decoy experiment, the user must also input parameters $c \leq \lambda$ of the form $k/(d + 1)$ where $d$ is the number of decoy scores used for each hypothesis and $1 \leq k \leq d$ is an integer. As an example, if we compute $3$ decoy scores for each hypothesis, we may take $c$ and $\lambda$ to be $1/4$, $1/2$, or $3/4$, subject to $c \leq \lambda$. The value of $c$ determines the ranks in which the target score is considered "winning." E.g., if $c = 1/4$, $H_i$ is labelled as a target win whenever its corresponding target score is the highest ranked score among all targets/decoys for that hypothesis. Similarly, $\lambda$ determines when the target score is "losing."
The argument scores
of mirandom()
must now be an $m \times (d + 1)$ matrix, where $m$ is the number of hypotheses and $d$ is the number of decoy scores for each. As in the single-decoy case, the first column must consist of target scores.
library(stepdownfdp) set.seed(123) target_scores <- rnorm(200, mean = 1.75) decoy_scores <- matrix(rnorm(600, mean = 0), ncol = 3) scores <- cbind(target_scores, decoy_scores) scores_and_labels <- mirandom(scores, c = 0.25, lambda = 0.75) head(scores_and_labels)
Pass the output into fdp_sd()
, making sure to specify the same $c$ and $\lambda$ values used in mirandom()
.
results <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1, c = 0.25, lambda = 0.75) W_scores <- results$discoveries indices <- results$discoveries_ind W_scores indices
The user can also invoke the randomised version of both single-decoy and multiple-decoy procedures by passing procedure = "coinflip"
into fdp_sd()
.
results <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1, c = 0.25, lambda = 0.75, procedure = "coinflip") W_scores <- results$discoveries indices <- results$discoveries_ind W_scores indices
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.