knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

stepdownfdp

This package provides a step-down procedure for controlling the False Discovery Proportion (FDP) in a competition-based setup (see Dong et al. (2022)). This includes target-decoy competition (TDC) in computational mass spectrometry and the knockoff construction in regression. FDP control (also referred to as FDX) is probabilistic in nature: given a prespecified FDP tolerance $\alpha \in (0, 1)$ and a desired confidence level $1-\gamma$ the procedure reports a list of discoveries for which the FDP is $\leq \alpha$ with probability $\geq 1 - \gamma$.

Installation

You can install the development version of stepdownfdp from GitHub with:

# install.packages("devtools")
devtools::install_github("uni-Arya/stepdownfdp")

Usage

With a single decoy

For use in a single-decoy experiment, the user must input a collection of target/decoy scores for each hypothesis, an FDP threshold $\alpha \in (0, 1)$, and a confidence level $\gamma \in (0, 1)$.

First use mirandom() to convert target/decoy scores into winning scores and labels. The argument scores must be an $m \times 2$ matrix, where $m$ is the number of hypotheses. Make sure that the first column of scores corresponds to target scores.

library(stepdownfdp)
set.seed(123)
target_scores     <- rnorm(200, mean = 1.75)
decoy_scores      <- rnorm(200, mean = 0)
scores            <- cbind(target_scores, decoy_scores)
scores_and_labels <- mirandom(scores)
head(scores_and_labels)

Pass the output of mirandom() into fdp_sd() to return the winning scores and indices of hypotheses that were rejected. The output of fdp_sd() is in decreasing order of winning scores.

results  <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1)
W_scores <- results$discoveries
indices  <- results$discoveries_ind
W_scores
indices

With multiple decoys

For use in a multiple-decoy experiment, the user must also input parameters $c \leq \lambda$ of the form $k/(d + 1)$ where $d$ is the number of decoy scores used for each hypothesis and $1 \leq k \leq d$ is an integer. As an example, if we compute $3$ decoy scores for each hypothesis, we may take $c$ and $\lambda$ to be $1/4$, $1/2$, or $3/4$, subject to $c \leq \lambda$. The value of $c$ determines the ranks in which the target score is considered "winning." E.g., if $c = 1/4$, $H_i$ is labelled as a target win whenever its corresponding target score is the highest ranked score among all targets/decoys for that hypothesis. Similarly, $\lambda$ determines when the target score is "losing."

The argument scores of mirandom() must now be an $m \times (d + 1)$ matrix, where $m$ is the number of hypotheses and $d$ is the number of decoy scores for each. As in the single-decoy case, the first column must consist of target scores.

library(stepdownfdp)
set.seed(123)
target_scores     <- rnorm(200, mean = 1.75)
decoy_scores      <- matrix(rnorm(600, mean = 0), ncol = 3)
scores            <- cbind(target_scores, decoy_scores)
scores_and_labels <- mirandom(scores, c = 0.25, lambda = 0.75)
head(scores_and_labels)

Pass the output into fdp_sd(), making sure to specify the same $c$ and $\lambda$ values used in mirandom().

results  <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1, c = 0.25, lambda = 0.75)
W_scores <- results$discoveries
indices  <- results$discoveries_ind
W_scores
indices

Randomised versions

The user can also invoke the randomised version of both single-decoy and multiple-decoy procedures by passing procedure = "coinflip" into fdp_sd().

results  <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1, c = 0.25, lambda = 0.75, 
                   procedure = "coinflip")
W_scores <- results$discoveries
indices  <- results$discoveries_ind
W_scores
indices


uni-Arya/stepdownfdp documentation built on March 19, 2022, 8:19 a.m.