This package provides a step-down procedure for controlling the False Discovery Proportion (FDP) in a competition-based setup (see Dong et al. (2022)). This includes target-decoy competition (TDC) in computational mass spectrometry and the knockoff construction in regression. FDP control (also referred to as FDX) is probabilistic in nature: given a prespecified FDP tolerance α ∈ (0,1) and a desired confidence level 1 − γ the procedure reports a list of discoveries for which the FDP is ≤ α with probability ≥ 1 − γ.
You can install the development version of stepdownfdp from GitHub with:
# install.packages("devtools")
devtools::install_github("uni-Arya/stepdownfdp")
For use in a single-decoy experiment, the user must input a collection of target/decoy scores for each hypothesis, an FDP threshold α ∈ (0,1), and a confidence level γ ∈ (0,1).
First use mirandom()
to convert target/decoy scores into winning
scores and labels. The argument scores
must be an m × 2 matrix,
where m is the number of hypotheses. Make sure that the first column
of scores
corresponds to target scores.
library(stepdownfdp)
set.seed(123)
target_scores <- rnorm(200, mean = 1.75)
decoy_scores <- rnorm(200, mean = 0)
scores <- cbind(target_scores, decoy_scores)
scores_and_labels <- mirandom(scores)
head(scores_and_labels)
#> [,1] [,2]
#> [1,] 2.198810 -1
#> [2,] 1.519823 1
#> [3,] 3.308708 1
#> [4,] 1.820508 1
#> [5,] 1.879288 1
#> [6,] 3.465065 1
Pass the output of mirandom()
into fdp_sd()
to return the winning
scores and indices of hypotheses that were rejected. The output of
fdp_sd()
is in decreasing order of winning scores.
results <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1)
W_scores <- results$discoveries
indices <- results$discoveries_ind
W_scores
#> [1] 4.991040 3.937333 3.918956 3.878452 3.850109 3.800085 3.747213 3.659104
#> [9] 3.593862 3.536913 3.465065 3.308708 3.282611 3.266471 3.194551 3.118602
#> [17] 3.110652 3.013185 3.003815 2.974082 2.957962 2.898808 2.881337 2.859920
#> [25] 2.846839 2.802711 2.775571 2.755739 2.743504 2.726973 2.672267 2.668997
#> [33] 2.645126 2.628133 2.587787 2.571581
indices
#> [1] 164 97 44 174 149 70 196 139 125 16 6 3 98 56 131 54 95 182 30
#> [20] 11 45 90 136 187 87 161 76 73 91 159 69 110 33 34 27 35
For use in a multiple-decoy experiment, the user must also input parameters c ≤ λ of the form k/(d+1) where d is the number of decoy scores used for each hypothesis and 1 ≤ k ≤ d is an integer. As an example, if we compute 3 decoy scores for each hypothesis, we may take c and λ to be 1/4, 1/2, or 3/4, subject to c ≤ λ. The value of c determines the ranks in which the target score is considered “winning.” E.g., if c = 1/4, Hi is labelled as a target win whenever its corresponding target score is the highest ranked score among all targets/decoys for that hypothesis. Similarly, λ determines when the target score is “losing.”
The argument scores
of mirandom()
must now be an m × (d+1)
matrix, where m is the number of hypotheses and d is the number of
decoy scores for each. As in the single-decoy case, the first column
must consist of target scores.
library(stepdownfdp)
set.seed(123)
target_scores <- rnorm(200, mean = 1.75)
decoy_scores <- matrix(rnorm(600, mean = 0), ncol = 3)
scores <- cbind(target_scores, decoy_scores)
scores_and_labels <- mirandom(scores, c = 0.25, lambda = 0.75)
head(scores_and_labels)
#> [,1] [,2]
#> [1,] 2.198810 0
#> [2,] 1.519823 1
#> [3,] 3.308708 1
#> [4,] 1.820508 1
#> [5,] 1.879288 1
#> [6,] 3.465065 1
Pass the output into fdp_sd()
, making sure to specify the same c and
λ values used in mirandom()
.
results <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1, c = 0.25, lambda = 0.75)
W_scores <- results$discoveries
indices <- results$discoveries_ind
W_scores
#> [1] 4.991040 3.937333 3.918956 3.878452 3.850109 3.800085 3.747213 3.659104
#> [9] 3.593862 3.536913 3.465065 3.308708 3.282611 3.266471 3.194551 3.118602
#> [17] 3.110652 3.013185 3.003815 2.974082 2.957962 2.898808 2.881337 2.859920
#> [25] 2.846839 2.802711 2.775571 2.755739 2.743504 2.726973 2.672267 2.668997
#> [33] 2.645126 2.628133 2.587787 2.571581 2.537739 2.529965 2.519042 2.504054
#> [41] 2.489948 2.451784 2.451356 2.438640 2.437917 2.394377 2.386570
indices
#> [1] 164 97 44 174 149 70 196 139 125 16 6 3 98 56 131 54 95 182 30
#> [20] 11 45 90 136 187 87 161 76 73 91 159 69 110 33 34 27 35 151 49
#> [39] 152 189 138 141 19 36 148 84 167
The user can also invoke the randomised version of both single-decoy and
multiple-decoy procedures by passing procedure = "coinflip"
into
fdp_sd()
.
results <- fdp_sd(scores_and_labels, alpha = 0.1, conf = 0.1, c = 0.25, lambda = 0.75,
procedure = "coinflip")
W_scores <- results$discoveries
indices <- results$discoveries_ind
W_scores
#> [1] 4.9910399 3.9373330 3.9189560 3.8784519 3.8501089 3.8000847 3.7472134
#> [8] 3.6591036 3.5938620 3.5369131 3.4650650 3.3087083 3.2826106 3.2664706
#> [15] 3.1945509 3.1186023 3.1106524 3.0131852 3.0038149 2.9740818 2.9579620
#> [22] 2.8988076 2.8813372 2.8599203 2.8468390 2.8027115 2.7755714 2.7557385
#> [29] 2.7435039 2.7269734 2.6722675 2.6689966 2.6451257 2.6281335 2.5877870
#> [36] 2.5715811 2.5377388 2.5299651 2.5190422 2.5040538 2.4899475 2.4517843
#> [43] 2.4513559 2.4386403 2.4379168 2.3943765 2.3865697 2.3579643 2.3507088
#> [50] 2.3346137 2.3129895 2.3039177 2.2983970 2.2694072 2.2668620 2.2478505
#> [57] 2.2109162 2.2015041 2.1982098 2.1865235 2.1851815 2.1764642 2.1507715
#> [64] 2.1352804 2.1296395 2.1189645 2.1098138 2.0822026 2.0817820 2.0604807
#> [71] 2.0535286 2.0511534 2.0482276 2.0068837 2.0033185 1.9887317 1.9853866
#> [78] 1.9659416 1.9644453 1.9313035 1.9033731 1.8792877 1.8738542 1.8676466
#> [85] 1.8606827 1.8556762 1.8445835 1.8347373 1.8279608 1.8205084 1.8152930
#> [92] 1.8030042 1.7912329 1.7877884 1.7557642 1.7214532 1.7159327 1.7071295
#> [99] 1.7049723 1.6944380 1.6880883 1.6786919 1.6666309 1.6111086 1.5528241
#> [106] 1.5420827 1.5320251 1.5295134 1.5198225 1.5142996 1.5137204 1.5033081
#> [113] 1.4939078 1.4878025 1.4696047 1.4549285 1.4440373 1.4253141 1.4024574
#> [120] 1.4003496 1.3793400 1.3775612 1.3697735 1.3695290 1.3471152 1.3331424
#> [127] 1.3043380 1.2916347 1.2833446 1.2662194 1.2594426 1.2507080 1.2190935
#> [134] 1.1941589 1.1249607 1.1220939 1.1092940 1.0980501 1.0619914 1.0552930
#> [141] 1.0407992 1.0395934 1.0086639 0.9650955 0.8844871 0.8546366 0.7983814
#> [148] 0.7416234 0.7008230 0.5292823 0.4629695
indices
#> [1] 164 97 44 174 149 70 196 139 125 16 6 3 98 56 131 54 95 182
#> [19] 30 11 45 90 136 187 87 161 76 73 91 159 69 110 33 34 27 35
#> [37] 151 49 152 189 138 141 19 36 148 84 167 112 197 58 157 37 92 115
#> [55] 169 17 7 132 67 179 88 31 13 82 61 170 12 153 86 178 66 116
#> [73] 166 102 51 93 127 60 191 79 28 5 59 121 14 117 193 188 128 4
#> [91] 172 68 133 177 81 52 173 53 106 114 38 130 50 80 186 42 22 85
#> [109] 2 99 185 103 124 142 156 32 39 192 104 183 83 158 109 40 47 165
#> [127] 10 180 48 168 123 190 146 15 25 94 118 126 75 41 74 101 175 107
#> [145] 184 194 105 154 162 78 150
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.