R/mscore4assayfdr.R

Defines functions mscore4assayfdr

Documented in mscore4assayfdr

#' Find m_score cutoff to reach a desired FDR on assay level (over the entire
#' OpenSWATH/pyProphet output table)
#'
#' This function estimates the m_score cutoff required in a dataset to reach a
#' given overall assay level FDR.
#' It counts target and decoy assays at high resolution across the m_score
#' cutoffs and reports a useful m_score cutoff - assay FDR pair close to the
#' supplied fdr_target level over the entire dataset. The m_score cutoff is
#' returned by the function and can be used in the context of the filtering
#' functions, e.g.:
#' data.assayFDR1pc<-filter_mscore(data, mscore4assayfdr(data, fdr_target=0.01))
#' To arrive from decoy counts at an estimation of the false discovery rate
#' (false positives among the targets remaining at a given mscore cutoff) the
#' ratio of false positives to true negatives (decoys) (FFT) must be
#' supplied. It is estimated for each run individually by pyProphet and
#' contained in the pyProphet statistics [Injection_name]_full_stat.csv. As an
#' approximation, the FFTs of multiple runs are averaged and supplied as
#' argument FFT. For further details see the Vignette Section 1.3 and 4.1.
#' For FDR evaluations on peptide and protein level, please refer to functions
#' mscore4pepfdr and mscore4protfdr.
#'
#' @param data Annotated OpenSWATH/pyProphet data table. See function
#'   sample_annotation from this package.
#' @param FFT Ratio of false positives to true negatives, q-values from
#'   [Injection_name]_full_stat.csv in pyProphet stats output. As an
#'   approximation, the q-values of multiple runs are averaged and supplied as
#'   argument FFT. Numeric from 0 to 1. Defaults to 1, the most conservative
#'   value (1 Decoy indicates 1 False target).
#' @param fdr_target  Assay FDR target, numeric, defaults to 0.01. An m_score
#'   cutoff achieving an FDR < fdr_target will be selected.
#' Calculated as FDR = (TN*FFT/T); TN=decoys, T=targets, FFT=see above.
#' @param mscore.col Column name containing the computed m scores.
#' @return Returns the m_score cutoff selected to arrive at the desired FDR
#' @author Moritz Heusel
#' @examples
#'  data("OpenSWATH_data", package="SWATH2stats")
#'  data("Study_design", package="SWATH2stats")
#'  data <- sample_annotation(OpenSWATH_data, Study_design)
#'  chosen <- mscore4assayfdr(data, FFT=0.7, fdr_target=0.01)
#' @export
mscore4assayfdr <- function(data, 
                            FFT = 1, 
                            fdr_target = 0.01, 
                            mscore.col = "m_score") {
    mscore.col <- JPP_update(data, mscore.col)

    # generate high resolution mscore levels to assess mscore cutoff for a given
    # fdr_target
    mscore_levels_highres = 10^-(c(seq(2, 20, 0.05)))
    target.assays.highres <- NULL
    decoy.assays.highres <- NULL
    for (i in seq_len(length(mscore_levels_highres))) {
        target.assays.highres[i] <- length(unique(data[data$decoy == FALSE & data[,
            mscore.col] <= mscore_levels_highres[i], c("transition_group_id")]))
        decoy.assays.highres[i] <- length(unique(data[data$decoy == TRUE & data[,
            mscore.col] <= mscore_levels_highres[i], c("transition_group_id")]))
    }
    assay.fdr.highres <- (decoy.assays.highres/target.assays.highres) * FFT

    # pick mscore cutoff closest to (<=) fdr_target % peptide FDR & report
    mscore_chosen <- mscore_levels_highres[assay.fdr.highres <= fdr_target][1]
    assay_fdr_chosen <- assay.fdr.highres[assay.fdr.highres <= fdr_target][1]
    message("Target assay FDR: ", fdr_target, "\n")
    message("Required overall m-score cutoff:", signif(mscore_chosen, digits = 5),
        "\n", "achieving assay FDR =", signif(assay_fdr_chosen, digits = 3), "\n")
    return(mscore_chosen)
}
peterblattmann/SWATH2stats documentation built on July 2, 2023, 9:42 p.m.