FDRcutoff: Determine optimal cutoff thresholds based on Screen Strength...

FDRcutoffR Documentation

Determine optimal cutoff thresholds based on Screen Strength analysis.

Description

This function calculates optimal cutoff thresholds for identifying significant hits in high-throughput screening data using Screen Strength (SS) analysis. It evaluates the trade-off between sensitivity and specificity by calculating the ratio of apparent FDR to baseline FDR across different zeta score thresholds.

Usage

FDRcutoff(zetaData, negGene, posGene, nonExpGene, combine = FALSE)

Arguments

zetaData

A data frame containing zeta scores calculated by the Zeta() function. Should have columns 'Zeta_D' and 'Zeta_I' representing decrease and increase direction scores, respectively.

negGene

A data frame or matrix containing negative control gene/siRNA identifiers. The first column should contain gene/siRNA names that match the row names in zetaData.

posGene

A data frame or matrix containing positive control gene/siRNA identifiers. The first column should contain gene/siRNA names that match the row names in zetaData.

nonExpGene

A data frame or matrix containing non-expressed gene/siRNA identifiers. These genes are used to estimate the baseline false discovery rate. The first column should contain gene/siRNA names that match the row names in zetaData.

combine

Logical. Whether to combine decrease and increase direction zeta scores. Default is FALSE. When TRUE, uses the sum of Zeta_D and Zeta_I; when FALSE, analyzes each direction separately.

Details

The function performs the following analysis:

  1. Categorizes genes into types: "Gene" (test genes), "Positive" (positive controls), "NS_mix" (negative controls), and "non_exp" (non-expressed genes)

  2. Calculates baseline FDR (bFDR) as the proportion of non-expressed genes in the entire dataset

  3. For each zeta score threshold, calculates apparent FDR (aFDR) as the proportion of non-expressed genes among hits

  4. Computes Screen Strength: SS = 1 - (aFDR / bFDR)

  5. Generates plots showing zeta score distributions and SS curves

Higher Screen Strength values indicate better separation between true hits and false positives. Users can select appropriate thresholds based on desired sensitivity/specificity trade-offs.

Value

A list containing:

FDR_cutOff

A data frame with 6 columns:

  • Cut_Off: Zeta score threshold

  • aFDR: Apparent false discovery rate at this threshold

  • SS: Screen Strength = 1 - (aFDR / bFDR)

  • TotalHits: Total number of hits at this threshold

  • Num_nonExp: Number of non-expressed genes among hits

  • Type: Direction ("Decrease", "Increase", or "Combine")

plotList

A list with two ggplot objects:

  • Zeta_type: Jitter plots showing zeta score distributions by gene type

  • SS_cutOff: Screen Strength curves showing SS vs. zeta score threshold

Author(s)

Yajing Hao, Shuyang Zhang, Junhui Li, Guofeng Zhao, Xiang-Dong Fu

Examples

data(nonExpGene)
data(negGene)
data(posGene)
data(ZseqList)
data(countMat)
ZscoreVal <- Zscore(countMat, negGene)
zetaData <- Zeta(ZscoreVal, ZseqList, SVM=FALSE)
cutoffval <- FDRcutoff(zetaData, negGene, posGene, nonExpGene, combine=TRUE)


ZetaSuite documentation built on Nov. 5, 2025, 6:37 p.m.