GetFdrBasedThreshLimit: GetFdrBasedThreshLimit Function (FDREst)
In AlexisHardy/DNAModAnnot: Toolbox for DNA Modifications filtering and annotation

GetFdrBasedThreshLimit

R Documentation

GetFdrBasedThreshLimit Function (FDREst)

Description

Return a plot describing the false discovery rate (fdr) estimations by threshold on the parameter provided for each dataframe in the list provided.

Usage

GetFdrBasedThreshLimit(
  listFdrEstByThr,
  nFdrPropForFilt = 0.05,
  lUseBestThrIfNoFdrThr = TRUE
)

Arguments

listFdrEstByThr

A list composed of x dataframes. In each dataframe:

fdr: The false discovery rate estimated for this threshold.
threshold: The threshold on the parameter.
fdr_cummin: The minimum false discovery rate estimated for this threshold and less stringent thresholds (adjusted false discovery rate).

nFdrPropForFilt

A number indicating the false discovery rate to be used for filtering: this will allow to choose the closest threshold below this number. Defaults to 0.05 (so fdr of 5%).

lUseBestThrIfNoFdrThr

For fdr calculation by motif: if no fdr-associated threshold can be retrieved for one motif, return the strongest threshold identified for any other motif if lUseBestThrIfNoFdrThr is TRUE; if lUseBestThrIfNoFdrThr is FALSE, return the max value for the threshold (so every modification in that motif will be filtered out automatically). Defaults to TRUE.

Examples

library(Biostrings)
myGenome <- readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))
myGrangesGenome <- GetGenomeGRanges(myGenome)

# Preparing a gposPacBioCSV dataset with sequences
myGposPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE, lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )
myGrangesBaseCSV <- as(myGposPacBioCSV[myGposPacBioCSV$base == "A"], "GRanges")
myGrangesBaseCSVWithSeq <- GetGRangesWindowSeqandParam(
  grangesData = myGrangesBaseCSV,
  grangesGenome = myGrangesGenome,
  dnastringsetGenome = myGenome,
  nUpstreamBpToAdd = 0,
  nDownstreamBpToAdd = 1
)

# FDR estimation by motif associated to modifications
myFdr_score_per_motif_list <-
  GetFdrEstListByThresh(
    grangesDataWithSeq = myGrangesBaseCSVWithSeq,
    cNameParamToTest = "score",
    nRoundDigits = 1,
    cModMotifsAsForeground = c("AG", "AT")
  )

GetFdrBasedThreshLimit(
  listFdrEstByThr = myFdr_score_per_motif_list,
  nFdrPropForFilt = 0.05,
  lUseBestThrIfNoFdrThr = TRUE
)

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.