FiltParam: FiltParam Function (Filter)

View source: R/Filter.R

FiltParamR Documentation

FiltParam Function (Filter)

Description

Filter out modifications which have a chosen parameter that do not reach criterias of selection.

Usage

FiltParam(
  grangesModPos,
  cParamNameForFilter,
  nFiltParamLoBoundaries = NULL,
  nFiltParamUpBoundaries = NULL,
  cFiltParamBoundariesToInclude = NULL,
  listMeanParamByContig = NULL,
  nContigFiltParamLoBound = NULL,
  nContigFiltParamUpBound = NULL
)

Arguments

grangesModPos

A GRanges object containing Modifications positions data to be filtered.

cParamNameForFilter

A character vector giving the name of the parameter to be filtered. Must correspond to the name of one column in the object provided with grangesModPos.

nFiltParamLoBoundaries

A numeric vector returning the lower boundaries of intervals. Must have the same length as "nFiltParamUpBoundaries". Defaults to NULL.

If this parameter is provided, the function will remove modifications which have values of the given parameter that are not included in the intervals provided with "nFiltParamLoBoundaries" and "nFiltParamUpBoundaries".

nFiltParamUpBoundaries

A numeric vector returning the upper boundaries of intervals. Must have the same length as "nFiltParamLoBoundaries". Defaults to NULL.

If this parameter is provided, the function will remove modifications which have values of the given parameter that are not included in the intervals provided with "nFiltParamLoBoundaries" and "nFiltParamUpBoundaries".

cFiltParamBoundariesToInclude

A character vector describing which interval boundaries must be included in the intervals provided. Can be "upperOnly" (only upper boundaries), "lowerOnly" (only lower boundaries), "both" (both upper and lower boundaries) or "none" (do not include upper and lower boundaries). If NULL, both upper and lower boundaries will be included (= "both"). Defaults to NULL. cFiltParamBoundariesToInclude = NULL #can be "upperOnly","lowerOnly","both", "none' (NULL = "both" for all)

listMeanParamByContig

List containing, for each strand, the mean of a given parameter for each contig. This list must be composed of 2 dataframes (one by strand) called f_strand and r_strand. In each dataframe, "refName" column returning names of contigs and "mean_"[parameter name] column returning the mean of the given parameter. If not NULL, remove contigs that are too far away from the mean of the Parameter of all contigs (which are not included in the interval centered on the mean) in the list provided. Defaults to NULL.

nContigFiltParamLoBound

A numeric value to be removed from the mean of the given parameter of all contigs (calculates the lower bound of the interval centered on the mean). Defaults to NULL.

nContigFiltParamUpBound

A numeric value to be added to the mean of the given parameter of all contigs (calculates the upper bound of the interval centered on the mean). Defaults to NULL.

Value

A filtered grangesModPos object.

Examples

# loading genome
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))

# Preparing a grangesPacBioGFF dataset
myGrangesPacBioGFF <-
  ImportPacBioGFF(
    cPacBioGFFPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.modifications.sca171819.gff"
    ),
    cNameModToExtract = "m6A",
    cModNameInOutput = "6mA",
    cContigToBeAnalyzed = c("scaffold51_17", "scaffold51_18", "scaffold51_19")
  )

# Preparing ParamByStrand List
myMean_fra_list <- GetMeanParamByContig(
  grangesData = myGrangesPacBioGFF,
  dnastringsetGenome = myGenome,
  cParamName = "frac"
)

# Filtering Out Modif with Frac 20% above/below the mean Frac
myGRangesPacBioGFF_filt <- FiltParam(
  grangesModPos = myGrangesPacBioGFF,
  cParamNameForFilter = "frac",
  listMeanParamByContig = myMean_fra_list,
  nContigFiltParamLoBound = 0.2,
  nContigFiltParamUpBound = 0.2
)
myGRangesPacBioGFF_filt

# Filtering Out Modif with Frac below 5%
myGRangesPacBioGFF_filt <- FiltParam(
  grangesModPos = myGrangesPacBioGFF,
  cParamNameForFilter = "frac",
  listMeanParamByContig = myMean_fra_list,
  nFiltParamLoBoundaries = 0.05,
  nFiltParamUpBoundaries = 1.00
)
myGRangesPacBioGFF_filt

# Keeping Modif with Frac between 40% and 60%;
# or between 90% and 100%  (adapted to genome of diploid organisms)
myGRangesPacBioGFF_filt <- FiltParam(
  grangesModPos = myGrangesPacBioGFF,
  cParamNameForFilter = "frac",
  nFiltParamLoBoundaries = c(0.40, 0.90),
  nFiltParamUpBoundaries = c(0.60, 1.00)
)

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.