DrawModBasePropDistFromFeature: DrawModBasePropDistFromFeature Function (ModAnnot)
In AlexisHardy/DNAModAnnot: Toolbox for DNA Modifications filtering and annotation

DrawModBasePropDistFromFeature

R Documentation

DrawModBasePropDistFromFeature Function (ModAnnot)

Description

Return, in dataframes via a list, the counts (or proportion) of "Mod" (or "Base") positions by distance from feature positions. If the input list contains 4 GRanges, 4 dataframes ("Mod" vs featureStart; "Mod" vs featureEnd; "Base" vs featureStart; "Base" vs featureEnd) will be exported in the output instead of 2 dataframes ("Mod" vs featureStart; "Base" vs featureStart). "Mod": the base modified. "Base": the base letter of the modified base. Example: for Mod="6mA", Base="A"; for Mod="5mC", Base="C".

Usage

DrawModBasePropDistFromFeature(
  listModCountsDistDataframe,
  listBaseCountsDistDataframe,
  cFeaturePosNames = c("Start", "End"),
  cBaseMotif,
  cModMotif,
  nDensityBaseMotif = 50
)

Arguments

`listModCountsDistDataframe`	A list with 1 or 2 dataframe(s) containing "Mod" counts by distance to feature positions: If 1 dataframe is provided in listModCountsDistDataframe, 1 position will be plotted (featureStart). If 2 dataframes are provided in listModCountsDistDataframe, 2 positions will be plotted (featureStart; featureEnd). Must have the same length as the list provided with "listBaseCountsDistDataframe".
`listBaseCountsDistDataframe`	A list with 1 or 2 dataframe(s) containing "Base" counts by distance to feature positions: If 1 dataframe is provided in listModCountsDistDataframe, 1 position will be plotted (featureStart). If 2 dataframes are provided in listModCountsDistDataframe, 2 positions will be plotted (featureStart; featureEnd). Must have the same length as the list provided with "listModCountsDistDataframe".
`cFeaturePosNames`	A character vector returning the names of the feature positions provided. Defaults to c("Start","End"). If 2 dataframes are provided in listModBaseCountsDistDataframe, the name of the feature will be the first element of the vector. If 4 dataframes are provided in listModBaseCountsDistDataframe, the names of the feature borders will be the first element then the second element.
`cBaseMotif`	The name of the motif with the base letter of the modified base.
`cModMotif`	The name of the motif with the modification in the output.
`nDensityBaseMotif`	Numeric vector giving the density of the polygon made with the "Base" counts by distance to feature positions.

Examples

# loading genome
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))

# loading annotation
library(rtracklayer)
myAnnotations <- readGFFAsGRanges(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_annotation_v2.0_sca171819.gff3"
))

# Preparing a grangesPacBioGFF and a grangesPacBioCSV datasets
myGrangesPacBioGFF <-
  ImportPacBioGFF(
    cPacBioGFFPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.modifications.sca171819.gff"
    ),
    cNameModToExtract = "m6A",
    cModNameInOutput = "6mA",
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE, lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <- myGposPacBioCSV[myGposPacBioCSV$base == "A"]

# Retrieve, in a list, dataframes of ModBase counts per Distance values from feature positions
myModDistCountsList <- GetDistFromFeaturePos(
  grangesAnnotations = myAnnotations,
  cSelectFeature = "gene",
  grangesData = myGrangesPacBioGFF,
  lGetGRangesInsteadOfListCounts = FALSE,
  lGetPropInsteadOfCounts = TRUE,
  cWhichStrandVsFeaturePos = "both", nWindowSizeAroundFeaturePos = 600,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  cFeaturePosNames = c("TSS", "TTS")
)
myBaseDistCountsList <- GetDistFromFeaturePos(
  grangesAnnotations = myAnnotations,
  cSelectFeature = "gene",
  grangesData = myGposPacBioCSV,
  lGetGRangesInsteadOfListCounts = FALSE,
  lGetPropInsteadOfCounts = TRUE,
  cWhichStrandVsFeaturePos = "both", nWindowSizeAroundFeaturePos = 600,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  cFeaturePosNames = c("TSS", "TTS")
)
DrawModBasePropDistFromFeature(
  listModCountsDistDataframe = myModDistCountsList,
  listBaseCountsDistDataframe = myBaseDistCountsList,
  cFeaturePosNames = c("TSS", "TTS"),
  cBaseMotif = "A",
  cModMotif = "6mA"
)

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.