DrawModBasePropDistFromFeature: DrawModBasePropDistFromFeature Function (ModAnnot)

View source: R/ModAnnot.R

DrawModBasePropDistFromFeatureR Documentation

DrawModBasePropDistFromFeature Function (ModAnnot)

Description

Return, in dataframes via a list, the counts (or proportion) of "Mod" (or "Base") positions by distance from feature positions. If the input list contains 4 GRanges, 4 dataframes ("Mod" vs featureStart; "Mod" vs featureEnd; "Base" vs featureStart; "Base" vs featureEnd) will be exported in the output instead of 2 dataframes ("Mod" vs featureStart; "Base" vs featureStart). "Mod": the base modified. "Base": the base letter of the modified base. Example: for Mod="6mA", Base="A"; for Mod="5mC", Base="C".

Usage

DrawModBasePropDistFromFeature(
  listModCountsDistDataframe,
  listBaseCountsDistDataframe,
  cFeaturePosNames = c("Start", "End"),
  cBaseMotif,
  cModMotif,
  nDensityBaseMotif = 50
)

Arguments

listModCountsDistDataframe

A list with 1 or 2 dataframe(s) containing "Mod" counts by distance to feature positions:

  • If 1 dataframe is provided in listModCountsDistDataframe, 1 position will be plotted (featureStart).

  • If 2 dataframes are provided in listModCountsDistDataframe, 2 positions will be plotted (featureStart; featureEnd).

Must have the same length as the list provided with "listBaseCountsDistDataframe".

listBaseCountsDistDataframe

A list with 1 or 2 dataframe(s) containing "Base" counts by distance to feature positions:

  • If 1 dataframe is provided in listModCountsDistDataframe, 1 position will be plotted (featureStart).

  • If 2 dataframes are provided in listModCountsDistDataframe, 2 positions will be plotted (featureStart; featureEnd).

Must have the same length as the list provided with "listModCountsDistDataframe".

cFeaturePosNames

A character vector returning the names of the feature positions provided. Defaults to c("Start","End").

  • If 2 dataframes are provided in listModBaseCountsDistDataframe, the name of the feature will be the first element of the vector.

  • If 4 dataframes are provided in listModBaseCountsDistDataframe, the names of the feature borders will be the first element then the second element.

cBaseMotif

The name of the motif with the base letter of the modified base.

cModMotif

The name of the motif with the modification in the output.

nDensityBaseMotif

Numeric vector giving the density of the polygon made with the "Base" counts by distance to feature positions.

Examples

# loading genome
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))

# loading annotation
library(rtracklayer)
myAnnotations <- readGFFAsGRanges(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_annotation_v2.0_sca171819.gff3"
))

# Preparing a grangesPacBioGFF and a grangesPacBioCSV datasets
myGrangesPacBioGFF <-
  ImportPacBioGFF(
    cPacBioGFFPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.modifications.sca171819.gff"
    ),
    cNameModToExtract = "m6A",
    cModNameInOutput = "6mA",
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE, lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <- myGposPacBioCSV[myGposPacBioCSV$base == "A"]

# Retrieve, in a list, dataframes of ModBase counts per Distance values from feature positions
myModDistCountsList <- GetDistFromFeaturePos(
  grangesAnnotations = myAnnotations,
  cSelectFeature = "gene",
  grangesData = myGrangesPacBioGFF,
  lGetGRangesInsteadOfListCounts = FALSE,
  lGetPropInsteadOfCounts = TRUE,
  cWhichStrandVsFeaturePos = "both", nWindowSizeAroundFeaturePos = 600,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  cFeaturePosNames = c("TSS", "TTS")
)
myBaseDistCountsList <- GetDistFromFeaturePos(
  grangesAnnotations = myAnnotations,
  cSelectFeature = "gene",
  grangesData = myGposPacBioCSV,
  lGetGRangesInsteadOfListCounts = FALSE,
  lGetPropInsteadOfCounts = TRUE,
  cWhichStrandVsFeaturePos = "both", nWindowSizeAroundFeaturePos = 600,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  cFeaturePosNames = c("TSS", "TTS")
)
DrawModBasePropDistFromFeature(
  listModCountsDistDataframe = myModDistCountsList,
  listBaseCountsDistDataframe = myBaseDistCountsList,
  cFeaturePosNames = c("TSS", "TTS"),
  cBaseMotif = "A",
  cModMotif = "6mA"
)

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.