DrawParamPerModBaseCategories: DrawParamPerModBaseCategories Function (ModAnnot)

View source: R/ModAnnot.R

DrawParamPerModBaseCategoriesR Documentation

DrawParamPerModBaseCategories Function (ModAnnot)

Description

Return boxplots describing, for each feature provided, the distribution of a parameter provided by categories of counts (or counts per KiloBase pairs (kbp)) of the base modified (Mod) and the base letter of the modified base (Base). These categories should have a comparable amount of Mod (or Base) between them. Example: for Mod="6mA", Base="A"; for Mod="5mC", Base="C".

Usage

DrawParamPerModBaseCategories(
  grangesAnnotationsWithCounts,
  cParamColname,
  cParamFullName = cParamColname,
  cParamYLabel = cParamColname,
  cSelectFeature = NULL,
  lUseCountsPerkbp = TRUE,
  lKeepOutliers = FALSE,
  lUseSameYAxis = FALSE,
  cBaseMotif,
  cModMotif,
  lBoxPropToCount = TRUE
)

Arguments

grangesAnnotationsWithCounts

A GRanges object based on grangesAnnotations with the counts:

  • Modcount: The number of "Mod" within this feature.

  • Modcount_perkbp: (The number of "Mod" within this feature divided by the size of the feature)*1000.

  • Basecount: The number of "Base" within this feature.

  • Basecount_perkbp: (The number of "Base" within this feature divided by the size of the feature)*1000.

The Genomic features categories must be in a column named "type".

cParamColname

Name of the column, in the grangesAnnotationsWithCounts provided, containing the parameter to be compared with Mod|Base categories.

cParamFullName

Name of the parameter to be displayed in the title of the plot. Defaults to cParamColname.

cParamYLabel

Name of the parameter to be displayed on the Y-axis of the plot. Defaults to cParamColname.

cSelectFeature

The name of the feature from the annotation to be analysed along counts and parameter provided. Defaults to NULL (No subsetting: counts and parameter provided from all features in the annotation provided will be used).

lUseCountsPerkbp

If TRUE, use counts per kbp (Modcount_perkbp and Basecount_perkbp) to calculate the Mod|Base categories to be displayed. If FALSE, use counts (Modcount and Basecount) to calculate the Mod|Base categories to be displayed. Defaults to TRUE.

lKeepOutliers

If FALSE, remove outliers before plotting. Defaults to FALSE.

lUseSameYAxis

If TRUE, the 2 plots will use the same range for the y-axis. If FALSE, y-axis of the 2 plots will be independant. Defaults to FALSE.

cBaseMotif

The name of the motif with the base letter of the modified base.

cModMotif

The name of the motif with the modification in the output.

lBoxPropToCount

If TRUE, the width of each boxplot depends on the number of Mod (or Base) in the categories. If FALSE, all boxplots will have the same size. Defaults to TRUE.

Examples

# loading genome
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))

# loading annotation
library(rtracklayer)
myAnnotations <- readGFFAsGRanges(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_annotation_v2.0_sca171819.gff3"
))

# Preparing a grangesPacBioGFF and a grangesPacBioCSV datasets
myGrangesPacBioGFF <-
  ImportPacBioGFF(
    cPacBioGFFPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.modifications.sca171819.gff"
    ),
    cNameModToExtract = "m6A",
    cModNameInOutput = "6mA",
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE, lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <- myGposPacBioCSV[myGposPacBioCSV$base == "A"]

# Retrieve annotations with "Mod" and "Base" counts (and counts per kbp)
myAnn_ModBase_counts <- GetModBaseCountsByFeature(
  grangesAnnotations = myAnnotations,
  grangesModPos = myGrangesPacBioGFF,
  gposModTargetBasePos = myGposPacBioCSV
)

# Add Parameter by feature to annotation file
myAnn_ModBase_counts$ParamToPlot <- width(myAnn_ModBase_counts)

DrawParamPerModBaseCategories(myAnn_ModBase_counts,
  cParamColname = "ParamToPlot",
  cParamFullName = "Gene Width",
  cParamYLabel = "Gene Width (bp)",
  cSelectFeature = c("gene"),
  lUseCountsPerkbp = TRUE,
  lKeepOutliers = FALSE,
  lUseSameYAxis = FALSE,
  cBaseMotif = "A",
  cModMotif = "6mA",
  lBoxPropToCount = FALSE
)

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.