GetListCountsByDist: GetListCountsByDist Function (ModAnnot)

View source: R/ModAnnot.R

GetListCountsByDistR Documentation

GetListCountsByDist Function (ModAnnot)

Description

Return, in dataframes via a list, the counts (or proportion) of provided "Positions" by distance from feature positions. If the input list contains 2 GRanges, 2 dataframes ("Position" vs featureStart; "Position" vs featureEnd) will be exported in the output instead of 1 dataframe ("Position" vs featureStart).

Usage

GetListCountsByDist(
  listGRangesDist,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  lGetPropInsteadOfCounts = TRUE
)

Arguments

listGRangesDist

A GRangesList with 1 or 2 GRanges objects containing ranges of given "Positions" with their distance to feature positions.

lAddCorrectedDistFrom5pTo3p

If TRUE, the distance will be corrected to reflect 5' to 3' direction and will be stored in a new column (dist_5to3). Defaults to TRUE.

lGetPropInsteadOfCounts

If TRUE, return the proportion of given "Positions" near feature position: counts / sum of counts. If listGRangesDist contains 4 GRanges, the proportion of given "Positions" is calculated near both feature positions: counts / (sum of counts near feature1 + sum of counts near feature2). Defaults to TRUE.

Value

A list with 1 or 2 dataframe(s) containing "Positions" counts by distance to feature positions:

  • If 1 GRanges are provided in listGRangesDist, 1 dataframe is provided ("Position" vs featureStart).

  • If 2 GRanges are provided in listGRangesDist, 2 dataframes are provided ("Position" vs featureStart; "Position" vs featureEnd).

If a given "Position" is within nWindowSizeAroundFeaturePos base pairs of x different feature positions: this given "Position" will then reported x times with the distance to each feature position.

Examples

# loading genome
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))

# loading annotation
library(rtracklayer)
myAnnotations <- readGFFAsGRanges(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_annotation_v2.0_sca171819.gff3"
))

# Preparing a grangesPacBioGFF and a grangesPacBioCSV datasets
myGrangesPacBioGFF <-
  ImportPacBioGFF(
    cPacBioGFFPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.modifications.sca171819.gff"
    ),
    cNameModToExtract = "m6A",
    cModNameInOutput = "6mA",
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE, lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <- myGposPacBioCSV[myGposPacBioCSV$base == "A"]

# Retrieve, in a list, dataframes of ModBase counts per Distance values from feature positions
myModDistGRangesList <- GetDistFromFeaturePos(
  grangesAnnotations = myAnnotations,
  cSelectFeature = "gene",
  grangesData = myGrangesPacBioGFF,
  lGetGRangesInsteadOfListCounts = TRUE,
  cWhichStrandVsFeaturePos = "both", nWindowSizeAroundFeaturePos = 600,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  cFeaturePosNames = c("TSS", "TTS")
)
myModDistCountsList <- GetListCountsByDist(
  listGRangesDist = myModDistGRangesList,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  lGetPropInsteadOfCounts = TRUE
)
myModDistCountsList

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.