GetListCountsByDist: GetListCountsByDist Function (ModAnnot)
In AlexisHardy/DNAModAnnot: Toolbox for DNA Modifications filtering and annotation

GetListCountsByDist

R Documentation

GetListCountsByDist Function (ModAnnot)

Description

Return, in dataframes via a list, the counts (or proportion) of provided "Positions" by distance from feature positions. If the input list contains 2 GRanges, 2 dataframes ("Position" vs featureStart; "Position" vs featureEnd) will be exported in the output instead of 1 dataframe ("Position" vs featureStart).

Usage

GetListCountsByDist(
  listGRangesDist,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  lGetPropInsteadOfCounts = TRUE
)

Arguments

`listGRangesDist`	A GRangesList with 1 or 2 GRanges objects containing ranges of given "Positions" with their distance to feature positions.
`lAddCorrectedDistFrom5pTo3p`	If TRUE, the distance will be corrected to reflect 5' to 3' direction and will be stored in a new column (dist_5to3). Defaults to TRUE.
`lGetPropInsteadOfCounts`	If TRUE, return the proportion of given "Positions" near feature position: counts / sum of counts. If listGRangesDist contains 4 GRanges, the proportion of given "Positions" is calculated near both feature positions: counts / (sum of counts near feature1 + sum of counts near feature2). Defaults to TRUE.

Value

A list with 1 or 2 dataframe(s) containing "Positions" counts by distance to feature positions:

If 1 GRanges are provided in listGRangesDist, 1 dataframe is provided ("Position" vs featureStart).
If 2 GRanges are provided in listGRangesDist, 2 dataframes are provided ("Position" vs featureStart; "Position" vs featureEnd).

If a given "Position" is within nWindowSizeAroundFeaturePos base pairs of x different feature positions: this given "Position" will then reported x times with the distance to each feature position.

Examples

# loading genome
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))

# loading annotation
library(rtracklayer)
myAnnotations <- readGFFAsGRanges(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_annotation_v2.0_sca171819.gff3"
))

# Preparing a grangesPacBioGFF and a grangesPacBioCSV datasets
myGrangesPacBioGFF <-
  ImportPacBioGFF(
    cPacBioGFFPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.modifications.sca171819.gff"
    ),
    cNameModToExtract = "m6A",
    cModNameInOutput = "6mA",
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE, lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )
myGposPacBioCSV <- myGposPacBioCSV[myGposPacBioCSV$base == "A"]

# Retrieve, in a list, dataframes of ModBase counts per Distance values from feature positions
myModDistGRangesList <- GetDistFromFeaturePos(
  grangesAnnotations = myAnnotations,
  cSelectFeature = "gene",
  grangesData = myGrangesPacBioGFF,
  lGetGRangesInsteadOfListCounts = TRUE,
  cWhichStrandVsFeaturePos = "both", nWindowSizeAroundFeaturePos = 600,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  cFeaturePosNames = c("TSS", "TTS")
)
myModDistCountsList <- GetListCountsByDist(
  listGRangesDist = myModDistGRangesList,
  lAddCorrectedDistFrom5pTo3p = TRUE,
  lGetPropInsteadOfCounts = TRUE
)
myModDistCountsList

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.