ExtractListModPosByModMotif: ExtractListModPosByModMotif Function (GloModAn)

View source: R/GloModAn.R

ExtractListModPosByModMotifR Documentation

ExtractListModPosByModMotif Function (GloModAn)

Description

Return the GRanges object provided with the sequence associated to each position (and can also retrieve the sequence around each position).

Usage

ExtractListModPosByModMotif(
  grangesModPos,
  grangesGenome,
  dnastringsetGenome,
  nUpstreamBpToAdd = 0,
  nDownstreamBpToAdd = 1,
  nModMotifMinProp,
  nModPositionInMotif = 1 + nUpstreamBpToAdd,
  cBaseLetterForMod,
  cModNameInOutput
)

Arguments

grangesModPos

A GRanges object containing Modifications Positions data to be extracted with the sequence.

grangesGenome

A GRanges object containing the width of each contig.

dnastringsetGenome

A DNAStringSet object containing the sequence for each contig.

nUpstreamBpToAdd

Number of base pairs to add upstream of the range from the GRanges object provided to obtain some sequence upstream of range. If some new ranges do not fit in the ranges of the contigs (provided with grangesGenome), those new ranges will be removed. New windows with gaps are also removed. Defaults to 0.

nDownstreamBpToAdd

Number of base pairs to add downstream of the range from the GRanges object provided to obtain some sequence downstream of range. If some new ranges do not fit in the ranges of the contigs (provided with grangesGenome), those new ranges will be removed. New windows with gaps are also removed. Defaults to 0.

nModMotifMinProp

A number indicating the false discovery rate to be used for filtering: this will allow to choose the closest threshold below this number. Defaults to 0.05 (so fdr of 5%).

nModPositionInMotif

The position of the modification in the window after resizing with nUpstreamBpToAdd and nDownstreamBpToAdd. If GRanges are 1-bp positions, then 1+nUpstreamBpToAdd will return the right position of the modification. Defaults to 1+nUpstreamBpToAdd.

cBaseLetterForMod

The name of the base letter of the modified base.

cModNameInOutput

Name for the modification in the output.

Value

A list of 4 objects:

motifs_to_analyse

A character vector containing the sequence of motifs associated to the modification.

mod_motif

A character vector containing the sequence of motifs associated to the modification with the modification represented inside those motifs.

motif_pct

A table containing the percentage of modifications in each motif tested.

GRangesbyMotif

A list of GRanges objects with the sequence: one GRanges object by motif associated to the modification.

Examples

# loading genome
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))
myGrangesGenome <- GetGenomeGRanges(myGenome)

# Preparing a grangesPacBioGFF dataset
myGrangesPacBioGFF <-
  ImportPacBioGFF(
    cPacBioGFFPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.modifications.sca171819.gff"
    ),
    cNameModToExtract = "m6A",
    cModNameInOutput = "6mA",
    cContigToBeAnalyzed = names(myGenome)
  )

# Retrieve GRanges with sequence
myMotif_pct_and_GRangesList <- ExtractListModPosByModMotif(
  grangesModPos = myGrangesPacBioGFF,
  grangesGenome = myGrangesGenome,
  dnastringsetGenome = myGenome,
  nUpstreamBpToAdd = 0,
  nDownstreamBpToAdd = 1,
  nModMotifMinProp = 0.05,
  cBaseLetterForMod = "A",
  cModNameInOutput = "6mA"
)

myMotif_pct_and_GRangesList$motifs_to_analyse
myMotif_pct_and_GRangesList$mod_motif
myMotif_pct_and_GRangesList$motif_pct
myMotif_pct_and_GRangesList$GRangesbyMotif

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.