GetSeqPctByContig: GetSeqPctByContig Function (SeQual)
In AlexisHardy/DNAModAnnot: Toolbox for DNA Modifications filtering and annotation

GetSeqPctByContig

R Documentation

GetSeqPctByContig Function (SeQual)

Description

Return a list with the percentage of sequencing by strand for all scaffolds of genome assembly provided. This function is not adapted for data from DeepSignal.

Usage

GetSeqPctByContig(gposPacBioCSV, grangesGenome)

Arguments

`gposPacBioCSV`	An UnStitched GPos object containing PacBio CSV data to be analysed.
`grangesGenome`	A GRanges object containing the width of each contig.

Value

A list composed of 3 dataframes: 1 dataframe by strand and 1 dataframe with both strands. In each dataframe:

refName: The names of each contig.
strand: The strand of each contig.
width: The width of each contig.
nb_sequenced: The number of bases sequenced by strand for each contig.
seqPct: The percentage of bases sequenced for each strand for each contig (percentage of sequencing).

Examples

myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))
myGrangesGenome <- GetGenomeGRanges(myGenome)

# Preparing a gposPacBioCSV dataset
myGposPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE, lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )

myPct_seq_csv <- GetSeqPctByContig(myGposPacBioCSV, grangesGenome = myGrangesGenome)
myPct_seq_csv

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.