ImportPacBioCSV: ImportPacBioCSV Function (DaLoad)

View source: R/DaLoad.R

ImportPacBioCSVR Documentation

ImportPacBioCSV Function (DaLoad)

Description

Import PacBio CSV file and convert it as an UnStitched GPos object.

Usage

ImportPacBioCSV(
  cPacBioCSVPath,
  cSelectColumnsToExtract = c("refName", "tpl", "strand", "base", "score", "ipdRatio",
    "coverage"),
  lKeepExtraColumnsInGPos = TRUE,
  lSortGPos = TRUE,
  cContigToBeAnalyzed = NULL,
  lKeepSequence = TRUE
)

Arguments

cPacBioCSVPath

Path to a PacBio CSV file containing data from all bases sequenced.

cSelectColumnsToExtract

Names of columns to extract from PacBio CSV file. Less there are columns, faster the file will be loaded. The columns "refName", "tpl" and "strand" are mandatory to convert to a GPos object. Defaults to c("refName", "tpl", "strand", "base", "score", "ipdRatio", "coverage")

lKeepExtraColumnsInGPos

If FALSE, only the contig names, start/end positions and strand will be displayed in the resulting GPos object. Defaults to TRUE.

lSortGPos

If TRUE, the GPos object will be sorted before being returned: the function will take a longer time to proceed but the GPos Object will require less memory.

cContigToBeAnalyzed

Names of contigs for which the data will be kept. If NULL, data from all contigs available will be imported. Defaults to NULL.

lKeepSequence

If TRUE, the sequence of the base will be retained in one column. Otherwise, it will be discarded to reduce object size. Defaults to TRUE.

Examples

# Loading genome data
myGenome <- Biostrings::readDNAStringSet(system.file(
  package = "DNAModAnnot", "extdata",
  "ptetraurelia_mac_51_sca171819.fa"
))
names(myGenome)

# Loading PacBio data
myGrangesPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c("refName", "tpl", "strand", "base",
                                "score", "ipdRatio", "coverage"),
    lKeepExtraColumnsInGPos = TRUE,
    lSortGPos = TRUE,
    cContigToBeAnalyzed = names(myGenome)
  )
myGrangesPacBioCSV

# Loading PacBio data for 2 scaffolds only
myGrangesPacBioCSV <-
  ImportPacBioCSV(
    cPacBioCSVPath = system.file(
      package = "DNAModAnnot", "extdata",
      "ptetraurelia.bases.sca171819.csv"
    ),
    cSelectColumnsToExtract = c(
      "refName", "tpl", "strand", "base",
      "score", "ipdRatio", "coverage"
    ),
    lKeepExtraColumnsInGPos = TRUE,
    lSortGPos = TRUE,
    cContigToBeAnalyzed = c("scaffold51_18", "scaffold51_19")
  )
myGrangesPacBioCSV

AlexisHardy/DNAModAnnot documentation built on Feb. 27, 2023, 12:03 a.m.