getDVDnames: Identify DVD reads based on alignment of the vector on the...

Description Usage Arguments Value Examples

View source: R/getDVDnames.R

Description

DVD reads contain DNA-vector-DNA DVD reads have a single vector alignment covering at least PercentVecLength% (default 95%) of the vector length.

Usage

1
2
3
4
5
6
getDVDnames(
  alignGR,
  vectorLength,
  PercentVecLength = 0.95,
  MinDNASides = 10000L
)

Arguments

alignGR

a GRanges object containing the alignment of the vector on the reads

vectorLength

integer Length of the vector (in bp)

PercentVecLength

numeric in [0.5,1] indicating the % of the vector length that must be aligned on the read to be considered a DVD read

MinDNASides

minimum length of non-vector DNA on each side of the central vector alignment

Value

a character vector with the names of DVD reads

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Create a GRanges. Only Read1 and Read2 are DVD reads
## vector is 10kb
## alignment of vector on Read2 covers 9501 bp (>95\% of vector length)
  rgr <- GenomicRanges::GRanges(c("Read1:12e3-22e3",
                                  "Read2:20e3-29.6e3",
                                  "Read3:1-2000", "Read3:98001-1e5"),
                                seqlengths = c("Read1"=5e4, "Read2"=6e4,
                                               "Read3"=1e5, "Read4"=5e4),
                                 QueryRange.width = c(1e4, 9501, 2000, 2000))
## Names of VDV reads (using 95\% of vector length as threshold):
  getDVDnames(rgr, 10000, 0.95)
## With 98\% of vector length only Read1 will be selected as a DVD read
  getDVDnames(rgr, 10000, 0.98)
## If requiring at least 15kb on each side of the vector then only Read2 is DVD
  getDVDnames(rgr, 10000, 0.95, 15e3)

pgpmartin/NanoBAC documentation built on Dec. 11, 2020, 9:51 a.m.