sublong: Align long sequence reads to a reference genome via...

Description Usage Arguments Details Author(s) Examples

View source: R/sublong.R

Description

This function aligns DNA-seq reads, generated by long-read sequencing technologies such as Nanopore and PacBio sequencers, to a reference genome.

Usage

1
2
3
4
5
6
7
sublong(
  index,
  readFiles,
  outputFiles,
  outputFormat="BAM",
  nthreads=1
)

Arguments

index

a character vector giving the basename of index files. Index files should be located in the current directory. The provided index should be a full index and also it should have only one block. See buildindex for index building options.

readFiles

a character vector giving the names of input files that contain long sequence reads. FASTQ and gzipped FASTQ formats are both accepted.

outputFiles

a character vector specifying the names of output files that contain read mapping results.

outputFormat

a character string specifying the format of output files. BAM by default. Acceptable formats include SAM and BAM.

nthreads

an integer giving the number of threads used for mapping. 1 by default. Note that when more than one thread is used, the order of reads might be changed in the output.

Details

sublong is designed for the mapping of long reads. It performs full alignment of reads by performing seed-and-vote mapping followed by a bounded dynamic programming procedure. sublong is able to map reads as long as millions of bases.

sublong is extremely fast. It takes less than 10 minutes to complete the mapping of more than 100,000 long reads generated from Nanopore MinION ultra-long sequencing protocol.

The number of CIGAR operations (eg. insertion and deletion) reported for a long read may exceed the limit on the total number of operations allowed in a CIGAR string (up to 65,535 operations in a CIGAR string in BAM output and up to 99,900 operations in a CIGAR string in SAM output). If this limited is exceeded, the read will be soft clipped.

Author(s)

Yang Liao and Wei Shi

Examples

1
2
3
4
5
library(Rsubread)
ref <- system.file("extdata","reference.fa",package="Rsubread")
buildindex(basename="./full_index",reference=ref,gappedIndex=FALSE, indexSplit=FALSE)
reads <- system.file("extdata","longreads.txt.gz",package="Rsubread")
sublong("./full_index",reads,"./Long_alignment.BAM",nthreads=4)

Rsubread documentation built on Nov. 9, 2018, 6 p.m.