extractReads: Extract reads from a BAM file

View source: R/extractReads.R

extractReadsR Documentation

Extract reads from a BAM file

Description

Extract reads from a BAM file with the specified parameter settings.

Usage

extractReads(bam.file, region, ext=NA, param=readParam(), as.reads=FALSE)

Arguments

bam.file

A character string containing the path to a sorted and indexed BAM file. Alternatively, a BamFile object describing a BAM file and its index.

region

A GRanges object of length 1 describing the region of interest.

ext

An integer scalar or list specifying the fragment length for directional read extension.

param

A readParam object specifying how reads should be extracted.

as.reads

A logical scalar indicating whether reads should be returned instead of fragments for paired-end data.

Details

This function extracts the reads from a BAM file overlapping a given genomic interval. The interpretation of the values in param is the same as that throughout the package. The aim is to supply the raw data for visualization, in a manner that maintains consistency with the rest of the analysis.

If pe!="both" in param, stranded intervals corresponding to the reads will be reported. If ext is not NA, directional read extension will also be performed – see windowCounts for more details. If pe="both", intervals are unstranded and correspond to fragments from proper pairs.

If as.reads=TRUE and pe="both", the reads in each proper pair are returned directly as a GRangesList of length 2. The two internal GRanges are of the same length and contain the forward and reverse reads for each proper pair in parallel. In other words, the nth elements of the first and second GRanges represent the nth proper pair.

Any strandedness of region is ignored. If strand-specific extraction is desired, this can be done by setting param$forward via reform. Alternatively, the returned GRanges can be filtered to retain only the desired strand.

Value

If pe!="both" or as.reads=FALSE, a GRanges object is returned containing the read (for single-end data) or fragment intervals (for paired-end data).

If pe="both" and as.reads=TRUE, a GRangesList is returned containing the paired reads – see Details.

Author(s)

Aaron Lun

See Also

readParam, windowCounts

Examples

bamFile <- system.file("exdata", "rep1.bam", package="csaw")
extractReads(bamFile, GRanges("chrA", IRanges(100, 500)))
extractReads(bamFile, GRanges("chrA", IRanges(100, 500)),
    param=readParam(dedup=TRUE))
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)))

bamFile <- system.file("exdata", "pet.bam", package="csaw")
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="both"))
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="first"))

# Extracting as reads.
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="both"), as.reads=TRUE)

# Dealing with the extension length.
bamFile <- system.file("exdata", "rep1.bam", package="csaw")
my.reg <- GRanges("chrA", IRanges(10, 200))
extractReads(bamFile, my.reg)
extractReads(bamFile, my.reg, ext=100)

LTLA/csaw documentation built on Dec. 11, 2023, 5:11 a.m.