unmapped_read | R Documentation |
This function extracts all unmapped reads with a mate that overlaps
with a set of query genomic intervals. Internally, this function
uses scanBam
to scan the bam files. In our application, we
typically create a bam file containing only mapped-unmapped read
pairs as this greatly reduces the size of the bam file to query.
In particular, we create a bam file with the following set of flags:
unmapped_read(
bam.file,
query,
yield_size = 1e+06,
maxgap = 500,
what = scanBamWhat()
)
bam.file |
character-string providing complete path to BAM file |
query |
a |
yield_size |
the number of reads to extract from the bam file at once using |
maxgap |
the gap allowed between the query interval and the mapped read to consider the two intervals overlapping |
what |
a character vector of fields to keep from the bam file.
Defaults to |
samtools view -b -f 4 -F 8 $input > "unmapped-mapped/${input}"
The GRanges
object returned by this function includes the
sequence of the reads so that the sequences can be subsequently
written to disk in fasta format and realigned with a local
alignment algorithm such as BLAT
that allows for split
read alignments.
a GRanges
object of mapped reads with unmapped
mates.
See scanBam
for scanning a bam
file for reads matching a set of flags. See
fasta_unmapped
for writting these sequences to disk in
fasta format.
extdata <- system.file("extdata", package="svbams")
bam <- file.path(extdata, "cgov44t_revised.bam")
region <- GRanges(seqnames = "chr8",
ranges = IRanges(start = 128691748, end = 128692097))
unmapped_read(bam.file = bam, query = region, yield_size = 1e6)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.