Description Usage Arguments Details Value
Read a "SAM file", discarding some reads that cannot be interpreted for our purposes.
1 | ReadSamfile(filename)
|
filename |
The name of the SAM file to read. |
SAM stands for "Sequence Alignment Map", a text file that represents aligned next-generation sequencing reads (https://en.wikipedia.org/wiki/SAM_(file_format)). Only keep reads that satisfy certain conditions:
Mate pair maps to same chromosome
Mapping quality >= 30
"FLAG" < 256 (see info on the return value
element reads.with.bad.FLAG
, below).
The CIGAR string
is only \d+M
(one or more digits followed M, and nothing
else before or after).
This means there are no insertions or deletions in the read
versus the reference and there is no soft clipping. Insertions
or deletions or clipping shift the read within the
SAM file, after which this function cannot keep track of the
DBS location.
A a list with the elements:
good.reads
, data.frame
with
with column names for the
first 11 columns as specified in
https://en.wikipedia.org/wiki/SAM_(file_format)
with one row per read. The result contains only the
reads that meet certain conditions.
total.depth
, the initial depth including
"bad" reads
reads.with.bad.FLAG
. These are reads
with FLAG >= 256, which marks reads that
(i) are "not primary alignment"
(ii) failed vendor QC
(iii) are PCR or optical duplicates
(iv) are supplementary alignments (e.g. split, split /inverted read).
See https://broadinstitute.github.io/picard/explain-flags.html
reads.with.bad.CIGAR
, reads with CIGAR string
that indicates an indel in the read or clipping.
reads.with.bad.MAPQ
, reads with MAPQ < 30.
reads.with.bad.Mate_CHROM
, reads with a
mate on a different chromosome.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.