Description Usage Arguments Value Note Author(s) See Also
Reads a pileup formatted file (pileupCallFile
) or
all pileup files in a folder (pileupCallRun
) created by
samtools mpileup
and calls bases for each chromosome listed.
Base calling is controlled by coverage and frequency parameters as
described in Notes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | pileupCallRun(
min.cov.call,
min.cov.freq,
min.base.freq,
min.ins.freq,
min.prob.freq,
min.binom.prob,
folder = ".",
pattern = "\\.pileup$",
label = NULL,
num.cores = NULL
)
pileupCallFile(
fname,
min.cov.call,
min.cov.freq,
min.base.freq,
min.ins.freq,
num.cores = NULL
)
|
min.cov.call |
minimum coverage for base calling. Sites with coverage
below this are assigned |
min.cov.freq |
minimum coverage above which |
min.base.freq |
minimum frequency of either the reference or alternate
base for calling. If both bases are below this frequency, an |
min.ins.freq |
minimum frequency of insertion. |
min.prob.freq |
minimum frequency for binomial probability. |
min.binom.prob |
minimum probability from binomial distribution. |
folder |
folder containing pileup files from a run |
pattern |
text pattern for pileup files. The default is that the file
ends in " |
label |
label for run output files. |
num.cores |
number of cores to use during processing. If |
fname |
filename of pileup file |
list with the following elements:
cons.seq | a DNAbin format list of
sequences. |
plp | data frame of reference, consensus base, and base frequencies at each reference position. |
The input pileup file should be the result of a call to
samtools mpileup
on a single BAM file.
For each position, bases are called according to the following logic within
a single pileup file using pileupCallFile()
:
If coverage is < min.cov.call
, assign N
.
If min.cov.call
<= coverage
<
min.cov.freq
, assign N
unless all reads contain the
same base.
If coverage >= min.cov.freq
, then assign N
unless a base occurs at frequency > min.base.freq
.
When a set of pileup files are processed together using
pileupCallRun()
, an additional step is considered. For
positions that were designated N
based on
condition 3 above, a base may be called if 1) the pooled frequency
(pool.prop
) for that base is > 0.5 and the frequency for the
individual (read.prop) is > pool.prop
, or 2) pool.prop
<= 0.5, the binomial probability of that base (given the coverage
at that site) is > min.binom.prob
, and read.prop
is
above a line defined by ((1 - (min.prob.freq
/ 0.5)) *
pool.prop
) + min.prob.freq
.
The above numbers are used as the value of the n.code
column in
the output plp
data frame to identify the reason an N
was
called at a given position.
Eric Archer eric.archer@noaa.gov
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.