pipe.BAMpileup | R Documentation |
High level and low level wrappper functions to invoke SAMTOOLS MPILEUP to get alignment details for a region in a BAM file.
pipe.BAMpileup(sampleID, geneID = NULL, seqID = NULL, start = NULL, stop = NULL,
annotationFile = "Annotation.txt", optionsFile = "Options.txt",
results.path = NULL, summarize.calls = TRUE, verbose = FALSE)
BAM.mpileup(files, seqID, fastaFile, start = NULL, stop = NULL,
min.depth = 3, max.depth = 10000, min.gap.fraction = 0.25,
mpileupArgs = "", summarize.calls = FALSE, verbose = TRUE)
sampleID |
the SampleID for this sample. This SampleID keys for a row of annotation details in the annotation file, for getting sample-specific details. The SampleID is also used as a sample-specific prefix for all files created during the processing of this sample. |
geneID |
Optional character string of one GeneID. |
seqID |
Optional character string of one SeqID. Required explicitily for the low level function. |
start |
|
stop |
Optional integer for the starting and stopping nucleotide location. Note that any valid combination of the above that is sufficient to define a region of a chromosome is allowed. For example, a GeneID alone also implies its SeqID, start, and stop. |
annotationFile |
the file of sample annotation details, which specifies all needed
sample-specific information about the samples under study.
See |
optionsFile |
the file of processing options, which specifies all processing
parameters that are not sample specific. See |
results.path |
The top level folder where all results have been written to. Default is taken from the options file. |
summarize.calls |
Logical. Controls whether the very low level details of the base calls at each nucleotide location are summarized into tabular form. FALSE leaves the raw large cryptic strings as made by SAMTOOLS MPILEUP. When TRUE, they are summarized into a consensus base call and a short text string count summary of A/C/G/T/Indel calls. |
fastaFile |
Full pathname to the genomic DNA FASTA file that made the Bowtie index that the BAM file was aligned against. |
min.depth |
|
max.depth |
|
min.gap.fraction |
Bounds passed to SAMTOOLS MPILEUP, to control how it performs and reports its assessment of the BAM file. |
mpileupArgs |
Other optional arguments passed to SAMTOOLS MPILEUP. |
These functions give two different levels of control for extracting aligned read pileup information from BAM files, via the SAMTOOLS MPILEUP utility. The high level function gives an easy way to extract details for single genes or regions, while the low level function gives finer control.
A data frame of details about read coverage and base depth in the specified region of a chromosome. Only base locations with 1+ read of coverage are returned, so there can be gaps with no information. Columns include:
SEQ_ID |
chromosome name as a character string |
POSITION |
chromosome location as an integer |
REF_BASE |
the expected base call from the reference genome |
DEPTH |
the depth of coverage at this position, as an integer |
CALL_BASE |
when summarize is TRUE, the one consensus base that was most frequently observed. If FALSE, the the full cryptic string of base calls as made by MPILEUP |
CALL_SCORE |
only when summarize is FALSE, the full cryptic string of Phred quality scores |
BASE_TABLE |
only when summarize is TRUE, the short sorted table of observed base counts |
Note that the high level function only operates on a single sample. While the low level function can be given a vector of BAM files, and then returns depth and call details for multiple files at one time.
Bob Morrison
SAMTOOLS www.htslib.org/doc/samtools.html
For various low level functions that manipulate the MPILEUP calls, see MPU.callBases
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.