processBAM | R Documentation |
These function calls the SpliceWiz C++ routine on one or more BAM files.
The routine is an improved version over the original IRFinder, with
OpenMP-based multi-threading and the production of compact "COV" files to
record alignment coverage. A SpliceWiz reference built using
Build-Reference-methods is required.
After processBAM()
is run, users should call
collateData to collate individual outputs into an experiment / dataset.
BAM2COV creates COV files from BAM files without running processBAM()
.
See details for performance info.
BAM2COV(
bamfiles = "./Unsorted.bam",
sample_names = "sample1",
output_path = "./cov_folder",
n_threads = 1,
useOpenMP = TRUE,
overwrite = FALSE,
verbose = FALSE,
multiRead = FALSE
)
processBAM(
bamfiles = "./Unsorted.bam",
sample_names = "sample1",
reference_path = "./Reference",
output_path = "./SpliceWiz_Output",
n_threads = 1,
useOpenMP = TRUE,
overwrite = FALSE,
run_featureCounts = FALSE,
verbose = FALSE,
skipCOVfiles = FALSE,
multiRead = FALSE
)
bamfiles |
A vector containing file paths of 1 or more BAM files |
sample_names |
The sample names of the given BAM files. Must
be a vector of the same length as |
output_path |
The output directory of this function |
n_threads |
(default |
useOpenMP |
(default |
overwrite |
(default |
verbose |
(default |
multiRead |
(default |
reference_path |
The directory containing the SpliceWiz reference |
run_featureCounts |
(default |
skipCOVfiles |
(default |
Typical run-times for a 100-million paired-end alignment BAM file takes 10
minutes using a single core. Using 8 threads, the runtime is approximately
2-5 minutes, depending on your system's file input / output speeds.
Approximately 10 Gb of RAM is used when OpenMP is used. If OpenMP
is not used (see below), this memory usage is multiplied across the number
of processor threads (i.e. 40 Gb if n_threads = 4
).
OpenMP is natively available to Linux / Windows compilers, and OpenMP will
be used if useOpenMP
is set to TRUE
, using multiple threads to process
each BAM file. On Macs, if OpenMP is not available at compilation,
BiocParallel will be used, processing BAM files simultaneously,
with one BAM file per thread.
Output will be saved to output_path
. Output files
will be named using the given sample_names
.
For processBAM()
:
sample.txt.gz: The main output file containing the quantitation
of IR and splice junctions, as well as QC information
sample.cov: Contains coverage information in compressed binary. See getCoverage
main.FC.Rds: A single file containing gene counts for the whole dataset
(only if run_featureCounts == TRUE
)
For BAM2COV()
:
sample.cov: Contains coverage information in compressed binary. See getCoverage
BAM2COV()
: Converts BAM files to COV files without running
processBAM()
processBAM()
: Processes BAM files. Requires a
SpliceWiz reference generated by buildRef()
Build-Reference-methods collateData isCOV
# Run BAM2COV, which only produces COV files but does not run `processBAM()`:
bams <- SpliceWiz_example_bams()
BAM2COV(bams$path, bams$sample,
output_path = file.path(tempdir(), "SpliceWiz_Output"),
n_threads = 2, overwrite = TRUE
)
# Run processBAM(), which produces:
# - text output of intron coverage and spliced read counts
# - COV files which record read coverages
example_ref <- file.path(tempdir(), "Reference")
buildRef(
reference_path = example_ref,
fasta = chrZ_genome(),
gtf = chrZ_gtf()
)
bams <- SpliceWiz_example_bams()
processBAM(bams$path, bams$sample,
reference_path = file.path(tempdir(), "Reference"),
output_path = file.path(tempdir(), "SpliceWiz_Output"),
n_threads = 2
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.