Description Usage Arguments Details Value Author(s) See Also Examples
Computations are distributed in parallel by range. Data subsets are extracted and manipulated (MAP) and optionally combined (REDUCE) across all files.
1 2 3 4 5 6 7 8 9 10 11 | ## S4 method for signature 'GRanges,ANY'
reduceByRange(ranges, files, MAP,
REDUCE, ..., summarize=FALSE, iterate=TRUE, init)
## S4 method for signature 'GRangesList,ANY'
reduceByRange(ranges, files, MAP,
REDUCE, ..., summarize=FALSE, iterate=TRUE, init)
## S4 method for signature 'GenomicFiles,missing'
reduceByRange(ranges, files, MAP,
REDUCE, ..., summarize=FALSE, iterate=TRUE, init)
reduceRanges(ranges, files, MAP, REDUCE, ..., init)
|
ranges |
A A When |
files |
A |
MAP |
A function executed on each worker. The signature must contain a minimum of two arguments representing the ranges and files. There is no restriction on argument names and additional arguments can be provided.
|
REDUCE |
An optional function that combines output from the
Reduction combines data from a single worker and is always
performed as part of the distributed step. When When |
iterate |
A logical indicating if the Collapsing results iteratively is useful when the number of
records to be processed is large (maybe complete files) but
the end result is a much reduced representation of all records.
Iteratively applying |
summarize |
A logical indicating if results should be returned as a
When |
init |
An optional initial value for |
... |
Arguments passed to other methods. Currently not used. |
reduceByRange
extracts, manipulates and combines ranges across
different files. Each element of ranges
is sent to a worker;
this is a single range when ranges
is a GRanges and may be
multiple ranges when ranges
is a GRangesList. MAP
is
invoked on each range / file combination. This approach allows ranges
extracted from multiple files to be kept separate or combined with
REDUCE
.
In contrast, reduceRanges
treats the output of all MAP calls
as a group and reduces them together. REDUCE
usually plays
a minor role by concatenating or unlisting results.
Both MAP
and REDUCE
are applied in the distributed
step (“on the worker“). Results are not combined across workers in
the distributed step.
reduceByRange:
When summarize=FALSE
the return value is a list
or
the value from the final invocation of REDUCE
. When
summarize=TRUE
output is a SummarizedExperiment
.
When ranges
is a GenomicFiles
object data from
rowRanges
, colData
and metadata
are transferred
to the SummarizedExperiment
.
reduceRanges:
A list
or the value returned by the final invocation of
REDUCE
.
Martin Morgan and Valerie Obenchain
reduceFiles
reduceByFile
GenomicFiles-class
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | if (all(requireNamespace("RNAseqData.HNRNPC.bam.chr14", quietly=TRUE) &&
require(GenomicAlignments))) {
## -----------------------------------------------------------------------
## Compute coverage across BAM files.
## -----------------------------------------------------------------------
fls <- ## 8 bam files
RNAseqData.HNRNPC.bam.chr14::RNAseqData.HNRNPC.bam.chr14_BAMFILES
## Regions of interest.
gr <- GRanges("chr14", IRanges(c(62262735, 63121531, 63980327),
width=214700))
## The MAP computes the coverage ...
MAP <- function(range, file, ...) {
requireNamespace("GenomicFiles", quietly=TRUE)
## for coverage(), Rsamtools::ScanBamParam()
param = Rsamtools::ScanBamParam(which=range)
GenomicFiles::coverage(file, param=param)[range]
}
## and REDUCE adds the last and current results.
REDUCE <- function(mapped, ...)
Reduce("+", mapped)
## -----------------------------------------------------------------------
## reduceByRange:
## With no REDUCE, coverage is computed for each range / file combination.
cov1 <- reduceByRange(gr, fls, MAP)
cov1[[1]]
## Each call to coverage() produces an RleList which accumulate on the
## workers. We can use a reducer to combine these lists either iteratively
## or non-iteratively. When iterate = TRUE the current result
## is collapsed with the last resulting in a maximum of 2 RleLists on
## a worker at any given time.
cov2 <- reduceByRange(gr, fls, MAP, REDUCE, iterate=TRUE)
cov2[[1]]
## If memory use is not a concern (or if MAP output is not large) the
## REDUCE function can be applied non-iteratively.
cov3 <- reduceByRange(gr, fls, MAP, REDUCE, iterate=FALSE)
## Results match those obtained with the iterative REDUCE.
cov3[[1]]
## When 'ranges' is a GRangesList, the list elements are sent to the
## workers instead of a single range as in the case of a GRanges.
grl <- GRangesList(gr[1], gr[2:3])
grl
cov4 <- reduceByRange(grl, fls, MAP)
length(cov4) ## length of GRangesList
elementNROWS(cov4) ## number of files
## -----------------------------------------------------------------------
## reduceRanges:
## This function passes the character vector of all file names to MAP.
## MAP must handle each file separately or invoke a method that operates
## on a list of files.
## TODO: example
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.