bpvec: Parallel, vectorized evaluation
In Bioconductor/BiocParallel: Bioconductor facilities for parallel evaluation

bpvec

R Documentation

Parallel, vectorized evaluation

Description

bpvec applies FUN to subsets of X. Any type of object X is allowed, provided length, and [ are defined on X. FUN is a function such that length(FUN(X)) == length(X). The objects returned by FUN are concatenated by AGGREGATE (c() by default). The return value is FUN(X).

Usage

bpvec(X, FUN, ..., AGGREGATE=c, BPREDO=list(), BPPARAM=bpparam(), BPOPTIONS = bpoptions())

Arguments

`X`	Any object for which methods `length` and `[` are implemented.
`FUN`	A function to be applied to subsets of `X`. The relationship between `X` and `FUN(X)` is 1:1, so that `length(FUN(X, ...)) == length(X)`. The return value of separate calls to `FUN` are concatenated with `AGGREGATE`.
`...`	Additional arguments for `FUN`.
`AGGREGATE`	A function taking any number of arguments `...` called to reduce results (elements of the `...` argument of `AGGREGATE` from parallel jobs. The default, `c`, concatenates objects and is appropriate for vectors; `rbind` might be appropriate for data frames.
`BPPARAM`	An optional `BiocParallelParam` instance determining the parallel back-end to be used during evaluation, or a `list` of `BiocParallelParam` instances, to be applied in sequence for nested calls to BiocParallel functions.
`BPREDO`	A `list` of output from `bpvec` with one or more failed elements. When a list is given in `BPREDO`, `bpok` is used to identify errors, tasks are rerun and inserted into the original results.
`BPOPTIONS`	Additional options to control the behavior of the parallel evaluation, see `bpoptions`.

Details

This method creates a vector of indices for X that divide the elements as evenly as possible given the number of bpworkers() and bptasks() of BPPARAM. Indices and data are passed to bplapply for parallel evaluation.

The distinction between bpvec and bplapply is that bplapply applies FUN to each element of X separately whereas bpvec assumes the function is vectorized, e.g., c(FUN(x[1]), FUN(x[2])) is equivalent to FUN(x[1:2]). This approach can be more efficient than bplapply but requires the assumption that FUN takes a vector input and creates a vector output of the same length as the input which does not depend on partitioning of the vector. This behavior is consistent with parallel:::pvec and the ?pvec man page should be consulted for further details.

Value

The result should be identical to FUN(X, ...) (assuming that AGGREGATE is set appropriately).

When evaluation of individual elements of X results in an error, the result is a list with the same geometry (i.e., lengths()) as the split applied to X to create chunks for parallel evaluation; one or more elements of the list contain a bperror element, indicting that the vectorized calculation failed for at least one of the index values in that chunk.

An error is also signaled when FUN(X) does not return an object of the same length as X; this condition is only detected when the number of elements in X is greater than the number of workers.

Author(s)

Martin Morgan mailto:mtmorgan@fhcrc.org.

Examples

methods("bpvec")

## ten tasks (1:10), called with as many back-end elements are specified
## by BPPARAM.  Compare with bplapply
fun <- function(v) {
    message("working")
    sqrt(v)
}
system.time(result <- bpvec(1:10, fun))
result

## invalid FUN -- length(class(X)) is not equal to length(X)
bptry(bpvec(1:2, class, BPPARAM=SerialParam()))

Bioconductor/BiocParallel documentation built on June 2, 2025, 7:17 a.m.