DeveloperInterface: Developer interface

DeveloperInterfaceR Documentation

Developer interface

Description

Functions documented on this page are meant for developers wishing to implement BPPARAM objects that extend the BiocParallelParam virtual class to support additional parallel back-ends.

Usage

## class extension

.prototype_update(prototype, ...)

## manager interface

.send_to(backend, node, value)
.recv_any(backend)
.send_all(backend, value)
.recv_all(backend)

## worker interface

.send(worker, value)
.recv(worker)
.close(worker)

## task manager interface(optional)
.manager(BPPARAM)
.manager_send(manager, value, ...)
.manager_recv(manager)
.manager_send_all(manager, value)
.manager_recv_all(manager)
.manager_capacity(manager)
.manager_flush(manager)
.manager_cleanup(manager)

## supporting implementations

.bpstart_impl(x)
.bpworker_impl(worker)
.bplapply_impl(
    X, FUN, ..., BPREDO = list(),
    BPPARAM = bpparam(), BPOPTIONS = bpoptions()
)
.bpiterate_impl(
    ITER, FUN, ..., REDUCE, init, reduce.in.order = FALSE, BPREDO = list(),
    BPPARAM = bpparam(), BPOPTIONS = bpoptions()
)
.bpstop_impl(x)


## extract the static or dynamic part from a task
.task_const(value)
.task_dynamic(value)
.task_remake(value, static_data = NULL)

## Register an option for BPPARAM
.registerOption(optionName, genericName)

Arguments

prototype

A named list of default values for reference class fields.

x

A BPPARAM instance.

backend

An object containing information about the cluster, returned by bpbackend(<BPPARAM>).

manager

An object returned by .manager()

worker

The object to which the worker communicates via .send and .recv. .close terminates the worker.

node

An integer value indicating the node in the backend to which values are to be sent or received.

value

Any R object, to be sent to or from workers.

X, ITER, FUN, REDUCE, init, reduce.in.order, BPREDO, BPPARAM

See bplapply and bpiterate.

...

For .prototype_update(), name-value pairs to initialize derived and base class fields.

For .bplapply_impl(), .bpiterate_impl(), additional arguments to FUN(); see bplapply and bpiterate.

For .manager_send(), this is a placeholder for the future development.

static_data

An object extracted from .task_const(value)

BPOPTIONS

Additional options to control the behavior of parallel evaluation, see bpoptions.

optionName

character(1), an option name for BPPARAM. The named options will be created by bpoptions

genericName

character(1), the name of the S4 generic function. This will be used to get or set the field in BPPARAM. The generic needs to support replacement function defined by setReplaceMethod.

Details

Start a BPPARM implementation by creating a reference class, e.g., extending the virtual class BiocParallelParam. Because of idiosyncracies in reference class field initialization, an instance of the class should be created by calling the generator returned by setRefClass() with a list of key-value pairs providing default parameteter arguments. The default values for the BiocParallelParam base class is provided in a list .BiocParallelParam_prototype, and the function .prototype_update() updates a prototype with new values, typically provided by the user. See the example below.

BPPARAM implementations need to implement bpstart() and bpstop() methods; they may also need to implement, bplapply() and bpiterate() methods. Each method usually performs implementation-specific functionality before calling the next (BiocParallelParam) method. To avoid the intricacies of multiple dispatch, the bodies of BiocParallelParam methods are available for direct use as exported symbols.

  • bpstart,BiocParallelParam-method (.bpstart_impl()) initiates logging, random number generation, and registration of finalizers to ensure that started clusters are stopped.

  • bpstop,BiocParallelParam-method (.bpstop_impl()) ensures appropriate clean-up of stopped clusters, including sending the DONE semaphore. bpstart() will usually arrange for workers to enter .bpworker_impl() to listen for and evaluate tasks.

  • bplapply,ANY,BiocParallelParam-method and bpiterate,ANY,BiocParallelParam-method (.bplapply_impl(), .bpiterate_impl()) implement: serial evaluation when there is a single core or task available; BPREDO functionality, and parallel lapply-like or iterative calculation.

Invoke .bpstart_impl(), .bpstop_impl(), .bplapply_impl(), and .bpiterate_impl() after any BPPARAM-specific implementation details.

New implementations will also implement bpisup() and bpbackend() / bpbackend<-(); there are no default methods.

The backends (object returned by bpbackend()) of new BPPARAM implementations must support length() (number of nodes). In addition, the backends must support .send_to() and .recv_any() manager and .send(), .recv(), and .close() worker methods. Default .send_all() and .recv_all() methods are implemented as simple iterations along the length(cluster), invoking .send_to() or .recv_any() on each iteration.

The task manager is an optional interface for a backend that wants to control the task dispatching process. .manager_send() sends the task value to a worker, .manager_recv() returns a list with each element being a result received from a worker. .manager_capacity() instructs how many tasks values can be processed simultaneously by the cluster. .manager_flush() flushes all the cached tasks(if any) immediately. .manager_cleanup() performs cleanup after the job is finished. The default methods for .manager_flush() and .manager_cleanup() are no-op.

In some cases it might be worth-while to cache some objects in a task and reuse them in another task. This can reduce the bandwith requirement for sending the tasks out to the worker. .task_const() can be used to extract the objects from the task which are not going to change across all tasks. .task_dynamic() preserve only the dynamic components in a task. Given the static and dynamic task objects, the complete task can be recovered by .task_remake(). When there is no static data can be extracted(e.g. a task with no static component or a task which has been extracted by .task_dynamic()), .task_const() simply returns a NULL value. Calling .task_remake() is no-op if the task haven't been extracted by .task_dynamic() or the static data is NULL.

The function .registerOption allows the developer to register a generic function that can change the fields in BPPARAM. The developer does not need to register the fields that are already defined in BiocParallel. .registerOption should only be used to support new fields. For example, if the developer defines a BPPARAM which has a field worker.password, the developer may also define the getter / setter bpworkerPassword and bpworkerPassword<-. Then by calling .registerOption("worker.password", "bpworkerPassword"), the user can change the field in BPPARAM by passing an object of bpoptions(worker.password = "1234") in any apply function.

Value

The return value of .prototype_update() is a list with elements in prototype substituted with key-value pairs provided in ....

All send* and recv* functions are endomorphic, returning a cluster object.

The return value of .manager_recv() is a list with each element being a result received from a worker, .manager_capacity() is an integer. The return values of the other .manager_*() are not restricted

Examples


##
## Extend BiocParallelParam; `.A()` is not meant for the end user
##

.A <- setRefClass(
    "A",
    contains = "BiocParallelParam",
    fields = list(id = "character")
)

## Use a prototype for default values, including the prototype for
## inheritted fields

.A_prototype <- c(
    list(id = "default_id"),
    .BiocParallelParam_prototype
)

## Provide a constructor for the user

A <- function(...) {
    prototype <- .prototype_update(.A_prototype, ...)
    do.call(.A, prototype)
}

## Provide an R function for field access

bpid <- function(x)
    x$id

## Create and use an instance, overwriting default values

bpid(A())

a <- A(id = "my_id", threshold = "WARN")
bpid(a)
bpthreshold(a)


Bioconductor/BiocParallel documentation built on Oct. 31, 2024, 6:58 a.m.