SerialParam-class: Enable serial evaluation

SerialParam-classR Documentation

Enable serial evaluation

Description

This class is used to parameterize serial evaluation, primarily to facilitate easy transition from parallel to serial code.

Usage

SerialParam(
    stop.on.error = TRUE,
    progressbar = FALSE,
    RNGseed = NULL,
    timeout = WORKER_TIMEOUT,
    log = FALSE,
    threshold = "INFO",
    logdir = NA_character_,
    resultdir = NA_character_,
    jobname = "BPJOB",
    force.GC = FALSE
)

Arguments

stop.on.error

logical(1) Enable stop on error.

progressbar

logical(1) Enable progress bar (based on plyr:::progress_text).

RNGseed

integer(1) Seed for random number generation. The seed is used to set a new, independent random number stream for each element of X. The ith element recieves the same stream seed, regardless of use of SerialParam(), SnowParam(), or MulticoreParam(), and regardless of worker or task number. When RNGseed = NULL, a random seed is used.

timeout

numeric(1) Time (in seconds) allowed for worker to complete a task. This value is passed to base::setTimeLimit() as both the cpu and elapsed arguments. If the computation exceeds timeout an error is thrown with message 'reached elapsed time limit'.

log

logical(1) Enable logging.

threshold

character(1) Logging threshold as defined in futile.logger.

logdir

character(1) Log files directory. When not provided, log messages are returned to stdout.

resultdir

character(1) Job results directory. When not provided, results are returned as an R object (list) to the workspace.

jobname

character(1) Job name that is prepended to log and result files. Default is "BPJOB".

force.GC

logical(1) Whether to invoke the garbage collector after each call to FUN. The default (FALSE, do not explicitly call the garbage collection) rarely needs to be changed.

Details

SerialParam is used for serial computation on a single node. Using SerialParam in conjunction with bplapply differs from use of lapply because it provides features such as error handling, logging, and random number use consistent with SnowParam and MulticoreParam.

error handling:

By default all computations are attempted and partial results are returned with any error messages.

  • stop.on.error A logical. Stops all jobs as soon as one job fails or wait for all jobs to terminate. When FALSE, the return value is a list of successful results along with error messages as 'conditions'.

  • The bpok(x) function returns a logical() vector that is FALSE for any jobs that threw an error. The input x is a list output from a bp*apply function such as bplapply or bpmapply.

logging:

When log = TRUE the futile.logger package is loaded on the workers. All log messages written in the futile.logger format are captured by the logging mechanism and returned real-time (i.e., as each task completes) instead of after all jobs have finished.

Messages sent to stdout and stderr are returned to the workspace by default. When log = TRUE these are diverted to the log output. Those familiar with the outfile argument to makeCluster can think of log = FALSE as equivalent to outfile = NULL; providing a logdir is the same as providing a name for outfile except that BiocParallel writes a log file for each task.

The log output includes additional statistics such as memory use and task runtime. Memory use is computed by calling gc(reset=TRUE) before code evaluation and gc() (no reseet) after. The output of the second gc() call is sent to the log file.

log and result files:

Results and logs can be written to a file instead of returned to the workspace. Writing to files is done from the master as each task completes. Options can be set with the logdir and resultdir fields in the constructor or with the accessors, bplogdir and bpresultdir.

random number generation:

For MulticoreParam, SnowParam, and SerialParam, random number generation is controlled through the RNGseed = argument. BiocParallel uses the L'Ecuyer-CMRG random number generator described in the parallel package to generate independent random number streams. One stream is associated with each element of X, and used to seed the random number stream for the application of FUN() to X[[i]]. Thus setting RNGseed = ensures reproducibility across MulticoreParam(), SnowParam(), and SerialParam(), regardless of worker or task number. The default value RNGseed = NULL means that each evaluation of bplapply proceeds independently.

For details of the L'Ecuyer generator, see ?clusterSetRNGStream.

Constructor

SerialParam():

Return an object to be used for serial evaluation of otherwise parallel functions such as bplapply, bpvec.

Methods

The following generics are implemented and perform as documented on the corresponding help page (e.g., ?bpworkers): bpworkers. bpisup, bpstart, bpstop, are implemented, but do not have any side-effects.

Author(s)

Martin Morgan mailto:mtmorgan@fhcrc.org

See Also

getClass("BiocParallelParam") for additional parameter classes.

register for registering parameter classes for use in parallel evaluation.

Examples

p <- SerialParam()
simplify2array(bplapply(1:10, sqrt, BPPARAM=p))
bpvec(1:10, sqrt, BPPARAM=p)

Bioconductor/BiocParallel documentation built on Oct. 31, 2024, 6:58 a.m.