furrr_options: Options to fine tune furrr

Description Usage Arguments Global variables Reproducible random number generation (RNG) Examples

View source: R/furrr-options.R

Description

These options fine tune furrr functions, such as future_map(). They are either used by furrr directly, or are passed on to future::future().

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
furrr_options(
  ...,
  stdout = TRUE,
  conditions = NULL,
  globals = TRUE,
  packages = NULL,
  lazy = FALSE,
  seed = FALSE,
  scheduling = 1,
  chunk_size = NULL,
  prefix = NULL
)

Arguments

...

These dots are reserved for future extensibility and must be empty.

stdout

A logical.

  • If TRUE, standard output of the underlying futures is captured and relayed as soon as possible.

  • If FALSE, output is silenced by sinking it to the null device.

  • If NA, output is not intercepted. This is not recommended.

conditions

A character string of conditions classes to be captured and relayed. The default is the same as the condition argument of future::Future(). To not intercept conditions, use conditions = character(0L). Errors are always relayed.

globals

A logical, a character vector, a named list, or NULL for controlling how globals are handled. For details, see the Global variables section below.

packages

A character vector, or NULL. If supplied, this specifies packages that are guaranteed to be attached in the R environment where the future is evaluated.

lazy

A logical. Specifies whether futures should be resolved lazily or eagerly.

seed

A logical, an integer of length 1 or 7, a list of length(.x) with pre-generated random seeds, or NULL. For details, see the Reproducible random number generation (RNG) section below.

scheduling

A single integer, logical, or Inf. This argument controls the average number of futures ("chunks") per worker.

  • If 0, then a single future is used to process all elements of .x.

  • If 1 or TRUE, then one future per worker is used.

  • If 2, then each worker will process two futures (provided there are enough elements in .x).

  • If Inf or FALSE, then one future per element of .x is used.

This argument is only used if chunk_size is NULL.

chunk_size

A single integer, Inf, or NULL. This argument controls the average number of elements per future ("chunk"). If Inf, then all elements are processed in a single future. If NULL, then scheduling is used instead to determine how .x is chunked.

prefix

A single character string, or NULL. If a character string, then each future is assigned a label as {prefix}-{chunk-id}. If NULL, no labels are used.

Global variables

globals controls how globals are identified, similar to the globals argument of future::future(). Since all function calls use the same set of globals, furrr gathers globals upfront (once), which is more efficient than if it was done for each future independently.

Reproducible random number generation (RNG)

Unless seed = FALSE, furrr functions are guaranteed to generate the exact same sequence of random numbers given the same initial seed / RNG state regardless of the type of futures and scheduling ("chunking") strategy.

Setting seed = NULL is equivalent to seed = FALSE, except that the future.rng.onMisuse option is not consulted to potentially monitor the future for faulty random number usage. See the seed argument of future::future() for more details.

RNG reproducibility is achieved by pre-generating the random seeds for all iterations (over .x) by using L'Ecuyer-CMRG RNG streams. In each iteration, these seeds are set before calling .f(.x[[i]], ...). Note, for large length(.x) this may introduce a large overhead.

A fixed seed may be given as an integer vector, either as a full L'Ecuyer-CMRG RNG seed of length 7, or as a seed of length 1 that will be used to generate a full L'Ecuyer-CMRG seed.

If seed = TRUE, then .Random.seed is returned if it holds a L'Ecuyer-CMRG RNG seed, otherwise one is created randomly.

If seed = NA, a L'Ecuyer-CMRG RNG seed is randomly created.

If none of the function calls .f(.x[[i]], ...) use random number generation, then seed = FALSE may be used.

In addition to the above, it is possible to specify a pre-generated sequence of RNG seeds as a list such that length(seed) == length(.x) and where each element is an integer seed that can be assigned to .Random.seed. Use this alternative with caution. Note that as.list(seq_along(.x)) is not a valid set of such .Random.seed values.

In all cases but seed = FALSE, after a furrr function returns, the RNG state of the calling R process is guaranteed to be "forwarded one step" from the RNG state before the call. This is true regardless of the future strategy / scheduling used. This is done in order to guarantee that an R script calling future_map() multiple times should be numerically reproducible given the same initial seed.

Examples

1

Example output

Loading required package: future
<furrr_options>

furrr documentation built on Jan. 29, 2021, 5:08 p.m.