View source: R/furrr-options.R
| furrr_options | R Documentation |
furrr_options() returns an object that can be supplied as the .options
argument for furrr functions, such as future_map(). The options are either
used by furrr directly, or are passed on to future::future().
furrr_options(
...,
stdout = TRUE,
conditions = "condition",
globals = TRUE,
packages = NULL,
seed = FALSE,
scheduling = 1,
chunk_size = NULL,
prefix = NULL
)
... |
These dots are reserved for future extensibility and must be empty. |
stdout |
A logical.
|
conditions |
A character string of conditions classes to be relayed.
The default is to relay all conditions, including messages and warnings.
Errors are always relayed. To not relay any conditions (besides errors),
use |
globals |
A logical, a character vector, a named list, or |
packages |
A character vector, or |
seed |
A logical, an integer of length |
scheduling |
A single integer, logical, or
This argument is only used if |
chunk_size |
A single integer, |
prefix |
A single character string, or |
globals controls how globals are identified, similar to the globals
argument of future::future(). Since all function calls use the same set of
globals, furrr gathers globals upfront (once), which is more efficient than
if it was done for each future independently.
If TRUE or NULL, then globals are automatically identified and
gathered.
If a character vector of names is specified, then those globals are gathered.
If a named list, then those globals are used as is.
In all cases, .f and any ... arguments are automatically passed as
globals to each future created, as they are always needed.
Unless seed = FALSE, furrr functions are guaranteed to generate
the exact same sequence of random numbers given the same initial
seed / RNG state regardless of the type of futures and scheduling
("chunking") strategy.
Setting seed = NULL is equivalent to seed = FALSE, except that the
future.rng.onMisuse option is not consulted to potentially monitor the
future for faulty random number usage. See the seed argument of
future::future() for more details.
RNG reproducibility is achieved by pre-generating the random seeds for all
iterations (over .x) by using L'Ecuyer-CMRG RNG streams. In each
iteration, these seeds are set before calling .f(.x[[i]], ...).
Note, for large length(.x) this may introduce a large overhead.
A fixed seed may be given as an integer vector, either as a full
L'Ecuyer-CMRG RNG seed of length 7, or as a seed of length 1 that
will be used to generate a full L'Ecuyer-CMRG seed.
If seed = TRUE, then .Random.seed is returned if it holds a
L'Ecuyer-CMRG RNG seed, otherwise one is created randomly.
If seed = NA, a L'Ecuyer-CMRG RNG seed is randomly created.
If none of the function calls .f(.x[[i]], ...) use random number
generation, then seed = FALSE may be used.
In addition to the above, it is possible to specify a pre-generated
sequence of RNG seeds as a list such that length(seed) == length(.x) and
where each element is an integer seed that can be assigned to .Random.seed.
Use this alternative with caution. Note that as.list(seq_along(.x)) is
not a valid set of such .Random.seed values.
In all cases but seed = FALSE, after a furrr function returns, the RNG
state of the calling R process is guaranteed to be "forwarded one step" from
the RNG state before the call. This is true regardless of the future
strategy / scheduling used. This is done in order to guarantee that an R
script calling future_map() multiple times should be numerically
reproducible given the same initial seed.
Note that you cannot expect identical results between map() and
future_map() when using a .f that calls functions that generate random
numbers, even when calling set.seed() ahead of time. For one thing, the
default random number generation algorithm used by R during sequential
processing is Mersenne-Twister, different from the L'Ecuyer-CMRG seeds used
by furrr. But even aligning the RNGkind() would not be enough. map()
itself would have to change to use the same parallel compatible RNG strategy
as future_map() (pre-generating the seeds, and setting them before each
.f invocation). At the end of the day, you have to accept that the
following will produce different sequences of random numbers, but both are
statistically sound:
set.seed(42) purrr::map(1:10, ~ rnorm(1)) set.seed(42) furrr::future_map(1:10, ~ rnorm(1), .options = furrr_options(seed = TRUE))
But importantly, the furrr::future_map() example will always produce the
same sequence of random numbers, regardless of the plan() you choose:
plan(sequential) set.seed(42) furrr::future_map(1:10, ~ rnorm(1), .options = furrr_options(seed = TRUE)) plan(multisession, workers = 2) set.seed(42) furrr::future_map(1:10, ~ rnorm(1), .options = furrr_options(seed = TRUE)) plan(cluster, workers = workers) set.seed(42) furrr::future_map(1:10, ~ rnorm(1), .options = furrr_options(seed = TRUE))
furrr_options()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.