pa_map: Parallel version of purrr map family

View source: R/map_family.R

pa_mapR Documentation

Parallel version of purrr map family

Description

The syntax and the logic of pa_map* functions are identical to purrr's map functions. Please refer to map if you are not familiar with purrr mapping style. Except .x and .f, other arguments are optional and control the parallelization processes.

Usage

pa_map(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

pa_map_lgl(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

pa_map_int(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

pa_map_dbl(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

pa_map_chr(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

pa_map_df(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

pa_map_dfr(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

pa_map_dfc(
  .x,
  .f,
  ...,
  cores = NULL,
  adaptor = "doParallel",
  cluster_type = NULL,
  splitter = NULL,
  auto_export = TRUE,
  .export = NULL,
  .packages = NULL,
  .noexport = NULL,
  .errorhandling = "stop",
  .inorder = TRUE,
  .verbose = FALSE
)

Arguments

.x

A list or atomic vector.

.f

A function, formula, or vector (not necessarily atomic).

If a function, it is used as is.

If a formula, e.g. ~ .x + 2, it is converted to a function. There are three ways to refer to the arguments:

  • For a single argument function, use .

  • For a two argument function, use .x and .y

  • For more arguments, use ..1, ..2, ..3 etc

This syntax allows you to create very compact anonymous functions.

If character vector, numeric vector, or list, it is converted to an extractor function. Character vectors index by name and numeric vectors index by position; use a list to index by position and name at different levels. If a component is not present, the value of .default will be returned.

...

Additional arguments passed on to the mapped function.

cores

(Optional) Number of cores (i.e. workers) to be used. The default value is: Available CPU cores - 1

adaptor

The foreach adaptor to be used. Available options are:

  • "doParallel" (default)

  • "doFuture"

  • "doMC"

  • "doMPI"

  • "doSNOW"

cluster_type

The Clusters architecture to be used with the selected adaptor. Note that allowed values for this argument depends on the "adaptor" argument:

  1. If adaptor is "doParallel":

    • in windows OS: "PSOCK" (default for Windows)

    • in Unix-based OS: "FORK" (default for Unix), "PSOCK"

  2. If adaptor is "doFuture":

    • in windows OS: "multisession" (default for Windows), "cluster_PSOCK"

    • in Unix-based OS: "multicore" (default for Unix), "multisession", "cluster_FORK", "cluster_PSOCK"

  3. If adaptor is "doMC":

    • No cluster_type options here, let cluster_type be NULL

  4. If adaptor is "doMPI":

    • No cluster_type options here, let cluster_type be NULL

  5. If adaptor is "doSNOW":

    • in windows OS: "SOCK" (default for Windows)

    • in Unix-based OS: "MPI" (default for Unix), "SOCK"

splitter

(Optional) Explicitly instruct parapurrr how to pass your input elements to the workers. Splitter should be alist where each of its elements is a vector of integers or integer-like numbers (i.e. no decimal points) of the indexes of your input elements. Collectively they should have a one-to-one correspondence with .x indexes. See the Vignettes for further explanation and examples.

auto_export

(TRUE (default), FALSE or "all") Should parapurrr export the detected objects used in .f, from the function's calling frame to the workers? Default is set to TRUE for convenience, but to improve the performance, consider turning auto_export off and manually supply the exported variables using .export argument. "all" is the most conservative and yet, potentially resource-demanding option. It will clone the function's calling environment, and export every variable to the workers, whether used or not.

.export

character vector of variables to export. This can be useful when accessing a variable that isn't defined in the current environment. The default value in NULL.

.packages

character vector of packages that the tasks depend on. If ex requires a R package to be loaded, this option can be used to load that package on each of the workers. Ignored when used with %do%.

.noexport

character vector of variables to exclude from exporting. This can be useful to prevent variables from being exported that aren't actually needed, perhaps because the symbol is used in a model formula. The default value in NULL.

.errorhandling

specifies how a task evaluation error should be handled. If the value is "stop", then execution will be stopped via the stop function if an error occurs. If the value is "remove", the result for that task will not be returned, or passed to the .combine function. If it is "pass", then the error object generated by task evaluation will be included with the rest of the results. It is assumed that the combine function (if specified) will be able to deal with the error object. The default value is "stop".

.inorder

logical flag indicating whether the .combine function requires the task results to be combined in the same order that they were submitted. If the order is not important, then it setting .inorder to FALSE can give improved performance. The default value is 'TRUE.

.verbose

logical flag enabling verbose messages. This can be very useful for trouble shooting.

Details

Note that except cores, cluster_type, adaptor, auto_export, and splitter, documentation of other arguments, return section, and examples section are automatically imported from purrr and foreach packages.

Value

  • map() Returns a list the same length as .x.

  • map_lgl() returns a logical vector, map_int() an integer vector, map_dbl() a double vector, and map_chr() a character vector.

  • map_df(), map_dfc(), map_dfr() all return a data frame.

  • If .x has names(), the return value preserves those names.

  • The output of .f will be automatically typed upwards, e.g. logical -> integer -> double -> character.

  • walk() returns the input .x (invisibly). This makes it easy to use in pipe.


moosa-r/parapurrr documentation built on July 14, 2022, 11:20 a.m.