async_work: Execute parallel job in another session

Description Usage Arguments Details Value Examples

View source: R/utils.R

Description

Similar to lapply but run in parallel

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
async_work(
  X,
  FUN,
  ...,
  .globals = NULL,
  .name = "Untitled",
  .rs = FALSE,
  .wait = TRUE,
  .chunk_size = Inf
)

Arguments

X

vector

FUN

R function

...

further arguments to FUN

.globals

named list of global variables to be used by FUN

.name

job or progress name

.rs

whether to use 'RStudio' job scheduler

.wait

whether to wait for the results

.chunk_size

maximum chunk size per job, must be Inf if .wait is false

Details

Unlike future package functions, where the global variables can be automatically determined, you must specify the variables to be used by FUN. In addition, you may only assume base packages are loaded when executing functions. Therefore it's recommended to call functions with package names like utils::read.csv explicitly instead of read.csv etc. See examples for details.

The main feature of async_work is that there is no backward communication between main and slave process, hence the setup time is faster than future multiprocess. There is no memory leak issue caused by forked process, hence it's designed for process that writes something to disk and doesn't require too much feed-backs. However, using this function requires to specify .globals, which is inconvenient for beginners.

Value

If .wait is true, then return list of results of FUN being applied to each element of X, otherwise returns a function that can be used to track and obtain the results.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
if(interactive()){
  a <- 1
  f <- function(x, b){
    Sys.sleep(1)
    list(
      result = x + a + b,
      loaded = names(utils::sessionInfo()$loaded),
      attached = search()
    )
  }

  # `a` is a "global" variable because `f` must need to look up for its
  # declaring environment, hence must be specified in `.globals`
  #
  res <- async_work(1:10, f, b = 3, .globals = list(a = a))

  # Only base libraries are attached
  res[[1]]
}

dipterix/ravebase documentation built on Sept. 1, 2020, 6:34 p.m.