Execute a Function in Parallel on all Workers of a Sleigh

Share:

Description

eachWorker executes function fun once on each worker in the specified sleigh, and returns the results in a list. This can be useful for initializing each of the workers before starting to execute tasks using eachElem. Loading packages using the library function, loading data sets using the data function, and assigning variables in the global environment are common tasks for eachWorker.

Usage

1
2
  ## S4 method for signature 'sleigh'
eachWorker(.Object, fun, ..., eo=NULL, DEBUG=FALSE)

Arguments

.Object

sleigh class object.

fun

the function to be evaluated by each of the sleigh workers. In the case of functions like +, %*%, etc., the function name must be quoted.

...

optional arguments to fun.

eo

additional options, see details

DEBUG

logical; should browser function be called upon entry to eachWorker?

Details

The eo argument is a list that can be used to specify various options. The current options are: eo$blocking, and eo$accumulator.

The eo$blocking option is used to indicate whether to wait for the results, or to return as soon as the tasks have been submitted. If set to FALSE, eachWorker will return a sleighPending object that is used to monitor the status of the tasks, and to eventually retrieve the results. You must wait for the results to be complete before executing any further tasks on the sleigh, or an exception will be raised. The default value is TRUE.

The eo$accumulator option can be used to specify a function that will receive the results of the task execution. Note that while this can be a very useful feature with eachElem, it's not commonly used with eachWorker, but is included for consistency. The first argument to eo$accumulator function is a list of results, where the length of the list is always equal to 1 (because there is no eo$chunkSize option in eachWorker). The second argument is a vector of task numbers, starting from 1, where the length of the vector is also always equal to 1. The task numbers are not very important when used with eachWorker, because the order of tasks isn't specified, as it is with eachElem. Note that when eo$accumulator is specified, eachWorker returns NULL, not the list of results, since eachWorker doesn't save any of the results after passing them to the eo$accumulator function.

The DEBUG argument is used call the browser function upon entering eachWorker. The default value is FALSE.

Note

The eo$blocking option can be very useful for starting a function on each of the workers, and then allowing the master process to interact with the workers via NetWorkSpace operations in order to implement sophisticated parallel applications.

See Also

eachElem, sleighPending

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
  ## Not run: 
# create a sleigh
s <- sleigh()

# assign to global variable x on each worker
eachWorker(s, function() x <<- 1)

# get a listing of each worker's global environment
eachWorker(s, function() ls(globalenv()))

# get system info from each worker
eachWorker(s, Sys.info)

# load MASS package on each worker
eachWorker(s, function() library(MASS))

# non-blocking example using simple NWS operations
sp <- eachWorker(s, function() nwsFind(SleighNws, 'hello'), eo=list(blocking=FALSE))
nwsStore(s@nws, 'hello', 'world')
waitSleigh(sp)	
  
## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.