loopingRule: parallel looping functions

Description Usage Arguments Details

Description

parallel looping functions, usually called by modifyFunction to pass parameter, for example cl in parLapplyLoop

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
parLapplyBatchLoop(loopList, executeFunction, ..., cl, batchSize = 3,
  mc.cores = getOption("mc.cores", 1L), shuffle = TRUE)

parLapplyLoop(loopList, executeFunction, ..., cl, shuffle = TRUE)

parLapplyLBLoop(loopList, executeFunction, ..., cl)

foreachBatchLoop(loopList, executeFunction, ..., batchSize = 3,
  mc.cores = getOption("mc.cores", 1L), shuffle = TRUE)

foreachLoop(loopList, executeFunction, ...)

Arguments

loopList

An internal model list to execute over

executeFunction

An internal task execution function

...

Other parameter to pass to executeFunction. Must be left in a loopingRue

cl

A SNOW cluster as decribed in parallel::parLapplyLoop

batchSize

Number of models to send to a cluster or core at a time

mc.cores

Number of cores or CPUs to use in each compute node

shuffle

Should loopList be shuffle before depatch to different compute nodes. Shuffle the loopList gives a more balanced batch load

Details

A looping rule is a lapply like function, whose first two arguments are a list and a function respectively, with ... available for further argument passed to the function. The order of the first two arguments matters while names do not. Its return value must be a list. Therefore, bas::lapply, parallel::mclapply, parallel::parLapply, parallel::parLapplyLB are all valide looping rule. parLapplyBatchLoop, foreachBatchLoop here are usually used with openMPI, and send its jobs by batches, with batchSize = 3 as default. One can also execute jobs (<=batchSize) in a node with multiple cores, by setting mc.cores > 1. But since only at more batchSize jobs running on a node at a time, mc.cores must be less or equal to batchSize. It is usually disencouraged to use many cores in a single node in openMPI jobs, as it may crash a node (heavy load). To use 2 cores on one node, one can use the modifyFunction, like modifyFunction(foreachLoop, mc.cores = 2L).

The default looping rule is the foreachLoop, which sends one job to one node or core at a time and execute the next one if one registered node or core is available. If no backend registered, foreachBatchLoop and foreachLoop execute jobs sequentially.


linxihui/lazyML documentation built on May 21, 2019, 6:39 a.m.