The easypar
R package allows you to:
easypar
can help if you have to run several different independent computations (for instance bootstrap estimates or multiple local-optimisations) and you want these to be parallel on multi-core architectures. easypar
interfaces to doParallel
in order to make this task easier to code, and to debug.
The idea is to exploit a code template and switch easily between parallel and sequential runs of a function. The code skeleton looks like this
if(parallel) { R = foreach(i = 1:N) %dopar% { ....fun.... } } else { for(i in 1:N) { ....fun... } }
where f
is the actual computation.
I want to use parallel = FALSE
when I have to debug f
, and eventually, I want to use parallel = TRUE
to speed up computations. Parallel execution are hard to debug: inside %dopar%
, tasks run in different memory spaces, and thus outputs (i.e., print
etc) are asynchronous.
This piece of code is at the base of easypar
, whose functioning is shown with some examples.
Consider a dummy function f
that sleeps for some random time and then print the output.
f = function(x) { clock = 2 * runif(1) print(paste("Before sleep", x, " - siesta for ", clock)) Sys.sleep(clock) print(paste("After sleep", x)) return(x) }
f
runs as
f(3)
Input(s). We want to run f
on 4 inputs (random univariate numbers). We store them in a list where each position is a full set of parameters that we want to pass to each calls to f
(list of lists), named according to the actual parameter names.
inputs = lapply(runif(4), list) print(inputs)
easypar
provides a single function that takes as input f
, its list of inputs and some execution parameters for the type of execution requested. The simplest call runs f
in parallel, without seeing any output and just receiving the return values in a list as follows
library(easypar) easypar::run(FUN = f, PARAMS = inputs, parallel = TRUE, outfile = NULL)
We can control the amount (0 to 1) of cores to use at maximum (which are checked via doPar
). Other combinations are also possible.
rds
file its result, implementing a cache which is usefull if one want to real-time analyze output results (with another process).easypar::run(FUN = f, PARAMS = inputs, parallel = TRUE, outfile = NULL, cache = "My_task.rds") # Check cache = readRDS("My_task.rds") print(cache)
outfile
easypar::run(FUN = f, PARAMS = inputs, parallel = TRUE, outfile = '')
for
-loop fashioneasypar::run(FUN = f, PARAMS = inputs, parallel = FALSE, outfile = '')
easypar
We can disable parallel executions easily.
We have a global option to force the execution to go serial, whatever its source code default behaviour is (parallel = TRUE
will not work).
When f
is plugged in a tool and called as
easypar::run(FUN = f, PARAMS = inputs)
which has default parallel = TRUE
, and you set the global option easypar.parallel
, easypar
will run f
sequentially.
options(easypar.parallel = FALSE) easypar::run(FUN = f, PARAMS = inputs, parallel = TRUE)
# Hopefully r will crash at least once but not all calls f = function(x) { if(runif(1) > .5) stop("Boom!!") "Ok" } # Restore parallel and run options(easypar.parallel = TRUE) runs = easypar::run(FUN = f, PARAMS = inputs, parallel = TRUE, outfile = NULL) # inspect and filter function numErrors(runs) runs filterErrors(runs)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.