This package aims to add parallelisation to the core functions of the purrr package . Essentially this package is a just wrapper around the parallel package but with an api that aims to be identical to that of purrr. The advantage of this is that you can convert your code to using parallelisation with as minimal syntax updates as possible. So far the following functions have been implemented map_*()
, map2_*()
, pmap_*()
& rerun()
Below shows a contrivied example for creating a bootstrap confidence interval for the mean of iris$Sepal.Length
across each iris$Species
.
library(purrrallel)
library(parallel)
library(dplyr)
library(tidyr)
get_sample <- function(...){
iris %>%
sample_frac( 1 , replace = T) %>%
group_by(Species) %>%
summarise ( m = mean(Sepal.Length)) %>%
spread(Species , m)
}
cl <- makeCluster(2)
clusterEvalQ(cl, {library(dplyr); library(tidyr)})
registerCluster(cl, nblock = 2)
system.time({
dat <- data_frame( n = 1:1200 ) %>%
mutate( est = map( n , get_sample)) %>%
unnest(est)
})
stopCluster(cl)
registerCluster()
Alternatively we can do the same using the rerun()
function.
cl <- makeCluster(2)
clusterEvalQ(cl, {library(dplyr); library(tidyr)})
clusterExport(cl, "get_sample")
registerCluster(cl, nblock = 2)
system.time({
dat <- rerun(1200, get_sample())
})
purrr::reduce(dat, bind_rows)
stopCluster(cl)
registerCluster()
registerCluster(cl, nblock = NULL)
- Register a cluster for use in purrrallel setClusterSeed(cl, 101)
- Set seeds across all subprocesses to ensure reproducibility map_dbl( list(list(a=1), list(a=2)), "a")
nblock
argument of registerCluster()
which forces purrrallel to break your job up into blocks with each block being run in its own subprocesses (blocks <= # processes) cl <- makeCluster(2)
- Create a cluster with n processes stopCluster(cl)
- Stop the cluster clusterEvalQ(cl, {library(dplyr); library(tidyr)})
- Run an expression in ever subprocess clusterCall(cl, function() print("hello, world!"))
- Run a function in every subprocess clusterExport(cl, c("var1", "var2"))
- Exports specific variables to every subprocess Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.