README.md

Purrrallel

This package aims to add parallelisation to the core functions of the purrr package . Essentially this package is a just wrapper around the parallel package but with an api that aims to be identical to that of purrr. The advantage of this is that you can convert your code to using parallelisation with as minimal syntax updates as possible. So far the following functions have been implemented map_*(), map2_*(), pmap_*() & rerun()

Example of use

Below shows a contrivied example for creating a bootstrap confidence interval for the mean of iris$Sepal.Length across each iris$Species.

library(purrrallel)
library(parallel)
library(dplyr)
library(tidyr)

get_sample <- function(...){
    iris %>% 
        sample_frac( 1 , replace = T) %>% 
        group_by(Species) %>% 
        summarise ( m = mean(Sepal.Length)) %>% 
        spread(Species , m)
}

cl <- makeCluster(2)
clusterEvalQ(cl, {library(dplyr); library(tidyr)})
registerCluster(cl, nblock = 2)


system.time({
    dat <- data_frame( n = 1:1200 ) %>% 
        mutate( est = map( n , get_sample)) %>% 
        unnest(est)
})

stopCluster(cl)
registerCluster()

Alternatively we can do the same using the rerun() function.

cl <- makeCluster(2)
clusterEvalQ(cl, {library(dplyr); library(tidyr)})
clusterExport(cl, "get_sample")
registerCluster(cl, nblock = 2)


system.time({
    dat <- rerun(1200, get_sample())
})

purrr::reduce(dat, bind_rows)

stopCluster(cl)
registerCluster()

Additional Functions

Misc Notes

Parallel Notes



gowerc/purrrallel documentation built on May 21, 2019, 2:29 a.m.