View source: R/apply-parallelize.R
big_parallelize | R Documentation |
A Split-Apply-Combine strategy to parallelize the evaluation of a function.
big_parallelize( X, p.FUN, p.combine = NULL, ind = cols_along(X), ncores = nb_cores(), ... )
X |
An object of class FBM. |
p.FUN |
The function to be applied to each subset matrix.
It must take a Filebacked Big Matrix as first argument and
|
p.combine |
Function to combine the results with |
ind |
Initial vector of subsetting indices. Default is the vector of all column indices. |
ncores |
Number of cores used. Default doesn't use parallelism. You may use nb_cores. |
... |
Extra arguments to be passed to |
This function splits indices in parts, then apply a given function to each part and finally combine the results.
Return a list of ncores
elements, each element being the result of
one of the cores, computed on a block. The elements of this list are then
combined with do.call(p.combine, .)
if p.combined
is given.
big_apply bigparallelr::split_parapply
## Not run: # CRAN is super slow when parallelism. X <- big_attachExtdata() ### Computation on all the matrix true <- big_colstats(X) big_colstats_sub <- function(X, ind) { big_colstats(X, ind.col = ind) } # 1. the computation is split along all the columns # 2. for each part the computation is done, using `big_colstats` # 3. the results (data.frames) are combined via `rbind`. test <- big_parallelize(X, p.FUN = big_colstats_sub, p.combine = 'rbind', ncores = 2) all.equal(test, true) ### Computation on a part of the matrix n <- nrow(X) m <- ncol(X) rows <- sort(sample(n, n/2)) # sort to provide some locality in accesses cols <- sort(sample(m, m/2)) # idem true2 <- big_colstats(X, ind.row = rows, ind.col = cols) big_colstats_sub2 <- function(X, ind, rows, cols) { big_colstats(X, ind.row = rows, ind.col = cols[ind]) } # This doesn't work because, by default, the computation is spread # along all columns. We must explictly specify the `ind` parameter. tryCatch(big_parallelize(X, p.FUN = big_colstats_sub2, p.combine = 'rbind', ncores = 2, rows = rows, cols = cols), error = function(e) message(e)) # This now works, using `ind = seq_along(cols)`. test2 <- big_parallelize(X, p.FUN = big_colstats_sub2, p.combine = 'rbind', ncores = 2, ind = seq_along(cols), rows = rows, cols = cols) all.equal(test2, true2) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.