parallel.apply: MoreParallelR::parallel.apply

Description Usage Arguments Details Value Note Author(s) Examples

View source: R/parallel.apply.R

Description

MoreParallelR::parallel.apply provides a convenient solution for parallelizing the apply function on array. This function first breaks the dimension specified by MARGIN to a list of smaller arrays and then call the mcapply function to achieve the rest of the parallelization.

Usage

1
2
parallel.apply(X, MARGIN, FUN, ..., verbose = F, cores = 1,
  progress.bar = F)

Arguments

X

An array, including a matrix.

MARGIN

A vector giving the subscripts which the function will be applied over.

FUN

The function to be applied.

...

Optional arguments to FUN.

verbose

Whether to print progress information.

cores

The number of cores for parallelization.

progress.bar

Whether to show a progress bar. This requires the package pbmcapply.

Details

To see better improvement by the parallelization, it is preferred to have the runtime of FUN longer. In other words, this solution works better when you have a heavy workload in the function FUN.

This idea was originally inspired by my advisor, Prof. Guido Cervone, during a casual conversation.

This function is different from plyr::laply that it returns an array with the specified MARGIN as dimensions.

Value

An array.

Note

Please be aware of whether your FUN behaves differently for a vector, a matrix, or an array. If you are applying the function on a matrix or an array, lapply and plyr:laply will coerce the high-dimensional object to vector; but parallel.apply will take the data AS IT IS to feed the FUN. This might cause different results from this function and apply.

Author(s)

Weiming Hu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# This example shows you how to run parallel.apply on a synthetic
# array and the how the performance compares to a serial run.
#

library(profvis)

profvis({
  library(MoreParallelR)
  library(magrittr)

  # Generate synthesized data
  dims <- c(80 , 90, 100, 15)
  X <- dims %>%
    prod() %>%
    runif(min = 1, max = 10) %>%
    array(dim = dims)

  MARGIN <- c(2, 4)
  cores <- 4
  FUN <- function(v) {
    library(magrittr)

    # A costly function
    ret <- v %>%
      as.vector() %>%
      sin() %>%
      cos() %>%
      var()

    return(ret)
  }

  # Run the paralle code
  X.new.par <- parallel.apply(
    X, MARGIN, cores = cores, FUN)

  # Run the serial code
  X.new.sq <- apply(X, MARGIN, FUN)

  # Compare results
  identical(X.new.par, X.new.sq)
})

Weiming-Hu/MoreParallelR documentation built on Dec. 8, 2019, 9:49 a.m.