mapSchedule: Data Parallel Scheduler

Description Usage Arguments Details Examples

Description

This function detects parallelism through the use of top level calls to R's apply family of functions and through analysis of for loops. Currently supported apply style functions include lapply and mapply. It doesn't parallelize all for loops that can be parallelized, but it does do the common ones listed in the example.

Usage

1
mapSchedule(graph)

Arguments

graph

DependGraph

Details

Consider using this if:

Don't use this if:

Currently this function support for loops that update 0 or 1 global variables. For those that update a single variable the update must be on the last line of the loop body, so the for loop should have the following form:

for(i in ...){ ... x[i] <- ... }

If the last line doesn't update the variable then it's not clear that the loop can be parallelized.

Road map of features to implement:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Each iteration of the for loop writes to a different file- good!
# If they write to the same file this will break.
pfile <- makeParallel(parse(text = "
     fnames <- paste0(1:10, '.txt')
     for(f in fname){
         writeLines('testing...', f)
     }"))

# A couple examples in one script
serial_code <- parse(text = "
     x1 <- lapply(1:10, exp)
     n <- 10
     x2 <- rep(NA, n)
     for(i in seq(n)) x2[[i]] <- exp(i + 1)
")

p <- makeParallel(serial_code)

eval(serial_code)
x1
x2
rm(x1, x2)

# x1 and x2 should now be back and the same as they were for serial
eval(writeCode(p))
x1
x2

makeParallel documentation built on May 2, 2019, 9:40 a.m.