distribute_load: Distribute load

Description Usage Arguments Details Value See Also Examples

Description

Utility function for distributing computations among a pool of workers for parallel processing.

Usage

1

Arguments

x

integer number of item to process.

n

integer number of threads.

Details

This function returns a list containing an element for each worker. Each element contains a integer vector specifying the indices that the worker should process.

Value

list object.

See Also

get_number_of_threads, set_number_of_threads, is.parallel.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# imagine that we have 10 jobs that need processing. For simplicity,
# our jobs will involve adding 1 to each element in 1:10.
values <- 1:10

# we could complete this processing using the following vectorized code
result <- 1 + 1:10
print(result)

# however, if our jobs were complex then we would be better off using
# functionals
result <- lapply(1:10, function(x) x + 1)
print(result)

# we could do one better, and use the "plyr" package to handle the
# processing
result <- plyr::llply(1:10, function(x) x + 1)
print(result)

# we could also use the parallel processing options available through "plyr"
# to use more computation resources to complete the jobs (note that since
# these jobs are very quick to process this is actually slower).
cl <- parallel::makeCluster(2, "PSOCK")
doParallel::registerDoParallel(cl)
result <- plyr::llply(1:10, function(x) x + 1, .parallel = TRUE)
cl <- parallel::stopCluster(cl)
print(result)

# however this approach iterates over each element individually, we could
# use the distribute_load function to split the N jobs up into K super
# jobs, and evaluate each super job using vectorized code.
x <- 1:10
cl <- parallel::makeCluster(2, "PSOCK")
parallel::clusterExport(cl, 'x', envir = environment())
doParallel::registerDoParallel(cl)
l <- distribute_load(length(x), n = 2)
result <- plyr::llply(l, function(i) x[i] + 1, .parallel = TRUE)
cl <- parallel::stopCluster(cl)
print(result)

prioritizr/prioritizrutils documentation built on May 25, 2019, 12:20 p.m.