cluster_call: Call a function on each of the worker nodes and pass it the...

View source: R/cluster_call.R

cluster_callR Documentation

Call a function on each of the worker nodes and pass it the pairs

Description

Call a function on each of the worker nodes and pass it the pairs

Usage

cluster_call(pairs, fun, ...)

Arguments

pairs

an object or type cluster_pairs as created for example by cluster_pair.

fun

a function to call on each of the worker nodes. See details on the arguments of this function.

...

additional arguments are passed on to fun.

Details

The function will have to accept the following arguments as its first three arguments:

pairs

the data.table with the pairs of the worker node.

x

a data.table with the portion of x present on the worker node.

y

a data.table with y.

Value

The function will return a list with for each worker the result of the function call. When the functions return NULL the result is returned invisibly. Because the result is returned to main node, make sure you don't accidentally return all pairs. If you don't want to return anything end your function with NULL.

Examples

# Generate some pairs
library(parallel)
data("linkexample1", "linkexample2")
cl <- makeCluster(2)

pairs <- cluster_pair(cl, linkexample1, linkexample2)
compare_pairs(pairs, c("lastname", "firstname", "address", "sex"))

# Add a new column to pairs
cluster_call(pairs, function(pairs, ...) {
  pairs[, name := firstname & lastname]
  # we don't want to return the pairs; so make sure to return something
  # else
  NULL
})

# Get the number of pairs on each node
lenghts <- cluster_call(pairs, function(pairs, ...) {
  nrow(pairs)
})
lengths <- unlist(lenghts)
lenghts

# Cleanup
stopCluster(cl)


reclin2 documentation built on May 29, 2024, 4:21 a.m.