Description Usage Arguments Value Author(s) See Also Examples
Applies a function to each row of a data frame in a parallelized fashion
(by submitting multiple batch R jobs). It is a convenient wrapper for plapply
, modified
especially for parallel, single-row processing of data frames.
1 2 3 4 5 6 | dfplapply(X, FUN, ..., output.df = FALSE, njobs = parallel::detectCores() -
1, packages = NULL, header.file = NULL, needed.objects = NULL,
needed.objects.env = parent.frame(), workDir = "plapply",
clobber = TRUE, max.hours = 24, check.interval.sec = 1,
collate = FALSE, random.seed = NULL, rout = NULL, clean.up = TRUE,
verbose = FALSE)
|
X |
The data frame, each row of which will be processed using
|
FUN |
A function whose first argument is a single-row data frame, i.e.
a single row of |
... |
Additional named arguments to |
output.df |
logical indicating whether the value returned by
|
njobs |
The number of jobs (subsets). Defaults to one less than the number of cores on the machine. |
packages |
Character vector giving the names of packages that will be
loaded in each new instance of R, using |
header.file |
Text string indicating a file that will be initially
sourced prior calling |
needed.objects |
Character vector giving the names of objects which
reside in the evironment specified by |
needed.objects.env |
Environment where |
workDir |
Character string giving the name of the working directory that will be used for for the files needed to launch the separate instances of R. |
clobber |
Logical indicating whether the directory designated by |
max.hours |
The maximum number of hours to wait for the |
check.interval.sec |
The number of seconds to wait between checking to
see whether all |
collate |
|
random.seed |
An integer setting the random seed, which will result in
randomizing the elements of the list assigned to each job. This is useful
when the computing time for each element varies significantly because it
helps to even out the run times of the parallel jobs. If |
rout |
A character string giving the name of the file to where all of the |
clean.up |
|
verbose |
|
A list or data frame containing the results of processing each row
of X
with FUN
.
Landon Sego
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | X <- data.frame(a = 1:3, b = letters[1:3])
# Function that will operate on each of x, producing a simple list
test.1 <- function(x) {
list(ab = paste(x$a, x$b, sep = "-"), a2 = x$a^2, bnew = paste(x$b, "new", sep = "."))
}
# Data frame output
dfplapply(X, test.1, output.df = TRUE, njobs = 2)
# List output
dfplapply(X, test.1, njobs = 2)
# Function with 2 rows of output
test.2 <- function(x) {
data.frame(ab = rep(paste(x$a, x$b, sep = "-"), 2), a2 = rep(x$a^2, 2))
}
dfplapply(X, test.2, output.df = TRUE, njobs = 2, verbose = TRUE)
# Passing in other objects needed by FUN
a.out <- 10
test.3 <- function(x) {
data.frame(a = x$a + a.out, b = paste(x$b, a.out, sep="-"))
}
dfplapply(X, test.3, output.df = TRUE, needed.objects = "a.out", njobs = 2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.