flply | R Documentation |
With flply()
you can apply a function to each block of the file separately.
The result of each function is saved into a list and returned. flply()
is similar to lapply()
, except that it applies the function to each
block of the file rather than to each element of a list. It is also similar
to by()
, except that it does not read the whole file into memory, but
each block is processed as soon as it is read from the disk.
flply(
input,
FUN,
...,
key.sep = "\t",
sep = "\t",
skip = 0,
header = TRUE,
nblocks = Inf,
stringsAsFactors = FALSE,
colClasses = NULL,
select = NULL,
drop = NULL,
col.names = NULL,
parallel = 1
)
input |
Path of the input file. |
FUN |
A function to be applied to each block. The first argument to the
function must be a |
... |
Additional arguments to be passed to FUN. |
key.sep |
The character that delimits the first field from the rest. |
sep |
The field delimiter (often equal to |
skip |
Number of lines to skip at the beginning of the file |
header |
Whether the file has a header. |
nblocks |
The number of blocks to read. |
stringsAsFactors |
Whether to convert strings into factors. |
colClasses |
Vector or list specifying the class of each field. |
select |
The columns (names or numbers) to be read. |
drop |
The columns (names or numbers) not to be read. |
col.names |
Names of the columns. |
parallel |
Number of cores to use. |
Returns a list containing, for each chunk, the result of the processing.
flply: from file to list
f <- system.file("extdata", "dt_iris.csv", package = "fplyr")
# Compute, within each block, the correlation between Sepal.Length and Petal.Length
flply(f, function(d) cor(d$Sepal.Length, d$Petal.Length))
# Summarise each block
flply(f, summary)
# Make a different linear model for each block
block.lm <- function(d) {
lm(Sepal.Length ~ ., data = d[, !"Species"])
}
lm.list <- flply(f, block.lm)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.