ftply | R Documentation |
ftply
takes as input the path to a file and a function, and
returns a data.table
. It is a faster equivalent to using
l <- flply(...)
followed by do.call(rbind, l)
.
ftply(
input,
FUN = function(d, by) d,
...,
key.sep = "\t",
sep = "\t",
skip = 0,
header = TRUE,
nblocks = Inf,
stringsAsFactors = FALSE,
colClasses = NULL,
select = NULL,
drop = NULL,
col.names = NULL,
parallel = 1
)
input |
Path of the input file. |
FUN |
Function to be applied to each block. It must take at least two arguments,
the first of which is a |
... |
Additional arguments to be passed to FUN. |
key.sep |
The character that delimits the first field from the rest. |
sep |
The field delimiter (often equal to |
skip |
Number of lines to skip at the beginning of the file |
header |
Whether the file has a header. |
nblocks |
The number of blocks to read. |
stringsAsFactors |
Whether to convert strings into factors. |
colClasses |
Vector or list specifying the class of each field. |
select |
The columns (names or numbers) to be read. |
drop |
The columns (names or numbers) not to be read. |
col.names |
Names of the columns. |
parallel |
Number of cores to use. |
ftply
is similar to ffply
, but while the latter writes
to disk the result of the processing after each block, the former
keeps the result in memory until all the file has been processed, and
then returns the complete data.table
.
Returns a data.table
with the results of the
processing.
ftply: from file to data.table
f1 <- system.file("extdata", "dt_iris.csv", package = "fplyr")
# Compute the mean of the columns for each species
ftply(f1, function(d, by) d[, lapply(.SD, mean)])
# Read only the first two blocks
ftply(f1, nblocks = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.