flply: Read, process each block and return a list
In fplyr: Apply Functions to Blocks of Files

View source: R/flply.R

flply

R Documentation

Read, process each block and return a list

Description

With flply() you can apply a function to each block of the file separately. The result of each function is saved into a list and returned. flply() is similar to lapply(), except that it applies the function to each block of the file rather than to each element of a list. It is also similar to by(), except that it does not read the whole file into memory, but each block is processed as soon as it is read from the disk.

Usage

flply(
  input,
  FUN,
  ...,
  key.sep = "\t",
  sep = "\t",
  skip = 0,
  header = TRUE,
  nblocks = Inf,
  stringsAsFactors = FALSE,
  colClasses = NULL,
  select = NULL,
  drop = NULL,
  col.names = NULL,
  parallel = 1
)

Arguments

`input`	Path of the input file.
`FUN`	A function to be applied to each block. The first argument to the function must be a `data.table` containing the current block. Additional arguments can be passed with `...`.
`...`	Additional arguments to be passed to FUN.
`key.sep`	The character that delimits the first field from the rest.
`sep`	The field delimiter (often equal to `key.sep`).
`skip`	Number of lines to skip at the beginning of the file
`header`	Whether the file has a header.
`nblocks`	The number of blocks to read.
`stringsAsFactors`	Whether to convert strings into factors.
`colClasses`	Vector or list specifying the class of each field.
`select`	The columns (names or numbers) to be read.
`drop`	The columns (names or numbers) not to be read.
`col.names`	Names of the columns.
`parallel`	Number of cores to use.

Value

Returns a list containing, for each chunk, the result of the processing.

Slogan

flply: from file to list

Examples

f <- system.file("extdata", "dt_iris.csv", package = "fplyr")

# Compute, within each block, the correlation between Sepal.Length and Petal.Length
flply(f, function(d) cor(d$Sepal.Length, d$Petal.Length))

# Summarise each block
flply(f, summary)

# Make a different linear model for each block
block.lm <- function(d) {
  lm(Sepal.Length ~ ., data = d[, !"Species"])
}
lm.list <- flply(f, block.lm)

fplyr documentation built on Aug. 24, 2023, 1:08 a.m.