dt_ddply: Split 'data.table', apply function, and return results in a...

View source: R/main_dt_ddply.R

dt_ddplyR Documentation

Split data.table, apply function, and return results in a data.table.

Description

For each subset of a data.table, apply function then combine results into a data.table.

Usage

dt_ddply(
  .data,
  .variables,
  .f = NULL,
  ...,
  .progress = "none",
  .drop = TRUE,
  .parallel = FALSE
)

dt_ldply(
  .data,
  .f = NULL,
  ...,
  .progress = "none",
  .parallel = FALSE,
  .id = NA
)

dt_dlply(
  .data,
  .variables,
  .f = NULL,
  ...,
  .progress = "none",
  .drop = TRUE,
  .parallel = FALSE
)

Arguments

.data

data frame to be processed

.variables

variables to split data frame by, as as.quoted variables, a formula or character vector

.f

A function, specified in one of the following ways:

  • A named function, e.g. mean.

  • An anonymous function, e.g. ⁠\(x) x + 1⁠ or function(x) x + 1.

  • A formula, e.g. ~ .x + 1. You must use .x to refer to the first argument. Only recommended if you require backward compatibility with older versions of R.

  • A string, integer, or list, e.g. "idx", 1, or list("idx", 1) which are shorthand for ⁠\(x) pluck(x, "idx")⁠, ⁠\(x) pluck(x, 1)⁠, and ⁠\(x) pluck(x, "idx", 1)⁠ respectively. Optionally supply .default to set a default value if the indexed element is NULL or does not exist.

...

other arguments passed on to .fun

.progress

name of the progress bar to use, see create_progress_bar

.drop

should combinations of variables that do not appear in the input data be preserved (FALSE) or dropped (TRUE, default)

.parallel

if TRUE, apply function in parallel, using parallel backend provided by foreach

Examples

dt <- data.table(x = 1:10, y = 1:5)
dt_dlply(dt, .(y), ~.[which.max(x)])
dt_ddply(dt, .(y), ~ top_n(., 1, x))

rpkgs/Ipaper documentation built on March 24, 2024, 3:09 p.m.