adply: Split array, apply function, and return results in a data...

View source: R/adply.r

adplyR Documentation

Split array, apply function, and return results in a data frame.

Description

For each slice of an array, apply function then combine results into a data frame.

Usage

adply(
  .data,
  .margins,
  .fun = NULL,
  ...,
  .expand = TRUE,
  .progress = "none",
  .inform = FALSE,
  .parallel = FALSE,
  .paropts = NULL,
  .id = NA
)

Arguments

.data

matrix, array or data frame to be processed

.margins

a vector giving the subscripts to split up data by. 1 splits up by rows, 2 by columns and c(1,2) by rows and columns, and so on for higher dimensions

.fun

function to apply to each piece

...

other arguments passed on to .fun

.expand

if .data is a data frame, should output be 1d (expand = FALSE), with an element for each row; or nd (expand = TRUE), with a dimension for each variable.

.progress

name of the progress bar to use, see create_progress_bar

.inform

produce informative error messages? This is turned off by default because it substantially slows processing speed, but is very useful for debugging

.parallel

if TRUE, apply function in parallel, using parallel backend provided by foreach

.paropts

a list of additional options passed into the foreach function when parallel computation is enabled. This is important if (for example) your code relies on external data or packages: use the .export and .packages arguments to supply them so that all cluster nodes have the correct environment set up for computing.

.id

name(s) of the index column(s). Pass NULL to avoid creation of the index column(s). Omit or pass NA to use the default names "X1", "X2", .... Otherwise, this argument must have the same length as .margins.

Value

A data frame, as described in the output section.

Input

This function splits matrices, arrays and data frames by dimensions

Output

The most unambiguous behaviour is achieved when .fun returns a data frame - in that case pieces will be combined with rbind.fill. If .fun returns an atomic vector of fixed length, it will be rbinded together and converted to a data frame. Any other values will result in an error.

If there are no results, then this function will return a data frame with zero rows and columns (data.frame()).

References

Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. https://www.jstatsoft.org/v40/i01/.

See Also

Other array input: a_ply(), aaply(), alply()

Other data frame output: ddply(), ldply(), mdply()


plyr documentation built on Oct. 2, 2023, 9:07 a.m.

Related to adply in plyr...