groupingsets: Grouping Set aggregation for data tables

View source: R/groupingsets.R

groupingsetsR Documentation

Grouping Set aggregation for data tables

Description

Calculate aggregates at various levels of groupings producing multiple (sub-)totals. Reflects SQLs GROUPING SETS operations.

Usage

rollup(x, ...)
## S3 method for class 'data.table'
rollup(x, j, by, .SDcols, id = FALSE, ...)
cube(x, ...)
## S3 method for class 'data.table'
cube(x, j, by, .SDcols, id = FALSE, ...)
groupingsets(x, ...)
## S3 method for class 'data.table'
groupingsets(x, j, by, sets, .SDcols, id = FALSE, jj, ...)

Arguments

x

data.table.

...

argument passed to custom user methods. Ignored for data.table methods.

j

expression passed to data.table j.

by

character column names by which we are grouping.

sets

list of character vector reflecting grouping sets, used in groupingsets for flexibility.

.SDcols

columns to be used in j expression in .SD object.

id

logical default FALSE. If TRUE it will add leading column with bit mask of grouping sets.

jj

quoted version of j argument, for convenience. When provided function will ignore j argument.

Details

All three functions rollup, cube, groupingsets are generic methods, data.table methods are provided.

Value

A data.table with various aggregates.

References

https://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-GROUPING-SETS https://www.postgresql.org/docs/9.5/static/functions-aggregate.html#FUNCTIONS-GROUPING-TABLE

See Also

data.table, rbindlist

Examples

n = 24L
set.seed(25)
DT <- data.table(
    color = sample(c("green","yellow","red"), n, TRUE),
    year = as.Date(sample(paste0(2011:2015,"-01-01"), n, TRUE)),
    status = as.factor(sample(c("removed","active","inactive","archived"), n, TRUE)),
    amount = sample(1:5, n, TRUE),
    value = sample(c(3, 3.5, 2.5, 2), n, TRUE)
)

# rollup
rollup(DT, j = sum(value), by = c("color","year","status")) # default id=FALSE
rollup(DT, j = sum(value), by = c("color","year","status"), id=TRUE)
rollup(DT, j = lapply(.SD, sum), by = c("color","year","status"), id=TRUE, .SDcols="value")
rollup(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"), id=TRUE)

# cube
cube(DT, j = sum(value), by = c("color","year","status"), id=TRUE)
cube(DT, j = lapply(.SD, sum), by = c("color","year","status"), id=TRUE, .SDcols="value")
cube(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"), id=TRUE)

# groupingsets
groupingsets(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"),
             sets = list("color", c("year","status"), character()), id=TRUE)

Rdatatable/data.table documentation built on March 18, 2024, 2:11 a.m.