groupingsets | R Documentation |
Calculate aggregates at various levels of groupings producing multiple (sub-)totals. Reflects SQLs GROUPING SETS operations.
rollup(x, ...)
## S3 method for class 'data.table'
rollup(x, j, by, .SDcols, id = FALSE, label = NULL, ...)
cube(x, ...)
## S3 method for class 'data.table'
cube(x, j, by, .SDcols, id = FALSE, label = NULL, ...)
groupingsets(x, ...)
## S3 method for class 'data.table'
groupingsets(x, j, by, sets, .SDcols, id = FALSE, jj, label = NULL, ...)
x |
|
... |
argument passed to custom user methods. Ignored for |
j |
expression passed to data.table |
by |
character column names by which we are grouping. |
sets |
list of character vector reflecting grouping sets, used in |
.SDcols |
columns to be used in |
id |
logical default |
jj |
quoted version of |
label |
label(s) to be used in the 'total' rows in the grouping variable columns of the output, that is, in rows where the grouping variable has been aggregated. Can be a named list of scalars, or a scalar, or |
All three functions rollup, cube, groupingsets
are generic methods, data.table
methods are provided.
The label
argument can be a named list of scalars, or a scalar, or NULL
. When label
is a list, each element name must be (1) a variable name in by
, or (2) the first element of the class in the data.table x
of a variable in by
, or (3) one of 'character', 'integer', 'numeric', 'factor', 'Date', 'IDate'. The order of the list elements is not important. A label specified by variable name will apply only to that variable, while a label specified by first element of a class will apply to all variables in by
for which the first element of the class of the variable in x
matches the label
element name, except for variables that have a label specified by variable name (that is, specification by variable name takes precedence over specification by class). For label
elements with name in by
, the class of the label value must be the same as the class of the variable in x
. For label
elements with name not in by
, the first element of the class of the label value must be the same as the label
element name. For example, label = list(integer = 999, IDate = as.Date("3000-01-01"))
would produce an error because class(999)[1]
is not "integer"
and class(as.Date("3000-01-01"))[1]
is not "IDate"
. A corrected specification would be label = list(integer = 999L, IDate = as.IDate("3000-01-01"))
.
The label = <scalar>
option provides a shorter alternative in the case where only one class of grouping variable requires a label. For example, label = list(character = "Total")
can be shortened to label = "Total"
. When this option is used, the label will be applied to all variables in by
for which the first element of the class of the variable in x
matches the first element of the class of the scalar.
A data.table with various aggregates.
https://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-GROUPING-SETS https://www.postgresql.org/docs/9.5/static/functions-aggregate.html#FUNCTIONS-GROUPING-TABLE
data.table
, rbindlist
n = 24L
set.seed(25)
DT <- data.table(
color = sample(c("green","yellow","red"), n, TRUE),
year = as.Date(sample(paste0(2011:2015,"-01-01"), n, TRUE)),
status = as.factor(sample(c("removed","active","inactive","archived"), n, TRUE)),
amount = sample(1:5, n, TRUE),
value = sample(c(3, 3.5, 2.5, 2), n, TRUE)
)
# rollup
by_vars = c("color", "year", "status")
rollup(DT, j=sum(value), by=by_vars) # default id=FALSE
rollup(DT, j=sum(value), by=by_vars, id=TRUE)
rollup(DT, j=lapply(.SD, sum), by=by_vars, id=TRUE, .SDcols="value")
rollup(DT, j=c(list(count=.N), lapply(.SD, sum)), by=by_vars, id=TRUE)
rollup(DT, j=sum(value), by=by_vars,
# specify label by variable name
label=list(color="total", year=as.Date("3000-01-01"), status=factor("total")))
rollup(DT, j=sum(value), by=by_vars,
# specify label by variable name and first element of class
label=list(color="total", Date=as.Date("3000-01-01"), factor=factor("total")))
# label is character scalar so applies to color only
rollup(DT, j=sum(value), by=by_vars, label="total")
rollup(DT, j=.N, by=c("color", "year", "status", "value"),
# label can be explicitly specified as NA or NaN
label = list(color=NA_character_, year=as.Date(NA), status=factor(NA), value=NaN))
# cube
cube(DT, j = sum(value), by = c("color","year","status"), id=TRUE)
cube(DT, j = lapply(.SD, sum), by = c("color","year","status"), id=TRUE, .SDcols="value")
cube(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"), id=TRUE)
# groupingsets
groupingsets(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"),
sets = list("color", c("year","status"), character()), id=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.