GroupBy: Compute Aggregates of Data Subsets

Description Usage Arguments Details Value Author(s)

Description

Splits the data into subsets based on grouping expressions, performs the given aggregates per subset, and returns the results in a convenient form.

Usage

1
2
GroupBy(data, group, ..., fragment.size = 2e+06, init.size = 1024,
  use.mct = TRUE, debug = 0, states = list())

Arguments

data

A waypoint.

group

A named list of expressions, with the names being used as the corresponding outputs. These expressions are outputted in addition the results of the inner GLAs.

If no name is given and the corresponding expression is simply an attribute, then said attribute is used as the name. Otherwise, the column for that expression is hidden from the user.

fragment.size

The number of tuples returned per fragment. This should only be changed from its default value by experienced users.

init.size

The number of groups that space is initially allocated for.

use.mct

Should the MCT hash function be used.

debug

Should debugging information be printed to standard output.

states

Additional states to pass through.

...

Specification of the inner GLAs. See ‘details’ for more information.

Details

The inner GLAs should be specified as a list of calls to other aggregate functions, such as Sum or Mean. In each of these calls, the data argument should be omitted, as it is inferred to be the data passed to GroupBy. Additionally, each argument specifying an inner GLA may be named. If so, that name is taken to be the output of the corresponding GLA. This is purely a stylistic shortcut and the normal method of specifying the outputs can still be used instead.

The outputs of these inner GLAs should avoid name clashing both with each other and those for the grouping expressions.

In the case that one inner GLA produces multiple rows and the rest produce a single row, each of the single row outputs are repeated accordingly.

If more than one inner GLA produces multiple rows, an error is thrown.

The output for each group is then concatenated and returned.

Value

A waypoint.

Author(s)

Jon Claus, <jonterainsights@gmail.com>, Tera Insights, LLC.


tera-insights/gtBase documentation built on May 31, 2019, 8:35 a.m.