ExtremeTuples: Extreme Tuples

Description Usage Arguments Details Value Author(s) See Also Examples

Description

The data is filtered to only include tuples that contains the extremities of the given expressions.

Usage

1

Arguments

data

A waypoint.

inputs

A named list of expressions, with the names being used as the corresponding outputs. These expressions are outputted in addition to those used to specify the extremities.

If no name is given and the corresponding expression is simply an attribute, then said attribute is used as the name. Otherwise an error is thrown, as there is no reason to include an extra input if corresponding output column cannot be referenced later.

If this is not given, then each attribute of data that is not used exactly as an expression for comparison is included.

outputs

The usual way to specify the outputs. If both this and names for the inputs are given, a warning is given and outputs is used.

...

Specification of extremities. See ‘details’ for more information.

Details

The extremities to use for the filtering should be provided as a list of arguments, each of which is in the form fun(expr) where fun is either max or min and expr is an expression.

Precedence is based on the order in which the arguments are specified. The result of this GLA is only the tuples whose attributes matched the given extremities for the given attributes. For example, if the extremities provided were min{att1}, max{att2} then the GLA would first filter the data to include those tuples whose value of att1 was minimized. Of them, only those whose value of att2 was the maximum on that subset would be returned.

Each extremity expression is included in the result. If a name is provided in the argument list for a corresponding expression, the column is given that corresponding name. Otherwise, if the expression is a single attribute then the column is given that attribute name. If not, then the column for that expression is given a constructed name that is hidden from the user and guaranteed to not conflict with other column names.

Value

A waypoint with the designated columns and rows.

Author(s)

Jon Claus, <jonterainsights@gmail.com>, Tera Insights, LLC.

See Also

OrderBy for a similarly functioning GLA.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## One attribute test
data <- Read(lineitem100g)
agg <- ExtremeTuples(data, min(l_extendedprice))
result <- as.data.frame(agg)

## Three attribute test
## Despite being secondary, l_extendedprice still achieves its global
## minimum on the tuples where l_partkey was maximized. However, l_tax
## does not, as the value in the result is 0.03 and in the overall data
## the maximum is 0.08.
data <- Read(lineitem100g)
agg <- ExtremeTuples(data, max(l_partkey), min(l_extendedprice), max(l_tax))
result <- as.data.frame(agg)

tera-insights/gtBase documentation built on May 31, 2019, 8:35 a.m.