extreme: Extreme Tuples

Description Usage Arguments Details Value AUTO Author(s) See Also Examples

Description

The data is filtered to only include tuples that contains the extremities of given attributes.

Usage

1
2
3
ExtremeTuples(data, ..., inputs = AUTO, outputs = AUTO)

ExtremeTuplesMake(...)

Arguments

data

an object of class "data".

inputs

which attributes of the data to include in the result in addition to those used in the ordering.

outputs

possible re-namings of the inputs.

...

For ExtremeTuples: additional arguments passed to ExtremeTuplesMake.

For ExtremeTuplesMake: a list of extremities to use for the filtering, with precedence going to those seen earlier in the list. In the case that names are provided, these names are used as column names in the result. See ‘details’ for more information.

Details

The extremities to use for the filtering is provided as a list of arguments, each of which is in the form fun(expr) where fun is either max or min and expr is an expression.

Precedence is based on the order in which the arguments are specified. The result of this GLA is only the tuples whose attributes matched the given extremities for the given attributes. For example, if the extremities provided were min{att1}, max{att2} then the GLA would first filter the data to include those tuples whose value of att1 was minimized. Of them, only those whose value of att2 was the maximum on that subset would be returned.

Each extremity expression is included in the result. If a name is provided in the argument list for a given expression, the column is given that corresponding name. Otherwise, if the expression is a single attribute then the column is given that attribute name. If not, then the column for that expression is given a constructed name that is hidden from the user and guaranteed to not conflict with other column names.

Value

An object of class "data", with the attribute names and rows as discussed above.

AUTO

In the case that inputs = AUTO, each attribute of the data that was not used expressly as an extremity is included in the result. For example, if data contains attributes att1, att2, att3 and the ordering is min(att1), max(att2 + att3), then the result will contain 4 columns with names att1, gen, att2, att3, where gen is a placeholder for a generated name and whose values are att2 + att3.

If outputs = AUTO, then the names of the inputs in the result are left unchanged from data. If some of the inputs were not attributes of data, an error is thrown.

Author(s)

Jon Claus, <jonterainsights@gmail.com>, Tera Insights LLC

See Also

OrderBy for a similarly functioning GLA.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## One attribute test
## Returns one tuple with l_extendedprice = 900.05.
data <- Read(lineitem100g)
agg <- ExtremeTuples(data, min(l_extendedprice))
result <- as.data.frame(agg)

## Three attribute test
## Despite being secondary, l_extendedprice still achieves its global
## minimum on the tuples where l_partkey was maximized. However, l_tax
## does not, as the value in the result is 0.03 and in the overall data
## the maximum is 0.08.
data <- Read(lineitem100g)
agg <- ExtremeTuples(data, max(l_partkey), min(l_extendedprice), max(l_tax))
result <- as.data.frame(agg)

tera-insights/gtBase documentation built on May 31, 2019, 8:35 a.m.