orderby: Order By

Description Usage Arguments Details Value AUTO Author(s) Examples

Description

The data is sorted according to a given ordering.

Usage

1
2
3
OrderBy(data, ..., inputs = AUTO, outputs = AUTO)

OrderByMake(..., limit = 0, rank = NULL)

Arguments

data

an object of class "data".

inputs

which attributes of the data to include in the result in addition to those used in the ordering.

outputs

possible re-namings of the .

limit

the number of rows to include in the result. A value of 0 specifies that all rows are kept. Setting limit = k effectively functions as a top-k selection.

rank

if non-NULL, an additional column is included that functions as the row number. This is typically not needed as the row number is otherwise provided. The name of this column is the value provided as rank.

...

For OrderBy: additional arguments passed to OrderByMake.

For OrderByMake: a list of natural orderings to use, with precedence going to those seen earlier in the list. In the case that names are provided, these names are used as column names in the result. See ‘details’ for more information.

Details

The ordering schema is provided as a list of arguments, each of which is in the form fun(expr) where fun is either asc or dsc and expr is an expression.

Precedence is based on the order in which the arguments are specified. For example, in the case of asc(att1), dsc(att2), the ordering is primarily ascending with respect to att1 with ties being broken by att2 in a descending manner. Ties are broken arbitrarily, including the case in which only some tuples that are tied are included in the result based on limit.

Each ordering expression is included in the result. If a name is provided in the argument list for a given expression, the column is given that corresponding name. Otherwise, if the expression is a single attribute then the column is given that attribute name. If not, then the column for that expression is given a constructed name that is hidden from the user and guaranteed to not conflict with other column names.

Value

An object of class "data", with the attribute names as discussed above. Upon conversion to a data frame, there will be max(l, n), where l is limit and n is the number of rows in data.

AUTO

In the case that inputs = AUTO, each attribute of the data that was not used expressly as an ordering attribute is included in the result. For example, if data contains attributes att1, att2, att3 and the ordering is asc(att1), dsc(att2 + att3), then the result will contain 4 columns with names att1, gen, att2, att3, where gen is a placeholder for a generated name and whose values are att2 + att3.

Author(s)

Jon Claus, <jonterainsights@gmail.com>, Tera Insights LLC

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## TPCH Query 1
data <- Read(lineitem10g)
filter <- data[l_shipdate <= .(as.Date("1998-12-01")) - 90]
agg <- GroupBy(
  filter,
  groupAtts = c(rf = l_returnflag, ls = l_linestatus),
  sum_disc_price = Sum(l_extendedprice * (1 - l_discount)),
  sum_charge = Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)),
  avg_qty = Average(l_quantity),
  count_order = Count(1),
  sum_qty = Sum(l_quantity),
  avg_price = Average(l_extendedprice),
  sum_base_price = Sum(l_extendedprice),
  avg_disc = Average(l_discount)
)
agg <- OrderBy(
  agg,
  asc(rf),
  dsc(ls),
  rank = rank
)
result <- as.data.frame(agg)

tera-insights/gtBase documentation built on May 31, 2019, 8:35 a.m.