waypoint: Abstract Data Objects

Description Details Note Author(s) Examples

Description

Large scale data is represented abstractly in the GrokIt system. Rather than manipulating the data directly, the R interface sets up a description of the query using a waypoint system.

Details

Due to the scale of the data that the GrokIt system is intended to use, it is impossible to fully import the data into R because of memory limitations. Instead, the data is represented abstractly as an S3 object of class "data" built upon a list, with only the most relevant information kept.

The only elements that every "data" object is guaranteed to contain are labeled "schema" and "origin". The latter is used only for book-keeping and optimization and is not of much interest. The former is of much more import; it is used to ensure that the user only performs operations on attributes the data actually contains. As such, to find which attributes an object x contains, simply call x$schema to produce a character vector giving the names of the attributes.

Note

Although the above are the only two elements that every "data" object is guaranteed to have, there will always be additional elements. These are used to structure the overall query and relate it to the system; as such, the list structure grows rapidly and should be ignored by the average user.

Author(s)

Jon Claus, <jonterainsights@gmail.com>, Tera Insights, LLC

Examples

1
2
3
4
## Proper specification of inputs and outputs
data <- Read(lineitem100g)
data$schema ## produces a character vector
str(data) ## fully details the structure for illustration's sake

tera-insights/gtBase documentation built on May 31, 2019, 8:35 a.m.