library(collection) library(knitr) knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval=FALSE)
Collections can be used in both interactive and programmatic mode. In the interactive mode a number of heuristic and (hopefully) smart guesses are used to ease the use of the package. In the programmatic mode none such guesses are present to make the API simple and predictable.
Following the convention introduced in the lazyeval
package, functions
indenteded for interactive use have names that end with a letter while
their programmatic counterparts' names have an underscore appended to
the original name. For example:
restore() # is intended for interactive use restore_() # is intended for programmatic use
First we create a collection and store an object in it.
C <- collection('first collection') store(C, iris)
Let's see the summary of C
and then list objects stored there:
summary(C) print(C) show(C)
Finally, let's read that object back:
restore(C, name == 'iris') restore(C, 'id')
What are the use cases when working with multiple objects? In order to
illustrate them we will use a handy collection generator that comes
with the collection
package.
We create n = 12
time series objects, each len = 96
observations
long. We also provide the random seed.
ts_col <- sample_time_series(n = 12, len = 96, seed = 1)
Now we can list the collection:
TODO should show the no
tag
print(ts_col)
Let's now apply a function on each object in this collection:
do(ts_col, function (obj) { }) # TODO should return result wrapped in a pretty-printer; a clist?
What if we only want to apply this function on a subset of objects?
filter(ts_col, no < 7) %>% do(function (obj) { })
Let's pick a single time series data set and see what kind of model we can fit.
ts <- restore(ts_col, no == 1) summary(ts) lm(x ~ a + b + c, data = ts)
ts_col %>% do(function (obj) { lm(x ~ a + b + c, data = obj) }) %>% store(C, name = 'models') # TODO store method for clist; differs from store.default
We have now seen a way to process multiple objects using a collection. Since there are no limits to what objects are stored in a colletion, one can look at them as parts of a table, where tags are just indices used to group that table into sub-tables.
Let's store the iris data set a three separate groups/subsets with Species being the grouping attribute/column/tag.
tables < collection('tables') iris %>% group_by(Species) %>% do({ obj <- select(., -Species) tag <- .$Species[1] store(tables, obj, name = 'iris', species = tag) })
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.