PipelineData: Data with history

Share:

Description

PipelineData is a virtual class representing a dataset with an attached pipeline that describes the series of steps that produced the object. The storage of the data is up to the implementation. The methods described here apply equally to PipelineData and any other object that has pipeline as a slot/attribute.

Methods

pipeline(object, ancestry = TRUE, local = TRUE): Gets the pipeline that produced the object. If ancestry is TRUE, the returned pipeline includes the protocols that produced predecessors of a different type. If local is TRUE, the pipeline includes protocols after the last protocol that output an object of a different type, i.e., all local protocols have this type as both their input and output.

explore(object): Produces an interactive, exploratory visualization of this data, in the context of the last applied protocol.

Author(s)

Michael Lawrence

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
## A non-PipelineData data example
setStage("average", intype = "numeric")
setProtocol("mean", fun = mean, parent = "average")
setProtocol("quantile", representation = list(probs = "numeric"),
            fun = quantile, parent = "average")
setProtocol("range", representation = list(low = "numeric", high = "numeric"), 
            fun = function(x, low = 0, high = Inf) x[x >= low & x <= high],
            parent = setStage("trim", intype = "numeric"))

d <- c(1, 2, 4)
p <- Pipeline("trim", "average")
d2 <- perform(p, d)
attr(d2, 'pipeline')
pipeline(d2)
## Not run: 
## this will give an error, no slot called pipelinem, just numeric value.
d2@pipeline

## End(Not run)

setClass("ProcessNumeric", contains = c("numeric", "PipelineData"))
d <- new("ProcessNumeric", c(1, 2, 4))
d@pipeline
setStage("average", intype = "ProcessNumeric")
setProtocol("mean", fun = function(x) new("ProcessNumeric", mean(x)), parent = "average")
setProtocol("quantile", representation = list(probs = "numeric"),
            fun = function(x) new("ProcessNumeric", quantile(x)), parent = "average")
setProtocol("range", representation = list(low = "numeric", high = "numeric"), 
            fun = function(x, low = 0, high = Inf) new("ProcessNumeric",
                                          x[x >= low & x <= high]),
            parent = setStage("trim", intype = "ProcessNumeric"))

p <- Pipeline("trim", "average")
d2 <- perform(p, d)
attr(d2, 'pipeline')
pipeline(d2)
class(d2)
d2@pipeline