Retire (i.e. remove) data from the a dynaTree model

Share:

Description

Allows the removal (or “retireing” of X-y pairs from a "dynaTree"-class object to facilitate online learning; “retireed” pairs ar absorbed into the leaf prior(s)

Usage

1
2
## S3 method for class 'dynaTree'
retire(object, indices, lambda = 1, verb = 0)

Arguments

object

a "dynaTree"-class object built by dynaTree

indices

a vector of positive integers in 1:nrow(object$X) indicating which X-y pairs to “retire”; must have length(indices) <= nrow(object$X)

lambda

a scalar proportion (forgetting factor) used to downweight the previous prior summary statistics

verb

a nonzero scalar causes info about the “retireed” indices, i.e., their X-y values, to be printed to the screen as they are “retireed”

Details

Primarily for use in online learning contexts. After “retireing” the predictive distribution remains unchanged, because the sufficient statistics of the removed pairs enters the prior in the leaves of the tree of each particle. Further update.dynaTree calls (adding data) may cause changes to the posterior predictive as grow moves cannot keep the “retires”; see a forthcoming paper for more details. In many ways, retire.dynaTree is the opposite of update.dynaTree except that the loss of information upon “retireing” is not complete.

Drifting regression or classification relationships may be modeled with a forgetting factor lambda < 1

The alcX.dynaTree provides a good, and computationally efficient, heuristic for choosing which points to “retire” for regression models, and likewise link{entropyX.dynaTree} for classification models.

Note that classification models (model = "class") are not supported, and implicit intercepts (icept = "implicit") with linear models (model = "linear") are not supported at this time

Value

returns a "dynaTree"-class object with updated attributes

Note

In order to use model = "linear" with dynaTree and retirement one must also specify icept = "augmented" which automatically augments an extra column of ones onto the input X design matrix/matrices. The retire function only supports this icept case

Author(s)

Robert B. Gramacy rbgramacy@chicagobooth.edu,
Matt Taddy taddy@chicagobooth.edu, and
Christoforos Anagnostopoulos christoforos.anagnostopoulos06@imperial.ac.uk

References

Anagnostopoulos, C., Gramacy. R.B. (2013) “Information-Theoretic Data Discarding for Dynamic Trees on Data Streams.” Entropy, 15(12), 5510-5535; arXiv:1201.5568

http://bobby.gramacy.com/r_packages/dynaTree/

See Also

dynaTree, alcX.dynaTree, entropyX.dynaTree, update.dynaTree, rejuvenate.dynaTree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
n <- 100
Xp <- runif(n,-3,3)
XX <- seq(-3,3, length=200)
Yp <- Xp + Xp^2 + rnorm(n, 0, .2)
rect <- c(-3,3)
out <- dynaTree(Xp, Yp, model="linear", icept="augmented")

## predict and plot
out <- predict(out, XX)
plot(out, main="parabola data", lwd=2)

## randomly remove half of the data points
out <- retire(out, sample(1:n, n/2, replace=FALSE))

## predict and add to plot -- shouldn't change anything
out <- predict(out, XX)
plot(out, add=TRUE, col=3)
points(out$X[,-1], out$y, col=3)

## now illustrating rejuvenation, which should result
## in a change to the predictive surface
out <- rejuvenate(out)
out <- predict(out, XX)
plot(out, add=TRUE, col=4)
legend("top", c("original", "retired", "rejuvenated"),
       col=2:4, lty=1)

## clean up
deletecloud(out)

## see demo("online") for an online learning example
## where ALC is used for retirement