dynaTrees | R Documentation |
A function to initialize and fit dynamic tree models to regression and classification data by the sequential Monte Carlo (SMC) method of particle learning (PL)
dynaTree(X, y, N = 1000, model = c("constant", "linear", "class", "prior"),
nu0s20 = c(0,0), ab = c(0.95, 2), minp = NULL, sb = NULL,
nstart = minp, icept = c("implicit", "augmented", "none"),
rprop = c("luvar", "luall", "reject"), verb = round(length(y)/10))
dynaTrees(X, y, N = 1000, R = 10, sub = length(y),
model = c("constant", "linear", "class", "prior"), nu0s20 = c(0,0),
ab=c(0.95, 2), minp = NULL, sb = NULL, nstart = minp,
icept = c("implicit", "augmented", "none"),
rprop = c("luvar", "luall", "reject"), XX = NULL, yy = NULL,
varstats = FALSE, lhs = NULL, plotit = FALSE, proj = 1,
rorder = TRUE, verb = round(sub/10), pverb=round(N/10), ...)
X |
A design |
y |
A vector of length |
N |
a positive scalar integer indicating the number of particles to be used |
R |
a scalar integer |
sub |
Optional argument allowing only a subset of the |
model |
indicates the type of model to be used at the leaves of the tree;
|
nu0s20 |
a two-vector indicating Inverse Gamma prior parameters
|
ab |
tree prior parameter |
minp |
a positive scalar integer describing the smallest allowable
region in the treed partition; if |
sb |
an optional two-vector of positive integers indicating
|
nstart |
a positive scalar integer |
icept |
indicates the type of intertcept term used (only applies to
|
XX |
a design |
yy |
an optional vector of “true” responses at the |
varstats |
if |
lhs |
an optional |
plotit |
a scalar |
proj |
when |
rorder |
a scalar |
rprop |
indicates the scheme used to construct a grow proposal.
The best setting, |
verb |
a positive scalar integer indicating how many time steps
(iterations) should pass before a progress statement is
printed to the console; a value of |
pverb |
a positive scalar integer indicating after many particles
should be processed for prediction before a progress statement is
printed to the console; a value of |
... |
extra arguments to |
The dynaTree
function processes the X
and y
pairs serially via PL. It builds up a particle cloud
which is stored as an object in C
. A “pointer” to that
object is the primary return value. The dynaTrees
function
fits several (R
) different dynamic tree models on different
time-orderings of the data indices and also
obtains samples from the posterior predictive distribution at
new XX
locations. These predictions can be averaged
over each repeat, or used to assess the Monte Carlo predictive
error.
Three different leaf model
s are supported: two for
regression and one for classification. If model == "class"
then the y
values must contain representatives from
every class (1:max(y)
). For details of these models and
the complete description of their use at the leaves of the dynamic
trees, see the Taddy, et al., (2009) reference, below.
The tree prior is specified by ab=c(alpha, beta)
via the and minp
.
It was originally described by Chipman et al., (1998, 2002)
p_{\mbox{\tiny split}}(\eta, \mathcal{T}) =
\alpha*(1+\eta)^\beta
and subsequently augmented to enforce a minimum number of points
(minp
) in each region.
Once a "dynaTree"
-class object has been built (by
dynaTree
), predictions and estimates of sequential design and
optimization criteria can be obtained via
predict.dynaTree
, a generic prediction method.
These values can be used to augment the design, and the
update.dynaTree
function can be used to quickly
update the fit with the augmenting data
Both functions return an object of class "dynaTree"
,
which is a list containing the following fields
m |
|
T |
|
N |
the number of particles used |
X |
a copy of the design matrix |
y |
a copy of the responses |
model |
a copy of the specified leaf model |
params |
a vector containing |
verb |
a copy of the verbosity argument |
lpred |
a vector of |
icept |
a copy of the intercept argument |
time |
the total computing time used to build the particle cloud |
num |
a “pointer” to the |
-
The dynaTrees
function can obtain predictive samples
(via predict.dynaTree
) at each of the R
repeats. Therefore, the "dynaTree"
object returned contains
extra fields collecting these predictive samples, primarily
comprising of R
columns of information for each of the fields
returned by predict.dynaTree
; see that function for
more details. Likewise, when varstats = TRUE
the returned
object also contains vpu
, vpt
and parde[
fields
whose columns contain the varpropuse
and
varproptotal
outputs.
Likewise, dynaTrees
, can provide variable usage summaries
if varstats = TRUE
, in which case the output includes
vpu
and vpt
fields; See varpropuse
and varproptotal
for more details
The dynaTrees
function does not return num
since
it does not leave any allocated particle clouds on the C
-side
As mentioned in the details section, above, the
dynaTree
function returns a pointer to a particle
cloud allocated in C
. This pointer is used
for prediction, via predict.dynaTree
and for
later updating/augmentation of data, via
update.dynaTree
.
This information will not be “freed” unless
the user specifically calls deletecloud(num)
or deleteclouds()
. Failing to call one
of these functions (when done with the corresponding
object(s)) could result in a memory leak;
see their documentation for more details.
The C
-side memory cannot be saved in the workspace,
so they cannot persist across R
sessions
To copy a "dynaTree"
-class object, use
copy.dynaTree
, which will also copy the C
-side
memory allocated to the object
Robert B. Gramacy rbg@vt.edu,
Matt Taddy and Christoforos Anagnostopoulos
Taddy, M.A., Gramacy, R.B., and Polson, N. (2011). “Dynamic trees for learning and design” Journal of the American Statistical Association, 106(493), pp. 109-123; arXiv:0912.1586
Gramacy, R.B., Taddy, M.A., and S. Wild (2011). “Variable Selection and Sensitivity Analysis via Dynamic Trees with an Application to Computer Code Performance Tuning” arXiv:1108.4739
Carvalho, C., Johannes, M., Lopes, H., and Polson, N. (2008). “Particle Learning and Smoothing”. Discussion Paper 2008-32, Duke University Dept. of Statistical Science.
Chipman, H., George, E., & McCulloch, R. (1998). Bayesian CART model search (with discussion). Journal of the American Statistical Association, 93, 935–960.
Chipman, H., George, E., & McCulloch, R. (2002). Bayesian treed models. Machine Learning, 48, 303–324.
https://bobby.gramacy.com/r_packages/dynaTree/
predict.dynaTree
, update.dynaTree
,
plot.dynaTree
, deletecloud
,
copy.dynaTree
, getBF
,
varpropuse
, varproptotal
,
sens.dynaTree
, relevance.dynaTree
## simple parabolic data
n <- 100
Xp <- sort(runif(n,-3,3))
Yp <- Xp + Xp^2 + rnorm(n, 0, .2)
## fit a piece-wise linear model
parab.fit <- dynaTree(Xp, Yp, model="linear")
## obtain predictions at a new set of locations
## and plot
parab.fit <- predict(parab.fit, XX=seq(-3, 3, length=100))
plot(parab.fit)
## try duplicating the object
parab.fit.copy <- copy(parab.fit)
## must delete the cloud or memory may leak
deletecloud(parab.fit); parab.fit$num <- NULL
## to delete all clouds, do:
deleteclouds()
## for more examples of dynaTree see update.dynaTree
## Motorcycle accident data
if(require("MASS")) {
data(mcycle)
Xm <- mcycle[,1]
Ym <- mcycle[,2]
XXm <- seq(min(mcycle[,1]), max(mcycle[,1]), length=100)
R <- 2 ## use R >= 10 for better results
## small R is for faster CRAN checks
## fit constant model with R=2 repeats and predictions
moto.fit <- dynaTrees(Xm, Ym, XX=XXm, R=R, plotit=TRUE)
## plot the averages
plot(moto.fit, ptype="mean")
## clouds automatically deleted by dynaTrees
}
## Not run:
## 2-d/3-class classification data
library(plgp)
library(tgp)
xx <- seq(-2, 2, length=20)
XX <- expand.grid(xx, xx)
X <- dopt.gp(125, Xcand=XX)$XX
C <- exp2d.C(X)
## fit a classification model with R=10 repeats,
class.fit <- dynaTrees(X, C, XX=XX, model="class")
## for plot the output (no generic plotting available)
cols <- c(gray(0.85), gray(0.625), gray(0.4))
par(mfrow=c(1,2))
library(interp)
## plot R-averaged predicted class
mclass <- apply(class.fit$p, 1, which.max)
image(interp(XX[,1], XX[,2], mclass), col=cols,
xlab="x1", ylab="x2", main="repeated class mean")
points(X)
## plot R-averaged entropy
ment <- apply(class.fit$entropy, 1, mean)
image(interp(XX[,1], XX[,2], ment),
xlab="x1", ylab="x2", main="repeated entropy mean")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.