brt | R Documentation |
This function is intended to be used on the right hand side of the
formula
argument to create_sampler
or
generate_data
. It creates a BART term in the
model's linear predictor. To use this model component one needs
to have R package dbarts installed.
brt(
formula,
X = NULL,
n.trees = 75L,
name = "",
debug = FALSE,
keepTrees = FALSE,
...
)
formula |
a formula specifying the predictors to be used in the BART
model component. Variable names are looked up in the data frame
passed as |
X |
a design matrix can be specified directly, as an alternative
to the creation of one based on |
n.trees |
number of trees used in the BART ensemble. |
name |
the name of the model component. This name is used in the output of the
MCMC simulation function |
debug |
if |
keepTrees |
whether to store the trees ensemble for each Monte Carlo draw. This
is required for prediction based on new data. The default is |
... |
parameters passed to |
An object with precomputed quantities and functions for sampling from prior or conditional posterior distributions for this model component. Intended for internal use by other package functions.
H.A. Chipman, E.I. Georgea and R.E. McCulloch (2010). BART: Bayesian additive regression trees. The Annals of Applied Statistics 4(1), 266-298.
J.H. Friedman (1991). Multivariate adaptive regression splines. The Annals of Statistics 19, 1-67.
# generate data, based on an example in Friedman (1991)
gendat <- function(n=200L, p=10L, sigma=1) {
x <- matrix(runif(n * p), n, p)
mu <- 10*sin(pi*x[, 1] * x[, 2]) + 20*(x[, 3] - 0.5)^2 + 10*x[, 4] + 5*x[, 5]
y <- mu + sigma * rnorm(n)
data.frame(x=x, mu=mu, y=y)
}
train <- gendat()
test <- gendat(n=25)
# keep trees for later prediction based on new data
sampler <- create_sampler(
y ~ brt(~ . - y, name="bart", keepTrees=TRUE),
sigma.mod=pr_invchisq(df=3, scale=var(train$y)),
data = train
)
sim <- MCMCsim(sampler, n.chain=2, n.iter=700, thin=2,
store.all=TRUE, verbose=FALSE)
(summ <- summary(sim))
plot(train$mu, summ$bart[, "Mean"]); abline(0, 1)
# NB prediction is currently slow
pred <- predict(sim, newdata=test,
iters=sample(seq_len(n_draws(sim)), 100),
show.progress=FALSE
)
(summpred <- summary(pred))
plot(test$mu, summpred[, "Mean"]); abline(0, 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.