Description Usage Arguments Value References Examples
This is an implementation of Bayesian Additive Regression Trees \insertCitechipman2010bartbartBMA using Bayesian Model Averaging \insertCitehernandez2018bayesianbartBMA.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | bartBMA(x.train, ...)
## Default S3 method:
bartBMA(
x.train,
y.train,
a = 3,
nu = 3,
sigquant = 0.9,
c = 1000,
pen = 12,
num_cp = 20,
x.test = matrix(0, 0, 0),
num_rounds = 5,
alpha = 0.95,
beta = 2,
split_rule_node = 0,
gridpoint = 0,
maxOWsize = 100,
num_splits = 5,
gridsize = 10,
zero_split = 1,
only_max_num_trees = 1,
min_num_obs_for_split = 2,
min_num_obs_after_split = 2,
exact_residuals = 1,
spike_tree = 0,
s_t_hyperprior = 1,
p_s_t = 0.5,
a_s_t = 1,
b_s_t = 3,
lambda_poisson = 10,
less_greedy = 0,
...
)
|
x.train |
Training data covariate matrix |
... |
Further arguments. |
y.train |
Training data outcome vector. |
a |
This is a parameter that influences the variance of terminal node parameter values. Default value a=3. |
nu |
This is a hyperparameter in the distribution of the variance of the error term. THe inverse of the variance is distributed as Gamma (nu/2, nu*lambda/2). Default value nu=3. |
sigquant |
Calibration quantile for the inverse chi-squared prior on the variance of the error term. |
c |
This determines the size of Occam's Window |
pen |
This is a parameter used by the Pruned Exact Linear Time Algorithm when finding changepoints. Default value pen=12. |
num_cp |
This is a number between 0 and 100 that determines the proportion of changepoints proposed by the changepoint detection algorithm to keep when growing trees. Default num_cp=20. |
x.test |
Test data covariate matrix. Default x.test=matrix(0.0,0,0). |
num_rounds |
Number of trees. (Maximum number of trees in a sum-of-tree model). Default num_rounds=5. |
alpha |
Parameter in prior probability of tree node splitting. Default alpha=0.95 |
beta |
Parameter in prior probability of tree node splitting. Default beta=1 |
split_rule_node |
Binary variable. If equals 1, then find a new set of potential splitting points via a changepoint algorithm after adding each split to a tree. If equals zero, use the same set of potential split points for all splits in a tree. Default split_rule_node=0. |
gridpoint |
Binary variable. If equals 1, then a grid search changepoint detection algorithm will be used. If equals 0, then the Pruned Exact Linear Time (PELT) changepoint detection algorithm will be used (Killick et al. 2012). Default gridpoint=0. |
maxOWsize |
Maximum number of models to keep in Occam's window. Default maxOWsize=100. |
num_splits |
Maximum number of splits in a tree |
gridsize |
This integer determines the size of the grid across which to search if gridpoint=1 when finding changepoints for constructing trees. |
zero_split |
Binary variable. If equals 1, then zero split trees can be included in a sum-of-trees model. If equals zero, then only trees with at least one split can be included in a sum-of-trees model. |
only_max_num_trees |
Binary variable. If equals 1, then only sum-of-trees models containing the maximum number of trees, num_rounds, are selected. If equals 0, then sum-of-trees models containing less than num_rounds trees can be selected. The default is only_max_num_trees=1. |
min_num_obs_for_split |
This integer determines the minimum number of observations in a (parent) tree node for the algorithm to consider potential splits of the node. |
min_num_obs_after_split |
This integer determines the minimum number of observations in a child node resulting from a split in order for a split to occur. If the left or right chikd node has less than this number of observations, then the split can not occur. |
exact_residuals |
Binary variable. If equal to 1, then trees are added to sum-of-tree models within each round of the algorithm by detecting changepoints in the exact residuals. If equals zero, then changepoints are detected in residuals that are constructed from approximate predictions. |
spike_tree |
If equal to 1, then the Spike-and-Tree prior will be used, otherwise the standard BART prior will be used. The number of splitting variables has a beta-binomial prior. The number of terminal nodes has a truncated Poisson prior, and then a uniform prior is placed on the set of valid constructions of trees given the splitting variables and number of terminal nodes. |
s_t_hyperprior |
If equals 1 and spike_tree equals 1, then a beta distribution hyperprior is placed on the variable inclusion probabilities for the spike and tree prior. The hyperprior parameters are a_s_t and b_s_t. |
p_s_t |
If spike_tree=1 and s_t_hyperprior=0, then p_s_t is the prior variable inclusion probability. |
a_s_t |
If spike_tree=1 and s_t_hyperprior=1, then a_s_t is a parameter of a beta distribution hyperprior. |
b_s_t |
If spike_tree=1 and s_t_hyperprior=1, then b_s_t is a parameter of a beta distribution hyperprior. |
lambda_poisson |
This is a parameter for the Spike-and-Tree prior. It is the parameter for the (truncated and conditional on the number of splitting variables) Poisson prior on the number of terminal nodes. |
less_greedy |
If equal to one, then a less greedy model search algorithm is used. |
The following objects are returned by bartbma:
fitted.values |
The vector of predictions of the outcome for all training observations. |
sumoftrees |
This is a list of lists of matrices. The outer list corresponds to a list of sum-of-tree models, and each element of the outer list is a list of matrices describing the structure of the trees within a sum-of-tree model. See details. |
obs_to_termNodesMatrix |
This is a list of lists of matrices. The outer list corresponds to a list of sum-of-tree models, and each element of the outer list is a list of matrices describing to which node each of the observations is allocated to at all depths of each tree within a sum-of-tree model. See details. |
bic |
This is a vector of BICs for each sum-of-tree model. |
test.preds |
A vector of test data predictions. This output only is given if there is test data in the input. |
sum_residuals |
CURRENTLY INCORRECT OUTPUT. A List (over sum-of-tree models) of lists (over single trees in a model) of vectors of partial residuals. Unless the maximum number of trees in a model is one, in which case the output is a list (over single tree models) of vectors of partial residuals, which are all equal to the outcome vector. |
numvars |
This is the total number of variables in the input training data matrix. |
call |
match.call returns a call in which all of the specified arguments are specified by their full names. |
y_minmax |
Range of the input training data outcome vector. |
response |
Input taining data outcome vector. |
nrowTrain |
number of observations in the input training data. |
sigma |
sd(y.train)/(max(y.train)-min(y.train)) |
a |
input parameter |
nu |
input parameter |
lambda |
parameter determined by the inputs sigma, sigquant, and nu |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | N <- 100
p<- 100
set.seed(100)
library(bartBMA)
epsilon <- rnorm(N)
xcov <- matrix(runif(N*p), nrow=N)
y <- sin(pi*xcov[,1]*xcov[,2]) + 20*(xcov[,3]-0.5)^2+10*xcov[,4]+
5*xcov[,5]+epsilon
epsilontest <- rnorm(N)
xcovtest <- matrix(runif(N*p), nrow=N)
ytest <- sin(pi*xcovtest[,1]*xcovtest[,2]) + 20*(xcovtest[,3]-0.5)^2+10*xcovtest[,4]+
5*xcovtest[,5]+epsilontest
bart_bma_example <- bartBMA(x.train = xcov,y.train=y,x.test=xcovtest,zero_split = 1,
only_max_num_trees = 1,split_rule_node = 0)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.