tbart1: Type I Tobit Bayesian Additive Regression Trees implemented...
In EoghanONeill/TobitBART: Tobit Bayesian Additive Regression Trees

tbart1

R Documentation

Type I Tobit Bayesian Additive Regression Trees implemented using MCMC

Description

Type I Tobit Bayesian Additive Regression Trees implemented using MCMC

Usage

tbart1(
  x.train,
  x.test,
  y,
  n.iter = 1000,
  n.burnin = 100,
  below_cens = 0,
  above_cens = Inf,
  n.trees = 50L,
  n.burn = 0L,
  n.samples = 1L,
  n.thin = 1L,
  n.chains = 1,
  n.threads = 1L,
  printEvery = 100L,
  printCutoffs = 0L,
  rngKind = "default",
  rngNormalKind = "default",
  rngSeed = NA_integer_,
  updateState = TRUE,
  tree_power = 2,
  tree_base = 0.95,
  node.prior = dbarts:::normal,
  resid.prior = dbarts:::chisq,
  proposal.probs = c(birth_death = 0.5, swap = 0.1, change = 0.4, birth = 0.5),
  sigmadbarts = NA_real_,
  print.opt = 100,
  fast = TRUE,
  censsigprior = FALSE,
  lambda0 = NA,
  sigest = NA,
  nu0 = 3,
  sigquant = 0.9,
  nolinregforlambda = FALSE,
  sparse = FALSE,
  alpha_a_y = 0.5,
  alpha_b_y = 1,
  alpha_split_prior = TRUE
)

Arguments

`x.train`	The training covariate data for all training observations. Number of rows equal to the number of observations. Number of columns equal to the number of covariates.
`x.test`	The test covariate data for all test observations. Number of rows equal to the number of observations. Number of columns equal to the number of covariates.
`y`	The training data vector of outcomes. A continuous, censored outcome variable.
`n.iter`	Number of iterations excluding burnin.
`n.burnin`	Number of burnin iterations.
`below_cens`	Number at or below which observations are censored.
`above_cens`	Number at or above which observations are censored.
`n.trees`	(dbarts control option) A positive integer giving the number of trees used in the sum-of-trees formulation.
`n.chains`	(dbarts control option) A positive integer detailing the number of independent chains for the dbarts sampler to use (more than one chain is unlikely to improve speed because only one sample for each call to dbarts).
`n.threads`	(dbarts control option) A positive integer controlling how many threads will be used for various internal calculations, as well as the number of chains. Internal calculations are highly optimized so that single-threaded performance tends to be superior unless the number of observations is very large (>10k), so that it is often not necessary to have the number of threads exceed the number of chains.
`printEvery`	(dbarts control option)If verbose is TRUE, every printEvery potential samples (after thinning) will issue a verbal statement. Must be a positive integer.
`printCutoffs`	(dbarts control option) A non-negative integer specifying how many of the decision rules for a variable are printed in verbose mode
`rngKind`	(dbarts control option) Random number generator kind, as used in set.seed. For type "default", the built-in generator will be used if possible. Otherwise, will attempt to match the built-in generator’s type. Success depends on the number of threads.
`rngNormalKind`	(dbarts control option) Random number generator normal kind, as used in set.seed. For type "default", the built-in generator will be used if possible. Otherwise, will attempt to match the built-in generator’s type. Success depends on the number of threads and the rngKind
`rngSeed`	(dbarts control option) Random number generator seed, as used in set.seed. If the sampler is running single-threaded or has one chain, the behavior will be as any other sequential algorithm. If the sampler is multithreaded, the seed will be used to create an additional pRNG object, which in turn will be used sequentially seed the threadspecific pRNGs. If equal to NA, the clock will be used to seed pRNGs when applicable.
`updateState`	(dbarts control option) Logical setting the default behavior for many sampler methods with regards to the immediate updating of the cached state of the object. A current, cached state is only useful when saving/loading the sampler.
`tree_power`	Tree prior parameter for outcome model.
`tree_base`	Tree prior parameter for outcome model.
`node.prior`	(dbarts option) An expression of the form dbarts:::normal or dbarts:::normal(k) that sets the prior used on the averages within nodes.
`resid.prior`	(dbarts option) An expression of the form dbarts:::chisq or dbarts:::chisq(df,quant) that sets the prior used on the residual/error variance
`proposal.probs`	(dbarts option) Named numeric vector or NULL, optionally specifying the proposal rules and their probabilities. Elements should be "birth_death", "change", and "swap" to control tree change proposals, and "birth" to give the relative frequency of birth/death in the "birth_death" step.
`sigmadbarts`	(dbarts option) A positive numeric estimate of the residual standard deviation. If NA, a linear model is used with all of the predictors to obtain one.
`print.opt`	Print every print.opt number of Gibbs samples.
`fast`	If equal to TRUE, then implements faster truncated normal draws and approximates normal pdf.
`sparse`	If equal to TRUE, use Linero Dirichlet prior on splitting probabilities
`alpha_a_y`	Linero alpha prior parameter for outcome equation splitting probabilities
`alpha_b_y`	Linero alpha prior parameter for outcome equation splitting probabilities
`alpha_split_prior`	If true, set hyperprior for Linero alpha parameter

Value

The following objects are returned:

`Z.matcens`	Matrix of draws of latent (censored) outcomes for censored observations. Number of rows equals number of censored training observations. Number of columns equals n.iter . Rows are ordered in order of censored observations in the training data.
`Z.matcensbelow`	Matrix of draws of latent (censored) outcomes for observations censored from below. Number of rows equals number of training observations censored from below. Number of columns equals n.iter . Rows are ordered in order of censored observations in the training data.
`Z.matcensabove`	Matrix of draws of latent (censored) outcomes for observations censored from above. Number of rows equals number of training observations censored from above. Number of columns equals n.iter . Rows are ordered in order of censored observations in the training data.
`mu`	Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all training observations. Number of rows equals number of training observations. Number of columns equals n.iter .
`mucens`	Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all censored training observations. Number of rows equals number of censored training observations. Number of columns equals n.iter .
`muuncens`	Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all uncensored training observations. Number of rows equals number of uncensored training observations. Number of columns equals n.iter .
`mucensbelow`	Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all training observations censored from below. Number of rows equals number of training observations censored from below. Number of columns equals n.iter .
`mucensabove`	Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all training observations censored from above Number of rows equals number of training observations censored from above Number of columns equals n.iter .
`ystar`	Matrix of training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of training observations. Number of columns equals n.iter .
`ystarcens`	Matrix of censored training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of censored training observations. Number of columns equals n.iter .
`ystaruncens`	Matrix of uncensored training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of uncensored training observations. Number of columns equals n.iter .
`ystarcensbelow`	Matrix of censored from below training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of training observations censored from below. Number of columns equals n.iter .
`ystarcensabove`	Matrix of censored from above training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of training observations censored from above. Number of columns equals n.iter .
`test.mu`	Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all test observations. Number of rows equals number of test observations. Number of columns equals n.iter .
`test.y_nocensoring`	Matrix of test sample draws of the outcome assuming uncensored. Can take values below below_cens and above above_cens. Number of rows equals number of test observations. Number of columns equals n.iter .
`test.y_withcensoring`	Matrix of test sample draws of the outcome assuming censored. Cannot take values below below_cens and above above_cens. Number of rows equals number of test observations. Number of columns equals n.iter .
`test.probcensbelow`	Matrix of draws of probabilities of test sample observations being censored from below. Number of rows equals number of test observations. Number of columns equals n.iter .
`test.probcensabove`	Matrix of draws of probabilities of test sample observations being censored from above. Number of rows equals number of test observations. Number of columns equals n.iter .
`sigma`	Vector of draws of the standard deviation of the error term. Number of elements equals n.iter .
`alpha_s_y_store`	For Dirichlet prior on splitting probabilities in outcome equation, vector of alpha hyperparameter draws for each iteration.
`var_count_y_store`	Matrix of counts of splits on each variable in outcome observation. The number of rows is the number of potential splitting variables. The number of columns is the number of post-burn-in iterations.
`s_prob_y_store`	Splitting probabilities for the outcome equation. The number of rows is the number of potential splitting variables. The number of columns is the number of post-burn-in iterations.

Examples


#example taken from https://stats.idre.ucla.edu/r/dae/tobit-models/

dat <- read.csv("https://stats.idre.ucla.edu/stat/data/tobit.csv")

train_inds <- sample(1:200,190)
test_inds <- (1:200)[-train_inds]

ytrain <- dat$apt[train_inds]
ytest <- dat$apt[test_inds]

xtrain <- cbind(dat$read, dat$math)[train_inds,]
xtest <- cbind(dat$read, dat$math)[test_inds,]

tobart_res <- tbart1(xtrain,xtest,ytrain,
                    below_cens = -Inf,
                    above_cens = 800,
                    n.iter = 400,
                    n.burnin = 100)

EoghanONeill/TobitBART documentation built on Feb. 6, 2025, 6:52 a.m.