tbart1: Type I Tobit Bayesian Additive Regression Trees implemented...

View source: R/tbart1.R

tbart1R Documentation

Type I Tobit Bayesian Additive Regression Trees implemented using MCMC

Description

Type I Tobit Bayesian Additive Regression Trees implemented using MCMC

Usage

tbart1(
  x.train,
  x.test,
  y,
  n.iter = 1000,
  n.burnin = 100,
  below_cens = 0,
  above_cens = Inf,
  n.trees = 50L,
  n.burn = 0L,
  n.samples = 1L,
  n.thin = 1L,
  n.chains = 1,
  n.threads = 1L,
  printEvery = 100L,
  printCutoffs = 0L,
  rngKind = "default",
  rngNormalKind = "default",
  rngSeed = NA_integer_,
  updateState = TRUE,
  tree_power = 2,
  tree_base = 0.95,
  node.prior = dbarts:::normal,
  resid.prior = dbarts:::chisq,
  proposal.probs = c(birth_death = 0.5, swap = 0.1, change = 0.4, birth = 0.5),
  sigmadbarts = NA_real_,
  print.opt = 100,
  fast = TRUE,
  censsigprior = FALSE,
  lambda0 = NA,
  sigest = NA,
  nu0 = 3,
  sigquant = 0.9,
  nolinregforlambda = FALSE,
  sparse = FALSE,
  alpha_a_y = 0.5,
  alpha_b_y = 1,
  alpha_split_prior = TRUE
)

Arguments

x.train

The training covariate data for all training observations. Number of rows equal to the number of observations. Number of columns equal to the number of covariates.

x.test

The test covariate data for all test observations. Number of rows equal to the number of observations. Number of columns equal to the number of covariates.

y

The training data vector of outcomes. A continuous, censored outcome variable.

n.iter

Number of iterations excluding burnin.

n.burnin

Number of burnin iterations.

below_cens

Number at or below which observations are censored.

above_cens

Number at or above which observations are censored.

n.trees

(dbarts control option) A positive integer giving the number of trees used in the sum-of-trees formulation.

n.chains

(dbarts control option) A positive integer detailing the number of independent chains for the dbarts sampler to use (more than one chain is unlikely to improve speed because only one sample for each call to dbarts).

n.threads

(dbarts control option) A positive integer controlling how many threads will be used for various internal calculations, as well as the number of chains. Internal calculations are highly optimized so that single-threaded performance tends to be superior unless the number of observations is very large (>10k), so that it is often not necessary to have the number of threads exceed the number of chains.

printEvery

(dbarts control option)If verbose is TRUE, every printEvery potential samples (after thinning) will issue a verbal statement. Must be a positive integer.

printCutoffs

(dbarts control option) A non-negative integer specifying how many of the decision rules for a variable are printed in verbose mode

rngKind

(dbarts control option) Random number generator kind, as used in set.seed. For type "default", the built-in generator will be used if possible. Otherwise, will attempt to match the built-in generator’s type. Success depends on the number of threads.

rngNormalKind

(dbarts control option) Random number generator normal kind, as used in set.seed. For type "default", the built-in generator will be used if possible. Otherwise, will attempt to match the built-in generator’s type. Success depends on the number of threads and the rngKind

rngSeed

(dbarts control option) Random number generator seed, as used in set.seed. If the sampler is running single-threaded or has one chain, the behavior will be as any other sequential algorithm. If the sampler is multithreaded, the seed will be used to create an additional pRNG object, which in turn will be used sequentially seed the threadspecific pRNGs. If equal to NA, the clock will be used to seed pRNGs when applicable.

updateState

(dbarts control option) Logical setting the default behavior for many sampler methods with regards to the immediate updating of the cached state of the object. A current, cached state is only useful when saving/loading the sampler.

tree_power

Tree prior parameter for outcome model.

tree_base

Tree prior parameter for outcome model.

node.prior

(dbarts option) An expression of the form dbarts:::normal or dbarts:::normal(k) that sets the prior used on the averages within nodes.

resid.prior

(dbarts option) An expression of the form dbarts:::chisq or dbarts:::chisq(df,quant) that sets the prior used on the residual/error variance

proposal.probs

(dbarts option) Named numeric vector or NULL, optionally specifying the proposal rules and their probabilities. Elements should be "birth_death", "change", and "swap" to control tree change proposals, and "birth" to give the relative frequency of birth/death in the "birth_death" step.

sigmadbarts

(dbarts option) A positive numeric estimate of the residual standard deviation. If NA, a linear model is used with all of the predictors to obtain one.

print.opt

Print every print.opt number of Gibbs samples.

fast

If equal to TRUE, then implements faster truncated normal draws and approximates normal pdf.

sparse

If equal to TRUE, use Linero Dirichlet prior on splitting probabilities

alpha_a_y

Linero alpha prior parameter for outcome equation splitting probabilities

alpha_b_y

Linero alpha prior parameter for outcome equation splitting probabilities

alpha_split_prior

If true, set hyperprior for Linero alpha parameter

Value

The following objects are returned:

Z.matcens

Matrix of draws of latent (censored) outcomes for censored observations. Number of rows equals number of censored training observations. Number of columns equals n.iter . Rows are ordered in order of censored observations in the training data.

Z.matcensbelow

Matrix of draws of latent (censored) outcomes for observations censored from below. Number of rows equals number of training observations censored from below. Number of columns equals n.iter . Rows are ordered in order of censored observations in the training data.

Z.matcensabove

Matrix of draws of latent (censored) outcomes for observations censored from above. Number of rows equals number of training observations censored from above. Number of columns equals n.iter . Rows are ordered in order of censored observations in the training data.

mu

Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all training observations. Number of rows equals number of training observations. Number of columns equals n.iter .

mucens

Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all censored training observations. Number of rows equals number of censored training observations. Number of columns equals n.iter .

muuncens

Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all uncensored training observations. Number of rows equals number of uncensored training observations. Number of columns equals n.iter .

mucensbelow

Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all training observations censored from below. Number of rows equals number of training observations censored from below. Number of columns equals n.iter .

mucensabove

Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all training observations censored from above Number of rows equals number of training observations censored from above Number of columns equals n.iter .

ystar

Matrix of training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of training observations. Number of columns equals n.iter .

ystarcens

Matrix of censored training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of censored training observations. Number of columns equals n.iter .

ystaruncens

Matrix of uncensored training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of uncensored training observations. Number of columns equals n.iter .

ystarcensbelow

Matrix of censored from below training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of training observations censored from below. Number of columns equals n.iter .

ystarcensabove

Matrix of censored from above training sample draws of the outcome assuming uncensored (can take values below below_cens and above above_cens. Number of rows equals number of training observations censored from above. Number of columns equals n.iter .

test.mu

Matrix of draws of the sum of terminal nodes, i.e. f(x_i), for all test observations. Number of rows equals number of test observations. Number of columns equals n.iter .

test.y_nocensoring

Matrix of test sample draws of the outcome assuming uncensored. Can take values below below_cens and above above_cens. Number of rows equals number of test observations. Number of columns equals n.iter .

test.y_withcensoring

Matrix of test sample draws of the outcome assuming censored. Cannot take values below below_cens and above above_cens. Number of rows equals number of test observations. Number of columns equals n.iter .

test.probcensbelow

Matrix of draws of probabilities of test sample observations being censored from below. Number of rows equals number of test observations. Number of columns equals n.iter .

test.probcensabove

Matrix of draws of probabilities of test sample observations being censored from above. Number of rows equals number of test observations. Number of columns equals n.iter .

sigma

Vector of draws of the standard deviation of the error term. Number of elements equals n.iter .

alpha_s_y_store

For Dirichlet prior on splitting probabilities in outcome equation, vector of alpha hyperparameter draws for each iteration.

var_count_y_store

Matrix of counts of splits on each variable in outcome observation. The number of rows is the number of potential splitting variables. The number of columns is the number of post-burn-in iterations.

s_prob_y_store

Splitting probabilities for the outcome equation. The number of rows is the number of potential splitting variables. The number of columns is the number of post-burn-in iterations.

Examples


#example taken from https://stats.idre.ucla.edu/r/dae/tobit-models/

dat <- read.csv("https://stats.idre.ucla.edu/stat/data/tobit.csv")

train_inds <- sample(1:200,190)
test_inds <- (1:200)[-train_inds]

ytrain <- dat$apt[train_inds]
ytest <- dat$apt[test_inds]

xtrain <- cbind(dat$read, dat$math)[train_inds,]
xtest <- cbind(dat$read, dat$math)[test_inds,]

tobart_res <- tbart1(xtrain,xtest,ytrain,
                    below_cens = -Inf,
                    above_cens = 800,
                    n.iter = 400,
                    n.burnin = 100)


EoghanONeill/TobitBART documentation built on Feb. 6, 2025, 6:52 a.m.