abart: AFT BART for time-to-event outcomes
In BART: Bayesian Additive Regression Trees

abart

R Documentation

AFT BART for time-to-event outcomes

Description

BART is a Bayesian “sum-of-trees” model.
For a numeric response y, we have y = f(x) + \epsilon, where \epsilon \sim N(0,\sigma^2).

f is the sum of many tree models. The goal is to have very flexible inference for the uknown function f.

In the spirit of “ensemble models”, each tree is constrained by a prior to be a weak learner so that it contributes a small amount to the overall fit.

Usage

abart(
      x.train, times, delta,
      x.test=matrix(0,0,0), K=100,
      type='abart', ntype=1,
      sparse=FALSE, theta=0, omega=1,
      a=0.5, b=1, augment=FALSE, rho=NULL,
      xinfo=matrix(0,0,0), usequants=FALSE,
      rm.const=TRUE,
      sigest=NA, sigdf=3, sigquant=0.90,
      k=2, power=2, base=0.95,
      
      lambda=NA, tau.num=c(NA, 3, 6)[ntype], 
      offset=NULL, w=rep(1, length(times)),
      ntree=c(200L, 50L, 50L)[ntype], numcut=100L,
      
      ndpost=1000L, nskip=100L, 
      keepevery=c(1L, 10L, 10L)[ntype],
      printevery=100L, transposed=FALSE,
      mc.cores = 1L, ## mc.abart only
      nice = 19L,    ## mc.abart only
      seed = 99L     ## mc.abart only
)

mc.abart(
         x.train, times, delta,
         x.test=matrix(0,0,0), K=100,
         type='abart', ntype=1,
         sparse=FALSE, theta=0, omega=1,
         a=0.5, b=1, augment=FALSE, rho=NULL,
         xinfo=matrix(0,0,0), usequants=FALSE,
         rm.const=TRUE,
         sigest=NA, sigdf=3, sigquant=0.90,
         k=2, power=2, base=0.95,
         
         lambda=NA, tau.num=c(NA, 3, 6)[ntype], 
         offset=NULL, w=rep(1, length(times)),
         
         ntree=c(200L, 50L, 50L)[ntype], numcut=100L,
         ndpost=1000L, nskip=100L, 
         keepevery=c(1L, 10L, 10L)[ntype],
         printevery=100L, transposed=FALSE,
         mc.cores = 2L, nice = 19L, seed = 99L
)

Arguments

`x.train`	Explanatory variables for training (in sample) data. May be a matrix or a data frame, with (as usual) rows corresponding to observations and columns to variables. If a variable is a factor in a data frame, it is replaced with dummies. Note that `q` dummies are created if `q>2` and one dummy created if `q=2` where `q` is the number of levels of the factor. `abart` will generate draws of `f(x)` for each `x` which is a row of `x.train`.
`times`	The time of event or right-censoring. If `y.train` is `NULL`, then `times` (and `delta`) must be provided.
`delta`	The event indicator: 1 is an event while 0 is censored. If `y.train` is `NULL`, then `delta` (and `times`) must be provided.
`x.test`	Explanatory variables for test (out of sample) data. Should have same structure as `x.train`. `abart` will generate draws of `f(x)` for each `x` which is a row of `x.test`.
`K`	If provided, then coarsen `times` per the quantiles `1/K, 2/K, ..., K/K`.
`type`	You can use this argument to specify the type of fit. `'abart'` for AFT BART.
`ntype`	The integer equivalent of `type` where `'abart'` is 1.
`sparse`	Whether to perform variable selection based on a sparse Dirichlet prior rather than simply uniform; see Linero 2016.
`theta`	Set `theta` parameter; zero means random.
`omega`	Set `omega` parameter; zero means random.
`a`	Sparse parameter for `Beta(a, b)` prior: `0.5<=a<=1` where lower values inducing more sparsity.
`b`	Sparse parameter for `Beta(a, b)` prior; typically, `b=1`.
`rho`	Sparse parameter: typically `rho=p` where `p` is the number of covariates under consideration.
`augment`	Whether data augmentation is to be performed in sparse variable selection.
`xinfo`	You can provide the cutpoints to BART or let BART choose them for you. To provide them, use the `xinfo` argument to specify a list (matrix) where the items (rows) are the covariates and the contents of the items (columns) are the cutpoints.
`usequants`	If `usequants=FALSE`, then the cutpoints in `xinfo` are generated uniformly; otherwise, if `TRUE`, uniform quantiles are used for the cutpoints.
`rm.const`	Whether or not to remove constant variables.
`sigest`	The prior for the error variance (`sigma^2`) is inverted chi-squared (the standard conditionally conjugate prior). The prior is specified by choosing the degrees of freedom, a rough estimate of the corresponding standard deviation and a quantile to put this rough estimate at. If `sigest=NA` then the rough estimate will be the usual least squares estimator. Otherwise the supplied value will be used. Not used if `y` is binary.
`sigdf`	Degrees of freedom for error variance prior. Not used if `y` is binary.
`sigquant`	The quantile of the prior that the rough estimate (see `sigest`) is placed at. The closer the quantile is to 1, the more aggresive the fit will be as you are putting more prior weight on error standard deviations (`sigma`) less than the rough estimate. Not used if `y` is binary.
`k`	For numeric `y`, `k` is the number of prior standard deviations `E(Y\|x) = f(x)` is away from +/-0.5. For binary `y`, `k` is the number of prior standard deviations `f(x)` is away from +/-3. The bigger `k` is, the more conservative the fitting will be.
`power`	Power parameter for tree prior.
`base`	Base parameter for tree prior.
`lambda`	The scale of the prior for the variance. Not used if `y` is binary.
`tau.num`	The numerator in the `tau` definition, i.e., `tau=tau.num/(k*sqrt(ntree))`.
`offset`	Continous BART operates on `y.train` centered by `offset` which defaults to `mean(y.train)`. With binary BART, the centering is `P(Y=1 \| x) = F(f(x) + offset)` where `offset` defaults to `F^{-1}(mean(y.train))`. You can use the `offset` parameter to over-ride these defaults.
`w`	Vector of weights which multiply the standard deviation. Not used if `y` is binary.
`ntree`	The number of trees in the sum.
`numcut`	The number of possible values of `c` (see `usequants`). If a single number if given, this is used for all variables. Otherwise a vector with length equal to `ncol(x.train)` is required, where the `i^{th}` element gives the number of `c` used for the `i^{th}` variable in `x.train`. If usequants is false, numcut equally spaced cutoffs are used covering the range of values in the corresponding column of `x.train`. If `usequants` is true, then `min(numcut, the number of unique values in the corresponding columns of x.train - 1)` values are used.
`ndpost`	The number of posterior draws returned.
`nskip`	Number of MCMC iterations to be treated as burn in.
`printevery`	As the MCMC runs, a message is printed every printevery draws.
`keepevery`	Every keepevery draw is kept to be returned to the user.
`transposed`	When running `abart` in parallel, it is more memory-efficient to transpose `x.train` and `x.test`, if any, prior to calling `mc.abart`.
`seed`	Setting the seed required for reproducible MCMC.
`mc.cores`	Number of cores to employ in parallel.
`nice`	Set the job niceness. The default niceness is 19: niceness goes from 0 (highest) to 19 (lowest).

Details

BART is a Bayesian MCMC method. At each MCMC interation, we produce a draw from the joint posterior (f,\sigma) | (x,y) in the numeric y case and just f in the binary y case.

Thus, unlike a lot of other modelling methods in R, we do not produce a single model object from which fits and summaries may be extracted. The output consists of values f^*(x) (and \sigma^* in the numeric case) where * denotes a particular draw. The x is either a row from the training data, x.train or the test data, x.test.

Value

abart returns an object of type abart which is essentially a list. In the numeric y case, the list has components:

`yhat.train`	A matrix with ndpost rows and nrow(x.train) columns. Each row corresponds to a draw `f^` from the posterior of `f` and each column corresponds to a row of x.train. The `(i,j)` value is `f^(x)` for the `i^{th}` kept draw of `f` and the `j^{th}` row of x.train. Burn-in is dropped.
`yhat.test`	Same as yhat.train but now the x's are the rows of the test data.
`yhat.train.mean`	train data fits = mean of yhat.train columns.
`yhat.test.mean`	test data fits = mean of yhat.test columns.
`sigma`	post burn in draws of sigma, length = ndpost.
`first.sigma`	burn-in draws of sigma.
`varcount`	a matrix with ndpost rows and nrow(x.train) columns. Each row is for a draw. For each variable (corresponding to the columns), the total count of the number of times that variable is used in a tree decision rule (over all trees) is given.
`sigest`	The rough error standard deviation (`\sigma`) used in the prior.

Examples


N = 1000
P = 5       #number of covariates
M = 8

set.seed(12)
x.train=matrix(runif(N*P, -2, 2), N, P)
mu = x.train[ , 1]^3
y=rnorm(N, mu)
offset=mean(y)
T=exp(y)
C=rexp(N, 0.05)
delta=(T<C)*1
table(delta)/N
times=(T*delta+C*(1-delta))

##test BART with token run to ensure installation works
set.seed(99)
post1 = abart(x.train, times, delta, nskip=5, ndpost=10)

## Not run: 

post1 = mc.abart(x.train, times, delta,
                 mc.cores=M, seed=99)
post2 = mc.abart(x.train, times, delta, offset=offset,
                 mc.cores=M, seed=99)

Z=8

plot(mu, post1$yhat.train.mean, asp=1,
     xlim=c(-Z, Z), ylim=c(-Z, Z))
abline(a=0, b=1)

plot(mu, post2$yhat.train.mean, asp=1,
     xlim=c(-Z, Z), ylim=c(-Z, Z))
abline(a=0, b=1)

plot(post1$yhat.train.mean, post2$yhat.train.mean, asp=1,
     xlim=c(-Z, Z), ylim=c(-Z, Z))
abline(a=0, b=1)


## End(Not run)

BART documentation built on June 22, 2024, 11:33 a.m.