bartModelMatrix: Create a matrix out of a vector or data frame
In BartMixVs: Variable Selection Using Bayesian Additive Regression Trees

bartModelMatrix

R Documentation

Create a matrix out of a vector or data frame

Description

The external BART functions (e.g. wbart()) operate on matrices in memory. Therefore, if the user submits a vector or data frame, then this function converts it to a matrix. Also, it determines the number of cut points necessary for each column when asked to do so. This function is inherited from the CRAN package 'BART'.

Usage

bartModelMatrix(
  X,
  numcut = 0L,
  usequants = FALSE,
  type = 7,
  rm.const = FALSE,
  cont = FALSE,
  xinfo = NULL
)

Arguments

`X`	A vector or data frame where the matrix is created.
`numcut`	The maximum number of cut points to consider. If `numcut=0`, then return a matrix; otherwise, return a list containing a matrix `X`, a vector `numcut` and a list `xinfo`.
`usequants`	A Boolean argument indicating the way to generate cut points. If `usequants=FALSE`, then the cut points in `xinfo` are generated uniformly; otherwise, the quantiles are used for the cut points.
`type`	An integer between 1 and 9 determining which algorithm is employed in the function `quantile()`.
`rm.const`	A Boolean argument indicating whether to remove constant variables.
`cont`	A Boolean argument indicating whether to assume all variables are continuous.
`xinfo`	A list (matrix) where the items (rows) are the predictors and the contents (columns) of the items are the cut points. If `xinfo=NULL`, BART will choose `xinfo` for the user.

Value

The function bartModelMatrix() returns a list with the following components.

`X`	A matrix with rows corresponding to observations and columns corresponding to predictors (after dummification).
`numcut`	A vector of `ncol(X)` integers with each indicating the number of cut points for the corresponding predictor.
`rm.const`	A vector of indicators for the predictors (after dummification) used in BART; when the indicator is negative, it refers to remove that predictor.
`xinfo`	A list (matrix) where the items (rows) are the predictors and the contents (columns) of the items are the cut points.
`grp`	A vector of group indices for predictors. For example, if 2 appears 3 times in `grp`, the second predictor of `X` is a categorical predictor with 3 levels.

Author(s)

Chuji Luo: cjluo@ufl.edu and Michael J. Daniels: daniels@ufl.edu.

References

Chipman, H. A., George, E. I. and McCulloch, R. E. (2010). "BART: Bayesian additive regression trees." Ann. Appl. Stat. 4 266–298.

Linero, A. R. (2018). "Bayesian regression trees for high-dimensional prediction and variable selection." J. Amer. Statist. Assoc. 113 626–636.

Luo, C. and Daniels, M. J. (2021) "Variable Selection Using Bayesian Additive Regression Trees." arXiv preprint arXiv:2112.13998.

Rockova V, Saha E (2019). “On theory for BART.” In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2839–2848). PMLR.

Sparapani, R., Spanbauer, C. and McCulloch, R. (2021). "Nonparametric machine learning and efficient computation with bayesian additive regression trees: the BART R package." J. Stat. Softw. 97 1–66.

Examples

 
## simulate data (Scenario C.M.1. in Luo and Daniels (2021))
set.seed(123)
data = mixone(100, 10, 1, FALSE)
## test bartModelMatrix() function
res = bartModelMatrix(data$X, numcut=100, usequants=FALSE, cont=FALSE, rm.const=TRUE)

BartMixVs documentation built on May 5, 2022, 9:05 a.m.