metb: Boosted decision trees with random effects
In patr1ckm/mvtboost: Tree Boosting for Multivariate Outcomes

Description Usage Arguments Details Value Functions

At each iteration, a single decision tree is fit using gbm.fit, and the terminal node means are allowed to vary by group using lmer.

metb(y, X, id, n.trees = 5, interaction.depth = 3, n.minobsinnode = 20,
  shrinkage = 0.01, bag.fraction = 0.5, train.fraction = NULL,
  cv.folds = 1, subset = NULL, indep = TRUE, save.mods = FALSE,
  mc.cores = 1, num_threads = 1, verbose = TRUE, ...)

metb.fit(y, X, id, n.trees = 5, interaction.depth = 3,
  n.minobsinnode = 20, shrinkage = 0.01, bag.fraction = 0.5,
  train.fraction = NULL, subset = NULL, indep = TRUE, num_threads = 1,
  save.mods = FALSE, verbose = TRUE, ...)

`y`	outcome vector (continuous)
`X`	matrix or data frame of predictors
`id`	name or index of grouping variable
`n.trees`	the total number of trees to fit (iterations).
`interaction.depth`	The maximum depth of trees. 1 implies a single split (stump), 2 implies a tree with 2 splits, etc.
`n.minobsinnode`	minimum number of observations in the terminal nodes of each tree
`shrinkage`	a shrinkage parameter applied to each tree. Also known as the learning rate or step-size reduction.
`bag.fraction`	the fraction of the training set observations randomly selected to propose the next tree. This introduces randomnesses into the model fit. If `bag.fraction<1` then running the same model twice will result in similar but different fits. Using `set.seed` ensures reproducibility.
`train.fraction`	of sample used for training
`cv.folds`	number of cross-validation folds. In addition to the usual fit, will perform cross-validation over a grid of meta-parameters (see details).
`subset`	index of observations to use for training
`indep`	whether random effects are independent or allowed to covary (default is TRUE, for speed)
`save.mods`	whether the `lmer` models fit at each iteration are saved (required to use `predict`)
`mc.cores`	number of parallel cores
`num_threads`	number of threads
`verbose`	In the final model fit, will print every '10' trees/iterations.
`...`	arguments passed to gbm.fit

Meta-parameter tuning is handled by passing vectors of possible values for n.trees, shrinkage, indep, interaction.depth, and n.minobsinnode and setting cv.folds > 1. Setting mc.cores > 1 will carry out the tuning in parallel by forking via mclapply. Tuning is only done within the training set.

Prediction is most easily carried out by passing the entire X matrix to metb, and specifying the training set using subset. Otherwise, set save.mods=TRUE and use predict.

An metb object consisting of the following list elements:

yhat: Vector of predictions at the best iteration (fixed + ranef)
ranef: Vector of random effects at the best iteration
fixed: Vector of fixed effect predictions at the best iteration
shrinkange: Amount of shrinkage
subset: Vector of observations used for training
best.trees: Best number of trees by training, test, oob, and cv error
best.params: The best set of meta-parameter values given by CV
params: A data frame of all meta-parameter combinations and the corresponding CV error
sigma: The variance due to the grouping variable at each iteration
xnames: Column names of X
mods: List of lmer models (if save.mods=TRUE)
id: name or index of the grouping variable
trees: List of trees fit at each iteration
init: initial prediction
var.type: Type of variables (gbm.fit)
c.split: List of categorical splits (gbm.fit)
train.err: Training error at each iteration
oob.err: Out of bag error at each iteration
test.err: Test error at each iteration
cv.err: Cross-validation error at each iteration