pcb: boost principal components of outcomes

Description Usage Arguments

View source: R/pcb.R

Description

boost principal components of outcomes

Usage

1
2
3
4
5
pcb(Y, X, n.trees = 100, shrinkage = 0.01, interaction.depth = 1,
  distribution = "gaussian", train.fraction = 1, bag.fraction = 1,
  cv.folds = 1, keep.data = FALSE, s = NULL, compress = FALSE,
  save.cv = FALSE, iter.details = TRUE, verbose = FALSE, mc.cores = 1,
  ...)

Arguments

Y

vector, matrix, or data.frame for outcome variables with no missing values. To easily compare influences across outcomes and for numerical stability, outcome variables should be scaled to have unit variance.

X

vector, matrix, or data.frame of predictors. For best performance, continuous predictors should be scaled to have unit variance. Categorical variables should converted to factors.

n.trees

maximum number of trees to be included in the model. Each individual tree is grown until a minimum number observations in each node is reached.

shrinkage

a constant multiplier for the predictions from each tree to ensure a slow learning rate. Default is .01. Small shrinkage values may require a large number of trees to provide adequate fit.

interaction.depth

fixed depth of trees to be included in the model. A tree depth of 1 corresponds to fitting stumps (main effects only), higher tree depths capture higher order interactions (e.g. 2 implies a model with up to 2-way interactions)

distribution

Character vector specifying the distribution of all outcomes. Default is "gaussian" see ?gbm for further details.

train.fraction

proportion of the sample used for training the multivariate additive model. If both cv.folds and train.fraction are specified, the CV is carried out within the training set.

bag.fraction

proportion of the training sample used to fit univariate trees for each response at each iteration. Default: 1

cv.folds

number of cross validation folds. Default: 1. Runs k + 1 models, where the k models are run in parallel and the final model is run on the entire sample. If larger than 1, the number of trees that minimize the multivariate MSE averaged over k-folds is reported in object$best.trees

keep.data

a logical variable indicating whether to keep the data stored with the object.

s

vector of indices denoting observations to be used for the training sample. If s is given, train.fraction is ignored.

compress

TRUE/FALSE. Compress output results list using bzip2 (approx 10% of original size). Default is FALSE.

save.cv

TRUE/FALSE. Save all k-fold cross-validation models. Default is FALSE.

iter.details

TRUE/FALSE. Return training, test, and cross-validation error at each iteration. Default is FALSE.

verbose

If TRUE, will print out progress and performance indicators for each model. Default is FALSE.

mc.cores

Number of cores for cross validation.

...

additional arguments passed to gbm. These include distribution, weights, var.monotone, n.minobsinnode, keep.data, verbose, class.stratify.cv. Note that other distribution arguments have not been tested.


patr1ckm/mvtboost documentation built on May 24, 2019, 8:21 p.m.