train.gbm: train.gbm

View source: R/train.R

train.gbmR Documentation

train.gbm

Description

Provides a wrapping function for the gbm.

Usage

train.gbm(
  formula,
  data,
  distribution = "bernoulli",
  weights,
  var.monotone = NULL,
  n.trees = 100,
  interaction.depth = 1,
  n.minobsinnode = 10,
  shrinkage = 0.001,
  bag.fraction = 0.5,
  train.fraction = 1,
  cv.folds = 0,
  keep.data = TRUE,
  verbose = F,
  class.stratify.cv = NULL,
  n.cores = NULL
)

Arguments

formula

a symbolic description of the model to be fit.

data

an optional data frame containing the variables in the model.

distribution

Either a character string specifying the name of the distribution to use or a list with a component name specifying the distribution and any additional parameters needed.

weights

an optional vector of weights to be used in the fitting process. Must be positive but do not need to be normalized.

var.monotone

an optional vector, the same length as the number of predictors, indicating which variables have a monotone increasing (+1), decreasing (-1), or arbitrary (0) relationship with the outcome.

n.trees

Integer specifying the total number of trees to fit. This is equivalent to the number of iterations and the number of basis functions in the additive expansion. Default is 100.

interaction.depth

Integer specifying the maximum depth of each tree (i.e., the highest level of variable interactions allowed). A value of 1 implies an additive model, a value of 2 implies a model with up to 2-way interactions, etc. Default is 1.

n.minobsinnode

Integer specifying the minimum number of observations in the terminal nodes of the trees. Note that this is the actual number of observations, not the total weight.

shrinkage

a shrinkage parameter applied to each tree in the expansion. Also known as the learning rate or step-size reduction; 0.001 to 0.1 usually work, but a smaller learning rate typically requires more trees. Default is 0.1.

bag.fraction

the fraction of the training set observations randomly selected to propose the next tree in the expansion. This introduces randomnesses into the model fit.

train.fraction

The first train.fraction * nrows(data) observations are used to fit the gbm and the remainder are used for computing out-of-sample estimates of the loss function.

cv.folds

Number of cross-validation folds to perform. If cv.folds>1 then gbm, in addition to the usual fit, will perform a cross-validation, calculate an estimate of generalization error returned in cv.error.

keep.data

a logical variable indicating whether to keep the data and an index of the data stored with the object. Keeping the data and index makes subsequent calls to gbm.more faster at the cost of storing an extra copy of the dataset.

verbose

Logical indicating whether or not to print out progress and performance indicators (TRUE). If this option is left unspecified for gbm.more, then it uses verbose from object. Default is FALSE.

class.stratify.cv

Logical indicating whether or not the cross-validation should be stratified by class.

n.cores

The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores. If n.cores is not specified by the user, it is guessed using the detectCores function in the parallel package.

Value

A object gbm.prmdt with additional information to the model that allows to homogenize the results.

Note

The parameter information was taken from the original function gbm.

See Also

The internal function is from package gbm.

Examples


# Classification
data <- iris
n <- nrow(data)

sam <- sample(1:n, n*0.75)
training <- data[sam,]
testing <- data[-sam,]

model <- train.gbm(formula = Species ~ ., data = training)
model
predict <- predict(object = model, testing)
predict

# Regression
len <- nrow(swiss)
sampl <- sample(x = 1:len,size = len*0.10,replace = FALSE)
ttesting <- swiss[sampl,]
ttraining <- swiss[-sampl,]
model.gbm <- train.gbm(Infant.Mortality~., ttraining, distribution = "gaussian")
prediction <- predict(model.gbm, ttesting)
prediction


traineR documentation built on Nov. 10, 2023, 1:15 a.m.

Related to train.gbm in traineR...