ladboost: Gradient tree boosting with least absolute deviation (LAD)...

View source: R/ladboost.R

ladboostR Documentation

Gradient tree boosting with least absolute deviation (LAD) loss

Description

A poor-man's implementation of stochastic gradient tree boosting with LAD loss.

Print basic information about a fitted "ladboost" object.

Compute predictions from an "ladboost" object using new data.

Usage

ladboost(
  X,
  y,
  ntree = 100,
  shrinkage = 0.1,
  depth = 6,
  subsample = 0.5,
  init = median(y)
)

## S3 method for class 'ladboost'
print(x, ...)

## S3 method for class 'ladboost'
predict(object, newdata, ntree = NULL, individual = FALSE, ...)

Arguments

X

A data frame of only predictors.

y

A vector of response values

ntree

Integer specifying the number of trees in the ensemble to use. Defaults to using all the trees in the ensemble.

shrinkage

Numeric specifying the shrinkage factor.

depth

Integer specifying the depth of each tree.

subsample

Numeric specifying the proportion of the training data to randomly sample before building each tree. Default is 0.5.

init

Numeric specifying the initial value to boost from. Defaults to the median response (i.e., median(y)).

x

An object of class "ladboost".

...

Additional optional arguments. (Currently ignored.)

object

An object of class "ladboost".

newdata

Data frame of new observations for making predictions.

individual

Logical indicating whether or not to return the (shrunken) predictions from each tree individually (TRUE) or the overall ensemble prediction (FALSE). Default is FALSE.

Value

An object of class "ladboost" which is just a list with the following components:

  • trees A list of length ntree containing the individual rpart tree fits.

  • shrinkage The corresponding shrinkage parameter.

  • depth The maximum depth of each tree.

  • subsample The (row) subsampling rate.

  • init The initial constant fit.

A vector (individual = TRUE) or matrix (individual = FALSE) of predictions.

Note

By design, the final model does not include the predictions from the initial (constant) fit. So the constant is stored in the init component of the returned output to be used later by predict.ladboost().

Examples

# Simulate data from the Friedman 1 benchmark problem
set.seed(1025)  # for reproducibility
trn <- gen_friedman1(500)  # training data
tst <- gen_friedman1(500)  # test data

# Gradient boosted decision trees
set.seed(1027)  # for reproducibility
bst <- ladboost(subset(trn, select = -y), y = trn$y, depth = 2)
pred <- predict(bst, newdata = tst)
mean((pred - tst$y) ^ 2)

bgreenwell/treemisc documentation built on Oct. 26, 2022, 12:56 a.m.