summary.GBMFit: Summary of a GBMFit object

View source: R/gbm-summary.r

summary.GBMFitR Documentation

Summary of a GBMFit object

Description

Computes the relative influence of each variable in the GBMFit object.

Usage

## S3 method for class 'GBMFit'
summary(
  object,
  cBars = length(object$variables$var_names),
  num_trees = length(trees(object)),
  plot_it = TRUE,
  order_it = TRUE,
  method = relative_influence,
  normalize = TRUE,
  ...
)

Arguments

object

a GBMFit object created from an initial call to gbmt.

cBars

the number of bars to plot. If order_it=TRUE then only the cBars variables with the largest relative influence will appear in the barplot. If order_it=FALSE then the first cBars variables will appear in the plot. In either case, the function will return the relative influence of all of the variables.

num_trees

the number of trees used to generate the plot. Only the first num_trees trees will be used.

plot_it

an indicator as to whether the plot is generated.

order_it

an indicator as to whether the plotted and/or returned relative influences are sorted.

method

The function used to compute the relative influence. relative_influence is the default and is the same as that described in Friedman (2001). The other current (and experimental) choice is permutation_relative_influence. This method randomly permutes each predictor variable at a time and computes the associated reduction in predictive performance. This is similar to the variable importance measures Breiman uses for random forests, but gbm currently computes using the entire training dataset (not the out-of-bag observations).

normalize

if FALSE then summary.gbm returns the unnormalized influence.

...

other arguments passed to the plot function.

Details

For GBMGaussianDist this returns exactly the reduction of squared error attributable to each variable. For other loss functions this returns the reduction attributable to each variable in sum of squared error in predicting the gradient on each iteration. It describes the relative influence of each variable in reducing the loss function. See the references below for exact details on the computation.

Value

Returns a data frame where the first component is the variable name and the second is the computed relative influence, normalized to sum to 100.

Author(s)

James Hickey, Greg Ridgeway gregridgeway@gmail.com

References

J.H. Friedman (2001). "Greedy Function Approximation: A Gradient Boosting Machine," Annals of Statistics 29(5):1189-1232.

L. Breiman (2001). Random Forests.

See Also

gbmt


gbm-developers/gbm3 documentation built on March 8, 2024, 4:48 p.m.