xgb.plot.deepness: Plot model trees deepness

View source: R/xgb.plot.deepness.R

xgb.ggplot.deepnessR Documentation

Plot model trees deepness

Description

Visualizes distributions related to depth of tree leafs. xgb.plot.deepness uses base R graphics, while xgb.ggplot.deepness uses the ggplot backend.

Usage

xgb.ggplot.deepness(
  model = NULL,
  which = c("2x1", "max.depth", "med.depth", "med.weight")
)

xgb.plot.deepness(
  model = NULL,
  which = c("2x1", "max.depth", "med.depth", "med.weight"),
  plot = TRUE,
  ...
)

Arguments

model

either an xgb.Booster model generated by the xgb.train function or a data.table result of the xgb.model.dt.tree function.

which

which distribution to plot (see details).

plot

(base R barplot) whether a barplot should be produced. If FALSE, only a data.table is returned.

...

other parameters passed to barplot or plot.

Details

When which="2x1", two distributions with respect to the leaf depth are plotted on top of each other:

  • the distribution of the number of leafs in a tree model at a certain depth;

  • the distribution of average weighted number of observations ("cover") ending up in leafs at certain depth.

Those could be helpful in determining sensible ranges of the max_depth and min_child_weight parameters.

When which="max.depth" or which="med.depth", plots of either maximum or median depth per tree with respect to tree number are created. And which="med.weight" allows to see how a tree's median absolute leaf weight changes through the iterations.

This function was inspired by the blog post https://github.com/aysent/random-forest-leaf-visualization.

Value

Other than producing plots (when plot=TRUE), the xgb.plot.deepness function silently returns a processed data.table where each row corresponds to a terminal leaf in a tree model, and contains information about leaf's depth, cover, and weight (which is used in calculating predictions).

The xgb.ggplot.deepness silently returns either a list of two ggplot graphs when which="2x1" or a single ggplot graph for the other which options.

See Also

xgb.train, xgb.model.dt.tree.

Examples


data(agaricus.train, package='xgboost')
## Keep the number of threads to 2 for examples
nthread <- 2
data.table::setDTthreads(nthread)

## Change max_depth to a higher number to get a more significant result
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 6,
               eta = 0.1, nthread = nthread, nrounds = 50, objective = "binary:logistic",
               subsample = 0.5, min_child_weight = 2)

xgb.plot.deepness(bst)
xgb.ggplot.deepness(bst)

xgb.plot.deepness(bst, which='max.depth', pch=16, col=rgb(0,0,1,0.3), cex=2)

xgb.plot.deepness(bst, which='med.weight', pch=16, col=rgb(0,0,1,0.3), cex=2)


xgboost documentation built on Sept. 11, 2024, 8:26 p.m.