View source: R/xgb.plot.importance.R
| xgb.ggplot.importance | R Documentation |
Represents previously calculated feature importance as a bar graph.
xgb.plot.importance() uses base R graphics, while
xgb.ggplot.importance() uses "ggplot".
xgb.ggplot.importance(
importance_matrix = NULL,
top_n = NULL,
measure = NULL,
rel_to_first = FALSE,
n_clusters = seq_len(10),
...
)
xgb.plot.importance(
importance_matrix = NULL,
top_n = NULL,
measure = NULL,
rel_to_first = FALSE,
left_margin = 10,
cex = NULL,
plot = TRUE,
...
)
importance_matrix |
A |
top_n |
Maximal number of top features to include into the plot. |
measure |
The name of importance measure to plot.
When |
rel_to_first |
Whether importance values should be represented as relative to the highest ranked feature, see Details. |
n_clusters |
A numeric vector containing the min and the max range of the possible number of clusters of bars. |
... |
Other parameters passed to |
left_margin |
Adjust the left margin size to fit feature names.
When |
cex |
Passed as |
plot |
Should the barplot be shown? Default is |
The graph represents each feature as a horizontal bar of length proportional to the importance of a feature. Features are sorted by decreasing importance. It works for both "gblinear" and "gbtree" models.
When rel_to_first = FALSE, the values would be plotted as in importance_matrix.
For a "gbtree" model, that would mean being normalized to the total of 1
("what is feature's importance contribution relative to the whole model?").
For linear models, rel_to_first = FALSE would show actual values of the coefficients.
Setting rel_to_first = TRUE allows to see the picture from the perspective of
"what is feature's importance contribution relative to the most important feature?"
The "ggplot" backend performs 1-D clustering of the importance values, with bar colors corresponding to different clusters having similar importance values.
The return value depends on the function:
xgb.plot.importance(): Invisibly, a "data.table" with n_top features sorted
by importance. If plot = TRUE, the values are also plotted as barplot.
xgb.ggplot.importance(): A customizable "ggplot" object.
E.g., to change the title, set + ggtitle("A GRAPH NAME").
graphics::barplot()
data(agaricus.train)
## Keep the number of threads to 2 for examples
nthread <- 2
data.table::setDTthreads(nthread)
model <- xgboost(
agaricus.train$data, factor(agaricus.train$label),
nrounds = 2,
max_depth = 3,
nthreads = nthread
)
importance_matrix <- xgb.importance(model)
xgb.plot.importance(
importance_matrix, rel_to_first = TRUE, xlab = "Relative importance"
)
gg <- xgb.ggplot.importance(
importance_matrix, measure = "Frequency", rel_to_first = TRUE
)
gg
gg + ggplot2::ylab("Frequency")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.