varimp | R Documentation |

In-bag risk reduction per base-learner as variable importance for boosting.

## S3 method for class 'mboost' varimp(object, ...) ## S3 method for class 'varimp' plot(x, percent = TRUE, type = c("variable", "blearner"), blorder = c("importance", "alphabetical", "rev_alphabetical", "formula"), nbars = 10L, maxchar = 20L, xlab = NULL, ylab = NULL, xlim, auto.key, ...) ## S3 method for class 'varimp' as.data.frame(x, row.names = NULL, optional = FALSE, ...)

`object` |
an object of class |

`x` |
an object of class |

`percent` |
logical, indicating whether variable importance should be specified in percent. |

`type` |
a character string specifying whether to draw bars for variables
( |

`blorder` |
a character string specifying the order of the base-learners
in the plot. The default |

`nbars` |
integer, maximum number of bars to be plotted. If |

`maxchar` |
integer, maximum number of characters in bar labels. |

`xlab` |
text for the x-axis label. If not set (default is |

`ylab` |
text for the y-axis label. If not set (default is |

`xlim` |
the x limits of the plot. Defaults are from |

`auto.key` |
logical, or a list passed to |

`...` |
additional arguments passed to |

`row.names` |
NULL or a character vector giving the row names for the data frame. Missing values are not allowed. |

`optional` |
logical. If TRUE, setting row names and converting column names (to syntactic names: see make.names) is optional. |

This function extracts the in-bag risk reductions per boosting step of a
fitted `mboost`

model and accumulates it individually for each base-learner
contained in the model. This quantifies the individual contribution to risk
reduction of each base-learner and can thus be used to compare the importance
of different base-learners or variables in the model. Starting from offset only,
in each boosting step risk reduction is computed as the difference between
in-bag risk of the current and the previous model and is accounted for the
base-learner selected in the particular step.

The results can be plotted in a bar plot either for the base-learners, or the
variables contained in the model. The bars are ordered according to variable
importance. If their number exceeds `nbars`

the least important are
summarized as "other". If bars are plotted per variable, all base-learners
containing the same variable will be accumulated in a stacked bar. This is of
use for models including for example seperate base-learners for the linear and
non-linear part of a covariate effect (see `?bbs`

option
`center=TRUE`

). However, variable interactions are treated as individual
variables, as their desired handling might depend on context.

As a comparison the selection frequencies are added to the respective base-learner labels in the plot (rounded to three digits). For stacked bars they are ordered accordingly.

An object of class `varimp`

with available `plot`

and
`as.data.frame`

methods.

Converting a `varimp`

object results in a `data.frame`

containing the
risk reductions, selection frequencies and the corresponding base-learner and
variable names as ordered `factors`

(ordered according to their particular
importance).

Tobias Kuehn (tobi.kuehn@gmx.de), Almond Stoecker (almond.stoecker@gmail.com)

data(iris) ### glmboost with multiple variables and intercept iris$setosa <- factor(iris$Species == "setosa") iris_glm <- glmboost(setosa ~ 1 + Sepal.Width + Sepal.Length + Petal.Width + Petal.Length, data = iris, control = boost_control(mstop = 50), family = Binomial(link = c("logit"))) varimp(iris_glm) ### importance plot with four bars only plot(varimp(iris_glm), nbars = 4) ### gamboost with multiple variables iris_gam <- gamboost(Sepal.Width ~ bols(Sepal.Length, by = setosa) + bbs(Sepal.Length, by = setosa, center = TRUE) + bols(Petal.Width) + bbs(Petal.Width, center = TRUE) + bols(Petal.Length) + bbs(Petal.Length, center = TRUE), data = iris) varimp(iris_gam) ### stacked importance plot with base-learners in rev. alphabetical order plot(varimp(iris_gam), blorder = "rev_alphabetical") ### similar ggplot ## Not run: library(ggplot2) ggplot(data.frame(varimp(iris_gam)), aes(variable, reduction, fill = blearner)) + geom_bar(stat = "identity") + coord_flip() ## End(Not run)

