| vi_firm | R Documentation | 
Compute variance-based variable importance (VI) scores using a simple feature importance ranking measure (FIRM) approach; for details, see Greenwell et al. (2018) and Scholbeck et al. (2019).
vi_firm(object, ...)
## Default S3 method:
vi_firm(
  object,
  feature_names = NULL,
  train = NULL,
  var_fun = NULL,
  var_continuous = stats::sd,
  var_categorical = function(x) diff(range(x))/4,
  ...
)
| object | A fitted model object (e.g., a randomForest object). | 
| ... | Additional arguments to be passed on to the  | 
| feature_names | Character string giving the names of the predictor
variables (i.e., features) of interest. If  | 
| train | A matrix-like R object (e.g., a data frame or matrix)
containing the training data. If  | 
| var_fun | Deprecated; use  | 
| var_continuous | Function used to quantify the variability of effects
for continuous features. Defaults to using the sample standard deviation
(i.e.,  | 
| var_categorical | Function used to quantify the variability of effects
for categorical features. Defaults to using the range divided by four; that
is,  | 
This approach is based on quantifying the relative "flatness" of the
effect of each feature and assumes the user has some familiarity with the
pdp::partial() function. The  Feature effects can be assessed
using partial dependence (PD) plots (Friedman, 2001) or
individual conditional expectation (ICE) plots (Goldstein et al., 2014).
These methods are model-agnostic and can be applied to any supervised
learning algorithm. By default, relative "flatness" is defined by computing
the standard deviation of the y-axis values for each feature effect plot for
numeric features; for categorical features, the default is to use range
divided by 4. This can be changed via the var_continuous and
var_categorical arguments. See
Greenwell et al. (2018) for details and
additional examples.
A tidy data frame (i.e., a tibble object) with two columns:
Variable - the corresponding feature name;
Importance - the associated importance, computed as described in
Greenwell et al. (2018).
This approach can provide misleading results in the presence of interaction effects (akin to interpreting main effect coefficients in a linear with higher level interaction effects).
J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29: 1189-1232, 2001.
Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E., Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. (2014) Journal of Computational and Graphical Statistics, 24(1): 44-65, 2015.
Greenwell, B. M., Boehmke, B. C., and McCarthy, A. J. A Simple and Effective Model-Based Variable Importance Measure. arXiv preprint arXiv:1805.04755 (2018).
Scholbeck, C. A. Scholbeck, and Molnar, C., and Heumann C., and Bischl, B., and Casalicchio, G. Sampling, Intervention, Prediction, Aggregation: A Generalized Framework for Model-Agnostic Interpretations. arXiv preprint arXiv:1904.03959 (2019).
## Not run: 
#
# A projection pursuit regression example
#
# Load the sample data
data(mtcars)
# Fit a projection pursuit regression model
mtcars.ppr <- ppr(mpg ~ ., data = mtcars, nterms = 1)
# Compute variable importance scores using the FIRM method; note that the pdp
# package knows how to work with a "ppr" object, so there's no need to pass
# the training data or a prediction wrapper, but it's good practice.
vi_firm(mtcars.ppr, train = mtcars)
# For unsopported models, need to define a prediction wrapper; this approach
# will work for ANY model (supported or unsupported, so better to just always
# define it pass it)
pfun <- function(object, newdata) {
  # To use partial dependence, this function needs to return the AVERAGE
  # prediction (for ICE, simply omit the averaging step)
  mean(predict(object, newdata = newdata))
}
# Equivalent to the previous results (but would work if this type of model
# was not explicitly supported)
vi_firm(mtcars.ppr, pred.fun = pfun, train = mtcars)
# Equivalent VI scores, but the output is sorted by default
vi(mtcars.ppr, method = "firm")
# Use MAD to estimate variability of the partial dependence values
vi_firm(mtcars.ppr, var_continuous = stats::mad)
# Plot VI scores
vip(mtcars.ppr, method = "firm", train = mtcars, pred.fun = pfun)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.