importance.pre: Calculate importances of baselearners and input variables in...
In pre: Prediction Rule Ensembles

importance.pre

R Documentation

Calculate importances of baselearners and input variables in a prediction rule ensemble (pre)

Description

importance.pre calculates importances for rules, linear terms and input variables in the prediction rule ensemble (pre), and creates a bar plot of variable importances.

Usage

## S3 method for class 'pre'
importance(
  x,
  standardize = FALSE,
  global = TRUE,
  penalty.par.val = "lambda.1se",
  gamma = NULL,
  quantprobs = c(0.75, 1),
  round = NA,
  plot = TRUE,
  ylab = "Importance",
  main = "Variable importances",
  abbreviate = 10L,
  diag.xlab = TRUE,
  diag.xlab.hor = 0,
  diag.xlab.vert = 2,
  cex.axis = 1,
  legend = "topright",
  ...
)

Arguments

`x`	an object of class `pre`
`standardize`	logical. Should baselearner importances be standardized with respect to the outcome variable? If `TRUE`, baselearner importances have a minimum of 0 and a maximum of 1. Only used for ensembles with numeric (non-count) response variables.
`global`	logical. Should global importances be calculated? If `FALSE`, local importances will be calculated, given the quantiles of the predictions F(x) in `quantprobs`.
`penalty.par.val`	character or numeric. Value of the penalty parameter `\lambda` to be employed for selecting the final ensemble. The default `"lambda.min"` employs the `\lambda` value within 1 standard error of the minimum cross-validated error. Alternatively, `"lambda.min"` may be specified, to employ the `\lambda` value with minimum cross-validated error, or a numeric value `>0` may be specified, with higher values yielding a sparser ensemble. To evaluate the trade-off between accuracy and sparsity of the final ensemble, inspect `pre_object$glmnet.fit` and `plot(pre_object$glmnet.fit)`.
`gamma`	Mixing parameter for relaxed fits. See `coef.cv.glmnet`.
`quantprobs`	optional numeric vector of length two. Only used when `global = FALSE`. Probabilities for calculating sample quantiles of the range of F(X), over which local importances are calculated. The default provides variable importances calculated over the 25% highest values of F(X).
`round`	integer. Number of decimal places to round numeric results to. If `NA` (default), no rounding is performed.
`plot`	logical. Should variable importances be plotted?
`ylab`	character string. Plotting label for y-axis. Only used when `plot = TRUE`.
`main`	character string. Main title of the plot. Only used when `plot = TRUE`.
`abbreviate`	integer or logical. Number of characters to abbreviate x axis names to. If `FALSE`, no abbreviation is performed.
`diag.xlab`	logical. Should variable names be printed diagonally (that is, in a 45 degree angle)? Alternatively, variable names may be printed vertically by specifying `diag.xlab = FALSE` and `las = 2`.
`diag.xlab.hor`	numeric. Horizontal adjustment for lining up variable names with bars in the plot if variable names are printed diagonally.
`diag.xlab.vert`	positive integer. Vertical adjustment for position of variable names, if printed diagonally. Corresponds to the number of character spaces added after variable names.
`cex.axis`	numeric. The magnification to be used for axis annotation relative to the current setting of `cex`.
`legend`	logical or character. Should legend be plotted for multinomial or multivariate responses and if so, where? Defaults to `"topright"`, which puts the legend in the top-right corner of the plot. Alternatively, `"bottomright"`, `"bottom"`, `"bottomleft"`, `"left"`, `"topleft"`, `"top"`, `"topright"`, `"right"`, `"center"` and `FALSE` (which omits the legend) can be specified.
`...`	further arguments to be passed to `barplot` (only used when `plot = TRUE`).

Details

See also sections 6 and 7 of Friedman & Popecus (2008).

Value

A list with two dataframes: $baseimps, giving the importances for baselearners in the ensemble, and $varimps, giving the importances for all predictor variables.

References

Fokkema, M. (2020). Fitting prediction rule ensembles with R package pre. Journal of Statistical Software, 92(12), 1-30. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v092.i12")}

Fokkema, M. & Strobl, C. (2020). Fitting prediction rule ensembles to psychological research data: An introduction and tutorial. Psychological Methods 25(5), 636-652. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1037/met0000256")}, https://arxiv.org/abs/1907.05302

Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/07-AOAS148")}.

Examples

set.seed(42)
airq.ens <- pre(Ozone ~ ., data = airquality[complete.cases(airquality),])
# calculate global importances:
importance(airq.ens)
# calculate local importances (default: over 25% highest predicted values):
importance(airq.ens, global = FALSE)
# calculate local importances (custom: over 25% lowest predicted values):
importance(airq.ens, global = FALSE, quantprobs = c(0, .25))

pre documentation built on May 29, 2024, 5:10 a.m.