light_interaction: Interaction Strength
In mayer79/flashlight: Shed Light on Black Box Machine Learning Models

light_interaction

R Documentation

Interaction Strength

Description

This function provides Friedman's H statistic for overall interaction strength per covariable as well as its version for pairwise interactions, see the reference below.

Usage

light_interaction(x, ...)

## Default S3 method:
light_interaction(x, ...)

## S3 method for class 'flashlight'
light_interaction(
  x,
  data = x$data,
  by = x$by,
  v = NULL,
  pairwise = FALSE,
  type = c("H", "ice"),
  normalize = TRUE,
  take_sqrt = TRUE,
  grid_size = 200L,
  n_max = 1000L,
  seed = NULL,
  use_linkinv = FALSE,
  ...
)

## S3 method for class 'multiflashlight'
light_interaction(x, ...)

Arguments

`x`	An object of class "flashlight" or "multiflashlight".
`...`	Further arguments passed to or from other methods.
`data`	An optional `data.frame`.
`by`	An optional vector of column names used to additionally group the results.
`v`	Vector of variable names to be assessed.
`pairwise`	Should overall interaction strength per variable be shown or pairwise interactions? Defaults to `FALSE`.
`type`	Are measures based on Friedman's H statistic ("H") or on "ice" curves? Option "ice" is available only if `pairwise = FALSE`.
`normalize`	Should the variances explained be normalized? Default is `TRUE` in order to reproduce Friedman's H statistic.
`take_sqrt`	In order to reproduce Friedman's H statistic, resulting values are root transformed. Set to `FALSE` if squared values should be returned.
`grid_size`	Grid size used to form the outer product. Will be randomly picked from data (after limiting to `n_max`).
`n_max`	Maximum number of data rows to consider. Will be randomly picked from `data` if necessary.
`seed`	An integer random seed used for subsampling.
`use_linkinv`	Should retransformation function be applied? Default is `FALSE`.

Details

As a fast alternative to assess overall interaction strength, with type = "ice", the function offers a method based on centered ICE curves: The corresponding H* statistic measures how much of the variability of a c-ICE curve is unexplained by the main effect. As for Friedman's H statistic, it can be useful to consider unnormalized or squared values (see Details below).

Friedman's H statistic relates the interaction strength of a variable (pair) to the total effect strength of that variable (pair) based on partial dependence curves. Due to this normalization step, even variables with low importance can have high values for H. The function light_interaction() offers the option to skip normalization in order to have a more direct comparison of the interaction effects across variable (pairs). The values of such unnormalized H statistics are on the scale of the response variable. Use take_sqrt = FALSE to return squared values of H. Note that in general, for each variable (pair), predictions are done on a data set with grid_size * n_max, so be cautious with increasing the defaults too much. Still, even with larger grid_size and n_max, there might be considerable variation across different runs, thus, setting a seed is recommended.

The minimum required elements in the (multi-) flashlight are a "predict_function", "model", and "data".

Value

An object of class "light_importance" with the following elements:

data A tibble containing the results. Can be used to build fully customized visualizations. Column names can be controlled by options(flashlight.column_name).
by Same as input by.
type Same as input type. For information only.

Methods (by class)

light_interaction(default): Default method not implemented yet.
light_interaction(flashlight): Interaction strengths for a flashlight object.
light_interaction(multiflashlight): for a multiflashlight object.

References

Friedman, J. H. and Popescu, B. E. (2008). "Predictive learning via rule ensembles." The Annals of Applied Statistics. JSTOR, 916–54.

Examples

# First model with interactions
fit_nonadd <- lm(
  Sepal.Length ~ . + Sepal.Width:Species + Petal.Width:Species, data = iris
)
fl_nonadd <- flashlight(
  model = fit_nonadd, label = "nonadditive", data = iris, y = "Sepal.Length"
)

# Friedman's H per feature
plot(light_interaction(fl_nonadd), fill = "chartreuse4")

# Unnormalized H^2 measures proportion of bivariate effect explained by interaction
plot(
  light_interaction(fl_nonadd, normalize = TRUE, take_sqrt = TRUE),
  fill = "chartreuse4"
)

# Pairwise H
plot(light_interaction(fl_nonadd, pairwise = TRUE), fill = "chartreuse4")

# Second model without interactions
fit_add <- lm(Sepal.Length ~ ., data = iris)
fl_add <- flashlight(
  model = fit_add, label = "additive", data = iris, y = "Sepal.Length"
)
fls <- multiflashlight(list(fl_add, fl_nonadd))

plot(light_interaction(fls), fill = "chartreuse4")

mayer79/flashlight documentation built on April 12, 2025, 3:49 p.m.