light_interaction: Interaction Strength

View source: R/light_interaction.R

light_interactionR Documentation

Interaction Strength

Description

This function provides Friedman's H statistic for overall interaction strength per covariable as well as its version for pairwise interactions, see the reference below.

Usage

light_interaction(x, ...)

## Default S3 method:
light_interaction(x, ...)

## S3 method for class 'flashlight'
light_interaction(
  x,
  data = x$data,
  by = x$by,
  v = NULL,
  pairwise = FALSE,
  type = c("H", "ice"),
  normalize = TRUE,
  take_sqrt = TRUE,
  grid_size = 200L,
  n_max = 1000L,
  seed = NULL,
  use_linkinv = FALSE,
  ...
)

## S3 method for class 'multiflashlight'
light_interaction(x, ...)

Arguments

x

An object of class "flashlight" or "multiflashlight".

...

Further arguments passed to or from other methods.

data

An optional data.frame.

by

An optional vector of column names used to additionally group the results.

v

Vector of variable names to be assessed.

pairwise

Should overall interaction strength per variable be shown or pairwise interactions? Defaults to FALSE.

type

Are measures based on Friedman's H statistic ("H") or on "ice" curves? Option "ice" is available only if pairwise = FALSE.

normalize

Should the variances explained be normalized? Default is TRUE in order to reproduce Friedman's H statistic.

take_sqrt

In order to reproduce Friedman's H statistic, resulting values are root transformed. Set to FALSE if squared values should be returned.

grid_size

Grid size used to form the outer product. Will be randomly picked from data (after limiting to n_max).

n_max

Maximum number of data rows to consider. Will be randomly picked from data if necessary.

seed

An integer random seed used for subsampling.

use_linkinv

Should retransformation function be applied? Default is FALSE.

Details

As a fast alternative to assess overall interaction strength, with type = "ice", the function offers a method based on centered ICE curves: The corresponding H* statistic measures how much of the variability of a c-ICE curve is unexplained by the main effect. As for Friedman's H statistic, it can be useful to consider unnormalized or squared values (see Details below).

Friedman's H statistic relates the interaction strength of a variable (pair) to the total effect strength of that variable (pair) based on partial dependence curves. Due to this normalization step, even variables with low importance can have high values for H. The function light_interaction() offers the option to skip normalization in order to have a more direct comparison of the interaction effects across variable (pairs). The values of such unnormalized H statistics are on the scale of the response variable. Use take_sqrt = FALSE to return squared values of H. Note that in general, for each variable (pair), predictions are done on a data set with grid_size * n_max, so be cautious with increasing the defaults too much. Still, even with larger grid_size and n_max, there might be considerable variation across different runs, thus, setting a seed is recommended.

The minimum required elements in the (multi-) flashlight are a "predict_function", "model", and "data".

Value

An object of class "light_importance" with the following elements:

  • data A tibble containing the results. Can be used to build fully customized visualizations. Column names can be controlled by options(flashlight.column_name).

  • by Same as input by.

  • type Same as input type. For information only.

Methods (by class)

  • light_interaction(default): Default method not implemented yet.

  • light_interaction(flashlight): Interaction strengths for a flashlight object.

  • light_interaction(multiflashlight): for a multiflashlight object.

References

Friedman, J. H. and Popescu, B. E. (2008). "Predictive learning via rule ensembles." The Annals of Applied Statistics. JSTOR, 916–54.

See Also

light_ice()

Examples

# First model with interactions
fit_nonadd <- lm(
  Sepal.Length ~ . + Sepal.Width:Species + Petal.Width:Species, data = iris
)
fl_nonadd <- flashlight(
  model = fit_nonadd, label = "nonadditive", data = iris, y = "Sepal.Length"
)

# Friedman's H per feature
plot(light_interaction(fl_nonadd), fill = "chartreuse4")

# Unnormalized H^2 measures proportion of bivariate effect explained by interaction
plot(
  light_interaction(fl_nonadd, normalize = TRUE, take_sqrt = TRUE),
  fill = "chartreuse4"
)

# Pairwise H
plot(light_interaction(fl_nonadd, pairwise = TRUE), fill = "chartreuse4")

# Second model without interactions
fit_add <- lm(Sepal.Length ~ ., data = iris)
fl_add <- flashlight(
  model = fit_add, label = "additive", data = iris, y = "Sepal.Length"
)
fls <- multiflashlight(list(fl_add, fl_nonadd))

plot(light_interaction(fls), fill = "chartreuse4")

mayer79/flashlight documentation built on Feb. 13, 2024, 1:09 p.m.