identify_extreme_leverages: Identify Extreme Leverage Points
In RobbyLankford/tidytest: Tidy Statistical Modeling Tests

identify_extreme_leverages

R Documentation

Identify Extreme Leverage Points

Description

A data point with extreme leverage means that it has an extreme value or values in its predictor (x) values and/or an unusual combination of its predictors values. If this is the case, the data point(s) is/are influential, meaning that it has an outsized influence on a regression.

Usage

identify_extreme_leverages(object, id = NULL, .multiplier = 3)

## S3 method for class 'lm'
identify_extreme_leverages(object, id = NULL, .multiplier = 3)

Arguments

`object`	A model object (such as a fitted `lm` object).
`id`	(Optional) A vector of values, the same length as the number of observations, used as an identifier for each data point. If left as NULL, the row number will be added as the ID column.
`.multiplier`	(Optional) Used to determine which leverages are considered to be "extreme". The default is the rule-of-thumb 3 (see details).

Details

Extreme leverage points are defined as those values that are greater than the ratio of the number of coefficients to the number of observations, multiplied by some multiplier. A traditional rule-of-thumb is for the multiplier to be three.

Value

A tibble.

References

Kutner, M., Nachtsheim, C., Neter, J. and Li, W. (2005). Applied Linear Statistical Models. ISBN: 0-07-238688-6. McGraw-Hill/Irwin.

Examples

library(tidytest)

#> `lm` Method
mod_lm_fit <- lm(mpg ~ disp + wt + hp, data = mtcars)

identify_extreme_leverages(mod_lm_fit)
identify_extreme_leverages(mod_lm_fit, id = rownames(mtcars))

RobbyLankford/tidytest documentation built on Jan. 27, 2024, 7:36 a.m.