plot_cooksdistance: Influence of Observations Plot

View source: R/plot_cooksdistance.R

plot_cooksdistanceR Documentation

Influence of Observations Plot

Description

Plot of Cook’s distances used for estimate the influence of an single observation.

Usage

plot_cooksdistance(object, ..., nlabel = 3)

plotCooksDistance(object, ..., nlabel = 3)

Arguments

object

An object of class auditor_model_cooksdistance created with model_cooksdistance function.

...

Other objects of class auditor_model_cooksdistance.

nlabel

Number of observations with the biggest Cook's distances to be labeled.

Details

Cook’s distance is a tool for identifying observations that may negatively affect the model. They may be also used for indicating regions of the design space where it would be good to obtain more observations. Data points indicated by Cook’s distances are worth checking for validity.

Cook’s Distances are calculated by removing the i-th observation from the data and recalculating the model. It shows how much all the values in the model change when the i-th observation is removed.

For model classes other than lm and glm the distances are computed directly from the definition.

Value

A ggplot object.

References

Cook, R. Dennis (1977). "Detection of Influential Observations in Linear Regression". doi:10.2307/1268249.

Examples

dragons <- DALEX::dragons[1:100, ]

# fit a model
model_lm <- lm(life_length ~ ., data = dragons)

lm_audit <- audit(model_lm, data = dragons, y = dragons$life_length)

# validate a model with auditor
library(auditor)
cd_lm <- model_cooksdistance(lm_audit)

# plot results
plot_cooksdistance(cd_lm)
plot(cd_lm, type = "cooksdistance")


auditor documentation built on Nov. 2, 2023, 6:13 p.m.