Extreme case"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(MMRcaseselection)

The extreme case is selected with regard to the cases' values on an independent variable or the outcome. It is defined by the absolute difference between the case value on the chosen variable and the variable's mean. For example, for the outcome the extremeness of a case is defined as $|Y_i-\hat{Y}|$, with $i$ being the case index. The extreme case is the case with the maximum absolute difference. (see Seawright (2016)) Depending on the research question or substantive interest, one might be interested in the extreme case in the lower or upper range of a variable. Extremeness is then calculated with $Y_i-\hat{Y}$.

The extreme_on_x() and extreme_on_y() functions take an lm object as input and calculate the extremeness of all cases. For extremeness on an independent variable, one additionally needs to specify the variable of interest as a character. The output is a dataframe and cases are ordered by absolute extremeness in decreasing order. The dataframe also presents the extremeness values that show whether the case is extreme in the lower range of the variable (negative values) or the positive range (positive values).

df <- lm(mpg ~ disp + wt, data = mtcars)
extreme_on_x(df, "wt")

The calculation of extremeness on the outcome only requires an lm object as input.

df <- lm(mpg ~ disp + wt, data = mtcars)
Y_extreme <- extreme_on_y(df)
head(Y_extreme)


Try the MMRcaseselection package in your browser

Any scripts or data that you put into this service are public.

MMRcaseselection documentation built on July 1, 2020, 9:55 p.m.