outliers_mad: Identify outliers using robust median absolute deviation...

View source: R/outliers_mad.R

outliers_madR Documentation

Identify outliers using robust median absolute deviation approach

Description

outliers_mad is used to identify outliers in vectors using Leys et al.'s (2003) median absolute deviation approach.

Usage

outliers_mad(x, threshold = 3.0, replace_outlier_value = NA,
show_mad_values = FALSE, show_outlier_indices = FALSE,
b_constant = 1.4826, digits = 2, debug = FALSE)

Arguments

x

a vector of numbers

threshold

value to use as cutoff (Leys et al. recommend 2.5 or 3.0 as default)

replace_outlier_value

if value is an outlier, what to replace it with? NA by default

show_mad_values

if TRUE, will show deviation score of each value

show_outlier_indices

if TRUE, return index/position of outliers

b_constant

a constant linked to the assumption of normality of the data, disregarding the abnormality induced by outliers

digits

how many digits to round output to

debug

if TRUE, print messages (FALSE by default)

Details

We can identify and remove outliers in our data by identifying data points that are too extreme—either too many standard deviations (SD) away from the mean or too many median absolute deviations (MAD) away from the median. The SD approach might not be ideal with extreme outliers, whereas the MAD approach is much more robust (for comparison of both approaches, see Leys et al., 2013, Journal of Experimental Social Psychology).

b_constant is usually 1.4826, a constant linked to the assumption of normality of the data, disregarding the abnormality induced by outliers (Rousseeuw & Croux, 1993).

Value

A vector with outliers identified (default converts outliers to NA)

Author(s)

Hause Lin

References

See Also

outliersZ

Examples

x <- c(1, 3, 3, 6, 8, 10, 10, 1000, -1000) # 1000 is an outlier
outliers_mad(x)
outliers_mad(x, threshold = 3.0)
outliers_mad(x, threshold = 2.5, replace_outlier_value = -999)
outliers_mad(x, threshold = 1.5, show_outlier_indices = TRUE)
outliers_mad(x, threshold = 1.5, show_mad_values = TRUE)
outliers_mad(x, threshold = 1.5, show_mad_values = TRUE, replace_outlier_value = -88)

hauselin/hausekeep documentation built on Feb. 3, 2023, 3:09 p.m.