outlier: Definition and detection of outliers

View source: R/deepOutlier.r

outlierR Documentation

Definition and detection of outliers

Description

Definition and detection of outliers

Usage

outlier(x, type = c("iqr", "mean", "median"), fill = NULL, ...)

Arguments

x

A numeric vector.

type

The type of outlier definition and detection.

fill

A value that is used to replace outliers; NULL (default) indicates no replacement.

...

Further arguments.

Details

The following types of outlier detection are implemented:

  • iqr: refers to the method of Tukey (1977); Outliers are defined as elements more than 1.5 interquartile ranges above the upper quartile (75 percent) or below the lower quartile (25 percent). This method is useful when x is not normally distributed. The parameter k can be specified as a further argument, default 1.5.

  • mean: denotes maximum likelihood estimation; Outliers are defined as elements more than three standard deviations from the mean. This method is faster but less robust than median. The parameter k can be specified as a further argument, default 2.

  • median: denotes scaled median absolute deviation. Outliers are defined as elements more than three scaled MAD from the median; the scaled MAD is defined as c median(abs(x - median(x))), where c = -1/(sqrt(2) * erfcinv(3/2)). The parameter k can be specified as a further argument, default 3.

Value

Dependent on fill, a named list of lower and upper boundaries and values (default), otherwise, the vector x with replaced outliers.

References

Tukey, John W. (1977): Exploratory Data Analysis. 1977. Reading: Addison-Wesley.

See Also

quantile, IQR, outlier_dataset, winsorize.

Other Outlier: outlier_dataset(), winsorize()

Examples

  x <- c(57L, 59L, 60L, 100L, 59L, 58L, 57L, 58L, 300L, 61L, 62L, 60L, 62L, 58L, 57L, -12L)
  outlier(x, type = "median")

stschn/deepANN documentation built on June 25, 2024, 7:27 a.m.