which.outlier: Return vector indexes of statistical univariate outliers

View source: R/NCmisc.R

which.outlierR Documentation

Return vector indexes of statistical univariate outliers

Description

Performs simplistic outlier detection and returns indexes for outliers. Acts like the which() function, return indices of elements of a vector satisfying the condition, which by default are outliers exceeding 2 SD above or below the mean. However, the threshold can be specified, only high or low values can be considered outliers, and percentile and interquartile range thresholds can also be used.

Usage

which.outlier(
  x,
  thr = 2,
  method = c("sd", "iq", "pc"),
  high = TRUE,
  low = TRUE
)

Arguments

x

numeric, or coercible, the vector to test for outliers

thr

numeric, threshold for cutoff, e.g, when method="sd", standard deviations, when 'iq', interquartile ranges (thr=1.5 is most typical here), or when 'pc', you might select the extreme 1%, 5%, etc.

method

character, one of "sd","iq" or "pc", selecting whether to test for outliers by standard deviation, interquartile range, or percentile.

high

logical, whether to test for outliers greater than the mean

low

logical, whether to test for outliers less than the mean

Value

indexes of the vector x that are outliers according to either a SD cutoff, interquartile range, or percentile threshold, above (high) and/or below (low) the mean/median.

Examples

test.vec <- rnorm(200)
summary(test.vec)
ii <- which.outlier(test.vec) # 2 SD outliers
prv(ii); vals <- test.vec[ii]; prv(vals)
ii <- which.outlier(test.vec,1.5,"iq") # e.g, 'stars' on a box-plot
prv(ii)
ii <- which.outlier(test.vec,5,"pc",low=FALSE) # only outliers >mean
prv(ii)

NCmisc documentation built on Oct. 17, 2022, 5:09 p.m.