within_tolerance: Tolerance Interval

View source: R/within_tolerance.R

within_toleranceR Documentation

Tolerance Interval

Description

The function flags observations that fall within the tolerance interval. Observations that fall outside the interval are regarded as (potential) outliers.

Usage

within_tolerance(x, w, method = c("quartile", "modified", "boxplot"),
                 constants, lambda = 0.05, info = FALSE,
                 boxplot_coef = 1.5)

Arguments

x

[numeric vector] data vector.

w

[numeric vector] design weights (same length as x).

method

[character] one of the methods: "quartile", "modified" (quartile method), or "boxplot".

constants

[numeric vector] a vector of size two with nonnegative tuning constants; it is only used by the methods "quartile" and "modified".

lambda

[numeric] a tuning constant that takes values in the closed unit interval; it is only used by method "modified", default: lambda = 0.05.

info

[logical] if TRUE, the tolerance interval is printed out.

boxplot_coef

[numeric] determines how far the whiskers of the boxplot extend out from the box; the default is 1.5.

Details

Three methods are available.

Quartile method ("quartile")

For the quartile method, the tolerance interval is given by

[m - c_l \cdot L_l, \; m + c_u \cdot L_u]

with

L_l = m - q_1 \quad \text{and} \quad L_u = q_3 - m,

where m denotes the (weighted) median; q_1 and q_3 are, respectively, the first and third (weighted) quartiles. The tuning constants c_l and c_u are combined into the vector (c_l, c_u), which is available as argument constants; both constants must be nonnegative numbers.

The quartiles are calculated using design weights.

Modified quartile method ("modified")

For the modified quartile method (Lee, 1995), the tolerance interval is given by replacing L_l and L_u with, respectively,

L_l = \max\big(m - q_1, \vert \lambda \cdot m\vert\big),

and

L_u = \max\big(q_3 - m, \vert \lambda \cdot m \vert\big)

The tuning constant \lambda can only take values in the closed unit interval and is available as argument lambda.

The quartiles are calculated using design weights.

Boxplot (box-and-whisker plot) method ("boxplot")

The tolerance interval for the boxplot method extends from the lower whisker to the upper whisker. By default, the length of the whiskers is set to 1.5 times the interquartile range; see argument boxplot_coef. For more details, see boxplot.

The quartiles, and therefore the interquartile range, are calculated using design weights.

Value

A vector of logicals, where TRUE indicates that an observation is within the tolerance limits and FALSE indicates a (potential) outlier.

If info = TRUE, the function prints the tolerance interval. The endpoints of the interval can be numbers or the symbols ‘min.’ and ‘max.’, which denote the minimum and maximum values in the data, respectively.

References

Lee, H. (1995). Outliers in Business Surveys, in: Cox, B. G. et al. (eds.), Business Survey Methods, p. 503–526. New York: John Wiley and Sons.

See Also

Overview (of all implemented functions)

Examples

head(workplace)
attach(workplace)

# Show the tolerance limits
within_tolerance(payroll, weight, method = "boxplot", info = TRUE)

# Observations that fall outside the tolerance limits are (potential) outliers
outlier <- !within_tolerance(payroll, weight, method = "boxplot")
outlier[1:10]

robsurvey documentation built on Jan. 29, 2026, 1:07 a.m.