View source: R/within_tolerance.R
| within_tolerance | R Documentation |
The function flags observations that fall within the tolerance interval. Observations that fall outside the interval are regarded as (potential) outliers.
within_tolerance(x, w, method = c("quartile", "modified", "boxplot"),
constants, lambda = 0.05, info = FALSE,
boxplot_coef = 1.5)
x |
|
w |
|
method |
[character] one of the methods: |
constants |
|
lambda |
|
info |
|
boxplot_coef |
|
Three methods are available.
"quartile")For the quartile method, the tolerance interval is given by
[m - c_l \cdot L_l, \; m + c_u \cdot L_u]
with
L_l = m - q_1 \quad \text{and} \quad L_u = q_3 - m,
where m denotes the (weighted) median; q_1 and
q_3 are, respectively, the first and third (weighted)
quartiles. The tuning constants c_l and c_u
are combined into the vector (c_l, c_u), which is
available as argument constants; both constants must be
nonnegative numbers.
The quartiles are calculated using design weights.
"modified")For the modified quartile method (Lee, 1995), the tolerance
interval is given by replacing L_l and L_u
with, respectively,
L_l = \max\big(m - q_1, \vert \lambda \cdot m\vert\big),
and
L_u = \max\big(q_3 - m, \vert \lambda \cdot m \vert\big)
The tuning constant \lambda can only take values in
the closed unit interval and is available as argument lambda.
The quartiles are calculated using design weights.
"boxplot")The tolerance interval for the boxplot method extends from the
lower whisker to the upper whisker. By default, the length of the
whiskers is set to 1.5 times the interquartile range; see argument
boxplot_coef. For more details, see
boxplot.
The quartiles, and therefore the interquartile range, are calculated using design weights.
A vector of logicals, where TRUE indicates that an observation is within
the tolerance limits and FALSE indicates a (potential) outlier.
If info = TRUE, the function prints the tolerance interval. The
endpoints of the interval can be numbers or the symbols ‘min.’ and
‘max.’, which denote the minimum and maximum values in the data,
respectively.
Lee, H. (1995). Outliers in Business Surveys, in: Cox, B. G. et al. (eds.), Business Survey Methods, p. 503–526. New York: John Wiley and Sons.
Overview (of all implemented functions)
head(workplace)
attach(workplace)
# Show the tolerance limits
within_tolerance(payroll, weight, method = "boxplot", info = TRUE)
# Observations that fall outside the tolerance limits are (potential) outliers
outlier <- !within_tolerance(payroll, weight, method = "boxplot")
outlier[1:10]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.