dixonTest | R Documentation |
Performs Dixons single outlier test.
dixonTest(x, alternative = c("two.sided", "greater", "less"), refined = FALSE)
x |
a numeric vector of data |
alternative |
the alternative hypothesis.
Defaults to |
refined |
logical indicator, whether the refined version
or the Q-test shall be performed. Defaults to |
Let X denote an identically and independently distributed normal variate. Further, let the increasingly ordered realizations denote x_1 ≤ x_2 ≤ … ≤ x_n. Dixon (1950) proposed the following ratio statistic to detect an outlier (two sided):
r[j,i-1] = max{(x[n] - x[n-j]) / (x[n] - x[i]), (x[1+j] - x[1]) / (x[n-i] - x[1])}
The null hypothesis, no outlier, is tested against the alternative, at least one observation is an outlier (two sided). The subscript j on the r symbol indicates the number of outliers that are suspected at the upper end of the data set, and the subscript i indicates the number of outliers suspected at the lower end. For r_{10} it is also common to use the statistic Q.
The statistic for a single maximum outlier is:
r[j,i-1] = (x[n] - x[n-j]) / (x[n] - x[i])
The null hypothesis is tested against the alternative, the maximum observation is an outlier.
For testing a single minimum outlier, the test statistic is:
r[j,i-1] = (x[1+j] - x[1]) / (x[n] - x[i])
The null hypothesis is tested against the alternative, the minimum observation is an outlier.
Apart from the earlier Dixons Q-test (i.e. r_{10}), a refined version that was later proposed by Dixon can be performed with this function, where the statistic r_{j,i-1} depends on the sample size as follows:
r_{10}: | 3 ≤ n ≤ 7 |
r_{11}: | 8 ≤ n ≤ 10 |
r_{21}; | 11 ≤ n ≤ 13 |
r_{22}: | 14 ≤ n ≤ 30 |
The p-value is computed with the function pdixon
.
Dixon, W. J. (1950) Analysis of extreme values. Ann. Math. Stat. 21, 488–506. doi: 10.1214/aoms/1177729747.
Dean, R. B., Dixon, W. J. (1951) Simplified statistics for small numbers of observation. Anal. Chem. 23, 636–638. doi: 10.1021/ac60052a025.
McBane, G. C. (2006) Programs to compute distribution functions and critical values for extreme value ratios for outlier detection. J. Stat. Soft. 16. doi: 10.18637/jss.v016.i03.
## example from Dean and Dixon 1951, Anal. Chem., 23, 636-639. x <- c(40.02, 40.12, 40.16, 40.18, 40.18, 40.20) dixonTest(x, alternative = "two.sided") ## example from the dataplot manual of NIST x <- c(568, 570, 570, 570, 572, 578, 584, 596) dixonTest(x, alternative = "greater", refined = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.