outCoDa | R Documentation |
Outlier detection for compositional data using standard and robust statistical methods.
outCoDa(x, quantile = 0.975, method = "robust", alpha = 0.5, coda = TRUE)
## S3 method for class 'outCoDa'
print(x, ...)
## S3 method for class 'outCoDa'
plot(x, y, ..., which = 1)
x |
compositional data |
quantile |
quantile, corresponding to a significance level, is used as a cut-off value for outlier identification: observations with larger (squared) robust Mahalanobis distance are considered as potential outliers. |
method |
either “robust” (default) or “standard” |
alpha |
the size of the subsets for the robust covariance estimation
according the MCD-estimator for which the determinant is minimized, see |
coda |
if TRUE, data transformed to coordinate representation before outlier detection. |
... |
additional parameters for print and plot method passed through |
y |
unused second plot argument for the plot method |
which |
1 ... MD against index 2 ... distance-distance plot |
The outlier detection procedure is based on (robust) Mahalanobis distances in isometric logratio coordinates. Observations with squared Mahalanobis distance greater equal a certain quantile of the chi-squared distribution are marked as outliers.
If method “robust” is chosen, the outlier detection is based on the homogeneous majority of the compositional data set. If method “standard” is used, standard measures of location and scatter are applied during the outlier detection procedure. Method “robust” can be used if the number of variables is greater than the number of observations. Here the OGK estimator is chosen.
plot method: the Mahalanobis distance are plotted against the index. The dashed line indicates the (1 - alpha) quantile of the chi-squared distribution. Observations with Mahalanobis distance greater than this quantile could be considered as compositional outliers.
mahalDist |
resulting Mahalanobis distance |
limit
|
quantile of the Chi-squared distribution |
outlierIndex |
logical vector indicating outliers and non-outliers |
method |
method used |
It is highly recommended to use the robust version of the procedure.
Matthias Templ, Karel Hron
Egozcue J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., Barcelo-Vidal, C. (2003) Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35 (3) 279-300.
Filzmoser, P., and Hron, K. (2008) Outlier detection for compositional data using robust methods. Math. Geosciences, 40, 233-248.
Rousseeuw, P.J., Van Driessen, K. (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212-223.
pivotCoord
data(expenditures)
oD <- outCoDa(expenditures)
oD
## providing a function:
oD <- outCoDa(expenditures, coda = log)
## for high-dimensional data:
oD <- outCoDa(expenditures, method = "robustHD")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.