OutlierMahdist | R Documentation |
This function uses the Mahalanobis distance as a basis for multivariate outlier detection. The standard method for multivariate outlier detection is robust estimation of the parameters in the Mahalanobis distance and the comparison with a critical value of the Chi2 distribution (Rousseeuw and Van Zomeren, 1990).
OutlierMahdist(x, ...)
## Default S3 method:
OutlierMahdist(x, grouping, control, trace=FALSE, ...)
## S3 method for class 'formula'
OutlierMahdist(formula, data, ..., subset, na.action)
formula |
a formula with no response variable, referring only to numeric variables. |
data |
an optional data frame (or similar: see
|
subset |
an optional vector used to select rows (observations) of the
data matrix |
na.action |
a function which indicates what should happen
when the data contain |
... |
arguments passed to or from other methods. |
x |
a matrix or data frame. |
grouping |
grouping variable: a factor specifying the class for each observation. |
control |
a control object (S4) for one of the available control classes,
e.g. |
trace |
whether to print intermediate results. Default is |
If the data set consists of two or more classes
(specified by the grouping variable grouping
) the proposed method iterates
through the classes present in the data, separates each class from the rest and
identifies the outliers relative to this class, thus treating both types of outliers,
the mislabeled and the abnormal samples in a homogenous way.
The estimation method is selected by the control object control
.
If a character string naming an estimator is specified, a
new control object will be created and used (with default estimation options).
If this argument is missing or a character string
'auto' is specified, the function will select the robust estimator
according to the size of the dataset - for details see CovRobust
.
An S4 object of class OutlierMahdist
which
is a subclass of the virtual class Outlier
.
Valentin Todorov valentin.todorov@chello.at
P. J. Rousseeuw and B. C. Van Zomeren (1990). Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association. Vol. 85(411), pp. 633-651.
P. J. Rousseeuw and A. M. Leroy (1987). Robust Regression and Outlier Detection. Wiley.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
Todorov V & Filzmoser P (2009). An Object Oriented Framework for Robust Multivariate Analysis. Journal of Statistical Software, 32(3), 1–47, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v032.i03")}.
Filzmoser P & Todorov V (2013). Robust tools for the imperfect world, Information Sciences 245, 4–20. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ins.2012.10.017")}.
data(hemophilia)
obj <- OutlierMahdist(gr~.,data=hemophilia)
obj
getDistance(obj) # returns an array of distances
getClassLabels(obj, 1) # returns an array of indices for a given class
getCutoff(obj) # returns an array of cutoff values (for each class, usually equal)
getFlag(obj) # returns an 0/1 array of flags
plot(obj, class=2) # standard plot function
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.