outlier | R Documentation |
The Mahalanobis distance is D^2 = (x-\mu)' \Sigma^-1 (x-\mu)
where \Sigma
is the covariance of the x matrix. D2 may be used as a way of detecting outliers in distribution. Large D2 values, compared to the expected Chi Square values indicate an unusual response pattern. The mahalanobis function in stats does not handle missing data.
outlier(x, plot = TRUE, bad = 5,na.rm = TRUE, xlab, ylab, ...)
x |
A data matrix or data.frame |
plot |
Plot the resulting QQ graph |
bad |
Label the bad worst values |
na.rm |
Should missing data be deleted |
xlab |
Label for x axis |
ylab |
Label for y axis |
... |
More graphic parameters, e.g., cex=.8 |
Adapted from the mahalanobis function and help page from stats.
The D2 values for each case
William Revelle
Yuan, Ke-Hai and Zhong, Xiaoling, (2008) Outliers, Leverage Observations, and Influential Cases in Factor Analysis: Using Robust Procedures to Minimize Their Effect, Sociological Methodology, 38, 329-368.
mahalanobis
#first, just find and graph the outliers
d2 <- outlier(sat.act)
#combine with the data frame and plot it with the outliers highlighted in blue
sat.d2 <- data.frame(sat.act,d2)
pairs.panels(sat.d2,bg=c("yellow","blue")[(d2 > 25)+1],pch=21)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.