outl.det: Detection and Display Outliers

Description Usage Arguments Details Value Author(s) Examples

Description

Wrapper for the detection of sample outliers by computation of Mahalanobis distances using cov.rob from package MASS. The (robust) square root distance from the center is displayed alongside a 2D mapping of the data and its confidence ellipse.

Usage

1
2
outl.det(x, method = "classical", conf.level = 0.975, 
            dimen=c(1,2), tol = 1e-7, plotting = TRUE)

Arguments

x

A data.frame or matrix.

method

The method to be used:

  • mve. Minimum volume ellipsoid.

  • mcd. Minimum covariance determinant.

  • classical. Classical product-moment.

For details, see cov.rob in package MASS.

conf.level

The confidence level for controlling the cutoff of Mahalanobis distances.

dimen

Dimensions used to plot tolerance ellipse and the data points alongside these two dimensions.

tol

The tolerance to be used for computing Mahalanobis distances (see cov.rob in package MASS)

plotting

A logical value. If TRUE, The Mahalanobis distances against the index of data samples and the tolerance ellipse of the data samples are plotted.

Details

If the number of samples is n and number of variables in a sample is p, the data set must be n > p + 1. In this case, PCA can be used to produce fewer directions of uncorrelated dimensions that explain different dimensions in the data. Due to the inherent difficulties in defining outliers, inclusion of the first few dimensions only is almost always sufficient to compute Mahalanobis distances. However in more complex designs implicating various factors and/or multiple levels, different contributions to the overall variation modelled by PCA may be confounded in such a reduced space. In such situation, the initial dataset must be decomposed into smaller problems to relate potential outlying behaviour.

Value

A list with components:

outlier

List of outliers detected.

conf.level

Confidence level used.

mah.dist

Mahalanobis distances of each data sample.

cutoff

Cutoff of Mahalanobis distances for outliers detection.

Author(s)

Wanchang Lin [email protected] and David Enot [email protected]

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
## load abr1
data(abr1)
y   <- factor(abr1$fact$class)
x <- preproc(abr1$pos , y=y, method=c("log10","TICnorm"),add=1)[,110:1000]  
## Select classes 1 and 2
tmp <- dat.sel(x, y, choices=c("1","2"))
dat <- tmp$dat[[1]]
ind <- tmp$cl[[1]]


## dimension reduction by PCA
x   <- prcomp(dat,scale=FALSE)$x

## perform and plot outlier detection using classical Mahalanobis distance
## on the first 2 PCA dimensions
res <- outl.det(x[,c(1,2)], method="classical",dimen=c(1,2),
                    conf.level = 0.975)

wilsontom/FIEmspro documentation built on Feb. 19, 2018, 9:03 a.m.