Function to Find Outliers for an epplab Object

Description

Function to decide wether observations are considered outliers or not in specific projection directions of an epplab object.

Usage

1
2
EPPlabOutlier(x, which = 1:ncol(x$PPdir), k = 3, location = mean,
  scale = sd)

Arguments

x

An object of class epplab.

which

The directions in which outliers should be searched. The default is to look at all.

k

Numeric value to decide when an observation is considered an outlier or not. Default is 3. See details.

location

A function which gives the univariate location as an output. The default is mean.

scale

A function which gives the univariate scale as an output. The default is sd.

Details

Denote location_j as the location of the jth projection direction and analogously scale_j as its scale. Then an observation x is an outlier in the jth projection direction, if |x-location_j| >= k scale_j.

Naturally it is best to use for this purpose robust location and scale measures like median and mad for example.

Value

A list with class 'epplabOutlier' containing the following components:

outlier

A matrix with only zeros and ones. A value of 1 classifies the observation as an outlier in this projection direction.

k

The factor k used.

location

The name of the location estimator used.

scale

The name of the scale estimator used.

PPindex

The name of the PPindex used.

PPalg

The name of the PPalg used.

Author(s)

Klaus Nordhausen

References

Ruiz-Gazen, A., Larabi Marie-Sainte, S. and Berro, A. (2010), Detecting multivariate outliers using projection pursuit with particle swarm optimization, COMPSTAT2010, pp. 89-98.

See Also

EPPlab

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# creating data with 3 outliers
n <-300 
p <- 10
X <- matrix(rnorm(n*p),ncol=p)
X[1,1] <- 9
X[2,4] <- 7 
X[3,6] <- 8
# giving the data rownames, obs.1, obs.2 and obs.3 are the outliers.
rownames(X) <- paste("obs",1:n,sep=".")

PP<-EPPlab(X,PPalg="PSO",PPindex="KurtosisMax",n.simu=20, maxiter=20)
OUT<-EPPlabOutlier(PP, k = 3, location = median, scale = mad)
OUT