Func.ABOD: Angle-based outlier detection (ABOD) algorithm

Description Usage Arguments Value Examples

View source: R/HighDimOut.R

Description

This function performs the basic and aprroximated version of angle-based outlier detection algorithm. The ABOD method is especially useful for high-dimensional data, as angle is a more robust measure than distance in high-dimensional space. The basic version calculate the angle variance based on the whole data. The results obtained are more reliable. However, the speed can be very slow. The approximated version calculate the angle variance based on a subset of data and thereby, increasing the calculation speed. This function is based on the work of Krigel, H.P., Schubert, M., Zimek, A., Angle-based outlier detection in high dimensional data, 2008.

Usage

1
Func.ABOD(data, basic = FALSE, perc)

Arguments

data

is the data frame containing the observations. Each row represents an observation and each variable is stored in one column.

basic

is a logical value, indicating whether the basic method is used. The speed of basic version can be very slow if the data size is large.

perc

defines the percentage of data to use when calculating the angle variance. It is only needed when basic=F.

Value

The function returns the vector containing the angle variance for each observation

Examples

1
2
3
4
5
6
7
8
library(ggplot2)
res.ABOD <- Func.ABOD(data=TestData[,1:2], basic=FALSE, perc=0.2)
data.temp <- TestData[,1:2]
data.temp$Ind <- NA
data.temp[order(res.ABOD, decreasing = FALSE)[1:10],"Ind"] <- "Outlier"
data.temp[is.na(data.temp$Ind),"Ind"] <- "Inlier"
data.temp$Ind <- factor(data.temp$Ind)
ggplot(data = data.temp) + geom_point(aes(x = x, y = y, color=Ind, shape=Ind))

Example output

Warning message:
executing %dopar% sequentially: no parallel backend registered 

HighDimOut documentation built on May 2, 2019, 12:16 p.m.