Description Usage Arguments Value Examples
This function performs the basic and aprroximated version of angle-based outlier detection algorithm. The ABOD method is especially useful for high-dimensional data, as angle is a more robust measure than distance in high-dimensional space. The basic version calculate the angle variance based on the whole data. The results obtained are more reliable. However, the speed can be very slow. The approximated version calculate the angle variance based on a subset of data and thereby, increasing the calculation speed. This function is based on the work of Krigel, H.P., Schubert, M., Zimek, A., Angle-based outlier detection in high dimensional data, 2008.
1 |
data |
is the data frame containing the observations. Each row represents an observation and each variable is stored in one column. |
basic |
is a logical value, indicating whether the basic method is used. The speed of basic version can be very slow if the data size is large. |
perc |
defines the percentage of data to use when calculating the angle variance. It is only needed when basic=F. |
The function returns the vector containing the angle variance for each observation
1 2 3 4 5 6 7 8 | library(ggplot2)
res.ABOD <- Func.ABOD(data=TestData[,1:2], basic=FALSE, perc=0.2)
data.temp <- TestData[,1:2]
data.temp$Ind <- NA
data.temp[order(res.ABOD, decreasing = FALSE)[1:10],"Ind"] <- "Outlier"
data.temp[is.na(data.temp$Ind),"Ind"] <- "Inlier"
data.temp$Ind <- factor(data.temp$Ind)
ggplot(data = data.temp) + geom_point(aes(x = x, y = y, color=Ind, shape=Ind))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.