View source: R/greedy_ensemble.R
greedy_ensemble | R Documentation |
This function computes an ensemble score using the greedy algorithm in the paper titled Evaluation of Outlier Rankings and Outlier Scores by Schubert et al (2012) <doi:10.1137/1.9781611972825.90>. The greedy ensemble is detailed in Section 4.3.
greedy_ensemble(X, kk = 5)
X |
The input data containing the outlier scores in a dataframe, matrix or tibble format. Rows contain observations and columns contain outlier detection methods. |
kk |
The number of estimated outliers. |
A list with the components:
scores |
The ensemble scores. |
methods |
The methods that are chosen for the ensemble. |
chosen |
The chosen subset of original anomaly scores. |
set.seed(123)
if (requireNamespace("dbscan", quietly = TRUE)){
X <- data.frame(x1 = rnorm(200), x2 = rnorm(200))
X[199, ] <- c(4, 4)
X[200, ] <- c(-3, 5)
# Using different parameters of lof for anomaly detection
y1 <- dbscan::lof(X, minPts = 10)
y2 <- dbscan::lof(X, minPts = 20)
knnobj <- dbscan::kNN(X, k = 20)
# Using different KNN distances as anomaly scores
y3 <- knnobj$dist[ ,10]
y4 <- knnobj$dist[ ,20]
# Dense points are less anomalous. Hence 1 - pointdensity is used.
y5 <- 1 - dbscan::pointdensity(X, eps = 0.8, type = "gaussian")
y6 <- 1 - dbscan::pointdensity(X, eps = 0.5, type = "gaussian")
Y <- cbind.data.frame(y1, y2, y3, y4, y5, y6)
ens <- greedy_ensemble(Y, kk=5)
ens$scores
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.