Description Usage Arguments Details Value Author(s) References Examples
Function to calculate aggregated distance to k-nearest neighbors over a range of k's, as an outlier score. Suggested by Angiulli, F., & Pizzuti, C. (2002)
1 | KNN_AGG(dataset, k_min = 5, k_max = 10)
|
dataset |
The dataset for which observations have an aggregated k-nearest neighbors distance returned |
k_min |
The k parameter starting the k-range |
k_max |
The k parameter ending the k-range. Has to be smaller than the number of observations in dataset and greater than or equal to k_min |
KNN_AGG computes the aggregated distance to neighboring observations by aggregating the results from k_min-NN to k_max-NN, such that if k_min=1 and k_max=3, results from 1NN, 2NN and 3NN are aggregated. A kd-tree is used for kNN computation, using the kNN function() from the 'dbscan' package. The KNN_AGG function is useful for outlier detection in clustering and other multidimensional domains.
A vector of aggregated distance for observations. The greater the distance, the greater outlierness
Jacob H. Madsen
Angiulli, F., & Pizzuti, C. (2002). Fast Outlier Detection in High Dimensional Spaces. In Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD). Helsinki, Finland. pp. 15-26. DOI: 10.1007/3-540-45681-3_2
1 2 3 4 5 6 7 8 9 10 11 12 | # Create dataset
X <- iris[,1:4]
# Find outliers by setting a range of k's
outlier_score <- KNN_AGG(dataset=X, k_min=10, k_max=15)
# Sort and find index for most outlying observations
names(outlier_score) <- 1:nrow(X)
sort(outlier_score, decreasing = TRUE)
# Inspect the distribution of outlier scores
hist(outlier_score)
|
119 118 132 107 123 99 42 110
84.14264 81.98029 81.18188 67.68582 66.93006 65.87829 62.35515 61.82369
61 136 58 106 94 16 135 108
61.44617 59.08369 58.91190 57.98081 55.76996 55.11513 52.39217 52.36970
130 109 115 131 69 101 15 120
51.90887 51.18809 50.56004 50.37625 49.44191 48.97403 48.26370 47.99385
126 88 63 51 149 23 80 142
47.43669 46.64301 46.08609 43.78419 43.35963 43.35131 42.87854 42.08205
60 114 65 86 137 14 85 19
41.94065 41.20712 41.13602 40.77370 40.07189 39.89360 39.88596 39.70539
34 122 103 53 147 77 45 33
39.57289 39.35464 39.34954 39.29469 37.06554 36.72283 36.65728 36.35472
82 78 111 145 104 57 74 73
36.33810 36.27574 35.73859 35.68395 35.66022 35.64443 35.60047 35.52424
71 25 144 134 54 146 116 72
35.46824 35.16641 35.14174 34.93118 34.85844 34.69579 34.61560 34.28195
91 133 138 112 66 140 150 125
34.18726 33.99974 33.82144 33.59369 33.50269 33.38674 33.15171 33.09626
121 17 67 105 129 84 102 143
32.99146 32.84282 32.80132 32.72743 32.32921 32.21023 31.86692 31.86692
81 6 75 117 52 37 98 62
31.83292 31.83211 31.79915 31.78273 31.29275 30.85185 30.84422 30.79344
148 55 76 124 59 113 9 87
30.71950 30.69285 30.64318 30.62567 30.53929 30.46482 30.36945 30.35838
68 141 127 44 21 64 79 56
30.32139 30.15249 30.07458 29.94084 29.82010 29.79692 29.62591 29.60918
139 24 128 92 89 39 43 90
29.52196 29.39869 29.20391 28.96927 28.80427 27.98737 27.97012 27.96153
96 7 70 32 83 93 36 95
27.65555 27.25100 27.13429 26.99949 26.91009 26.25722 25.53690 24.78269
26 11 47 97 100 38 20 12
24.46208 24.45217 24.08901 23.88099 23.22331 23.02984 22.84867 22.64447
22 49 27 3 46 30 13 48
21.98652 21.78302 21.49228 21.17599 20.84591 20.53955 20.30874 20.24473
4 2 41 31 10 29 50 5
20.14937 19.70533 19.63242 18.84399 18.62495 18.52104 18.27333 17.66272
28 35 40 8 18 1
17.59896 17.20233 16.83494 16.54371 16.11287 15.53908
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.