Description Usage Arguments Details Value Author(s) References Examples
Function to calculate a Local Density Estimate (LDE) and Local Density Factor (LDF), as an outlier score, with a gaussian kernel. Suggested by Latecki, L., Lazarevic, A. & Prokrajac, D. (2007)
1 |
dataset |
The dataset for which observations have an LDE and LDF score returned |
k |
The number of k-nearest neighbors to compare density estimation with. k has to be smaller than number of observations in dataset |
h |
User-given bandwidth for kernel functions. The greater the bandwidth, the smoother kernels and lesser weight are put on outliers. Default is 1 |
c |
Scaling constant for comparison of LDE to neighboring observations. LDF is the comparison of average LDE for an observation and its neighboring observations. Thus, c=1 gives results in an LDF between 0 and 1, while c=0 can result in very large or infinite values of LDF. Default is 1 |
LDF computes a kernel density estimation, called LDE, over a user-given number of k-nearest neighbors. The LDF score is the comparison of Local Density Estimate (LDE) for an observation to its neighboring observations. Naturally, if an observation has a greater LDE than its neighboring observations, it has no outlierness whereas an observation with smaller LDE than its neighboring observations has great outlierness. A kd-tree is used for kNN computation, using the kNN() function from the 'dbscan' package. The LDF function is useful for outlier detection in clustering and other multidimensional domains
LDE |
A vector of Local Density Estimate for observations. The greater the LDE, the greater centrality |
LDF |
A vector of Local Density Factor for observations. The greater the LDF, the greater the outlierness |
Jacob H. Madsen
Latecki, L., Lazarevic, A. & Prokrajac, D. (2007). Outlier Detection with Kernel Density Functions. International Workshop on Machine Learning and Data Mining in Pattern Recognition: Machine Learning and Data Mining in Pattern Recognition. pp. 61-75. DOI: 10.1007/978-3-540-73499-4_6
1 2 3 4 5 6 7 8 9 10 11 12 | # Create dataset
X <- iris[,1:4]
# Find outliers by setting an optional range of k's
outlier_score <- LDF(dataset=X, k=10, h=2, c=1)$LDF
# Sort and find index for most outlying observations
names(outlier_score) <- 1:nrow(X)
sort(outlier_score, decreasing = TRUE)
# Inspect the distribution of outlier scores
hist(outlier_score)
|
58 16 42 99 15 132 118 119
0.7276217 0.7254735 0.7228736 0.7171022 0.7121239 0.7083509 0.7060269 0.7043789
94 61 123 106 136 34 131 108
0.6895826 0.6888589 0.6827138 0.6804717 0.6691665 0.6676980 0.6647350 0.6639056
19 45 107 17 23 6 11 33
0.6600106 0.6548252 0.6538390 0.6310541 0.6256664 0.6251220 0.6100369 0.6056570
14 88 69 63 82 120 110 47
0.6010961 0.5934501 0.5809000 0.5788718 0.5774531 0.5683943 0.5676850 0.5646755
130 126 32 135 87 39 43 37
0.5646600 0.5637191 0.5635693 0.5609338 0.5587530 0.5582367 0.5581898 0.5571707
80 115 114 65 21 24 101 81
0.5553393 0.5542288 0.5531827 0.5523997 0.5512975 0.5491356 0.5394798 0.5385490
44 137 25 77 9 7 109 53
0.5374027 0.5353577 0.5353295 0.5347614 0.5341326 0.5324061 0.5312280 0.5307166
22 51 3 73 103 122 148 102
0.5293764 0.5293138 0.5262312 0.5260957 0.5255984 0.5255297 0.5236574 0.5225730
143 71 49 90 141 4 70 98
0.5225730 0.5224128 0.5204007 0.5189761 0.5185947 0.5177519 0.5176837 0.5166334
86 20 145 12 62 52 66 112
0.5161136 0.5156641 0.5152552 0.5138377 0.5131981 0.5131650 0.5131457 0.5128695
8 128 75 31 60 85 149 54
0.5119967 0.5115152 0.5114274 0.5111306 0.5107272 0.5104499 0.5103985 0.5099480
105 150 56 84 111 121 67 2
0.5080799 0.5075490 0.5072416 0.5071443 0.5066495 0.5063696 0.5056844 0.5054519
48 147 95 144 5 92 97 72
0.5054358 0.5048213 0.5047479 0.5046361 0.5033983 0.5032228 0.5028540 0.5010669
146 104 125 27 35 74 78 116
0.5006804 0.5004960 0.5004622 0.5003949 0.4997283 0.4993634 0.4986958 0.4978644
139 134 91 142 18 13 46 10
0.4972717 0.4969446 0.4959750 0.4950688 0.4945860 0.4938775 0.4938775 0.4937073
59 76 127 64 140 117 138 79
0.4924629 0.4921091 0.4916968 0.4913452 0.4910859 0.4906247 0.4897488 0.4897391
50 30 57 36 113 1 83 40
0.4888637 0.4876497 0.4866397 0.4863480 0.4858641 0.4851776 0.4848402 0.4840054
93 28 96 100 41 89 55 38
0.4832600 0.4828004 0.4820199 0.4808723 0.4802692 0.4800804 0.4791818 0.4782170
26 29 133 129 124 68
0.4777825 0.4772532 0.4769265 0.4719941 0.4718194 0.4690840
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.