LOF: Local Outlier Factor (LOF) algorithm

Description Usage Arguments Details Value Author(s) References Examples

Description

Function to calculate the Local Outlier Factor (LOF) as an outlier score for observations. Suggested by Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000)

Usage

1
LOF(dataset, k = 5)

Arguments

dataset

The dataset for which observations have an LOF score returned

k

The number of k-nearest neighbors to compare density with. k has to be smaller than number of observations in dataset

Details

LOF computes a local density for observations with a user-given k-nearest neighbors. The density is compared to the density of the respective nearest neighbors, resulting in the local outlier factor. A kd-tree is used for kNN computation, using the kNN() function from the 'dbscan' package. The LOF function is useful for outlier detection in clustering and other multidimensional domains

Value

A vector of LOF scores for observations. The greater the LOF, the greater outlierness

Author(s)

Jacob H. Madsen

References

Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: Identifying Density-Based Local Outliers. In Int. Conf. On Management of Data. Dallas, TX. pp. 93-104. DOI: 10.1145/342009.335388

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Create dataset
X <- iris[,1:4]

# Find outliers by setting an optional k
outlier_score <- LOF(dataset=X, k=10)

# Sort and find index for most outlying observations
names(outlier_score) <- 1:nrow(X)
sort(outlier_score, decreasing = TRUE)

# Inspect the distribution of outlier scores
hist(outlier_score)

jhmadsen/DDoutlier documentation built on June 6, 2019, 11:53 p.m.