LOF: Local Outlier Factor (LOF)

Description Usage Arguments References See Also Examples

View source: R/outlier-dist.R

Description

LOF: Identifying Density-Based Local Outliers.

Usage

1
2
3
4
5
6
7
8
LOF(
  U,
  seq_k = c(4, 10, 30),
  combine = max,
  robMaha = FALSE,
  log = TRUE,
  ncores = 1
)

Arguments

U

A matrix, from which to detect outliers (rows). E.g. PC scores.

seq_k

Sequence of numbers of nearest neighbors to use. If multiple k are provided, this returns the combination of statistics. Default is c(4, 10, 30) and use max to combine (see combine).

combine

How to combine results for multiple k? Default uses max.

robMaha

Whether to use a robust Mahalanobis distance instead of the normal euclidean distance? Default is FALSE, meaning using euclidean.

log

Whether to return the logarithm of LOFs? Default is TRUE.

ncores

Number of cores to use. Default is 1.

References

Breunig, Markus M., et al. "LOF: identifying density-based local outliers." ACM sigmod record. Vol. 29. No. 2. ACM, 2000.

See Also

prob_dist()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
X <- readRDS(system.file("testdata", "three-pops.rds", package = "bigutilsr"))
svd <- svds(scale(X), k = 10)

llof <- LOF(svd$u)
hist(llof, breaks = nclass.scottRob)
tukey_mc_up(llof)

llof_maha <- LOF(svd$u, robMaha = TRUE)
hist(llof_maha, breaks = nclass.scottRob)
tukey_mc_up(llof_maha)

lof <- LOF(svd$u, log = FALSE)
hist(lof, breaks = nclass.scottRob)
str(hist_out(lof))
str(hist_out(lof, nboot = 100))
str(hist_out(lof, nboot = 100, breaks = "FD"))

bigutilsr documentation built on April 14, 2021, 1:06 a.m.