CpKnnCad: Classic processing KNN based Conformal Anomaly Detector...

View source: R/cp_knn_cad.R

CpKnnCadR Documentation

Classic processing KNN based Conformal Anomaly Detector (KNN-CAD)

Description

CpKnnCad calculates the anomalies of a dataset using classical processing based on the KNN-CAD algorithm. KNN-CAD is a model-free anomaly detection method for univariate time-series which adapts itself to non-stationarity in the data stream and provides probabilistic abnormality scores based on the conformal prediction paradigm.

Usage

CpKnnCad(
  data,
  n.train,
  threshold = 1,
  l = 19,
  k = 27,
  ncm.type = "ICAD",
  reducefp = TRUE
)

Arguments

data

Numerical vector with training and test dataset.

n.train

Number of points of the dataset that correspond to the training set.

threshold

Anomaly threshold.

l

Window length.

k

Number of neighbours to take into account.

ncm.type

Non Conformity Measure to use "ICAD" or "LDCD"

reducefp

If TRUE reduces false positives.

Details

data must be a numerical vector without NA values. threshold must be a numeric value between 0 and 1. If the anomaly score obtained for an observation is greater than the threshold, the observation will be considered abnormal. l must be a numerical value between 1 and 1/n; n being the length of the training data. Take into account that the value of l has a direct impact on the computational cost, so very high values will make the execution time longer. k parameter must be a numerical value less than the n.train value. ncm.type determines the non-conformity measurement to be used. ICAD calculates dissimilarity as the sum of the distances of the nearest k neighbours and LDCD as the average.

Value

dataset conformed by the following columns:

is.anomaly

1 if the value is anomalous, 0 otherwise.

anomaly.score

Probability of anomaly.

References

V. Ishimtsev, I. Nazarov, A. Bernstein and E. Burnaev. Conformal k-NN Anomaly Detector for Univariate Data Streams. ArXiv e-prints, jun. 2017.

Examples

## Generate data
set.seed(100)
n <- 350
x <- sample(1:100, n, replace = TRUE)
x[70:90] <- sample(110:115, 21, replace = TRUE)
x[25] <- 200
x[320] <- 170
df <- data.frame(timestamp = 1:n, value = x)

## Set parameters
params.KNN <- list(threshold = 1, n.train = 50, l = 19, k = 17)

## Calculate anomalies
result <- CpKnnCad(
  data = df$value,
  n.train = params.KNN$n.train,
  threshold = params.KNN$threshold,
  l = params.KNN$l,
  k = params.KNN$k,
  ncm.type = "ICAD",
  reducefp = TRUE
)

## Plot results
res <- cbind(df, result)
PlotDetections(res, title = "KNN-CAD ANOMALY DETECTOR")


alaineiturria/otsad documentation built on Jan. 12, 2023, 12:26 p.m.