get_anomalies: Retrieve anomalies

View source: R/get_anomalies.R

get_anomaliesR Documentation

Retrieve anomalies

Description

Based on a summary normalized/stacked metric, retrieve top anomalies.

Usage

get_anomalies(
  x,
  rank.prop = 0.05,
  nmin = 10,
  nmax = 300,
  stack.use = "avg",
  method.use = "norm",
  verbose = TRUE,
  ...
)

Arguments

x

stranger object (before of after singularize)

stack.use

One of c("max","avg","min","damavg", "pruavg")) - must have been requestedwhen invoking 'singularize' (done by default).

method.use

One of c("norm","rank") - must have been requested when invoking 'singularize' (done by default).

verbose

logical: provide some information.

...

additional parameters to pass to singularize (if called on a non-singularized object)

Anomalies selection is performed using one summary metric. This summary metrics is assumed to stacked some base metrics - may be only one!. Stacking is performed after standardisarion, being possible with two approaches: normalisation (method.use = "norm") or ranking (method.use = "rank"). See singularize function.

Three parameters are used together to define anomalies: rank.prop is first used to filter on top x percent anomalies then one applies on top of this criteria conditions on a minimal (nmin) and maximal (nmax) number of anomalies to be provided.

rank.prop:

proportion of records to be considered as anomalies

nmin:

constraint - minimum number of anomalies

nmax:

constraint - maximum number of anomalies

Examples

data <- crazyfy(iris[,1:4])
(anom <- get_anomalies(strange(data)))
## Not run: 
library(dplyr)
ss <- iris %>% select(-Species) %>%
 crazyfy() %>%
 strange(weird="autoencode") %>%
 singularize(methods="norm",stacks="avg")
 anom2 <- ss %>% get_anomalies(nmin=2, nmax=4)
 ss %>% plot(type="n",score="N_anom_norm_avg",anomaly_id=anom2[1])

## End(Not run)

welovedatascience/stranger documentation built on Oct. 12, 2022, 10:52 p.m.