OipPewma | R Documentation |
OipPewma
is the optimized implementation of the
IpPewma
function using environmental variables. It has been shown that
in long datasets it can reduce runtime by up to 50%. This function allows
the calculation of anomalies using PEWMA in an incremental processing mode.
This algorithm is a probabilistic method of EWMA which dynamically adjusts the
parameterization based on the probability of the given observation. This
method produces dynamic, data-driven anomaly thresholds which are robust to
abrupt transient changes, yet quickly adjust to long-term distributional
shifts.
OipPewma(data, alpha0 = 0.2, beta = 0, n.train = 5, l = 3, last.res = NULL)
data |
Numerical vector with training and test dataset. |
alpha0 |
Maximal weighting parameter. |
beta |
Weight placed on the probability of the given observation. |
n.train |
Number of points of the dataset that correspond to the training set. |
l |
Control limit multiplier. |
last.res |
Last result returned by the algorithm. |
data
must be a numerical vector without NA values.
alpha0
must be a numeric value where 0 < alpha0
< 1. If a
faster adjustment to the initial shift is desirable, simply lowering
alpha0
will suffice. beta
is the weight placed on the
probability of the given observation. it must be a numeric value where
0 <= beta
<= 1. Note that beta
equals 0, PEWMA converges to a
standard EWMA. Finally l
is the parameter that determines the control
limits. By default, 3 is used. last.res
is the last result returned
by some previous execution of this algorithm. The first time the algorithm
is executed its value is NULL. However, to run a new batch
of data without having to include it in the old dataset and restart the
process, the two parameters returned by the last run are only needed.
This algorithm can be used for both classical and incremental processing. It
should be noted that in case of having a finite dataset the
CpPewma
or OcpPewma
algorithms are faster.
Incremental processing can be used in two ways. 1) Processing all available
data and saving last.res
for future runs in which there is new data.
2) Using the stream library
for when there is too much data and it does not fit into the memory.
An example has been made for this use case.
A list of the following items.
result |
dataset conformed by the following columns. |
is.anomaly
1 if the value is anomalous 0, otherwise.
ucl
Upper control limit.
lcl
Lower control limit.
last.res |
Last result returned by the algorithm. Is a dataset containing the parameters calculated in the last iteration and necessary for the next one. |
M. Carter, Kevin y W. Streilein. Probabilistic reasoning for streaming anomaly detection. 2012 IEEE Statistical Signal Processing Workshop (SSP), pp. 377-380, Aug 2012.
## EXAMPLE 1: ---------------------- ## It can be used in the same way as with OcpPewma passing the whole dataset as ## an argument. ## Generate data set.seed(100) n <- 180 x <- sample(1:100, n, replace = TRUE) x[70:90] <- sample(110:115, 21, replace = TRUE) x[25] <- 200 x[150] <- 170 df <- data.frame(timestamp = 1:n, value = x) ## Calculate anomalies result <- OipPewma( data = df$value, alpha0 = 0.8, beta = 0.1, n.train = 5, l = 3, last.res = NULL ) res <- cbind(df, result$result) ## Plot results PlotDetections(res, title = "PEWMA ANOMALY DETECTOR") ## EXAMPLE 2: ---------------------- ## You can use it in an incremental way. This is an example using the stream ## library. This library allows the simulation of streaming operation. # install.packages("stream") library("stream") ## Generate data set.seed(100) n <- 500 x <- sample(1:100, n, replace = TRUE) x[70:90] <- sample(110:115, 21, replace = TRUE) x[25] <- 200 x[320] <- 170 df <- data.frame(timestamp = 1:n, value = x) dsd_df <- DSD_Memory(df) ## Initialize parameters for the loop last.res <- NULL res <- NULL nread <- 100 numIter <- n%/%nread ## Calculate anomalies for(i in 1:numIter) { # read new data newRow <- get_points(dsd_df, n = nread, outofpoints = "ignore") # calculate if it's an anomaly last.res <- OipPewma( data = newRow$value, n.train = 5, alpha0 = 0.8, beta = 0.1, l = 3, last.res = last.res$last.res ) # prepare the result if(!is.null(last.res$result)){ res <- rbind(res, cbind(newRow, last.res$result)) } } ## Plot results PlotDetections(res, print.time.window = FALSE, title = "PEWMA ANOMALY DETECTOR")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.