OipSdEwma | R Documentation |
OipSdEwma
is the optimized implementation of the
IpSdEwma
function using environmental variables. This function allows
the calculation of anomalies using SD-EWMA alogrithm in an incremental
processing mode. It has been shown that in long datasets it can reduce
runtime by up to 50%. SD-EWMA algorithm is a novel method for covariate
shift-detection tests based on a two-stage structure for univariate
time-series. It works in an online mode and it uses an exponentially weighted
moving average (EWMA) model based control chart to detect the covariate
shift-point in non-stationary time-series.
OipSdEwma(data, n.train, threshold, l = 3, last.res = NULL)
data |
Numerical vector with training and test datasets. |
n.train |
Number of points of the dataset that correspond to the training set. |
threshold |
Error smoothing constant. |
l |
Control limit multiplier. |
last.res |
Last result returned by the algorithm. |
data
must be a numerical vector without NA values.
threshold
must be a numeric value between 0 and 1. It is recommended
to use low values such as 0.01 or 0.05. By default, 0.01 is used. l
is
the parameter that determines the control limits. By default, 3 is used.
Finally last.res
is the last result returned by some previous
execution of this algorithm. The first time the algorithm is executed its
value is NULL. However, to run a new batch
of data without having to include it in the old dataset and restart the
process, the two parameters returned by the last run are only needed.
This algorithm can be used for both classical and incremental processing. It
should be noted that in case of having a finite dataset the
CpSdEwma
or OcpSdEwma
algorithms are faster.
Incremental processing can be used in two ways. 1) Processing all available
data and saving last.res
for future runs in which there is new data.
2) Using the stream library
for when there is too much data and it does not fit into memory. An example
has been made for this use case.
A list of the following items.
result |
dataset conformed by the following columns. |
is.anomaly
1 if the value is anomalous 0, otherwise.
ucl
Upper control limit.
lcl
Lower control limit.
last.res |
Last result returned by the algorithm. Is a dataset containing the parameters calculated in the last iteration and necessary for the next one. |
Raza, H., Prasad, G., & Li, Y. (03 de 2015). EWMA model based shift-detection methods for detecting covariate shifts in non-stationary environments. Pattern Recognition, 48(3), 659-669.
## EXAMPLE 1: ---------------------- ## It can be used in the same way as with OcpSdEwma passing the whole dataset as ## an argument. ## Generate data set.seed(100) n <- 180 x <- sample(1:100, n, replace = TRUE) x[70:90] <- sample(110:115, 21, replace = TRUE) x[25] <- 200 x[150] <- 170 df <- data.frame(timestamp = 1:n, value = x) ## Calculate anomalies result <- OipSdEwma( data = df$value, n.train = 5, threshold = 0.01, l = 3 ) res <- cbind(df, result$result) ## Plot results PlotDetections(res, print.time.window = FALSE, title = "SD-EWMA ANOMALY DETECTOR") ## EXAMPLE 2: ---------------------- ## You can use it in an incremental way. This is an example using the stream ## library. This library allows the simulation of streaming operation. # install.packages("stream") library("stream") ## Generate data set.seed(100) n <- 500 x <- sample(1:100, n, replace = TRUE) x[70:90] <- sample(110:115, 21, replace = TRUE) x[25] <- 200 x[320] <- 170 df <- data.frame(timestamp = 1:n, value = x) dsd_df <- DSD_Memory(df) ## Initialize parameters for the loop last.res <- NULL res <- NULL nread <- 100 numIter <- n%/%nread ## Calculate anomalies for(i in 1:numIter) { # read new data newRow <- get_points(dsd_df, n = nread, outofpoints = "ignore") # calculate if it's an anomaly last.res <- OipSdEwma( data = newRow$value, n.train = 5, threshold = 0.01, l = 3, last.res = last.res$last.res ) # prepare the result if(!is.null(last.res$result)){ res <- rbind(res, cbind(newRow, last.res$result)) } } # plot PlotDetections(res, title = "SD-EWMA ANOMALY DETECTOR")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.