View source: R/GenTSAnomVars.R
GenTSAnomVars | R Documentation |
GenTSAnomVars is an automated z-score anomaly detection via GLM-like procedure. Data is z-scaled and grouped by factors and time periods to determine which points are above and below the control limits in a cumulative time fashion. Then a cumulative rate is created as the final variable. Set KeepAllCols to FALSE to utilize the intermediate features to create rolling stats from them. The anomalies are separated into those that are extreme on the positive end versus those that are on the negative end.
GenTSAnomVars(
data,
ValueCol = "Value",
GroupVars = NULL,
DateVar = "DATE",
HighThreshold = 1.96,
LowThreshold = -1.96,
KeepAllCols = TRUE,
IsDataScaled = FALSE
)
data |
the source residuals data.table |
ValueCol |
the numeric column to run anomaly detection over |
GroupVars |
this is a group by variable |
DateVar |
this is a time variable for grouping |
HighThreshold |
this is the threshold on the high end |
LowThreshold |
this is the threshold on the low end |
KeepAllCols |
set to TRUE to remove the intermediate features |
IsDataScaled |
set to TRUE if you already scaled your data |
The original data.table with the added columns merged in. When KeepAllCols is set to FALSE, you will get back two columns: AnomHighRate and AnomLowRate - these are the cumulative anomaly rates over time for when you get anomalies from above the thresholds (e.g. 1.96) and below the thresholds.
Adrian Antico
Other Unsupervised Learning:
ResidualOutliers()
## Not run:
data <- data.table::data.table(
DateTime = as.Date(Sys.time()),
Target = stats::filter(
rnorm(10000, mean = 50, sd = 20),
filter=rep(1,10),
circular=TRUE))
data[, temp := seq(1:10000)][, DateTime := DateTime - temp][
, temp := NULL]
data <- data[order(DateTime)]
x <- data.table::as.data.table(sde::GBM(N=10000)*1000)
data[, predicted := x[-1,]]
data[, Fact1 := sample(letters, size = 10000, replace = TRUE)]
data[, Fact2 := sample(letters, size = 10000, replace = TRUE)]
data[, Fact3 := sample(letters, size = 10000, replace = TRUE)]
stuff <- GenTSAnomVars(
data,
ValueCol = "Target",
GroupVars = c("Fact1","Fact2","Fact3"),
DateVar = "DateTime",
HighThreshold = 1.96,
LowThreshold = -1.96,
KeepAllCols = TRUE,
IsDataScaled = FALSE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.