Anomaly Detection

    message = FALSE,
    warning = FALSE,
    fig.width = 8, 
    fig.height = 4.5,
    fig.align = 'center',
    dpi = 100

# devtools::load_all() # Travis CI fails on load_all()

Anomaly detection is an important part of time series analysis:

  1. Detecting anomalies can signify special events
  2. Cleaning anomalies can improve forecast error

In this short tutorial, we will cover the plot_anomaly_diagnostics() and tk_anomaly_diagnostics() functions for visualizing and automatically detecting anomalies at scale.



This tutorial will use the walmart_sales_weekly dataset:


Anomaly Visualization

Using the plot_anomaly_diagnostics() function, we can interactively detect anomalies at scale.

walmart_sales_weekly %>%
  group_by(Store, Dept) %>%
  plot_anomaly_diagnostics(Date, Weekly_Sales, .facet_ncol = 2)

Automatic Anomaly Detection

To get the data on the anomalies, we use tk_anomaly_diagnostics(), the preprocessing function.

walmart_sales_weekly %>%
  group_by(Store, Dept) %>%
  tk_anomaly_diagnostics(Date, Weekly_Sales)

Learning More

My Talk on High-Performance Time Series Forecasting

Time series is changing. Businesses now need 10,000+ time series forecasts every day. This is what I call a High-Performance Time Series Forecasting System (HPTSF) - Accurate, Robust, and Scalable Forecasting.

High-Performance Forecasting Systems will save companies MILLIONS of dollars. Imagine what will happen to your career if you can provide your organization a "High-Performance Time Series Forecasting System" (HPTSF System).

I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. If interested in learning Scalable High-Performance Forecasting Strategies then take my course. You will learn:

Unlock the High-Performance Time Series Forecasting Course

Try the timetk package in your browser

Any scripts or data that you put into this service are public.

timetk documentation built on Nov. 2, 2023, 6:18 p.m.