knitr::opts_chunk$set(
    # message = FALSE,
    # warning = FALSE,
    fig.width = 8, 
    fig.height = 4.5,
    fig.align = 'center',
    out.width='95%', 
    dpi = 100
)

# devtools::load_all() # Travis CI fails on load_all()

Frequency and trend cycles are used in many time series applications including Seasonal ARIMA (SARIMA) forecasting and STL Decomposition. timetk includes functionality for Automatic Frequency and Trend Selection. These tools use only the the timestamp information to make logical guesses about the frequency and trend.

Prerequisites

Before we get started, load the following packages.

library(dplyr)
library(timetk)

Data

Daily Irregular Data

The daily stock prices of Facebook from 2013 to 2016. Note that trading days only occur on "business days" (non-weekends and non-business-holidays).

data(FANG)

FB_tbl <- FANG %>% dplyr::filter(symbol == "FB")
FB_tbl

Sub-Daily Data

Taylor's Energy Demand data at a 30-minute timestamp interval.

taylor_30_min

Applications

An example of where automatic frequency detection occurs is in the plot_stl_diagnostics() function.

taylor_30_min %>%
    plot_stl_diagnostics(date, value, 
                         .frequency = "auto", .trend = "auto",
                         .interactive = FALSE)

Automatic Frequency & Trend Selection

Specifying a Frequency or Trend

The period argument has three basic options for returning a frequency. Options include:

Frequency

A frequency is loosely defined as the number of observations that comprise a cycle in a data set.

Using tk_get_frequency(), we can pick a number of observations that will roughly define a frequency for the series.

Daily Irregular Data

Because FB_tbl is irregular (weekends and holidays are not present), the frequency selected is weekly but each week is only 5-days typically. So 5 is selected.

FB_tbl %>% tk_index() %>% tk_get_frequency(period = "auto")

Sub-Daily Data

This works as well for a sub-daily time series. Here we'll use taylor_30_min for a 30-minute timestamp series. The frequency selected is 48 because there are 48 timestamps (observations) in 1 day for the 30-minute cycle.

taylor_30_min %>% tk_index() %>% tk_get_frequency("1 day")

Trend

The trend is loosely defined as time span that can be aggregated across to visualize the central tendency of the data.

Using tk_get_trend(), we can pick a number of observations that will help describe a trend for the data.

Daily Irregular Data

Because FB_tbl is irregular (weekends and holidays are not present), the trend selected is 3 months but each week is only 5-days typically. So 64 observations is selected.

FB_tbl %>% tk_index() %>% tk_get_trend(period = "auto")

Sub-Daily Data

A 14-day (2 week) interval is selected for the "30-minute" interval data.

taylor_30_min %>% tk_index() %>% tk_get_trend("auto")

Time Scale Template

A Time-Scale Template is used to get and set the time scale template, which is used by tk_get_frequency() and tk_get_trend() when period = "auto".

The predefined template is stored in a function tk_time_scale_template(). This is the default used by timetk.

Accessing the Default Template

You can access the current template with get_tk_time_scale_template().

get_tk_time_scale_template()

Changing the Default Template

You can modify the current template with set_tk_time_scale_template().

Learning More

My Talk on High-Performance Time Series Forecasting

Time series is changing. Businesses now need 10,000+ time series forecasts every day.

High-Performance Forecasting Systems will save companies MILLIONS of dollars. Imagine what will happen to your career if you can provide your organization a "High-Performance Time Series Forecasting System" (HPTSF System).

I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. If interested in learning Scalable High-Performance Forecasting Strategies then take my course. You will learn:

Unlock the High-Performance Time Series Forecasting Course



business-science/timekit documentation built on Feb. 2, 2024, 2:51 a.m.