knitr::opts_chunk$set( # message = FALSE, # warning = FALSE, fig.width = 8, fig.height = 4.5, fig.align = 'center', out.width='95%', dpi = 100 ) # devtools::load_all() # Travis CI fails on load_all()
Frequency and trend cycles are used in many time series applications including Seasonal ARIMA (SARIMA) forecasting and STL Decomposition. timetk
includes functionality for Automatic Frequency and Trend Selection. These tools use only the the timestamp information to make logical guesses about the frequency and trend.
Before we get started, load the following packages.
library(dplyr) library(timetk)
Daily Irregular Data
The daily stock prices of Facebook from 2013 to 2016. Note that trading days only occur on "business days" (non-weekends and non-business-holidays).
data(FANG) FB_tbl <- FANG %>% dplyr::filter(symbol == "FB") FB_tbl
Sub-Daily Data
Taylor's Energy Demand data at a 30-minute timestamp interval.
taylor_30_min
An example of where automatic frequency detection occurs is in the plot_stl_diagnostics()
function.
taylor_30_min %>% plot_stl_diagnostics(date, value, .frequency = "auto", .trend = "auto", .interactive = FALSE)
The period
argument has three basic options for returning a frequency. Options include:
A frequency is loosely defined as the number of observations that comprise a cycle in a data set.
Using tk_get_frequency()
, we can pick a number of observations that will roughly define a frequency for the series.
Daily Irregular Data
Because FB_tbl
is irregular (weekends and holidays are not present), the frequency selected is weekly but each week is only 5-days typically. So 5 is selected.
FB_tbl %>% tk_index() %>% tk_get_frequency(period = "auto")
Sub-Daily Data
This works as well for a sub-daily time series. Here we'll use taylor_30_min
for a 30-minute timestamp series. The frequency selected is 48 because there are 48 timestamps (observations) in 1 day for the 30-minute cycle.
taylor_30_min %>% tk_index() %>% tk_get_frequency("1 day")
The trend is loosely defined as time span that can be aggregated across to visualize the central tendency of the data.
Using tk_get_trend()
, we can pick a number of observations that will help describe a trend for the data.
Daily Irregular Data
Because FB_tbl
is irregular (weekends and holidays are not present), the trend selected is 3 months but each week is only 5-days typically. So 64 observations is selected.
FB_tbl %>% tk_index() %>% tk_get_trend(period = "auto")
Sub-Daily Data
A 14-day (2 week) interval is selected for the "30-minute" interval data.
taylor_30_min %>% tk_index() %>% tk_get_trend("auto")
A Time-Scale Template is used to get and set the time scale template, which is used by tk_get_frequency()
and tk_get_trend()
when period = "auto"
.
The predefined template is stored in a function tk_time_scale_template()
. This is the default used by timetk
.
Accessing the Default Template
You can access the current template with get_tk_time_scale_template()
.
get_tk_time_scale_template()
Changing the Default Template
You can modify the current template with set_tk_time_scale_template()
.
My Talk on High-Performance Time Series Forecasting
Time series is changing. Businesses now need 10,000+ time series forecasts every day.
High-Performance Forecasting Systems will save companies MILLIONS of dollars. Imagine what will happen to your career if you can provide your organization a "High-Performance Time Series Forecasting System" (HPTSF System).
I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. If interested in learning Scalable High-Performance Forecasting Strategies then take my course. You will learn:
Modeltime
- 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more)GluonTS
(Competition Winners)Unlock the High-Performance Time Series Forecasting Course
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.