About the example data

Each forecasting model is explained using an example dataset which is publicly available. The dataset used is the classic Box & Jenkins airline data, which contains monthly totals of international airline passengers from 1949 to 1960. It contains a clear trend and yearly seasonality, which enables some of the models to achieve a pretty good fit and a succesfull forecast.

Please note that even though some of the other models have a bad fit on this particular dataset (and for this particular split date), these models can still be very succesfull in forecasting based on other datasets!

For a more extensive and interactive demonstration of the different forecasting models implemented in this package, you can run one of the following examples after loading the tsforecast package in R:

# Create function to prevent repetition of code for each fc_model
create_fc_model_plot <- function(fc_model) {
  example_data <- univariate_example_data_1 %>% 
    dplyr::filter(grouping == "dataset = AirPassengers   &   type = original") %>% 
    dplyr::mutate(grouping = "dataset = AirPassengers") %>% 
    dplyr::filter(ts_split_date == 195512)
  compare_forecasts_with_actuals(
      main_forecasting_table = example_data,
      fc_models = fc_model
    ) %>% 
    plotly::config(displayModeBar = FALSE) %>% 
    plotly::layout(
      #yaxis = list(range = c(-10, 810)),
      legend = list(orientation = 'h', x = 0.35, y = -0.1)
    )
}



Basic models {.tabset .tabset-fade .tabset-pills}

Mean forecast models {.tabset .tabset-fade .tabset-pills}

In mean forecast models, the forecasts of all future values are equal to the mean of the latest available historical data. This forecasting method is very simple but can also be surprisingly effective, especially in case of a volatile time series without a clear trend. One parameter to play around with for these forecast models is the number of previous data points that are used to calculate the mean value that is used as a forecast for future data points. The mean forecast models are implemented using the forecast::meanf() function. More information on mean forecast models can be found in this online book.

The following versions of this model are available in the tsforecast package:

fc_mean_l12m

fc_mean_l12m = Mean value over the last 12 months

In this model the mean is calculated over the last 12 months prior to the forecast date, which is then extrapolated as a forecast for the required future months. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_mean_l12m")

fc_mean_l6m

fc_mean_l6m = Mean value over the last 6 months

In this model the mean is calculated over the last 6 months prior to the forecast date, which is then extrapolated as a forecast for the required future months. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_mean_l6m")

fc_mean_l3m

fc_mean_l3m = Mean value over the last 3 months

In this model the mean is calculated over the last 3 months prior to the forecast date, which is then extrapolated as a forecast for the required future months. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_mean_l3m")

Drift forecast models {.tabset .tabset-fade .tabset-pills}

Drift forecast models are a variation on the naive method to allow the forecasts to increase or decrease over time, where the amount of change over time (called the drift) is set to be the mean change seen in the historical data. This forecasting method is quite simple but can also be surprisingly effective, especially in case of a volatile time series with a clear trend. One parameter to play around with for these forecast models is the number of previous data points that are used to calculate the drift that is used to forecast the future data points. The drift forecast models are implemented using the forecast::rwf(drift = TRUE) function. More information on drift forecast models can be found in this online book.

The following versions of this model are available in the tsforecast package:

fc_drift_l12m

fc_drift_l12m = Random walk with drift over the last 12 months

In this model the drift is calculated over the last 12 months prior to the forecast date, which is then extrapolated in the forecast for the required future months. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_drift_l12m")

fc_drift_l6m

fc_drift_l6m = Random walk with drift over the last 6 months

In this model the drift is calculated over the last 6 months prior to the forecast date, which is then extrapolated in the forecast for the required future months. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_drift_l6m")

fc_drift_l3m

fc_drift_l3m = Random walk with drift over the last 3 months

In this model the drift is calculated over the last 3 months prior to the forecast date, which is then extrapolated in the forecast for the required future months. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_drift_l3m")

Naive forecast models {.tabset .tabset-fade .tabset-pills}

In naive forecast models, the forecasts of all future values are simply set to be the value of the last observation. This forecast method is also sometimes refered to as 'yesterdays weather'. The naive forecast method can work remarkably well for many economic and financial time series. A similar method, called seasonal naive method, is useful for highly seasonal data. In that case, we set each forecast to be equal to the last observed value from the same season of the year (e.g., the same month of the previous year). More information on naive forecast models can be found in this online book.

The following versions of this model are available in the tsforecast package:

fc_naive

fc_naive = Naive forecast model which extrapolates the last known value

The naive forecast model is implemented using the forecast::naive() function. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_naive")

fc_naive_seasonal

fc_naive_seasonal = Seasonal Naive forecast model which extrapolates the last known value from the last same season

The seasonal naive forecast model is implemented using the forecast::snaive() function. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_naive_seasonal")

Linear models {.tabset .tabset-fade .tabset-pills}

Linear models can be fit to time series, including trend and seasonality components. A time series linear model is basically a regression where variables "trend" and "season" are created from the time series characteristics of the data to be used in the regression model. The variable "trend" is a simple time trend and "season" is a factor indicating the season (e.g., the month or the quarter depending on the frequency of the data). More information on applying linear models to time series data can be found this online book.

The following versions of this model are available in the tsforecast package:

fc_linear_trend

fc_linear_trend = Time series linear model with only a trend component

The linear model with only a trend component is implemented using the forecast::tslm(formula = x ~ trend) function, which is largely a wrapper for the generic stats::lm() function used to fit linear models. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_linear_trend")

fc_linear_trend_seasonal

fc_linear_trend_seasonal = Time series linear model with trend and seasonality components

The linear model with both a trend and seasonal component is implemented using the forecast::tslm(formula = x ~ trend + season) function, which is largely a wrapper for the generic stats::lm() function used to fit linear models. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_linear_trend_seasonal")

Holt-Winters models {.tabset .tabset-fade .tabset-pills}

The Holt-Winters models are an extended simple exponential smoothing to allow forecasting of data with a trend and seasonality. It comprises a forecast equation and three smoothing equations, one for the level, one for trend and one for the seasonal component. There are two variations to this method that differ in the nature of the seasonal component. The additive method is preferred when the seasonal variations are roughly constant through the series, while the multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series. A detailed description of Holt-Winters models can be found in this online book.

The following versions of this model are available in the tsforecast package:

fc_holt_winters_addiv

fc_holt_winters_addiv = Holt-Winters filtering (with additive trend)

The Holt-Winters model with additive seasonal model is implemented using the stats::HoltWinters(seasonal = 'additive') function, which computes Holt-Winters filtering of a given time series and estimates parameters by minimizing the squared prediction error. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_holt_winters_addiv")

fc_holt_winters_multip

fc_holt_winters_multip = Holt-Winters filtering (with multiplicative trend)

The Holt-Winters model with multiplicative seasonal model is implemented using the stats::HoltWinters(seasonal = 'multiplicative') function, which computes Holt-Winters filtering of a given time series and estimates parameters by minimizing the squared prediction error. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_holt_winters_multip")

(T)BATS models {.tabset .tabset-fade .tabset-pills}

(T)BATS models are based on an innovations state space modeling framework for forecasting complex seasonal time series such as those with multiple seasonal periods, high frequency seasonality, non-integer seasonality and dual-calendar effects. The models incorporate BoxCox transformations, Fourier representations with time varying coefficients, and ARMA error correction. The framework for the (T)BATS models are described in detail in this online paper by De Livera, Hyndman & Snyder (2011).

The following versions of this model are available in the tsforecast package:

fc_bats

fc_bats = Box-Cox transform, ARMA errors, Trend, and Seasonal components

The BATS model is implemented using the forecast::bats(stepwise = TRUE, approximation = TRUE) function, which is an exponential smoothing state space model with Box-Cox transformation, ARMA errors, Trend and Seasonal components, as described in the paper by De Livera, Hyndman & Snyder (2011). The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_bats")

fc_tbats

fc_tbats = Box-Cox transform, ARMA errors, Trend, and Seasonal components (with Trigonometric seasonal formulation)

The TBATS model is implemented using the forecast::tbats(stepwise = TRUE, approximation = TRUE) function, which is basically a BATS model with trigonometric seasonal formulation, as described in the paper by De Livera, Hyndman & Snyder (2011). The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_tbats")

ETS models {.tabset .tabset-fade .tabset-pills}

ETS models (with Error, Trend and Seasonality components) are based on the concept of exponential smoothing, in which previous observations are weighted according to some function to calculate forecasts. A framework methodology introduced by Hyndman et al. (2002 & 2008) considers 15 possible exponential smoothing methods, combined with 2 different models: one with additive errors and one with multiplicative errors. This creates a taxonomy of 30 different forecast methods that are assessed within the ETS framework to create a forecast model. Model selection is based on a specified information criterion, which is one of AICc, AIC or BIC. One of the advantages of ETS is that it can handle any combination of trend, seasonality and damping. An extensive description of ETS models can be found in this online book.

The following versions of this model are available in the tsforecast package:

fc_ets_addiv

fc_ets_addiv = Exponential smoothing model (with only additive trend allowed)

The ETS model with only additive trend allowed is implemented using the forecast::ets(additive.only = TRUE, allow.multiplicative.trend = FALSE) function, which conducts a search over the different possible methods from the ETS framework methodology by Hyndman et al. (2002 & 2008). The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_ets_addiv")

fc_ets_multip

fc_ets_multip = Exponential smoothing model (with multiplicative trend allowed)

The ETS model with multiplicative trend allowed is implemented using the forecast::ets(additive.only = FALSE, allow.multiplicative.trend = TRUE) function, which conducts a search over the different possible methods from the ETS framework methodology by Hyndman et al. (2002 & 2008). The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_ets_multip")

fc_ets_addiv_damped

fc_ets_addiv_damped = Exponential smoothing model (with only additive damped trend allowed)

The ETS model with only additive damped trend allowed is implemented using the forecast::ets(additive.only = TRUE, allow.multiplicative.trend = FALSE, damped = TRUE) function, which conducts a search over the different possible methods from the ETS framework methodology by Hyndman et al. (2002 & 2008). The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_ets_addiv_damped")

fc_ets_multip_damped

fc_ets_multip_damped = Exponential smoothing model (with multiplicative damped trend allowed)

The ETS model with multiplicative damped trend allowed is implemented using the forecast::ets(additive.only = FALSE, allow.multiplicative.trend = TRUE, damped = TRUE) function, which conducts a search over the different possible methods from the ETS framework methodology by Hyndman et al. (2002 & 2008). The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_ets_multip_damped")

fc_ets_stl

fc_ets_stl = Exponential smoothing model after applying Seasonal decomposition of the Time series by Loess

The STL decomposition model is implemented using the forecast::stlm(method = 'ets') function, which applies the ETS forecasting method to the seasonally adjusted data and re-seasonalizing using the last year of the seasonal component. More information on forecasting with decomposition can be found in this online book. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_ets_stl")

ARIMA models {.tabset .tabset-fade .tabset-pills}

In statistics and econometrics an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. ARIMA models are applied in some cases where data show evidence of non-stationarity, where an initial differencing step (corresponding to the "integrated" part of the model) can be applied one or more times to eliminate the non-stationarity. An extensive description of ARIMA models can be found in this online book, which explains the concept of stationarity and differencing, autoregression and moving average models.

The following versions of this model are available in the tsforecast package:

fc_arima

fc_arima = AutoRegressive Integrated Moving Average

The ARIMA model is implemented using the forecast::auto.arima(stepwise = TRUE, approximation = TRUE) function, which conducts a search over possible models within the order contraints provided. Model selection is based on a specified information criterion, which is one of AIC, AICc or BIC. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_arima")

fc_arima_stl

fc_arima_stl = AutoRegressive Integrated Moving Average after applying Seasonal decomposition of the Time series by Loess

The STL decomposition model is implemented using the forecast::stlm(method = 'arima') function, which applies the ARIMA forecasting method to the seasonally adjusted data and re-seasonalizing using the last year of the seasonal component. More information on forecasting with decomposition can be found in this online book. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_arima_stl")

Kalman Filter models {.tabset .tabset-fade .tabset-pills}

Kalman filters are a type of state space model where it is assumed that what is observed is noisy, and thus the real value of an observation is latent. The latent variable is known as the underlying state and the kalman filter aims to filter out observed noise to estimate this underlying state. By using the dlm package in R, one can run kalman filters on many different functional forms of the underlying state. To see what these underlying states can look like, see this excellent presentation by Rob J. Hyndman, Professor of Statistics and Head of the Department of Econometrics & Business Statistics at Monash University, Australia.

The following versions of this model are available in the tsforecast package:

fc_kalman_poly

fc_kalman_poly = Kalman Filter with a Polynomial Underlying State

The underlying state is assumed to be a polynomial of two degrees, i.e. a linear regression.

create_fc_model_plot("fc_kalman_poly")

fc_kalman_seas_12

fc_kalman_seas_12 = Kalman Filter with a Seasonality of 12 Months in the Underlying State

The underlying state is assumed to have a seasonality of 12 months.

create_fc_model_plot("fc_kalman_seas_12")

Neural Network models {.tabset .tabset-fade .tabset-pills}

Artificial neural networks are forecasting methods that are based on simple mathematical models of the brain. They allow complex nonlinear relationships between the response variable and its predictors. The very simplest networks contain no hidden layers and are equivalent to linear regression. Once we add an intermediate layer with hidden neurons, the neural network becomes non-linear. With time series data, lagged values of the time series can be used as inputs to a neural network. More information on applying neural networks for time series forecasting can be found in this online book.

The following versions of this model are available in the tsforecast package:

fc_nn_5n_0decay

fc_nn_5n_0decay= Feed-forward Neural Network with a single hidden layer of 5 nodes without weight decay

The Neural Network model is implemented using the forecast::nnetar(size = 5) function, which creates a feed-forward neural network with a single hidden layer consisting of 5 nodes without weight decay and with lagged inputs for forecasting the time series. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_nn_5n_0decay")

fc_nn_25n_0decay

fc_nn_25n_0decay= Feed-forward Neural Network with a single hidden layer of 25 nodes without weight decay

The Neural Network model is implemented using the forecast::nnetar(size = 25) function, which creates a feed-forward neural network with a single hidden layer consisting of 25 nodes without weight decay and with lagged inputs for forecasting the time series. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_nn_25n_0decay")

fc_nn_5n_50decay

fc_nn_5n_50decay= Feed-forward Neural Network with a single hidden layer of 5 nodes with 50% weight decay

The Neural Network model is implemented using the forecast::nnetar(size = 5, decay = 0.50) function, which creates a feed-forward neural network with a single hidden layer consisting of 5 nodes with 50% weight decay and with lagged inputs for forecasting the time series. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_nn_5n_50decay")

fc_nn_25n_50decay

fc_nn_25n_50decay= Feed-forward Neural Network with a single hidden layer of 25 nodes with 50% weight decay

The Neural Network model is implemented using the forecast::nnetar(size = 25, decay = 0.50) function, which creates a feed-forward neural network with a single hidden layer consisting of 25 nodes with 50% weight decay and with lagged inputs for forecasting the time series. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_nn_25n_50decay")

fc_nn_5n_mlp

fc_nn_5n_mlp= MultiLayer Perceptron Neural Network with a single hidden layer of 5 nodes

The MLP Neural Network model is implemented using the nnfor::mlp(x, hd = 5, reps = 10) function, which creates a multilayer perceptron neural network with a single hidden layer consisting of 5 nodes for forecasting the time series. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_nn_5n_mlp")

fc_nn_25n_elm

fc_nn_25n_elm= Extreme Learning Machine Neural Network with a single hidden layer of 25 nodes

The ELM Neural Network model is implemented using the nnfor::elm(x, hd = 25, reps = 10) function, which creates a extreme learning machine neural network with a single hidden layer consisting of 25 nodes for forecasting the time series. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_nn_25n_elm")

Prophet models {.tabset .tabset-fade .tabset-pills}

Prophet is a procedure for forecasting time series data developed and released as open source software by Facebook's Core Data Science team. Prophet is used in many applications across Facebook for producing reliable forecasts for planning and goal setting. It is robust to outliers, missing data, and dramatic changes in time series, for which it can automatically select changepoints in the time series. The flexibility of the automatic changepoint selection can be tuned through one of the parameters, to make the model more or less sensitive to changes in the time series pattern over time. More information on Prophet can be found on the facebook github or in this paper

The following versions of this model are available in the tsforecast package:

fc_prophet_005cps

fc_prophet_005cps = Prophet model with low flexibility for the automatic ChangePoint Selection

The Prophet model with low flexibility for automatic changepoint selection is implemented using the prophet::prophet(changepoint.prior.scale = 0.005) function. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_prophet_005cps")

fc_prophet_050cps

fc_prophet_050cps = Prophet model with medium (default) flexibility for the automatic ChangePoint Selection

The Prophet model with medium (default) flexibility for automatic changepoint selection is implemented using the prophet::prophet(changepoint.prior.scale = 0.050) function. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_prophet_050cps")

fc_prophet_500cps

fc_prophet_500cps = Prophet model with high flexibility for the automatic ChangePoint Selection

The Prophet model with high flexibility for automatic changepoint selection is implemented using the prophet::prophet(changepoint.prior.scale = 0.500) function. The visualization below demonstrated the model when applied to the example dataset.

create_fc_model_plot("fc_prophet_500cps")

Regression Trees and Random Forests {.tabset .tabset-fade .tabset-pills}

Regression trees and ensemble models are some of the most commonly used machine learning methods. A tree is the outcome of a non-parametric regression, a decision tree that shows what value the variable of interest takes given differing levels of the explanatory variables. They can work with large numbers of explanatory variables, choose the ones that have the most impact and also visualize the relative degree of importance amongst them.

Their downside is that they are prone to over-fitting. This can be overcome with fine-tuning the regression tree parameters and/or using ensemble methods, making use of many trees (known as random forests). Another downside is that they are not able to account for trends when forecasting. Thus the trees take the first difference of the variable of interest as their dependent variable.

The following versions of this model are available in the tsforecast package:

fc_rpart

fc_rpart = Recursive PARTioning Trees

The RPART model is implemented using the rpart::rpart() function. An intuitive description of its inner workings can be found here. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_rpart")

fc_ctree

fc_ctree = Conditional Inference TREEs

The CTREE model is implemented using the party::ctree() function. An intuitive description of its inner workings can be found here. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_ctree")

fc_randomforest

fc_randomforest = Random Forest

The Random Forest is implemented using the randomForest::randomForest() function. One of the first ensemble methods developed by statisticians at UC Berkeley, the random forest is an ensemble of multiple tree models as shown previously in this documentation. The original article describing the random forest can be found here. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_randomforest")

Ensemble of Forecast models {.tabset .tabset-fade .tabset-pills}

It has been well-known since at least 1969, when Bates and Granger wrote their famous paper on 'The Combination of Forecasts', that combining forecasts often leads to better forecast accuracy. Combining predictions or forecasts from multiple models is usually refered to as ensemble learning and the resulting model is called an ensemble model. The ensemble models used in this package are implemented using the hybridModel::hybridModel() function, which fits multiple models from the forecast package and then combines them using either equal weights or weights based on in-sample errors. More information on the implementation can be found here.

The following version of this model is available in the tsforecast package:

fc_ensemble_aefnst

fc_ensemble_aefnst = Ensemble of ARIMA, ETS, Neural Network, Linear (s), TBATS and Seasonal Naive (z) models

This ensemble model is implemented using all six available models, as specified above. The model forecasts are combined using equal weights accross each of the models. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_ensemble_aefnst")

Recursive ML models {.tabset .tabset-fade .tabset-pills}

Whereas the previously described Machine Learning (ML) models (Regression Trees and Random Forests) use multi-step forecasting to predict all required future values in one go, the recursive ML models use one-step forecasting. In one-step forecasting, only the first point is predicted and is then added to the dataset and used to do another one-step forecast to predict the next point. The recursive ML models are implemented using the caret package, which is short for Classification And REgression Training. It contains a set of functions that attempt to streamline the process for creating predictive models, which can be selected from a list of available model tags.

The following versions of this methodology are available in the tsforecast package:

fc_rec_svmradsig

fc_rec_svmradsig = Recursive Support Vector Machines with Radial Basis Function Kernel, tuning parameter sigma

This model is a recursive implementation of a Support Vector Machine (SVM) which tunes over the cost parameter and the Radial Basis Function (RBF) kernel parameter sigma. The model is implemented using the method = 'svmRadialSigma' tag from the caret package. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_rec_svmradsig")

fc_rec_rpart

fc_rec_rpart = Recursive, Recursive PARTioning Trees

This model is a recursive implementation of a recursive partitioning and regression tree. An intuitive description of its inner workings can be found here. The model is implemented using the method = 'rpart' tag from the caret package. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_rec_rpart")

fc_rec_ctree

fc_rec_ctree = Recursive Conditional Inference Tree

This model is a recursive implementation of a conditional inference tree. An intuitive description of its inner workings can be found here. The model is implemented using the method = 'ctree' tag from the caret package. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_rec_ctree")

fc_rec_cforest

fc_rec_cforest = Recursive Conditional Inference Random Forest

This model is a recursive implementation of a conditional inference random forst. One of the first ensemble methods developed by statisticians at UC Berkeley, the random forest is an ensemble of multiple tree models as shown previously in this documentation. The original article describing the random forest can be found here. The model is implemented using the method = 'cforest' tag from the caret package. The visualization below demonstrates the model when applied to the example dataset.

create_fc_model_plot("fc_rec_cforest")



ing-bank/tsforecast documentation built on Sept. 18, 2020, 9:40 a.m.