Feature Name: `Forecasting Reduction Techniques`
Start Date: 2019-11-16
Target Date:
```
## Summary
[summary]: #summary
- Lagged values of a timeseries can be used as features for a machine learning algorithm.
- Two major apporaches: rolling window regression and expanding window regression
## Motivation
[motivation]: #motivation
Most machine learning algorithms are not designed for temporal data. To still use these methods, we have to put the data in the right shape for our algorithm.
## Guide-level explanation
[guide-level-explanation]: #guide-level-explanation
Basically there are two major approaches to feed timeseries data into a regression method: the "rolling window"-approach and the "expanding window"-approach. Both methods use the last k values as features to predict the next value(s) in the timeseries.
Example for k = 2: Predict X(t) given X(t-1) and X(t-2)
X(t) X(t-1) X(t-2) 1 1 NA NA 2 2 1 NA 3 3 2 1 4 4 3 2 5 5 4 3 6 6 5 4 7 7 6 5 8 8 7 6 9 9 8 7 10 10 9 8
``` The difference between these two approaches is the amount of data that is used for making the prediction. While the rolling window (with window size n) approach uses the last n values for training and then predicts the next values, the expanding window approach trains the model before every prediction on all the available data.
Example: Suppose we are at timestep 50 of a timeseries
Rolling window approach:
Expanding window approach:
For multi-step-forecasts there are several options to tackle the problem.
For the direct forecasting method you build a seperate model for each timestep you want to predict.
An alternative to this approach is building a multi-output regression model that can handle the forecasts directly without the need for additional models.
For the recursive forecasting method you build one model and predict timestep t+1. To forecast the next value (t+2) you feed this prediction back into the model.
It would also be possible to combine the two strategies: build a seperate model for each timestep, but feed the prediction of the previous timestep as input.
The main drawback of this two approaches are the hurt iid assumptions that are usually required for machine learning algorithms. Another point is how we want to handle multi-step ahead forecasts, as there are several options to do that.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.