The goal of
modeltime_forecast() is to simplify the process of
forecasting future data.
1 2 3 4 5 6 7 8 9 10 11
A Modeltime Table
The forecast horizon (can be used instead of
Reference data that is combined with the output tibble and given a
An estimated confidence interval based on the calibration data. This is designed to estimate future confidence from out-of-sample prediction error.
Whether or not to produce confidence interval estimates by an ID feature.
Whether or not to keep the
Whether or not to sort the index in rowwise chronological order (oldest to newest) or to
keep the original order of the data.
Not currently used
modeltime_forecast() function prepares a forecast for visualization with
plot_modeltime_forecast(). The forecast is controlled by
which can be combined with existing data (controlled by
Confidence intervals are included if the incoming Modeltime Table has been
Otherwise confidence intervals are not estimated.
When forecasting you can specify future data using
This is a future tibble with date column and columns for xregs
extending the trained dates and exogonous regressors (xregs) if used.
Forecasting Evaluation Data: By default, the
new_data will use the
new_data is not provided.
This is the equivalent of using
rsample::testing() for getting test data sets.
Forecasting Future Data: See
timetk::future_frame() for creating future tibbles.
Xregs: Can be used with this method
When forecasting, you can specify
h. This is a phrase like "1 year",
which extends the
.calibration_data (1st priority) or the
actual_data (2nd priority)
into the future.
Forecasting Future Data: All forecasts using
extended after the calibration data or actual_data.
.calibration_data - Calibration data is given 1st priority, which is
desirable after refitting with
Internally, a call is made to
expedite creating new data using the date feature.
actual_data - If
h is provided, and the modeltime table has not been
calibrated, the "actual_data" will be extended into the future. This is useful
in situations where you want to go directly from
without calibrating or refitting.
Xregs: Cannot be used because future data must include new xregs.
If xregs are desired, build a future data frame and use
This is reference data that contains the true values of the time-stamp data. It helps in visualizing the performance of the forecast vs the actual data.
h is used and the Modeltime Table has not been calibrated, then the
actual data is extended into the future periods that are defined by
Confidence Interval Estimation
Confidence intervals (
.conf_hi) are estimated based on the normal estimation of
the testing errors (out of sample) from
The out-of-sample error estimates are then carried through and
applied to applied to any future forecasts.
The confidence interval can be adjusted with the
conf_interval parameter. An
80% confidence interval estimates a normal (Gaussian distribution) that assumes that
80% of the future data will fall within the upper and lower confidence limits.
The confidence interval is mean-adjusted, meaning that if the mean of the residuals is non-zero, the confidence interval is adjusted to widen the interval to capture the difference in means.
Refitting has no affect on the confidence interval since this is calculated independently of the refitted model (on data with a smaller sample size). New observations typically improve future accuracy, which in most cases makes the out-of-sample confidence intervals conservative.
Include the new data (and actual data) as extra columns with the results of the model forecasts. This can be helpful when the new data includes information useful to the forecasts. An example is when forecasting Panel Data and the new data contains ID features related to the time series group that the forecast belongs to.
modeltime_forecast() keeps the original order of the data.
If desired, the user can sort the output by
A tibble with predictions and time-stamp data. For ease of plotting and calculations, the column names are transformed to:
.key: Values labeled either "prediction" or "actual"
.index: The timestamp index.
.value: The value being forecasted.
Additionally, if the Modeltime Table has been previously calibrated using
you will gain confidence intervals.
.conf_lo: The lower limit of the confidence interval.
.conf_hi: The upper limit of the confidence interval.
Additional descriptive columns are included:
.model_id: Model ID from the Modeltime Table
.model_desc: Model Description from the Modeltime Table
Unnecessary columns are dropped to save space:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62
library(tidyverse) library(lubridate) library(timetk) library(parsnip) library(rsample) # Data m750 <- m4_monthly %>% filter(id == "M750") # Split Data 80/20 splits <- initial_time_split(m750, prop = 0.9) # --- MODELS --- # Model 1: auto_arima ---- model_fit_arima <- arima_reg() %>% set_engine(engine = "auto_arima") %>% fit(value ~ date, data = training(splits)) # ---- MODELTIME TABLE ---- models_tbl <- modeltime_table( model_fit_arima ) # ---- CALIBRATE ---- calibration_tbl <- models_tbl %>% modeltime_calibrate(new_data = testing(splits)) # ---- ACCURACY ---- calibration_tbl %>% modeltime_accuracy() # ---- FUTURE FORECAST ---- calibration_tbl %>% modeltime_forecast( new_data = testing(splits), actual_data = m750 ) # ---- ALTERNATIVE: FORECAST WITHOUT CONFIDENCE INTERVALS ---- # Skips Calibration Step, No Confidence Intervals models_tbl %>% modeltime_forecast( new_data = testing(splits), actual_data = m750 ) # ---- KEEP NEW DATA WITH FORECAST ---- # Keeps the new data. Useful if new data has information # like ID features that should be kept with the forecast data calibration_tbl %>% modeltime_forecast( new_data = testing(splits), keep_data = TRUE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.