knitr::opts_chunk$set( message = FALSE, warning = FALSE, fig.width = 8, fig.height = 4.5, fig.align = 'center', out.width='95%', dpi = 100 )
Modeltime Resample provide a convenient toolkit for efficiently evaluating multiple models across time, increasing our confidence in model selections.
modeltime_resample()
, which automates the iterative model fitting and prediction procedure. plot_modeltime_resamples()
provides a quick way to review model resample accuracy visually.modeltime_resample_accuracy()
provides a flexible way for creating custom accuracy tables using customizable summary functions (e.g. mean, median, sd, min, max). Resampling gives us a way to compare multiple models across time.
In this tutorial, we'll get you up to speed by evaluating multiple models using resampling of a single time series.
Load the following R packages.
library(tidymodels) library(modeltime) library(modeltime.resample) library(tidyverse) library(timetk)
library(tidymodels) library(modeltime) library(modeltime.resample) library(dplyr) library(timetk)
We'll work with the m750
data set.
m750 %>% plot_time_series(date, value, .interactive = FALSE)
We'll use timetk::time_series_cv()
to generate 4 time-series resamples.
"2 years"
"5 years"
"2 years
4
resamples_tscv <- time_series_cv( data = m750, assess = "2 years", initial = "5 years", skip = "2 years", slice_limit = 4 ) resamples_tscv
Next, visualize the resample strategy to make sure we're happy with our choices.
# Begin with a Cross Validation Strategy resamples_tscv %>% tk_time_series_cv_plan() %>% plot_time_series_cv_plan(date, value, .facet_ncol = 2, .interactive = FALSE)
Create models and add them to a Modeltime Table with Modeltime. I've already created 3 models (ARIMA, Prophet, and GLMNET) and saved the results as part of the modeltime
package m750_models
.
m750_models
Generate resample predictions using modeltime_fit_resamples()
:
m750_models
(models) and m750_training_resamples
.resample_results
contains the resample predictionsresamples_fitted <- m750_models %>% modeltime_fit_resamples( resamples = resamples_tscv, control = control_resamples(verbose = FALSE) ) resamples_fitted
Visualize the model resample accuracy using plot_modeltime_resamples()
. Some observations:
resamples_fitted %>% plot_modeltime_resamples( .point_size = 3, .point_alpha = 0.8, .interactive = FALSE )
We can compare the overall modeling approaches by evaluating the results with modeltime_resample_accuracy()
. The default is to report the average summary_fns = mean
, but this can be changed to any summary function or a list containing multiple summary functions (e.g. summary_fns = list(mean = mean, sd = sd)
). From the table below, ARIMA has a 6% lower RMSE, indicating it's the best choice for consistent performance on this dataset.
resamples_fitted %>% modeltime_resample_accuracy(summary_fns = mean) %>% table_modeltime_accuracy(.interactive = FALSE)
Resampling gives us a way to compare multiple models across time. In this example, we can see that the ARIMA model performs better than the Prophet and GLMNET models with a lower RMSE. This won't always be the case (every time series is different).
This is a quick overview of Getting Started with Modeltime Resample. To learn how to tune, ensemble, and work with multiple groups of Time Series, take my High-Performance Time Series Course.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.