knitr::opts_chunk$set( message = FALSE, warning = FALSE, fig.width = 8, fig.height = 4.5, fig.align = 'center', out.width='95%', dpi = 100 )
Modeltime Resample provide a convenient toolkit for efficiently evaluating multiple models across time, increasing our confidence in model selections.
modeltime_resample()
, which automates the iterative model fitting and prediction procedure. plot_modeltime_resamples()
provides a quick way to review model resample accuracy visually.modeltime_resample_accuracy()
provides a flexible way for creating custom accuracy tables using customizable summary functions (e.g. mean, median, sd, min, max). Resampling gives us a way to compare multiple models across time.
In this tutorial, we'll get you up to speed by evaluating multiple models using resampling of a single time series.
Load the following R packages.
library(tidymodels) library(modeltime) library(modeltime.resample) library(tidyverse) library(timetk)
We'll work with the m750
data set.
m750 %>% plot_time_series(date, value, .interactive = FALSE)
We'll use timetk::time_series_cv()
to generate 4 time-series resamples.
"2 years"
"5 years"
"2 years
4
resamples_tscv <- time_series_cv( data = m750, assess = "2 years", initial = "5 years", skip = "2 years", slice_limit = 4 ) resamples_tscv
Next, visualize the resample strategy to make sure we're happy with our choices.
# Begin with a Cross Validation Strategy resamples_tscv %>% tk_time_series_cv_plan() %>% plot_time_series_cv_plan(date, value, .facet_ncol = 2, .interactive = FALSE)
Create models and add them to a Modeltime Table with Modeltime. I've already created 3 models (ARIMA, Prophet, and GLMNET) and saved the results as part of the modeltime
package m750_models
.
m750_models
Generate resample predictions using modeltime_fit_resamples()
:
m750_models
(models) and m750_training_resamples
.resample_results
contains the resample predictionsresamples_fitted <- m750_models %>% modeltime_fit_resamples( resamples = resamples_tscv, control = control_resamples(verbose = FALSE) ) resamples_fitted
Visualize the model resample accuracy using plot_modeltime_resamples()
. Some observations:
resamples_fitted %>% plot_modeltime_resamples( .point_size = 3, .point_alpha = 0.8, .interactive = FALSE )
We can compare the overall modeling approaches by evaluating the results with modeltime_resample_accuracy()
. The default is to report the average summary_fns = mean
, but this can be changed to any summary function or a list containing multiple summary functions (e.g. summary_fns = list(mean = mean, sd = sd)
). From the table below, ARIMA has a 6% lower RMSE, indicating it's the best choice for consistent performance on this dataset.
resamples_fitted %>% modeltime_resample_accuracy(summary_fns = mean) %>% table_modeltime_accuracy(.interactive = FALSE)
Resampling gives us a way to compare multiple models across time. In this example, we can see that the ARIMA model performs better than the Prophet and GLMNET models with a lower RMSE. This won't always be the case (every time series is different).
This is a quick overview of Getting Started with Modeltime Resample. To learn how to tune, ensemble, and work with multiple groups of Time Series, take my High-Performance Time Series Course.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.