knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width='100%', fig.align = "center", fig.width = 7, fig.height = 5, message = FALSE, warning = FALSE )
knitr::include_graphics("modeltime_ecosystem.jpg")
In this tutorial you will learn how to use the Bayesmodels package and how to integrate it with the usual Modeltime workflow. The main purposes are:
Use an Arima Bayesian model to see how it would apply in the Bayesmodels package.
Compare the above model with the classic implementation of the Modeltime package through the usual workflow of the escosystem.
Bayesmodels
unlocks the following models in one package. Precisely its greatest advantage is to be able to integrate these models with the Modeltime
and Tidymodels
ecosystems.
Arima: bayesmodels
connects to the bayesforecast
package.
Garch: bayesmodels
connects to the bayesforecast
package.
Random Walk (Naive): bayesmodels
connects to the bayesforecast
package.
State Space Model: bayesmodels
connects to the bayesforecast
and bsts
packages.
Stochastic Volatility Model: bayesmodels
connects to the bayesforecast
package.
Generalized Additive Models (GAMS): bayesmodels
connects to the brms
package.
Adaptive Splines Surface: bayesmodels
connects to the BASS
package.
Exponential Smoothing: bayesmodels
connects to the Rglt
package.
Here's the general process and where the functions fit.
knitr::include_graphics("modeltime_workflow.jpg")
Just follow the modeltime
workflow, which is detailed in 6 convenient steps:
Let's go through a guided tour to kick the tires on modeltime
.
Load libraries to complete this short tutorial.
library(tidymodels) library(bayesmodels) library(modeltime) library(tidyverse) library(timetk) library(lubridate) # This toggles plots from plotly (interactive) to ggplot (static) interactive <- FALSE
# Data m750 <- m4_monthly %>% filter(id == "M750")
We can visualize the dataset.
m750 %>% plot_time_series(date, value, .interactive = interactive)
Let's split the data into training and test sets using initial_time_split()
# Split Data 80/20 splits <- initial_time_split(m750, prop = 0.9)
We can easily create dozens of forecasting models by combining bayesmodels
, modeltime
and parsnip
. We can also use the workflows
interface for adding preprocessing! Your forecasting possibilities are endless. Let's model a couple of arima models:
Important note: Handling Date Features
Bayesmodels and Modeltime models (e.g. sarima_reg() and arima_reg()
) are created with a date or date time feature in the model. You will see that most models include a formula like fit(value ~ date, data)
.
Parsnip models (e.g. linear_reg()
) typically should not have date features, but may contain derivatives of dates (e.g. month, year, etc). You will often see formulas like fit(value ~ as.numeric(date) + month(date), data)
.
First, we create a basic univariate ARIMA model using "Arima" using arima_reg()
# Model 1: arima ---- model_fit_arima<- arima_reg(non_seasonal_ar = 0, non_seasonal_differences = 1, non_seasonal_ma = 1, seasonal_period = 12, seasonal_ar = 0, seasonal_differences = 1, seasonal_ma = 1) %>% set_engine(engine = "arima") %>% fit(value ~ date, data = training(splits))
Now, we create the same model but from a Bayesian perspective with the package bayesmodels
:
# Model 2: arima_boost ---- model_fit_arima_bayes<- sarima_reg(non_seasonal_ar = 0, non_seasonal_differences = 1, non_seasonal_ma = 1, seasonal_period = 12, seasonal_ar = 0, seasonal_differences = 1, seasonal_ma = 1, pred_seed = 100) %>% set_engine(engine = "stan") %>% fit(value ~ date, data = training(splits))
plot(model_fit_arima_bayes$fit$models$model_1)
model_fit_naive <- random_walk_reg(seasonal_random_walk = TRUE, seasonal_period = 12) %>% set_engine("stan") %>% fit(value ~ date + month(date), data = training(splits))
plot(model_fit_naive$fit$models$model_1)
The next step is to add each of the models to a Modeltime Table using modeltime_table()
. This step does some basic checking to make sure each of the models are fitted and that organizes into a scalable structure called a "Modeltime Table" that is used as part of our forecasting workflow.
We have 2 models to add.
models_tbl <- modeltime_table( model_fit_arima, model_fit_arima_bayes, model_fit_naive ) models_tbl
Calibrating adds a new column, .calibration_data
, with the test predictions and residuals inside. A few notes on Calibration:
calibration_tbl <- models_tbl %>% modeltime_calibrate(new_data = testing(splits)) calibration_tbl
There are 2 critical parts to an evaluation.
Visualizing the Test Error is easy to do using the interactive plotly visualization (just toggle the visibility of the models using the Legend).
calibration_tbl %>% modeltime_forecast( new_data = testing(splits), actual_data = m750 ) %>% plot_modeltime_forecast( .legend_max_width = 25, # For mobile screens .interactive = interactive )
We can use modeltime_accuracy()
to collect common accuracy metrics. The default reports the following metrics using yardstick
functions:
mae()
mape()
mase()
smape()
rmse()
rsq()
These of course can be customized following the rules for creating new yardstick metrics, but the defaults are very useful. Refer to default_forecast_accuracy_metrics()
to learn more.
To make table-creation a bit easier, I've included table_modeltime_accuracy()
for outputing results in either interactive (reactable
) or static (gt
) tables.
calibration_tbl %>% modeltime_accuracy() %>% table_modeltime_accuracy( .interactive = interactive )
The final step is to refit the models to the full dataset using modeltime_refit()
and forecast them forward.
refit_tbl <- calibration_tbl %>% modeltime_refit(data = m750) refit_tbl %>% modeltime_forecast(h = "3 years", actual_data = m750) %>% plot_modeltime_forecast( .legend_max_width = 25, # For mobile screens .interactive = interactive )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.