knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The goal of EasyFORcasting or short efor is to make it easier if you have to forecast multiple timeseries. The package supports you in creating forecasts with different methods implemented in the forecast, forecastHyrid, smooth and prophet package. Furthermore it provides some functions for evaluating these forecasts.
Following you see a possible workflow:
You can install the released version of efor from github with:
#devtools::install_github("flostracke/efor")
First we load some packages for this example. The efor package contains some fictional timeseries data for demonstrating the apporaches:
library(dplyr) library(tsibble) # for creating nicer representation of monthly data library(efor) library(furrr) # for running the forecasting in parallel library(forecast) #provides forecast mehotds library(ggplot2) sales_data <- sales_monthly %>% mutate(date = yearmonth(date)) sales_data
We have some sales data for four articles. We want to create forecasts for all these articles. The efor package makes this quite easy, because it provides functionality to create forecasts for multiple articles with just one function call
The idea is that all the data has to be organised in one dataframe with the following columns:
The dataframe sales_data is already meeting these requirements. You can verify the correct structure with the following function call, otherweise there would be an error:
check_input_data(sales_data)
Before we start forecasts let's quickly create a plot of the 4 different articles we want to forecast:
ggplot(sales_data, aes(x = date, y = y)) + geom_line() + geom_point() + facet_wrap(~iterate) + ggtitle("The original series") + theme_minimal()
We split the dataset in a train and test set. All observations from the year 2016 go into the test set. We want to create forecasts for the next 4 months of the testset and evaluate the performance of different methods.
train_data <- sales_data %>% filter(date < "2016-01-01") test_data <- sales_data %>% filter(date >= "2016-01-01")
Now we can apply the the auto.arima function to the dataset and create the forecasts. All the methods from the forecast package can be run in parallel.
forecasts_ar <- tf_grouped_forecasts( train_data, # used training dataset n_pred = 6, # number of predictions func = auto.arima, # used forecasting method parallel = TRUE # for runiing in parallel ) forecasts_ar
forecasts_ets <- tf_grouped_forecasts( train_data, # used training dataset n_pred = 6, # number of predictions func = ets, # used forecasting method parallel = FALSE #disabling parallel for prohet ) forecasts_ets
In order to create some plots and evaluate the performance we combine the forecasts into one dataset.
forecasts <- bind_rows(forecasts_ar, forecasts_ets) %>% mutate(date = yearmonth(date)) #reformat the date because of a bug in bind_rows
The package brings also a function which makes it quite easy to access the performance (right now thhe mae, rmse and rsquared is calculated) of all the forecasting methods in the passed prediction dataframe:
tf_calc_metrics(forecasts, test_data)
Also it is possible to access the performance of each article:
tf_calc_metrics(forecasts, test_data, detailed = TRUE)
Finally we create a quick graph visualising the results of the forecasts.
train_data_plot <- train_data %>% mutate(key = "train") test_data_plot <- test_data %>% mutate(key = "test") bind_rows(train_data_plot, test_data_plot) %>% bind_rows(forecasts) %>% filter(key %in% c("auto.arima", "train", "test")) %>% mutate(date = yearmonth(date)) %>% ggplot(aes(x = date, y = y, color = key)) + geom_point() + geom_line() + facet_wrap(~iterate) + ggtitle("Forecasted values for each article") + ylab("Sales amount") + theme_minimal()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.