knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This is a walk through of how the package is intended to be used with a practical example.
The first thing that a forecast needs a data to be forecasted. The SynthCast provides a example of how it expected a dataset to look like, the code bellow loads the package and the example dataset:
library(knitr) library(SynthCast) data('df_example') kable(head(df_example))
The dataset is expected to have 3 types of columns:
The table bellow shows the max time for each unit:
library(dplyr) df_example %>% group_by(unit) %>% summarise(max_time_period=max(time_period)) %>% filter(unit %in% c(1, 2, 3, 4, 5, 45, 46, 47, 48, 49, 50)) %>% kable()
As one can see the older unit (the smaller the number the older the unit is) the longer is the time series that are available (larger values in the time_period
column). This means that the data from older units can be used to forecast the younger units. For example, the data from units r 30-12
to 1
could be used to predict the next 12
periods of the unit 30
. This is excatly what the function run_synthetic_forecast()
does (To better understand how it is working under the hood it is recommend to check the Synthetic Control Synth Package paper.).
The function call bellow runs a synthetic forecast of 12
time periods of the series x1
of the unit 30.
synthetic_forecast <- run_synthetic_forecast( df = df_example, col_unit_name = 'unit', col_time='time_period', periods_to_forecast=12, unit_of_interest = '30', serie_of_interest = 'x1' )
The output of the function is a list with 4 tables.
These are the 4 tables that are returned by the function call.
synthetic_control_composition
This table summarizes the results related to the unit selection from the Synthetic Control method. The columns are the following:
kable(synthetic_forecast$synthetic_control_composition)
execution_date
: The date that the forecast was executed in the YYYY-MM-DD format;projected_unit
: The forcasted unit; projected_serie
: The forecasted serie;synthetic_units
/w.weights
: the units (from r 30-12
to 1
) selected and their recpective weights.variable_importance_and_comparison
This table summarizes the results related to the features/variables selection from the Synthetic Control method. The columns are the following:
kable(head(synthetic_forecast$variable_importance_and_comparison,8))
execution_date
: The date that the forecast was executed in the YYYY-MM-DD format;projected_unit
: The forcasted unit; projected_serie
: The forecasted serie;variable
: The variable selected;unit_of_interest
: The mean value over time of the variable in column variable
from the unit in the projected_unit
;synthetic
: The mean value over time of the variable in column variable
of the syntehtic unit;sample
: The mean value over time of the variable in column variable
of the whole dataset;v.weights
: The weight of the variable in the column variable
.mape_backtest
This table depicts the results of a simple mape back test on the period it was used to forecast. It is worth noting that the intention is not to provide a robust method for validation the model. The Synthetic Control Method is a mathematical approach, not an machine learning, that minimizes the distance without worrying about overfitting the curves. The columns are the following:
kable(synthetic_forecast$mape_backtest)
execution_date
: The date that the forecast was executed in the YYYY-MM-DD format;projected_unit
: The forcasted unit; projected_serie
: The forecasted serie;max_time_unit_of_interest
: The age of the unit of interest;periods_to_forecast
: Periods that were forecasted;elegible_control_units
: Number of elegible units to be used to forecast;mape
: The mean absolute percentage error in the from 1 to max_time_unit_of_interest
.output_projecao
This tables contains the projection itself. The columns are the following:
kable(synthetic_forecast$output_projecao)
execution_date
: The date that the forecast was executed in the YYYY-MM-DD format;projected_unit
: The forcasted unit; time_period
: The time period;projected_serie
: The forecasted serie;projected_serie_value
: The value of the seria/variable that was projected, from colun projected_serie
;is_projected
: 1 indicates that the value is projected, 0 indicates that the value is observed.proj<- synthetic_forecast$output_projecao proj %>% glimpse()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.