library(deepvars) # Dependencies library(data.table) library(ggplot2) library(expm) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo=FALSE, warning = FALSE, message = FALSE )
The reduced-form VAR($p$) with $K$ variables and $p$ lags (with $m \in [1,p]$) can be written as
$$ \begin{aligned} && y_t&=A_1 y_{t-1} + A_2 y_{t-2} + ... + A_p y_{t-p} + u_t \ \end{aligned} $$
where $y_{t-m}$ are $(K \times 1)$ vectors , $A_m$ are $(K \times K)$ matrices of coefficients and $u_t$ is a $(K \times 1)$ vector of residuals.
The reduced-from VAR can be estimated by least-squares.
We can rewrite the reduced-form model as
$$ \begin{aligned} && Y&= Z A + U \ \end{aligned} $$
such that the OLS estimator is simply
$$ \begin{aligned} && \hat{A}&= {(Z'Z)}^{-1}Z'Y \ \end{aligned} $$
where $Y$, $Z$ and $\hat{A}$ are of dimensions $(T \times K)$, $(T \times (Kp+1))$ and $((Kp+1) \times K)$, respectively.
Equivalently, we can estimate the VAR through Maximum-Likelihood-Estimation (MLE). Under the assumption that $y_t$ are jointly normal (since by assumption $u_t \sim N(0, \Sigma_u)$), we have for the distribution function:
$$ \begin{aligned} && f(y_t|y_{t-1},...,y_{t-p+1})&=\left(\frac{1}{2 \pi} \right)^{\frac{k}{2}} det \left( \Sigma_u\right)^{-\frac{1}{2}} exp \left(-\frac{1}{2} u_t'{\Sigma_u}^{-1}u_t \right)\ \end{aligned} $$
The VAR
function can be used to estimate the VAR by least-squares or MLE. All estimations are run on an example data set.
dt = data.table::data.table(deepvars::canada) var_cols = colnames(dt)[2:ncol(dt)] print(head(dt))
The underlying time series in levels look non-stationary
chart_data = dt chart_data = data.table::melt(chart_data, id.vars = "date") ggplot2::ggplot(chart_data) + ggplot2::geom_line(ggplot2::aes(y=value, x=date)) + ggplot2::facet_wrap(variable ~ ., scales = "free_y") + ggplot2::theme_bw() + ggplot2::labs( x="Date", y="Value" )
lags = 3
Below the model is estimated in levels using r lags
lags firstly through least-squares and secondly through MLE.
var_ls = vareg(dt, lags = lags, method = "ols") var_mle = vareg(dt, lags = lags, method = "mle")
Above all series entered the estimation in levels. This may cause the VAR to be non-stable if one or more series are non-stationary. We can test for the VAR's stability by checking if the eigenvalues of its companion-form matrix lie within the unit circle. If furthermore $u_t$ has time-invariant variance and its first two moments are bounded, then the VAR is stationary.
The VAR_stable
function can be used to do this. Applying this function to the previously estimated VAR reveals that it is in fact not stable, so at least one of the underlying variables is non-stationary.
invisible(VAR_stable(var_ls, verbose = F)$test_result)
Standard univariate tests (e.g. ADF) can be used to test individual series for non-stationarity. For simplicity, let us take first differences of all series, which upon visual inspection look stationary.
dt[,(var_cols) := lapply(.SD, function(i) c(0,diff(i))), .SDcols=var_cols] chart_data = dt chart_data = data.table::melt(chart_data, id.vars = "date") ggplot2::ggplot(chart_data) + ggplot2::geom_line(ggplot2::aes(y=value, x=date)) + ggplot2::facet_wrap(variable ~ ., scales = "free_y") + ggplot2::theme_bw() + ggplot2::labs( x="Date", y="Value" )
Re-running the stability test with the transformed data shows that the var is now stable when using r lags
lags:
invisible(VAR_stable(vareg(dt, lag=3))$test_result)
Finally, let's turn to selecting the right lag length for our estimation. The most common way to select the lag-length is through information criteria, which can be down using the VAR_lag_select
function.By default the maximum considered lag length is set to 10 and the Akaike Information Criterium is used. Applying the function to the data in differences yields
lag_selection = lag_order(dt[,.SD,.SDcols=var_cols]) knitr::kable(lag_selection$criteria_overview) lag = lag_selection$p
suggesting that we should us $p=r lag
$ as the AIC is minimized for that choice. The table below shows the resulting coefficients when re-running the VAR estimation through least-squares with differenced data and the proposed, optimal lag-length.
var_model <- vareg(dt[,.SD,.SDcols=var_cols], lags = lags) print(var_model)
We can also ...
fitted_values <- predict(var_model) plot(fitted_values, y_true=var_model$y_train)
n = 10 fig_cap = sprintf("%i-step ahead predictions from estimated VAR.",n)
Once the VAR has been fitted, it can be used for prediction.
... Maths to follow
In R, we can predict from a model that has been estimated through VAR
using the VAR_predict
function. The figure below shows the r n
-step ahead predictions for our model.
fcst <- forecast(var_model, n) plot(fcst) # my_pred = VAR_predict(var_model, n.ahead, plot = T, theme = theme_bw())
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.