VARMA (commodity prices)
In ldt: Automated Uncertainty Analysis

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(ldt)

Introduction

The search.varma() function is one of the three main functions in the ldt package. This vignette explains a basic usage of this function using the commodity prices dataset (@datasetPcp). Commodity prices refer to the prices at which raw materials or primary foodstuffs are bought and sold. This dataset contains monthly data on primary commodity prices, including 68 different commodities with some starting in January 1990 and others in later periods.

Data

For this example, we use just the first 5 columns of data:

data <- data.pcp$data[,1:5]

Here are the last few observations from this subset of the data:

tail(data)

And here are some summary statistics for each variable:

sapply(data, summary)

The columns of the data represent the following variables:

for (c in colnames(data)){
  cat(paste0("- ", c, ": ", data.pcp$descriptions[c]), "\n\n")
}

Modelling

We use the first variable (i.e., r colnames(data)[[1]]) as the target variable and the MAPE metric to find the best predicting model. Out-of-sample evaluation affects the choice of maximum model complexity, as it involves reestimating the model using maximum likelihood several times. Although the simUsePreviousEstim argument helps with initializing maximum likelihood estimation, VARMA model estimation is time-consuming due to its large number of parameters. We impose some restrictions in the modelset. We set a maximum value for the number of equations allowed in the models. Additionally, we set a maximum value for the parameters of the VARMA model.

search_res <- search.varma(data = get.data(data, endogenous = 5),
                           combinations = get.combinations(sizes = c(1,2,3),
                                                           numTargets = 1),
                           maxParams = c(2,0,0),
                           metric <- get.search.metrics(typesIn = c(), 
                                                        typesOut = c("mape"),
                                                        simFixSize = 6),
                           maxHorizon = 5)
print(search_res)

The output of the search.varma() function does not contain any estimation results, but only the information required to replicate them. The summary() function returns a similar structure but with the estimation results included.

search_sum <- summary(search_res)

We can plot the predicted values along with the out-of-sample evaluations:

best_model <- search_sum$results[[1]]$value
pred <- predict(best_model, 
                actualCount = 10, 
                startFrequency = tdata::f.monthly(data.pcp$start,1))
plot(pred, simMetric = "mape")

Conclusion

This package can be a recommended tool for empirical studies that require reducing assumptions and summarizing uncertainty analysis results. This vignette is just a demonstration. There are indeed other options you can explore with the search.varma() function. For instance, you can experiment with different evaluation metrics or restrict the model set based on your specific needs. Additionally, there’s an alternative approach where you can combine modeling with Principal Component Analysis (PCA) (see estim.varma() function). I encourage you to experiment with these options and see how they can enhance your data analysis journey.