knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval=TRUE ### !!!! set to FALSE here to render only the text !!!! ) set.seed(42)
klippy::klippy(position = c('top', 'right'), tooltip_message = 'Copy', tooltip_success = 'Done', color="black")
This vignette shows how to perform probabilistic reconciliation with
the bayesRecon
package. We provide three examples:
Temporal hierarchy for a count time series: we build a temporal hierarchy over a count time series, produce the base forecasts using glarma
and reconcile them via Bottom-Up Importance Sampling (BUIS).
Temporal hierarchy for a smooth time series: we build a temporal hierarchy over a smooth time series, compute the base forecasts using ets
and we reconcile them in closed form using Gaussian reconciliation. The covariance matrix is diagonal.
Hierarchical of smooth time series: this is an example of a cross-sectional hierarchy. We generate the base forecasts using ets
and we reconcile them via Gaussian reconciliation.
The covariance matrix is full and estimated via shrinkage.
The package, available at this CRAN page, can be installed and loaded with the usual commands:
install.packages('bayesRecon', dependencies = TRUE)
Load the package:
library(bayesRecon)
We select a monthly time series of counts from the carparts dataset, available from
the expsmooth package [@expsmooth_pkg].
The data set contains time series of sales of cars part from Jan. 1998 to Mar. 2002.
For this example we select time series #2655, which we make available as bayesRecon::carparts_example
.
This time series has a skewed distribution of values.
layout(mat = matrix(c(1, 2), nrow = 1, ncol = 2), widths = c(2, 1)) plot(carparts_example, xlab = "Time", ylab = "Car part sales", main = NULL) hist(carparts_example, xlab = "Car part sales", main = NULL)
We divide the time series into train and test; the test set contains the last 12 months.
train <- window(carparts_example, end = c(2001, 3)) test <- window(carparts_example, start = c(2001, 4))
We build the temporal hierarchy using the temporal aggregation
function.
We specify the aggregation levels using the
agg_levels
argument; in this case they are
2-Monthly, Quarterly, 4-Monthly, Biannual, and Annual.
``` {r temp-agg} train.agg <- bayesRecon::temporal_aggregation(train, agg_levels = c(2, 3, 4, 6, 12)) levels <- c("Annual", "Biannual", "4-Monthly", "Quarterly", "2-Monthly", "Monthly") names(train.agg) <- levels
The function returns a list of aggregated time series, ordered from the most aggregated (top of the hierarchy) to the most disagreggated (bottom of the hierarchy). We plot them below. ``` {r temp-agg-plot, dpi=300, fig.show="hold", out.width="100%", out.heigth="100%", fig.align='center', fig.cap="**Figure 2**: The aggregated time series of the temporal hierarchy.", fig.dim=c(6,3.5)} par(mfrow = c(2, 3), mai = c(0.6, 0.6, 0.5, 0.5)) for (l in levels) { plot(train.agg[[l]], xlab = "Time", ylab = "Car part sales", main = l) }
We compute the base forecasts using the package glarma
,
a package specific for forecasting count time series.
We forecast 12 steps ahead at the monthly level, 4 steps ahead at the quarterly level, etc. by iterating over the levels of the hierarchy,
At each level, we fit a glarma
model
with Poisson predictive distribution and we forecast;
each forecast is constituted by 20000 integer samples.
Eventually we collect the samples of the 28 predictive distributions (one at the Annual level, two at the Biannual level, etc.) in a list. The code below takes about 30 seconds on a standard computer.
``` {r hier-fore, cache=TRUE}
fc.samples <- list() D <- 20000 fc.count <- 1
for (l in seq_along(train.agg)) { f.level <- frequency(train.agg[[l]]) print(paste("Forecasting at ", levels[l], "...", sep = "")) # fit an independent model for each aggregation level model <- glarma::glarma( train.agg[[l]], phiLags = if (f.level == 1) 1 else 1:(min(6, f.level - 1)), thetaLags = if (f.level == 1) NULL else f.level, X = cbind(intercept = rep(1, length(train.agg[[l]]))), offset = cbind(intercept = rep(0, length(train.agg[[l]]))), type = "Poi" ) # forecast 1 year ahead h <- f.level tmp <- matrix(data = NA, nrow = h, ncol = D) for (s in (1:D)) { # each call to 'forecast.glarma' returns a simulation path tmp[, s] <- glarma::forecast( model, n.ahead = h, newdata = cbind(intercept = rep(1, h)), newoffset = rep(0, h) )$Y } # collect the forecasted samples for (i in 1:h) { fc.samples[[fc.count]] <- tmp[i, ] fc.count <- fc.count + 1 } }
Reconciliation requires the aggregation matrix $\mathbf{A}$, which we obtain using the function `get_reconc_matrices`. It requires: * the aggregation factors of the hierarchy, which in this example are $\{2, 3, 4, 6, 12\}$; * the length of the forecasting horizon at the bottom level, which is 12 in this example. ``` {r aggregationMatrix} recon.matrices <- bayesRecon::get_reconc_matrices(agg_levels = c(2, 3, 4, 6, 12), h = 12) # Aggregation matrix A <- recon.matrices$A
To reconcile using Bottom-Up Important Sampling (BUIS) we
we use the function reconc_BUIS
, passing to it
the $\mathbf{A}$ matrix, the base forecasts, the type of the base forecasts (in_type
="samples") and whether the samples are discrete or integer (distr
= "discrete").
``` {r reconc} recon.res <- bayesRecon::reconc_BUIS( A, base_forecasts = fc.samples, in_type = "samples", distr = "discrete", seed = 42 )
Here we obtain samples from the reconciled forecast distribution. ```r reconciled_samples <- recon.res$reconciled_samples dim(reconciled_samples)
We now compute the Mean Absolute Error (MAE) and the Continuous Ranked Probability Score (CRPS) for the bottom (i.e., monthly) time series. For computing CRPS, we use the package scoringRules
.
# install.packages("scoringRules", dependencies = TRUE) library(scoringRules) ae.fc <- list() ae.reconc <- list() crps.fc <- list() crps.reconc <- list() for (h in 1:length(test)) { y.hat_ <- median(fc.samples[[nrow(A) + h]]) y.reconc_ <- median(recon.res$bottom_reconciled_samples[, h]) # Compute Absolute Errors ae.fc[[h]] <- abs(test[h] - y.hat_) ae.reconc[[h]] <- abs(test[h] - y.reconc_) # Compute Continuous Ranked Probability Score (CRPS) crps.fc[[h]] <- scoringRules::crps_sample(y = test[h], dat = fc.samples[[nrow(A) + h]]) crps.reconc[[h]] <- scoringRules::crps_sample(y = test[h], dat = recon.res$bottom_reconciled_samples[, h]) } mae.fc <- mean(unlist(ae.fc)) mae.reconc <- mean(unlist(ae.reconc)) crps.fc <- mean(unlist(crps.fc)) crps.reconc <- mean(unlist(crps.reconc)) metrics <- data.frame( row.names = c("MAE", "CRPS"), base.forecasts = c(mae.fc, crps.fc), reconciled.forecasts = c(mae.reconc, crps.reconc) ) metrics
In this second example, we select a smooth monthly time series (N1485) from the M3 forecasting competition [@makridakis2000m3]. The data set is available in
the Mcomp package [@mcomp_pkg] and we make available this time series as bayesRecon::m3_example
.
plot(M3_example$train, xlab = "Time", ylab = "y", main = "N1485")
We build the temporal hierarchy using the temporal_aggregation
function.
Without specifying agg_levels
, the function generates by default all the feasible aggregation: 2-Monthly, Quarterly, 4-Monthly, Biannual, and Annual.
train.agg <- bayesRecon::temporal_aggregation(M3_example$train) levels <- c("Annual", "Biannual", "4-Monthly", "Quarterly", "2-Monthly", "Monthly") names(train.agg) <- levels
We generate the base forecasts using ets
from the forecast package [@pkg_forecast].
At every level we predict 18 months ahead.
All the predictive distributions are Gaussian.
# install.packages("forecast", dependencies = TRUE) library(forecast) H <- length(M3_example$test) H fc <- list() level.idx <- 1 fc.idx <- 1 for (level in train.agg) { level.name <- names(train.agg)[level.idx] # fit an ETS model for each temporal level model <- ets(level) # generate forecasts for each level within 18 months h <- floor(H / (12 / frequency(level))) print(paste("Forecasting at ", level.name, ", h=", h, "...", sep = "")) level.fc <- forecast(model, h = h) # save mean and sd of the gaussian predictive distribution for (i in 1:h) { fc[[fc.idx]] <- list(mean = level.fc$mean[[i]], sd = (level.fc$upper[, "95%"][[i]] - level.fc$mean[[i]]) / qnorm(0.975)) fc.idx <- fc.idx + 1 } level.idx <- level.idx + 1 }
Using the function get_reconc_matrices
, we get matrix $\mathbf{A}$.
rmat <- get_reconc_matrices(agg_levels = c(2, 3, 4, 6, 12), h = 18) par(mai = c(1,1,0.5,0.5)) image(1:ncol(rmat$A), 1:nrow(rmat$A), t(apply(t(rmat$A),1,rev)), xaxt='n', yaxt='n', ylab = "", xlab = levels[6]) axis(1, at=1:ncol(rmat$A), label=1:ncol(rmat$A), las=1) axis(2, at=c(23,22,19,15,9), label=levels[1:5], las=2)
The function reconc_gaussian
implements the exact Gaussian reconciliation.
We also run reconc_BUIS
, to check the consistency
between the two approaches.
recon.gauss <- bayesRecon::reconc_gaussian( A = rmat$A, base_forecasts.mu = sapply(fc, "[[", 1), base_forecasts.Sigma = diag(sapply(fc, "[[", 2)) ^ 2 ) reconc.buis <- bayesRecon::reconc_BUIS( A = rmat$A, base_forecasts = fc, in_type = "params", distr = "gaussian", num_samples = 20000, seed = 42 ) # check that the algorithms return consistent results round(rbind( c(rmat$S %*% recon.gauss$bottom_reconciled_mean), rowMeans(reconc.buis$reconciled_samples) ))
We now compare base forecasts and reconciled forecasts:
yhat.mu <- tail(sapply(fc, "[[", 1), 18) yhat.sigma <- tail(sapply(fc, "[[", 2), 18) yhat.hi95 <- qnorm(0.975, mean = yhat.mu, sd = yhat.sigma) yhat.lo95 <- qnorm(0.025, mean = yhat.mu, sd = yhat.sigma) yreconc.mu <- rowMeans(reconc.buis$bottom_reconciled_samples) yreconc.hi95 <- apply(reconc.buis$bottom_reconciled_samples, 1, function(x) quantile(x, 0.975)) yreconc.lo95 <- apply(reconc.buis$bottom_reconciled_samples, 1, function(x) quantile(x, 0.025)) ylim_ <- c(min(M3_example$train, M3_example$test, yhat.lo95, yreconc.lo95) - 1, max(M3_example$train, M3_example$test, yhat.hi95, yreconc.hi95) + 1) plot(M3_example$train, xlim = c(1990, 1995.6), ylim = ylim_, ylab = "y", main = "N1485 Forecasts") lines(M3_example$test, lty = "dashed") lines(ts(yhat.mu, start = start(M3_example$test), frequency = 12), col = "coral", lwd = 2) lines(ts(yreconc.mu, start = start(M3_example$test), frequency = 12), col = "blue2", lwd = 2) xtest <- time(M3_example$test) polygon(c(xtest, rev(xtest)), c(yhat.mu, rev(yhat.hi95)), col = "#FF7F5066", border = "#FF7F5066") polygon(c(xtest, rev(xtest)), c(yhat.mu, rev(yhat.lo95)), col = "#FF7F5066", border = "#FF7F5066") polygon(c(xtest, rev(xtest)), c(yreconc.mu, rev(yreconc.hi95)), col = "#0000EE4D", border = "#0000EE4D") polygon(c(xtest, rev(xtest)), c(yreconc.mu, rev(yreconc.lo95)), col = "#0000EE4D", border = "#0000EE4D")
In this example, we consider the hierarchical time series infantgts, which is available
from the hts
package [@hts_pkg]
We make it available also in our package as bayesRecon::infantMortality
.
It contains counts of infant mortality (deaths) in Australia, disaggregated by state and sex (male and female).
We forecast one year ahead using auto.arima
from the forecast
package.
We collect the residuals, which we will later use to compute the covariance matrix.
# install.packages("forecast", dependencies = TRUE) library(forecast) fc <- list() residuals <- matrix(NA, nrow = length(infantMortality$total), ncol = length(infantMortality)) fc.idx <- 1 for (s in infantMortality) { s.name <- names(infantMortality)[fc.idx] print(paste("Forecasting at ", s.name, "...", sep = "")) # fit an auto.arima model and forecast with h=1 model <- auto.arima(s) s.fc <- forecast(model, h = 1) # save mean and sd of the gaussian predictive distribution fc[[s.name]] <- c(s.fc$mean, (s.fc$upper[, "95%"][[1]] - s.fc$mean) / qnorm(0.975)) residuals[, fc.idx] <- s.fc$residuals fc.idx <- fc.idx + 1 }
Now we build the $\mathbf{A}$ matrix.
# we have 16 bottom time series, and 11 upper time series A <- matrix(data = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1, 1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0, 0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1), byrow=TRUE, ncol = 16) # plot of A par(mai = c(1.5,1,0.5,0.5)) image(1:ncol(A), 1:nrow(A), t(apply(t(A),1,rev)), xaxt='n', yaxt='n', ann=FALSE) axis(1, at=1:ncol(A), label=names(infantMortality)[12:27], las=2) axis(2, at=c(1:11), label=rev(names(infantMortality)[1:11]), las=2)
We use bayesRecon::schaferStrimmer_cov
to estimate the covariance matrix of the residuals with shrinkage [@schafer2005shrinkage].
# means mu <- sapply(fc, "[[", 1) # Shrinkage covariance shrink.res <- bayesRecon::schaferStrimmer_cov(residuals) print(paste("The estimated shrinkage intensity is", round(shrink.res$lambda_star, 3))) Sigma <- shrink.res$shrink_cov
We now perform Gaussian reconciliation:
recon.gauss <- bayesRecon::reconc_gaussian(A, base_forecasts.mu = mu, base_forecasts.Sigma = Sigma) bottom_mu_reconc <- recon.gauss$bottom_reconciled_mean bottom_Sigma_reconc <- recon.gauss$bottom_reconciled_covariance # Obtain reconciled mu and Sigma for the upper variable upper_mu_reconc <- A %*% bottom_mu_reconc upper_Sigma_reconc <- A %*% bottom_Sigma_reconc %*% t(A) upper_mu_reconc
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.