bootCast | R Documentation |
Point forecasts and the respective forecasting intervals for autoregressive-moving-average (ARMA) models can be calculated, the latter via bootstrap, by means of this function.
bootCast(
X,
p = NULL,
q = NULL,
include.mean = FALSE,
n.start = 1000,
h = 1,
it = 10000,
pb = TRUE,
cores = future::availableCores(),
alpha = 0.95,
export.error = FALSE,
plot = FALSE,
...
)
X |
a numeric vector that contains the time series that is assumed to follow an ARMA model ordered from past to present. |
p |
an integer value |
q |
an integer value |
include.mean |
a logical value; if set to |
n.start |
an integer that defines the 'burn-in' number
of observations for the simulated ARMA series via bootstrap; is set to
|
h |
an integer that represents the forecasting horizon; if |
it |
an integer that represents the total number of iterations, i.e.,
the number of simulated series; is set to |
pb |
a logical value; for |
cores |
an integer value >0 that states the number of (logical) cores to
use in the bootstrap (or |
alpha |
a numeric vector of length 1 with |
export.error |
a single logical value; if the argument is set to
|
plot |
a logical value that controls the graphical output; for
|
... |
additional arguments for the standard plot function, e.g.,
|
This function is part of the smoots
package and was implemented under
version 1.1.0. For a given time series X_t
, t = 1, 2, ..., n
,
the point forecasts and the respective forecasting intervals will be
calculated. It is assumed that the series follows an ARMA(p,q
) model
X_t - \mu = \epsilon_t + \beta_1 (X_{t-1} - \mu) + ... + \beta_p
(X_{t-p} - \mu) + \alpha_1 \epsilon_{t-1} + ... + \alpha_{q}
\epsilon_{t-q},
where \alpha_j
and \beta_i
are real
numbers (for i = 1, 2, .., p
and j = 1, 2, ..., q
) and
\epsilon_t
are i.i.d. (identically and independently
distributed) random variables with zero mean and constant variance.
\mu
is equal to E(X_t)
.
The point forecasts and forecasting intervals for the future periods
n + 1, n + 2, ..., n + h
will be obtained. With respect to the point
forecasts \hat{X}_{n + k}
, where k = 1, 2, ..., h
,
\hat{X}_{n + k} = \hat{\mu} + \sum_{i = 1}^{p} \hat{\beta}_{i}
(X_{n + k - i} - \hat{\mu}) + \sum_{j = 1}^{q} \hat{\alpha}_{j}
\hat{\epsilon}_{n + k - j}
with X_{n+k-i} = \hat{X}_{n+k-i}
for
n+k-i > n
and
\hat{\epsilon}_{n+k-j} = E(\epsilon_t) = 0
for n+k-j > n
will be applied.
The forecasting intervals on the other hand are obtained by a forward
bootstrap method that was introduced by Pan and Politis (2016) for
autoregressive models and extended by Lu and Wang (2020) for applications to
autoregressive-moving-average models.
For this purpose, let l
be the number of the current bootstrap
iteration. Based on the demeaned residuals of the initial ARMA estimation,
different innovation series \epsilon_{l,t}^{s}
will
be sampled. The initial coefficient estimates and the sampled innovation
series are then used to simulate a variety of series
X_{l,t}^{s}
, from which again coefficient estimates will
be obtained. With these newly obtained estimates, proxy residual series
\hat{\epsilon}_{l,t}^{s}
are calculated for
the original series X_t
. Subsequently, point forecasts for the
time points n + 1
to n + h
are obtained for each iteration
l
based on the original series X_t
, the newly obtained
coefficient forecasts and the proxy residual series
\epsilon_{l,t}^{s}
.
Simultaneously, "true" forecasts, i.e., true future observations, are
simulated. Within each iteration, the difference between the simulated true
forecast and the bootstrapped point forecast is calculated and saved for each
future time point n + 1
to n + h
. The result for these time
points are simulated empirical values of the forecasting error. Denote by
q_k(.)
the quantile of the empirical distribution for the
future time point n + k
. Given a predefined confidence level
alpha
, define \alpha_s = (1 -
alpha
)/2
. The
bootstrapped forecasting interval is then
[\hat{X}_{n + k} + q_k(\alpha_s), \hat{X}_{n + k} + q_k(1 -
\alpha_s)],
i.e., the forecasting intervals are given by the sum of the respective point forecasts and quantiles of the respective bootstrapped forecasting error distributions.
The function bootCast
allows for different adjustments to
the forecasting progress. At first, a vector with the values of the observed
time series ordered from past to present has to be passed to the argument
X
. Orders p
and q
of the underlying ARMA process can be
defined via the arguments p
and q
. If only one of these orders
is inserted by the user, the other order is automatically set to 0
. If
none of these arguments are defined, the function will choose orders based on
the Bayesian Information Criterion (BIC) for
0 \leq p,q \leq 5
. Via the logical argument
include.mean
the user can decide, whether to consider the mean of the
series within the estimation process. By means of n.start
, the number
of "burn-in" observations for the simulated ARMA processes can be regulated.
These observations are usually used for the processes to build up and then
omitted. Furthermore, the argument h
allows for the definition of the
maximum future time point n + h
. Point forecasts and forecasting
intervals will be returned for the time points n + 1
to n + h
.
it
corresponds to the number of bootstrap iterations. We recommend a
sufficiently high number of repetitions for maximum accuracy of the results.
Another argument is alpha
, which is the equivalent of the confidence
level considered within the calculation of the forecasting intervals, i.e.,
the quantiles (1 -
alpha
)/2
and 1 - (1 -
alpha
)/2
of the bootstrapped forecasting error distribution
will be obtained.
Since this bootstrap approach needs a lot of computation time, especially for
series with high numbers of observations and when fitting models with many
parameters, parallel computation of the bootstrap iterations is enabled.
With cores
, the number of cores can be defined with an integer.
Nonetheless, for cores = NULL
, no cluster is created and therefore
the parallel computation is disabled. Note that the bootstrapped results are
fully reproducible for all cluster sizes. The progress of the bootstrap can
be observed in the R console, where a progress bar and the estimated
remaining time are displayed for pb = TRUE
.
If the argument export.error
is set to TRUE
, the output of
the function is a list instead of a matrix with additional information on
the simulated forecasting errors. For more information see the section
Value.
For simplicity, the function also incorporates the possibility to directly
create a plot of the output, if the argument plot
is set to
TRUE
. By the additional and optional arguments ...
, further
arguments of the standard plot function can be implemented to shape the
returned plot.
NOTE:
Within this function, the arima
function of the
stats
package with its method "CSS-ML"
is used throughout
for the estimation of ARMA models. Furthermore, to increase the performance,
C++ code via the Rcpp
and
RcppArmadillo
packages was
implemented. Also, the future
and
future.apply
packages are
considered for parallel computation of bootstrap iterations. The progress
of the bootstrap is shown via the
progressr
package.
The function returns a 3
by h
matrix with its columns
representing the future time points and the point forecasts, the lower
bounds of the forecasting intervals and the upper bounds of the
forecasting intervals as the rows. If the argument plot
is set to
TRUE
, a plot of the forecasting results is created.
If export.error = TRUE
is selected, a list with the following
elements is returned instead.
the 3
by h
matrix forecasting matrix with point
forecasts and bounds of the forecasting intervals.
a it
by h
matrix, where each column represents a
future time point n + 1, n + 2, ..., n + h
; in each column the
respective it
simulated forecasting errors are saved.
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Feng, Y., Gries, T. and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510-533.
Feng, Y., Gries, T., Letmathe, S. and Schulz, D. (2019). The smoots package in R for semiparametric modeling of trend stationary time series. Discussion Paper. Paderborn University. Unpublished.
Feng, Y., Gries, T., Fritz, M., Letmathe, S. and Schulz, D. (2020). Diagnosing the trend and bootstrapping the forecasting intervals using a semiparametric ARMA. Discussion Paper. Paderborn University. Unpublished.
Lu, X., and Wang, L. (2020). Bootstrap prediction interval for ARMA models with unknown orders. REVSTAT–Statistical Journal, 18:3, 375-396.
Pan, L. and Politis, D. N. (2016). Bootstrap prediction intervals for linear, nonlinear and nonparametric autoregressions. In: Journal of Statistical Planning and Inference 177, pp. 1-27.
### Example 1: Simulated ARMA process ###
# Function for drawing from a demeaned chi-squared distribution
rchisq0 <- function(n, df, npc = 0) {
rchisq(n, df, npc) - df
}
# Simulation of the underlying process
n <- 2000
n.start = 1000
set.seed(23)
X <- arima.sim(model = list(ar = c(1.2, -0.7), ma = 0.63), n = n,
rand.gen = rchisq0, n.start = n.start, df = 3) + 13.1
# Quick application with low number of iterations
# (not recommended in practice)
result <- bootCast(X = X, p = 2, q = 1, include.mean = TRUE,
n.start = n.start, h = 5, it = 10, cores = 2, plot = TRUE,
lty = 3, col = "forestgreen", xlim = c(1950, 2005), type = "b",
main = "Exemplary title", pch = "*")
result
### Example 2: Application with more iterations ###
## Not run:
result2 <- bootCast(X = X, p = 2, q = 1, include.mean = TRUE,
n.start = n.start, h = 5, it = 10000, cores = 2, plot = TRUE,
lty = 3, col = "forestgreen", xlim = c(1950, 2005),
main = "Exemplary title")
result2
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.