rollCast  R Documentation 
A simple backtest of SemiARMA models via rolling forecasts can be implemented.
rollCast(
y,
p = NULL,
q = NULL,
K = 5,
method = c("norm", "boot"),
alpha = 0.95,
np.fcast = c("lin", "const"),
it = 10000,
n.start = 1000,
pb = TRUE,
cores = future::availableCores(),
argsSmoots = list(),
plot = TRUE,
argsPlot = list()
)
y 
a numeric vector that represents the equidistant time series assumed to follow a SemiARMA model; must be ordered from past to present. 
p 
an integer value 
q 
an integer value 
K 
a single, positive integer value that defines the number of
outofsample observations; the last 
method 
a character object; defines the method used for the calculation
of the forecasting intervals; with 
alpha 
a numeric vector of length 1 with 
np.fcast 
a character object; defines the forecasting method used
for the nonparametric trend; for 
it 
an integer that represents the total number of iterations, i.e.,
the number of simulated series; is set to 
n.start 
an integer that defines the 'burnin' number
of observations for the simulated ARMA series via bootstrap; is set to

pb 
a logical value; for 
cores 
an integer value >0 that states the number of (logical) cores to
use in the bootstrap (or 
argsSmoots 
a list that contains arguments that will be passed to

plot 
a logical value that controls the graphical output; for the
default ( 
argsPlot 
a list; additional arguments for the standard plot function,
e.g., 
Define that an observed, equidistant time series y_t
, with
t = 1, 2, ..., n
, follows
y_t = m(x_t) + \epsilon_t,
where x_t = t/n
is the rescaled time on the closed
interval [0,1]
and m(x_t)
is a nonparametric and
deterministic trend function (see Beran and Feng, 2002, and Feng, Gries and
Fritz, 2020).
\epsilon_t
, on the other hand, is a stationary process
with E(\epsilon_t) = 0
and shortrange dependence.
For the purpose of this function, \epsilon_t
is assumed
to follow an autoregressivemovingaverage (ARMA) model with
\epsilon_t = \zeta_t + \beta_1 \epsilon_{t1} + ... + \beta_p
\epsilon_{tp} + \alpha_1 \zeta_{t1} + ... +
\alpha_q \zeta_{tq}.
Here, the random variables \zeta_t
are identically and
independently distributed (i.i.d.) with zeromean and a constant variance
and the coefficients \alpha_j
and \beta_i
,
i = 1, 2, ..., p
and j = 1, 2, ..., q
, are real numbers.
The combination of both previous formulas will be called a semiparametric
ARMA (SemiARMA) model.
An explicit forecasting method of SemiARMA models is described in
modelCast
. To backtest a selected model, a slightly adjusted
procedure is used. The data is divided into insample and an
outofsample values (usually the last K = 5
observations in the data
are reserved for the outofsample observations). A model is fitted to the
insample data, whereas onestep rolling point forecasts and forecasting
intervals are obtained for the outofsample time points. The proposed
forecasts of the trend are either a linear or a constant extrapolation of
the trend with negligible forecasting intervals, whereas the point forecasts
of the stationary rest term are obtained via the selected ARMA(p,q
)
model (see Fritz et al., 2020). The corresponding forecasting intervals
are calculated under the assumption that the innovations
\zeta_t
are either normally distributed (see e.g. pp.
9394 in Brockwell and Davis, 2016) or via a forward bootstrap (see Lu and
Wang, 2020). For a onestep forecast for time point t
, all observations
until time point t1
are assumed to be known.
The function calculates three important values for backtesting: the number
of breaches, i.e. the number of true observations that lie outside of the
forecasting intervals, the mean absolute scaled error (MASE, see Hyndman
and Koehler, 2006) and the root mean squared scaled error (RMSSE, see
Hyndman and Koehler, 2006) are obtained. For the MASE, a value < 1
indicates a better average forecasting potential than a naive forecasting
approach.
Furthermore, it is independent from the scale of the data and can thus be
used to compare forecasts of different datasets. Closely related is the
RMSSE, however here, the mean of the squared forecasting errors is computed
and scaled by the mean of the squared naive forecasting approach. Then the
root of that value is the RMSSE. Due to the close relation, the
interpretation of the RMSSE is similarly but not identically to the
interpretation of the MASE. Of course, a value close to zero is preferred
in both cases.
To make use of the function, a numeric vector with the values of a time
series that is assumed to follow a SemiARMA model needs to be passed to
the argument y
. Moreover, the arguments p
and q
represent the AR and MA orders, respectively, of the underlying ARMA
process in the parametric part of the model. If both values are set to
NULL
, an optimal order in accordance with the Bayesian Information
Criterion (BIC) will be selected. If only one of the values is NULL
,
it will be changed to zero instead. K
defines the number of the
outofsample observations; these will be cut off the end of y
, while
the remaining observations are treated as the insample observations. For the
K
outofsample time points, rolling forecasts will be obtained.
method
describes the method to use for the computation of the
prediction intervals. Under the normality assumption for the innovations
\zeta_t
, intervals can be obtained via
method = "norm". However, if the assumption does not hold, a
bootstrap can be implemented as well (method = "boot"). Both
approaches are explained in more detail in normCast
and
bootCast
, respectively. With alpha
, the confidence
level of the forecasting intervals can be adjusted, as the
(100
alpha
)percent forecasting intervals will be computed. By
means of the argument np.fcast
, the forecasting method for the
nonparametric trend function can be defined. Selectable are a linear
(np.fcast = "lin"
) and a constant (np.fcast = "const"
)
extrapolation. For more information on these methods, we refer the reader to
trendCast
.
it
, n.start
, pb
and cores
are only
relevant for method = "boot"
. With it
the total number of
bootstrap iterations is defined, whereas n.start
regulates, how
many 'burnin' observations are generated for each simulated ARMA process
in the bootstrap. Since a bootstrap may take a longer computation time,
with the argument cores
the number of cores for parallel computation
of the bootstrap iterations can be defined. Nonetheless, for
cores = NULL
, no cluster is created and therefore
the parallel computation is disabled. Note that the bootstrapped results are
fully reproducible for all cluster sizes. Moreover, for pb = TRUE
,
the progress of the bootstrap approach can be observed in the R console via
a progress bar. Additional information on these four function arguments can
be found in bootCast
.
The argument argsSmoots
is a list. In this list, different arguments
of the function msmooth
can be implemented to adjust the
estimation of the nonparametric part of the complete model. The arguments
of the smoothing function are described in msmooth
.
rollCast
allows for a quick plot of the results. If the logical
argument plot
is set to TRUE
, a graphic with default
settings is created. Nevertheless, users are allowed to implement further
arguments of the standard plot function in the list argsPlot
. For
example, the limits of the plot can be adjusted by xlim
and
ylim
. Furthermore, an argument x
can be included in
argsPlot
with the actual equidistant time points of the whole series
(insample & outofsample observations). Otherwise, simply 1:n
is
used as the insample time points by default.
NOTE:
Within this function, the arima
function of the
stats
package with its method "CSSML"
is used throughout for
the estimation of ARMA models. Furthermore, to increase the performance,
C++ code via the Rcpp
and
RcppArmadillo
packages
was implemented. Also, the future
and
future.apply
packages are
considered for parallel computation of bootstrap iterations. The progress
of the bootstrap is shown via the
progressr
package.
A list with different elements is returned. The elements are as follows.
a single numeric value; it describes, what confidence level
(100
alpha
)percent has been considered for the forecasting
intervals.
a logical vector that states whether the K
true
outofsample observations lie outside of the forecasting intervals,
respectively; a breach is denoted by TRUE
.
a numeric vector that contains the margin of the breaches
(in absolute terms) for the K
outofsample time points; if a breach
did not occur, the respective element is set to zero.
a numeric vector that contains the simulated empirical
values of the forecasting error for method = "boot"
; otherwise,
it is set to NULL
.
a numeric vector that contains the K
point forecasts
of the parametric part of the model.
a numeric matrix that contains the K
rolling point
forecasts as well as the values of the bounds of the respective forecasting
intervals for the complete model;
the first row contains the point forecasts, the lower bounds of the
forecasting intervals are in the second row and the upper bounds
can be found in the third row.
a numeric vector that contains the K
obtained trend
forecasts.
a positive integer; states the number of outofsample observations as well as the number of forecasts for the outofsample time points.
the obtained value of the mean average scaled error for the selected model.
a character object that states, whether the forecasting
intervals were obtained via a bootstrap (method = "boot"
) or under
the normality assumption for the innovations (method = "norm"
).
the output (usually a list) of the nonparametric
trend estimation via msmooth
.
the output (usually a list) of the parametric ARMA
estimation of the detrended series via arima
.
the number of observations (insample & outofsample observations).
the number of insample observations (n  n.out
).
the number of outofsample observations (equals K
).
a character object that states the applied forecasting
method for the nonparametric trend function; either a linear (
np.fcast = "lin"
) or a constant np.fcast = "const"
are
possible.
a numeric vector of length 2 with the
[100(1 
alpha
)/2]
percent and
{100
[1  (1 
alpha
)/2]
}percent quantiles of
the forecasting error distribution.
the obtained value of the root mean squared scaled error for the selected model.
a numeric vector that contains all true observations (insample & outofsample observations).
a numeric vector that contains all insample observations.
a numeric vector that contains the K
outofsample
observations.
Yuanhua Feng (Department of Economics, Paderborn University),
Author of the Algorithms
Website: https://wiwi.unipaderborn.de/en/dep4/feng/
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Package Creator and Maintainer
Beran, J., and Feng, Y. (2002). Local polynomial fitting with longmemory, shortmemory and antipersistent errors. Annals of the Institute of Statistical Mathematics, 54, 291311.
Brockwell, P. J., and Davis, R. A. (2016). Introduction to time series and forecasting, 3rd edition. Springer.
Fritz, M., Forstinger, S., Feng, Y., and Gries, T. (2020). Forecasting economic growth processes for developing economies. Unpublished.
Feng, Y., Gries, T. and Fritz, M. (2020). Datadriven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32:2, 510533.
Hyndman, R. J., and Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22:4, 679688.
Lu, X., and Wang, L. (2020). Bootstrap prediction interval for ARMA models with unknown orders. REVSTATâ€“Statistical Journal, 18:3, 375396.
lgdp < log(smoots::gdpUS$GDP)
time < seq(from = 1947.25, to = 2019.5, by = 0.25)
backtest < rollCast(lgdp, K = 5,
argsPlot = list(x = time, xlim = c(2012, 2019.5), col = "forestgreen",
type = "b", pch = 20, lty = 2, main = "Example"))
backtest
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.