Description Usage Arguments Details Author(s) See Also Examples
Perform cross validation on a time series.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
x |
the input time series. |
FUN |
the model function used. Custom functions are allowed. See details and examples. |
FCFUN |
a function that process point forecasts for the model function. This defaults
to |
rolling |
should a rolling procedure be used? If TRUE, non-overlapping windows
of size |
windowSize |
length of the window to build each model. When |
maxHorizon |
maximum length of the forecast horizon to use for computing errors. |
horizonAverage |
should the final errors be an average over all forecast horizons
up to |
xreg |
External regressors to be used to fit the model. Only used if FUN accepts xreg as an argument. FCFUN is also expected to accept it (see details) |
saveModels |
should the individual models be saved? Set this to |
saveForecasts |
should the individual forecast from each model be saved? Set this
to |
verbose |
should the current progress be printed to the console? |
num.cores |
the number of cores to use for parallel fitting. If the underlying model that is being fit also utilizes parallelization, the number of cores it is using multiplied by 'num.cores' should not exceed the number of cores available on your machine. |
extraPackages |
on Windows if a custom 'FUN' or 'FCFUN' is being used that requires loaded, these can be passed here so that they can be passed to parallel socket workers |
... |
Other arguments to be passed to the model function FUN |
Cross validation of time series data is more complicated than regular
k-folds or leave-one-out cross validation of datasets
without serial correlation since observations x[t] and x[t+n]
are not independent. The cvts()
function overcomes
this obstacle using two methods: 1) rolling cross validation where an initial training window
is used along with a forecast horizon
and the initial window used for training grows by one observation each round until the training
window and the forecast horizon capture the
entire series or 2) a non-rolling approach where a fixed training length is used that
is shifted forward by the forecast horizon after each iteration.
For the rolling approach, training points are heavily recycled, both in terms of used for fitting
and in generating forecast errors at each of the forecast horizons from 1:maxHorizon
.
In contrast, the models fit with
the non-rolling approach share less overlap, and the predicted forecast values are also
only compared to the actual values once.
The former approach is similar to leave-one-out cross validation while the latter resembles
k-fold cross validation. As a result,
rolling cross validation requires far more iterations and computationally takes longer
to complete, but a disadvantage of the
non-rolling approach is the greater variance and general instability of cross-validated errors.
The FUN
and FCFUN
arguments specify which function to use
for generating a model and forecasting, respectively. While the functions
from the "forecast" package can be used, user-defined functions can also
be tested, but the object returned by FCFUN
must
accept the argument h
and contain the point forecasts out to
this horizon h
in slot $mean
of the returned object. An example is given with
a custom model and forecast.
For small time series (default length <= 500
), all of the individual fit models
are included in the final
cvts
object that is returned. This can grow quite large since functions
such as auto.arima
will
save fitted values, residual values, summary statistics, coefficient matrices, etc.
Setting saveModels = FALSE
can be safely done if there is no need to examine individual models fit at every stage
of cross validation since the
forecasts from each fold and the associated residuals are always saved.
External regressors are allowed via the xreg
argument. It is assumed that both
FUN
and FCFUN
accept the xreg
parameter if xreg
is not NULL
.
If FUN
does not accept the xreg
parameter a warning will be given.
No warning is provided if FCFUN
does not use the xreg
parameter.
David Shaub
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | series <- subset(AirPassengers, end = 50)
cvmod1 <- cvts(series, FUN = snaive,
windowSize = 25, maxHorizon = 12)
accuracy(cvmod1)
# We can also use custom model functions for modeling/forecasting
stlmClean <- function(x) stlm(tsclean(x))
series <- subset(austres, end = 38)
cvmodCustom <- cvts(series, FUN = stlmClean, windowSize = 26, maxHorizon = 6)
accuracy(cvmodCustom)
# Use the rwf() function from the "forecast" package.
# This function does not have a modeling function and
# instead calculates a forecast on the time series directly
series <- subset(AirPassengers, end = 26)
rwcv <- cvts(series, FCFUN = rwf, windowSize = 24, maxHorizon = 1)
# Don't return the model or forecast objects
cvmod2 <- cvts(USAccDeaths, FUN = stlm,
saveModels = FALSE, saveForecasts = FALSE,
windowSize = 36, maxHorizon = 12)
# If we don't need prediction intervals and are using the nnetar model, turning off PI
# will make the forecasting much faster
series <- subset(AirPassengers, end=40)
cvmod3 <- cvts(series, FUN = hybridModel,
FCFUN = function(mod, h) forecast(mod, h = h, PI = FALSE),
rolling = FALSE, windowSize = 36,
maxHorizon = 2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.