cv.odpc: Automatic Choice of Tuning Parameters for One-Sided Dynamic...
In esmucler/odpc: One-Sided Dynamic Principal Components

View source: R/mainODPC.R

cv.odpc

R Documentation

Automatic Choice of Tuning Parameters for One-Sided Dynamic Principal Components via Cross-Validation

Description

Computes One-Sided Dynamic Principal Components, choosing the number of components and lags automatically, to minimize an estimate of the forecasting mean squared error.

Usage

cv.odpc(
  Z,
  h,
  k_list = 1:5,
  max_num_comp = 5,
  window_size,
  ncores_k = 1,
  ncores_w = 1,
  method,
  tol = 1e-04,
  niter_max = 500,
  train_tol = 0.01,
  train_niter_max = 100
)

Arguments

`Z`	Data matrix. Each column is a different time series.
`h`	Forecast horizon.
`k_list`	List of values of k to choose from.
`max_num_comp`	Maximum possible number of components to compute.
`window_size`	The size of the rolling window used to estimate the forecasting error.
`ncores_k`	Number of cores to parallelise over `k_list`.
`ncores_w`	Number of cores to parallelise over the rolling window (nested in `k_list`).
`method`	A string specifying the algorithm used. Options are 'ALS', 'mix' or 'gradient'. See details in `odpc`.
`tol`	Relative precision. Default is 1e-4.
`niter_max`	Integer. Maximum number of iterations. Default is 500.
`train_tol`	Relative precision used in cross-validation. Default is 1e-2.
`train_niter_max`	Integer. Maximum number of iterations used in cross-validation. Default is 100.

Details

We assume that for each component k_{1}^{i}=k_{2}^{i}, that is, the number of lags of \mathbf{z}_{t} used to define the dynamic principal component and the number of lags of \widehat{f}^{i}_{t} used to reconstruct the original series are the same. The number of components and lags is chosen to minimize the cross-validated forecasting error in a stepwise fashion. Suppose we want to make h-steps ahead forecasts. Let w= window_size. Then given k\in k_list we compute the first ODPC defined using k lags, using periods 1,…,T-h-t+1 for t=1,…,w, and for each of these fits we compute an h-steps ahead forecast and the corresponding mean squared error E_{t,h}. The cross-validation estimate of the forecasting error is then

\widehat{MSE}_{1,k}=\frac{1}{w}∑\limits_{t=1}^{w}E_{t,h}.

We choose for the first component the value k^{\ast,1} that minimizes \widehat{MSE}_{1,k}. Then, we fix the first component computed with k^{\ast,1} lags and repeat the procedure with the second component. If the optimal cross-validated forecasting error using the two components, \widehat{MSE}_{2,k^{\ast,2}} is larger than the one using only one component, \widehat{MSE}_{1,k^{\ast,1}}, we stop and output as a final model the ODPC computed using one component defined with k^{\ast,1} lags; otherwise, if max_num_comp ≥q 2 we add the second component defined using k^{\ast,2} lags and proceed as before.

This method can be computationally costly, especially for large values of the window_size. Ideally, the user should set n_cores_k equal to the length of k_list and n_cores_w equal to window_size; this would entail using n_cores_k times n_cores_w cores in total.

Value

An object of class odpcs, that is, a list of length equal to the number of computed components, each computed using the optimal value of k. The i-th entry of this list is an object of class odpc, that is, a list with entries

`f`	Coordinates of the i-th dynamic principal component corresponding to the periods k_1 + 1,…,T.
`mse`	Mean squared error of the reconstruction using the first i components.
`k1`	Number of lags used to define the i-th dynamic principal component f.
`k2`	Number of lags of f used to reconstruct.
`alpha`	Vector of intercepts corresponding to f.
`a`	Vector that defines the i-th dynamic principal component
`B`	Matrix of loadings corresponding to f. Row number k is the vector of k-1 lag loadings.
`call`	The matched call.
`conv`	Logical. Did the iterations converge?

components, fitted, plot and print methods are available for this class.

References

Peña D., Smucler E. and Yohai V.J. (2017). “Forecasting Multiple Time Series with One-Sided Dynamic Principal Components.” Available at https://arxiv.org/abs/1708.04705.

Examples

T <- 50 #length of series
m <- 10 #number of series
set.seed(1234)
f <- rnorm(T + 1)
x <- matrix(0, T, m)
u <- matrix(rnorm(T * m), T, m)
for (i in 1:m) {
  x[, i] <- 10 * sin(2 * pi * (i/m)) * f[1:T] + 10 * cos(2 * pi * (i/m)) * f[2:(T + 1)] + u[, i]
}
# Choose parameters to perform a one step ahead forecast. Use 1 or 2 lags, only one component 
# and a window size of 2 (artificially small to keep computation time low). Use two cores for the
# loop over k, two cores for the loop over the window
fit <- cv.odpc(x, h=1, k_list = 1:2, max_num_comp = 1, window_size = 2, ncores_k = 2, ncores_w = 2)

esmucler/odpc documentation built on March 28, 2022, 5:39 a.m.