WAR: Fit Wasserstein autoregressive models to distributional time...
In yqgchen/WR: Wasserstein regression for distributional data

WAR	R Documentation

Fit Wasserstein autoregressive models to distributional time series

Description

Fit Wasserstein autoregressive models to distributional time series

Usage

WAR(
  Y,
  qSup,
  optns = list(),
  FPCAoptns = list(),
  aic = FALSE,
  order.max = 1,
  method = "yule-walker",
  ...
)

Arguments

`Y`	A `q`-by-`n` matrix holding the quantile functions of `n` distributions evaluated on a common grid on `[0,1]` consisting of `q` points.
`qSup`	A vector of length `q` holding the support grid of quantile functions.
`optns`	A list of control parameters specified by `list(name=value)`. See 'Details'.
`FPCAoptns`	A list of optional control parameters for functional principal component analysis (FPCA) of log mapped distributional predictors and responses, respectively. See 'Details' of `FPCA`. For example, one can specify the methods of choosing numbers of functional principal components (FPCs) through `methodSelectK`; set up Fraction-of-Variance-Explained (FVE) thresholds through `FVEthreshold` when using FVE to choose numbers of FPCs. Default of `methodSelectK`: `'FVE'`; Default of `FVEthreshold`: 0.95 (different from `FPCA`). Default of `useBinnedData`: `'OFF'` (different from `FPCA`). In addition, if `optns$CVfold` is specified (i.e., not `NULL`), `methodSelectK = 'FVE'` and `FVEthreshold = 0.9999`.
`aic`, `order.max`, `method`, `...`	arguments for `ar`. Default: `aic = FALSE`, `order.max = 1`, `method = "yule-walker"`. Note that `demean` should not be input and is set to be `FALSE`.

Details

Available control options are

methodProj: The projection method if the predictions of log mapped distributional responses lie out of the log image space: 'log' (default, the method as per Chen, Lin, and Müller, 2021); 'qt' (the method as per Petersen & Müller, 2019).
CVfold: Either the number of folds if using cross validation to choose the number of FPCs, which can be any positive integer up to n, with suggested values: n (leave-one-out) if n\le 30 and 10 (10-fold) if n > 30; Or NULL if using other methods to choose the number of FPCs.
lower: Lower bound of the support of distributions, only relevant if methodProj is 'qt'. Default: NULL, i.e., no finite lower bound.
upper: Upper bound of the support of distributions, only relevant if methodProj is 'qt'. Default: NULL, i.e., no finite upper bound.

Value

A list of class 'WAR' with the following elements:

`Yfit`	A `q`-by-`n` matrix holding the fitted quantile functions of the `n` distributions.
`qSup`	The support of quantile functions, same as the input.
`beta`	A matrix holding the fitted coefficient function, where the (`j`,`k`)-th entry holds the value evaluated at (`workGrid[j]`,`workGrid[k]`).
`workGrid`	A vector holding a working grid for `beta`.
`arFPCs`	An ar object returned by `ar`, holding the fitted autoregressive model to the FPCs.
`order`	The order of the fitted Wasserstein autoregressive model.
`fpcaLogY`	An FPCA object holding the FPCA output for log mapped distributions.
`Y`	The input `Y`.
`outOfLogSpace`	A logical vector holding whether initial fitted log mapped distributions lie out of the log image space (`TRUE`) or not (`FALSE`).
`optns`	A list of options actually used.
`FPCAoptns`	A list of FPCA options actually used.

References

Chen, Y., Lin, Z., & Müller, H.-G. (2021). "Wasserstein regression." Journal of the American Statistical Association, in press. Petersen, A., & Müller, H.-G. (2019). "Fréchet regression for random objects with Euclidean predictors." The Annals of Statistics, 47(2), 691–719.

Examples


# X_{t} = N( mu_{t}, sigma_{t}^2 )
# mu_{t+1} = Emu + b11 * ( mu_{t} - Emu ) + b12 * ( sigma_{t} - Esigma ) + err_{mu,t+1}
# sigma_{t+1} = Esigma + b21 * ( mu_{t} - Emu ) + b22 * ( sigma_{t} - Esigma ) + err_{sigma,t+1}
bMat <- matrix( c( 1, 1, 1, -1 ) * 0.4, ncol = 2, byrow = TRUE )
# mu_{1} ~ Beta(2,2), sigma_{1} ~ Uniform(0.5,1.5) + Esigma - 1
# err ~ Uniform(-1,1) * M_err; M_err = 0.05
# | mu_{t+1} - Emu | < | mu_{t} - Emu |
# | sigma_{t+1} - Esigma | < | sigma_{t} - Esigma |
set.seed(1)
Emu <- 2
Esigma <- 2
M_err <- 0.05
mu_c <- rbeta( 1, 2, 2 ) - Emu
sigma_c <- runif( 1, 0.5, 1.5 ) - 1
dat_c <- matrix( c(mu_c,sigma_c), ncol = 1 )
n <- 100
for ( i in 2:n ) {
  dat_c <- cbind( dat_c, bMat %*% dat_c[,i-1] + runif(2,-1,1) * M_err )
}
mu_c <- dat_c[1,]
mu <- mu_c + Emu
sigma_c <- dat_c[2,]
sigma <- sigma_c + Esigma

nqSup <- 1000
qSup <- seq( 0, 1, length.out = nqSup+1 )
qSup <- ( qSup[-1] + qSup[-(nqSup+1)] ) / 2

Y <- sapply( seq_len(n), function (i) {
  qnorm( qSup, mean = mu[i], sd = sigma[i] )
})
res <- WAR( Y = Y, qSup = qSup )

yqgchen/WR documentation built on June 10, 2025, 6:04 p.m.