\newcommand{\bm}[1]{\boldsymbol{#1}} \newcommand{\tx}[1]{\textrm{#1}} \newcommand{\iid}{\overset{\;\tx{iid}\;}{\sim}} \newcommand{\ind}{\overset{\;\tx{ind}\;}{\sim}} \newcommand{\var}{\operatorname{var}} \newcommand{\cov}{\operatorname{cov}} \newcommand{\cor}{\operatorname{cor}} \newcommand{\logit}{\operatorname{logit}} \newcommand{\ilogit}{\operatorname{ilogit}} \newcommand{\N}{\mathcal{N}} \newcommand{\elL}{\mathcal{L}} \newcommand{\elap}{\ell_{\text{Lap}}} \newcommand{\ud}{\mathop{}!\mathrm{d}} \newcommand{\tth}{{\bm{\theta}}} \newcommand{\VV}{{\bm{V}}} \newcommand{\XX}{{\bm{X}}}

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The Common-Factor Multivariate Stochastic Volatility Model

Let $X_{it}$ denote the log price of asset $i = 1,\ldots N_a$ at time $t$ and $V_{it}$ denote the corresponding volatility on the standard deviation scale. The common-factor multivariate stochastic volatility (mSV) model proposed by @fang.etal20 is marginally written as \begin{equation} \begin{aligned} \ud \log V_{it} & = -\gamma_i (\log V_{it} - \mu_i) \ud t + \sigma_i \ud B_{it}^V \ \ud X_{it} & = (\alpha_i - \tfrac 1 2 V_{it}^2) \ud t + V_{it} \left(\rho_i \ud B_{it}^V + \sqrt{1 - \rho_i^2} \ud B_{it}^Z\right). \end{aligned} (#eq:svuni) \end{equation} The correlation between assets and volatilities comes from factor models for both the assets and the volatilities, namely \begin{equation} \begin{aligned} \ud B_{it}^V & = \tau_i \ud B_{\star t}^V + \sqrt{1 - \tau_i^2} \ud B_{it}^\epsilon \ \ud B_{it}^Z & = \omega_i \ud B_{\star t}^Z + \sqrt{1 - \omega_i^2} \ud B_{it}^\eta. \end{aligned} (#eq:cor) \end{equation} In @fang.etal20 it is suggested to estimate the volatility common innovation factor $B_{\star t}^{V}$ via an observable proxy, for example the VIX. That is, if $V_{\star t}$ is an observable proxy for the common volatility factor, then we use it to estimate $B_{\star t}^V$ via \begin{equation} \ud \log V_{\star t} = -\gamma_{\star} (\log V_{\star t} - \mu_{\star}) \ud t + \sigma_{\star} \ud B_{\star t}^V. (#eq:volproxy) \end{equation} Here we propose to do something similar for the asset common innovation factor $B_{\star t}^Z$. That is, let $(X_{0t}, V_{0t})$ be a univariate stochastic volatility model for an observable proxy for the common asset factor (for example, SPX). Then we give it a similar marginal model to the above: \begin{equation} \begin{aligned} \ud \log V_{0t} & = -\gamma_0 (\log V_{0t} - \mu_0) \ud t + \sigma_0 \ud B_{0t}^V \ \ud X_{0t} & = (\alpha_0 - \tfrac 1 2 V_{0t}^2) \ud t + V_{0t} \left(\rho_0 \ud B_{0 t}^V + \sqrt{1 - \rho_0^2} \ud B_{0t}^Z \right), \end{aligned} (#eq:assetproxy) \end{equation} where $\ud B_{0t}^V = \tau_0 \ud B_{\star t}^V + \sqrt{1 - \tau_0^2} \ud B_{0t}^\epsilon$. The asset common factor is estimated from the proxy by equating $B_{\star t}^Z = B_{0t}^Z$.

Parameter Estimation

# required packages
library(svcommon)
library(mvtnorm) # multivariate normal draws
library(tidyr); library(dplyr) # data frame manipulation
library(ggplot2) # plotting

dim(snp500) # automatically loaded with svcommon
snp500[1:6, ncol(snp500)-8 + 1:8]

The common-factor stochastic volatility model defined by \@ref(eq:svuni) through \@ref(eq:assetproxy) has likelihood function $$ \elL(\tth \mid \XX, \VV_0) \propto \int p(\XX, \VV, \VV_0 \mid \tth) \ud \VV. $$ While the integral over the latent volatilities $\VV$ is typically intractable, svcommon uses Laplace's method to approximate the integral by solving a tractable optimization problem. This is implemented using the R package TMB, which efficiently computes the approximate marginal loglikelihood \begin{equation} \elap(\tth \mid \XX, \VV_0) \approx \log \elL(\tth \mid \XX, \VV_0) (#eq:elap) \end{equation} and its gradient $\frac{\partial}{\partial \tth} \elap(\tth \mid \XX, \VV_0)$ using automatic differentiation. svcommon uses a block coordinate descent algorithm with good initial values to very quickly converge to the approximate MLE $\hat \tth = \operatorname{arg\,max}_{\tth} \elap(\XX, \VV_0 \mid \tth)$, about two orders of magnitude faster than exact inference methods. The procedure is illustrated with a dataset consisting of r nrow(snp500) daily closing prices of r ncol(snp500)-2 constituents of the S&P500, the index value itself (GSPC), and its volatility index (VIX).


Initialization

Initial guesses for the parameters of the common-factor stochastic volatility (SVC) model are obtained from marginal fits of the exponential Ornstein-Uhlenbeck (EOU) model to each asset and to GSPC. To perform the optimization, the R built-in stats package provides the optimiziers stats::optim(), stats::nlm(), and stats::nlminb(). Various experiments with SVC modeling indicate that statss:nlminb() is by far the fastest, with little difference in numerical accuracy. Indeed, the marginal fits are extremely fast and stable, even with the (log or logit) parameters arbitrarily initialized to zero.

# problem dimensions
nobs <- 1000 # number of days
nasset <- 5 # number of assets, excluding GSPC

# format the data
dt <- 1/252
Xt <- as.matrix(cbind(GSPC = snp500[1:nobs, "GSPC"],
                      snp500[1:nobs, 1+1:nasset]))
Xt <- log(Xt)
log_VPt <- log(as.numeric(snp500[1:nobs, "VIX"]))

# initialize latent variables with rolling window standard deviations
log_Vt <- log(apply(Xt, 2, sv_init, dt = dt, block_size = 10))
# initialize parameters with default values
alpha <- rep(0, nasset+1)
log_gamma <- rep(0, nasset+2)
mu <- rep(0, nasset+2)
log_sigma <- rep(0, nasset+2)
logit_rho <- rep(0, nasset+1)
logit_tau <- rep(0, nasset+1)
logit_omega <- rep(0, nasset)

# parameter and latent variable list for convenient updating
curr_par <- list(log_Vt = log_Vt,
                 alpha = alpha,
                 log_gamma = log_gamma,
                 mu = mu,
                 log_sigma = log_sigma,
                 logit_rho = logit_rho,
                 logit_tau = logit_tau,
                 logit_omega = logit_omega)

# optimization control parameters
opt_control <- list(eval.max = 1000, iter.max = 1000, trace = 10)
<<init_setup>>
# initialize parameters of each individual asset
tm_tot <- system.time({
  for(iasset in 0:nasset) {
    message("asset = ", iasset)
    # construct model object
    eou_ad <- eou_MakeADFun(Xt = Xt[,iasset+1], dt = dt,
                            alpha = curr_par$alpha[iasset+1],
                            log_Vt = curr_par$log_Vt[,iasset+1],
                            log_gamma = curr_par$log_gamma[iasset+2],
                            mu = curr_par$mu[iasset+2],
                            log_sigma = curr_par$log_sigma[iasset+2],
                            logit_rho = curr_par$logit_rho[iasset+1])
    # optimize with quasi-newton method
    tm <- system.time({
      opt <- nlminb(start = eou_ad$par,
                    objective = eou_ad$fn,
                    gradient = eou_ad$gr,
                    control = opt_control)
    })
    message("Time: ", round(tm[3], 2), " seconds")
    # update parameters
    curr_par <- svc_update(eou_ad, old_par = curr_par, iasset = iasset)
  }
})
message("Total Time: ", round(tm_tot[3], 2), " seconds.")
curr_par <- readRDS("init_opt.RDS")
saveRDS(curr_par, file = "init_opt.RDS")

Blockwise Coordinate Descent

The following algorithm updates the parameters for each asset conditioned on the values of all the others. Note that the Laplace approximation in this case depends only on the latent volatilities of the given asset and that of the asset common factor. Thus, svcommon removes the other volatilities from the gradient calculations, which considerably speeds up the calculations.

Here, nepoch denotes the number of cycles through all assets. For this dataset it appears that after nepoch = 3 the parameter values are changing very little.

nepoch <- 3

tm_tot <- system.time({
  for(iepoch in 1:nepoch) {
    message("epoch = ", iepoch)
    for(iasset in -1:nasset) {
      message("asset = ", iasset)
      svc_ad <- svc_MakeADFun(Xt = Xt, log_VPt = log_VPt, dt = dt,
                              par_list = curr_par,
                              iasset = iasset)
      tm <- system.time({
        opt <- nlminb(start = svc_ad$par,
                      objective = svc_ad$fn,
                      gradient = svc_ad$gr,
                      control = opt_control)
      })
      message("Time: ", round(tm[3], 2), " seconds")
      curr_par <- svc_update(svc_ad, old_par = curr_par, iasset = iasset)
    }
  }
})
message("Total Time: ", round(tm_tot[3], 2), " seconds.")
curr_par <- readRDS("block_opt.RDS")
saveRDS(curr_par, file = "block_opt.RDS")

For good measure, let's finish with a joint optimization over all parameters. This should be comparatively fast now that the optimizer has good starting values^[For this particular dataset, it is possible to skip the initialization and blockwise coordinate descent steps, provided the iter.max parameter to stats::nlminb() is large enough.].

# joint parameter optimization
iasset <- "all"
svc_ad <- svc_MakeADFun(Xt = Xt, log_VPt = log_VPt, dt = dt,
                        par_list = curr_par,
                        iasset = iasset)
tm <- system.time({
  opt <- nlminb(start = svc_ad$par,
                objective = svc_ad$fn,
                gradient = svc_ad$gr,
                control = opt_control)
})
message("Time: ", round(tm[3], 2), " seconds")
curr_par <- svc_update(svc_ad, old_par = curr_par, iasset = iasset)
curr_par <- readRDS("joint_opt.RDS")
saveRDS(curr_par, file = "joint_opt.RDS")

Bayesian Estimation and Forecasting

Let's take a look at the parameter estimates.

system.time({
  svc_est <- TMB::sdreport(svc_ad)
})
svc_est <- readRDS("summary.RDS")
saveRDS(svc_est, file = "summary.RDS")
knitr::kable(summary(svc_est, select = "report"), digits = 2)

All parameter estimates have reasonable precision, except notably $\alpha_i$, the long-term trend in asset $i$, which is notoriously difficult to estimate. Also included are stimates of $\log V_{iT}$, the latent volatility on the last day $t = T$ in the observed data. This can be useful for Bayesian forecasting, which can be conducted using the following method:

  1. Let $p(\tth, \log V_{0T}, \ldots, \log V_{qT} \mid \XX, \VV_0)$ denote the posterior distribution of the parameters and terminal volatilities for $q$ assets and the common asset factor with the Lebesgue prior^[Since the prior is flat on the original scale but optimization is done on the transformed scale ($\log \gamma_i$, $\logit \rho_i$, etc.) the optimizer provides the correct Bayesian mode but a penalized maximum likelihood. SVC experiments indicate that penalization considerably improves the estimation of correlation parameters, which otherwise tend to drift off to boundary values.] $\pi(\tth) \propto 1$. By the mode-quadrature approximation, this posterior is taken to be normal with mean and variance obtained from TMB::sdreport().

  2. Let $p(X_{0,T+d}, \ldots, X_{q,T+d}, \log V_{0,T+d}, \ldots, \log V_{q,T+d} \mid \XX, \VV_0)$ denote the Bayesian forecast distribution for assets and volatilities on day $t = T+d$. We can simulate from this distribution by combining draws from the multivariate normal approximation to $p(\tth, \log V_{0T}, \ldots, \log V_{qT} \mid \XX, \VV_0)$ with forward simulation from the SVC model with svc_sim().

An illustration of this procedure produces the posterior forecast distribution in Figure \@ref(fig:forecast).

```r \mid \XX, \VV_0)$ and $p(\XX_{T+d} \mid \XX, \VV_0)$."} nfwd <- 10 # number of days to forecast nsim <- 1e4 # number of simulated forecasts

sample from the mode-quadrature approximation

curr_post <- mvtnorm::rmvnorm(nsim, mean = svc_est$value, sigma = svc_est$cov)

convert to appropriate inputs to svc_sim

sim_args <- sapply(c("alpha", "log_gamma", "mu", "log_sigma", "logit_rho", "logit_tau", "logit_omega", "log_VT"), function(nm) { t(curr_post[,colnames(curr_post) == nm]) }, simplify = FALSE) names(sim_args)[8] <- "log_V0" # rename log_VT

remaining arguments

sim_args <- c(sim_args, list(X0 = Xt[nobs,], log_VP0 = log_VPt[nobs], nobs = nfwd, dt = dt))

forward simulation

fwd_post <- do.call(svc_sim, args = sim_args)

plot posteriors on day t = nobs + nfwd

format data

X_fwd <- t(fwd_post$Xt[nfwd,,]) colnames(X_fwd) <- colnames(Xt) X_fwd <- data.frame(type = "X", X_fwd) V_fwd <- exp(t(fwd_post$log_Vt[nfwd,,])) colnames(V_fwd) <- colnames(log_Vt) V_fwd <- data.frame(type = "V", V_fwd) XV_fwd <- rbind(X_fwd, V_fwd) %>% as_tibble()

plot

XV_fwd %>% pivot_longer(GSPC:USB, names_to = "Asset", values_to = "Value") %>% mutate(Asset = factor(Asset, levels = colnames(Xt))) %>% ggplot(aes(x = Value, group = Asset)) + geom_density(aes(color = Asset)) + facet_wrap(~ type, scales = "free", ncol = 2) + xlab("") + ylab("Posterior Forecast Distribution") ```

References {-}



mlysy/svcommon documentation built on Sept. 15, 2024, 1:15 a.m.