Calculate the trailing covariances of two streaming time series of returns using an online recursive formula.


run_covar(tseries, lambda)



A time series or a matrix with two columns of returns data.


A decay factor which multiplies past estimates.


The function run_covar() calculates the trailing covariances of two streaming time series of returns, by recursively weighting the past covariance estimates {cov}_{t-1}, with the products of their returns minus their means, using the decay factor \lambda:

\bar{x}_t = \lambda \bar{x}_{t-1} + (1-\lambda) x_t

\bar{y}_t = \lambda \bar{y}_{t-1} + (1-\lambda) y_t

\sigma^2_{x t} = \lambda \sigma^2_{x t-1} + (1-\lambda) (x_t - \bar{x}_t)^2

\sigma^2_{y t} = \lambda \sigma^2_{y t-1} + (1-\lambda) (y_t - \bar{y}_t)^2

{cov}_t = \lambda {cov}_{t-1} + (1-\lambda) (x_t - \bar{x}_t) (y_t - \bar{y}_t)

Where {cov}_t is the trailing covariance estimate at time t, \sigma^2_{x t}, \sigma^2_{y t}, \bar{x}_t and \bar{x}_t are the trailing variances and means of the returns, and x_t and y_t are the two streaming returns data.

The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.

The value of the decay factor \lambda must be in the range between 0 and 1. If \lambda is close to 1 then the decay is weak and past values have a greater weight, and the trailing covariance values have a stronger dependence on past data. This is equivalent to a long look-back interval. If \lambda is much less than 1 then the decay is strong and past values have a smaller weight, and the trailing covariance values have a weaker dependence on past data. This is equivalent to a short look-back interval.

The function run_covar() returns five columns of data: the trailing covariances, the variances, and the mean values of the two columns of the argument tseries. This allows calculating the trailing correlations, betas, and alphas.


A matrix with five columns of data: the trailing covariances, the variances, and the mean values of the two columns of the argument tseries.


## Not run: 
# Calculate historical returns
retp <- zoo::coredata(na.omit(rutils::etfenv$returns[, c("IEF", "VTI")]))
# Calculate the trailing covariance
lambdaf <- 0.9 # Decay factor
covars <- HighFreq::run_covar(retp, lambda=lambdaf)
# Calculate the trailing correlation
correl <- covars[, 1]/sqrt(covars[, 2]*covars[, 3])
# Calculate the trailing covariance using R code
nrows <- NROW(retp)
retm <- matrix(numeric(2*nrows), nc=2)
retm[1, ] <- retp[1, ]
retd <- matrix(numeric(2*nrows), nc=2)
covarr <- numeric(nrows)
covarr[1] <- retp[1, 1]*retp[1, 2]
for (it in 2:nrows) {
  retm[it, ] <- lambdaf*retm[it-1, ] + (1-lambdaf)*(retp[it, ])
  retd[it, ] <- retp[it, ] - retm[it, ]
  covarr[it] <- lambdaf*covarr[it-1] + (1-lambdaf)*retd[it, 1]*retd[it, 2]
} # end for
all.equal(covars[, 1], covarr, check.attributes=FALSE)

## End(Not run)

