run_var: Calculate the trailing variance of streaming _time series_ of...

View source: R/RcppExports.R

run_varR Documentation

Calculate the trailing variance of streaming time series of data using an online recursive formula.

Description

Calculate the trailing variance of streaming time series of data using an online recursive formula.

Usage

run_var(tseries, lambda)

Arguments

tseries

A time series or a matrix of data.

lambda

A decay factor which multiplies past estimates.

Details

The function run_var() calculates the trailing variance of streaming time series of data r_t, by recursively weighting the past variance estimates \sigma^2_{t-1}, with the squared differences of the data minus its trailing means (r_t - \bar{r}_t)^2, using the decay factor \lambda:

\bar{r}_t = \lambda \bar{r}_{t-1} + (1-\lambda) r_t

\sigma^2_t = \lambda \sigma^2_{t-1} + (1-\lambda) (r_t - \bar{r}_t)^2

Where r_t are the streaming data, \bar{r}_t are the trailing means, and \sigma^2_t are the trailing variance estimates.

The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.

The value of the decay factor \lambda must be in the range between 0 and 1. If \lambda is close to 1 then the decay is weak and past values have a greater weight, and the trailing variance values have a stronger dependence on past data. This is equivalent to a long look-back interval. If \lambda is much less than 1 then the decay is strong and past values have a smaller weight, and the trailing variance values have a weaker dependence on past data. This is equivalent to a short look-back interval.

The function run_var() performs the same calculation as the standard R function
stats::filter(x=series, filter=weightv, method="recursive"), but it's several times faster.

The function run_var() returns a matrix with the same dimensions as the input argument tseries.

Value

A matrix with the same dimensions as the input argument tseries.

Examples

## Not run: 
# Calculate historical returns
retp <- zoo::coredata(na.omit(rutils::etfenv$returns$VTI))
# Calculate the trailing variance
lambdaf <- 0.9
vars <- HighFreq::run_var(retp, lambda=lambdaf)
# Calculate centered returns
retc <- (retp - HighFreq::run_mean(retp, lambda=lambdaf))
# Calculate the trailing variance using R code
retc2 <- (1-lambdaf)*filter(retc^2, filter=lambdaf, 
  init=as.numeric(retc[1, 1])^2/(1-lambdaf), 
  method="recursive")
all.equal(vars, unclass(retc2), check.attributes=FALSE)
# Compare the speed of RcppArmadillo with R code
library(microbenchmark)
summary(microbenchmark(
  Rcpp=HighFreq::run_var(retp, lambda=lambdaf),
  Rcode=filter(retc^2, filter=lambdaf, init=as.numeric(retc[1, 1])^2/(1-lambdaf), method="recursive"),
  times=10))[, c(1, 4, 5)]  # end microbenchmark summary

## End(Not run)


algoquant/HighFreq documentation built on Feb. 9, 2024, 8:15 p.m.