run_var: Calculate the trailing mean and variance of streaming _time...

View source: R/RcppExports.R

run_varR Documentation

Calculate the trailing mean and variance of streaming time series of data using an online recursive formula.


Calculate the trailing mean and variance of streaming time series of data using an online recursive formula.


run_var(tseries, lambda)



A time series or a matrix of data.


A decay factor which multiplies past estimates.


The function run_var() calculates the trailing mean and variance of streaming time series of data r_t, by recursively weighting the past variance estimates \sigma^2_{t-1}, with the squared differences of the data minus its trailing means (r_t - \bar{r}_t)^2, using the decay factor \lambda:

\bar{r}_t = \lambda \bar{r}_{t-1} + (1-\lambda) r_t

\sigma^2_t = \lambda \sigma^2_{t-1} + (1-\lambda) (r_t - \bar{r}_t)^2

Where r_t are the streaming data, \bar{r}_t are the trailing means, and \sigma^2_t are the trailing variance estimates.

The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.

The value of the decay factor \lambda must be in the range between 0 and 1. If \lambda is close to 1 then the decay is weak and past values have a greater weight, and the trailing variance values have a stronger dependence on past data. This is equivalent to a long look-back interval. If \lambda is much less than 1 then the decay is strong and past values have a smaller weight, and the trailing variance values have a weaker dependence on past data. This is equivalent to a short look-back interval.

The function run_var() performs the same calculation as the standard R function
stats::filter(x=series, filter=weightv, method="recursive"), but it's several times faster.

The function run_var() returns a matrix with two columns and the same number of rows as the input argument tseries. The first column contains the trailing means and the second contains the variance.


A matrix with two columns and the same number of rows as the input argument tseries. The first column contains the trailing means and the second contains the variance.

algoquant/HighFreq documentation built on Oct. 26, 2024, 9:20 p.m.