run_var | R Documentation |
Calculate the trailing variance of streaming time series of data using an online recursive formula.
run_var(tseries, lambda)
tseries |
A time series or a matrix of data. |
lambda |
A decay factor which multiplies past estimates. |
The function run_var()
calculates the trailing variance of
streaming time series of data r_t
, by recursively weighting
the past variance estimates \sigma^2_{t-1}
, with the squared
differences of the data minus its trailing means (r_t -
\bar{r}_t)^2
, using the decay factor \lambda
:
\bar{r}_t = \lambda \bar{r}_{t-1} + (1-\lambda) r_t
\sigma^2_t = \lambda \sigma^2_{t-1} + (1-\lambda) (r_t - \bar{r}_t)^2
Where r_t
are the streaming data, \bar{r}_t
are the trailing
means, and \sigma^2_t
are the trailing variance estimates.
The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.
The value of the decay factor \lambda
must be in the range between
0
and 1
.
If \lambda
is close to 1
then the decay is weak and past
values have a greater weight, and the trailing variance values have a
stronger dependence on past data. This is equivalent to a long
look-back interval.
If \lambda
is much less than 1
then the decay is strong and
past values have a smaller weight, and the trailing variance values have a
weaker dependence on past data. This is equivalent to a short look-back
interval.
The function run_var()
performs the same calculation as the
standard R
functionstats::filter(x=series,
filter=weightv, method="recursive")
, but it's several times faster.
The function run_var()
returns a matrix with the same
dimensions as the input argument tseries
.
A matrix with the same dimensions as the input argument
tseries
.
## Not run:
# Calculate historical returns
retp <- zoo::coredata(na.omit(rutils::etfenv$returns$VTI))
# Calculate the trailing variance
lambdaf <- 0.9
vars <- HighFreq::run_var(retp, lambda=lambdaf)
# Calculate centered returns
retc <- (retp - HighFreq::run_mean(retp, lambda=lambdaf))
# Calculate the trailing variance using R code
retc2 <- (1-lambdaf)*filter(retc^2, filter=lambdaf,
init=as.numeric(retc[1, 1])^2/(1-lambdaf),
method="recursive")
all.equal(vars, unclass(retc2), check.attributes=FALSE)
# Compare the speed of RcppArmadillo with R code
library(microbenchmark)
summary(microbenchmark(
Rcpp=HighFreq::run_var(retp, lambda=lambdaf),
Rcode=filter(retc^2, filter=lambdaf, init=as.numeric(retc[1, 1])^2/(1-lambdaf), method="recursive"),
times=10))[, c(1, 4, 5)] # end microbenchmark summary
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.