run_zscores: Calculate the trailing means, volatilities, and z-scores of a...

View source: R/RcppExports.R

run_zscoresR Documentation

Calculate the trailing means, volatilities, and z-scores of a streaming time series of data using an online recursive formula.

Description

Calculate the trailing means, volatilities, and z-scores of a streaming time series of data using an online recursive formula.

Usage

run_zscores(tseries, lambda)

Arguments

tseries

A single time series or a single column matrix of data.

lambda

A decay factor which multiplies past estimates.

Details

The function run_zscores() calculates the trailing means, volatilities, and z-scores of a single streaming time series of data r_t, by recursively weighting the past variance estimates \sigma^2_{t-1}, with the squared differences of the data minus its trailing means (r_t - \bar{r}_t)^2, using the decay factor \lambda:

\bar{r}_t = \lambda \bar{r}_{t-1} + (1-\lambda) r_t

\sigma^2_t = \lambda \sigma^2_{t-1} + (1-\lambda) (r_t - \bar{r}_t)^2

z_t = \frac{r_t - \bar{r}_t}{\sigma_t}

Where r_t are the streaming data, \bar{r}_t are the trailing means, \sigma^2_t are the trailing variance estimates, and z_t are the z-scores.

The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.

The value of the decay factor \lambda must be in the range between 0 and 1. If \lambda is close to 1 then the decay is weak and past values have a greater weight, and the trailing variance values have a stronger dependence on past data. This is equivalent to a long look-back interval. If \lambda is much less than 1 then the decay is strong and past values have a smaller weight, and the trailing variance values have a weaker dependence on past data. This is equivalent to a short look-back interval.

The function run_zscores() returns a matrix with three columns (means, volatilities, and z-scores) and the same number of rows as the input argument tseries.

Value

A matrix with three columns (means, volatilities, and z-scores) and the same number of rows as the input argument tseries.

Examples

## Not run: 
# Calculate historical VTI log prices
pricev <- log(na.omit(rutils::etfenv$prices$VTI))
# Calculate the trailing variance and z-scores of prices
lambdaf <- 0.9 # Decay factor
zscores <- HighFreq::run_zscores(pricev, lambda=lambdaf)
datav <- cbind(pricev, zscores[, 3])
colnames(datav) <- c("VTI", "Zscores")
colnamev <- colnames(datav)
dygraphs::dygraph(datav, main="VTI Prices and Z-scores") %>%
   dyAxis("y", label=colnamev[1], independentTicks=TRUE) %>%
   dyAxis("y2", label=colnamev[2], independentTicks=TRUE) %>%
   dySeries(axis="y", label=colnamev[1], strokeWidth=2, col="blue") %>%
   dySeries(axis="y2", label=colnamev[2], strokeWidth=2, col="red") %>%
   dyLegend(show="always", width=300)

## End(Not run)


algoquant/HighFreq documentation built on Feb. 9, 2024, 8:15 p.m.