run_zscores | R Documentation |
Calculate the trailing means, volatilities, and z-scores of a streaming time series of data using an online recursive formula.
run_zscores(tseries, lambda)
tseries |
A single time series or a single column matrix of data. |
lambda |
A decay factor which multiplies past estimates. |
The function run_zscores()
calculates the trailing means,
volatilities, and z-scores of a single streaming time series of
data r_t
, by recursively weighting the past variance estimates
\sigma^2_{t-1}
, with the squared differences of the data minus its
trailing means (r_t - \bar{r}_t)^2
, using the decay factor
\lambda
:
\bar{r}_t = \lambda \bar{r}_{t-1} + (1-\lambda) r_t
\sigma^2_t = \lambda \sigma^2_{t-1} + (1-\lambda) (r_t - \bar{r}_t)^2
z_t = \frac{r_t - \bar{r}_t}{\sigma_t}
Where r_t
are the streaming data, \bar{r}_t
are the trailing
means, \sigma^2_t
are the trailing variance estimates, and z_t
are the z-scores.
The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.
The value of the decay factor \lambda
must be in the range between
0
and 1
.
If \lambda
is close to 1
then the decay is weak and past
values have a greater weight, and the trailing variance values have a
stronger dependence on past data. This is equivalent to a long
look-back interval.
If \lambda
is much less than 1
then the decay is strong and
past values have a smaller weight, and the trailing variance values have a
weaker dependence on past data. This is equivalent to a short look-back
interval.
The function run_zscores()
returns a matrix with three
columns (means, volatilities, and z-scores) and the same number of rows as
the input argument tseries
.
A matrix with three columns (means, volatilities, and
z-scores) and the same number of rows as the input argument
tseries
.
## Not run:
# Calculate historical VTI log prices
pricev <- log(na.omit(rutils::etfenv$prices$VTI))
# Calculate the trailing variance and z-scores of prices
lambdaf <- 0.9 # Decay factor
zscores <- HighFreq::run_zscores(pricev, lambda=lambdaf)
datav <- cbind(pricev, zscores[, 3])
colnames(datav) <- c("VTI", "Z-Scores")
colnamev <- colnames(datav)
dygraphs::dygraph(datav, main="VTI Prices and Z-scores") %>%
dyAxis("y", label=colnamev[1], independentTicks=TRUE) %>%
dyAxis("y2", label=colnamev[2], independentTicks=TRUE) %>%
dySeries(axis="y", label=colnamev[1], strokeWidth=2, col="blue") %>%
dySeries(axis="y2", label=colnamev[2], strokeWidth=2, col="red") %>%
dyLegend(show="always", width=300)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.