run_mean: Calculate the exponential moving average (EMA) of streaming...

View source: R/RcppExports.R

run_meanR Documentation

Calculate the exponential moving average (EMA) of streaming time series data using an online recursive formula.

Description

Calculate the exponential moving average (EMA) of streaming time series data using an online recursive formula.

Usage

run_mean(tseries, lambda, weightv = 0L)

Arguments

tseries

A time series or a matrix.

lambda

A decay factor which multiplies past estimates.

weightv

A single-column matrix of weights.

Details

The function run_mean() calculates the exponential moving average (EMA) of the streaming time series data p_t by recursively weighting present and past values using the decay factor \lambda. If the weightv argument is equal to zero, then the function run_mean() simply calculates the exponentially weighted moving average value of the streaming time series data p_t:

\bar{p}_t = \lambda \bar{p}_{t-1} + (1-\lambda) p_t = (1-\lambda) \sum_{j=0}^{n} \lambda^j p_{t-j}

Some applications require applying additional weight factors, like for example the volume-weighted average price indicator (VWAP). Then the streaming prices can be multiplied by the streaming trading volumes.

If the argument weightv has the same number of rows as the argument tseries, then the function run_mean() calculates the exponential moving average (EMA) in two steps.

First it calculates the trailing mean weights \bar{w}_t:

\bar{w}_t = \lambda \bar{w}_{t-1} + (1-\lambda) w_t

Second it calculates the trailing mean products \bar{w p}_t of the weights w_t and the data p_t:

\bar{w p}_t = \lambda \bar{w p}_{t-1} + (1-\lambda) w_t p_t

Where p_t is the streaming data, w_t are the streaming weights, \bar{w}_t are the trailing mean weights, and \bar{w p}_t are the trailing mean products of the data and the weights.

The trailing mean weighted value \bar{p}_t is equal to the ratio of the data and weights products, divided by the mean weights:

\bar{p}_t = \frac{\bar{w p}_t}{\bar{w}_t}

The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.

The value of the decay factor \lambda must be in the range between 0 and 1. If \lambda is close to 1 then the decay is weak and past values have a greater weight, and the trailing mean values have a stronger dependence on past data. This is equivalent to a long look-back interval. If \lambda is much less than 1 then the decay is strong and past values have a smaller weight, and the trailing mean values have a weaker dependence on past data. This is equivalent to a short look-back interval.

The function run_mean() performs the same calculation as the standard R function
stats::filter(x=series, filter=lambda, method="recursive"), but it's several times faster.

The function run_mean() returns a matrix with the same dimensions as the input argument tseries.

Value

A matrix with the same dimensions as the input argument tseries.

Examples

## Not run: 
# Calculate historical prices
ohlc <- rutils::etfenv$VTI
closep <- quantmod::Cl(ohlc)
# Calculate the trailing means
lambdaf <- 0.9
meanv <- HighFreq::run_mean(closep, lambda=lambdaf)
# Calculate the trailing means using R code
pricef <- (1-lambdaf)*filter(closep, 
  filter=lambdaf, init=as.numeric(closep[1, 1])/(1-lambdaf), 
  method="recursive")
all.equal(drop(meanv), unclass(pricef), check.attributes=FALSE)

# Compare the speed of RcppArmadillo with R code
library(microbenchmark)
summary(microbenchmark(
  Rcpp=HighFreq::run_mean(closep, lambda=lambdaf),
  Rcode=filter(closep, filter=lambdaf, init=as.numeric(closep[1, 1])/(1-lambdaf), method="recursive"),
  times=10))[, c(1, 4, 5)]  # end microbenchmark summary
  
# Calculate weights equal to the trading volumes
weightv <- quantmod::Vo(ohlc)
# Calculate the exponential moving average (EMA)
meanw <- HighFreq::run_mean(closep, lambda=lambdaf, weightv=weightv)
# Plot dygraph of the EMA
datav <- xts(cbind(meanv, meanw), zoo::index(ohlc))
colnames(datav) <- c("means trailing", "means weighted")
dygraphs::dygraph(datav, main="Trailing Means") %>%
  dyOptions(colors=c("blue", "red"), strokeWidth=2) %>%
  dyLegend(show="always", width=300)

## End(Not run)


algoquant/HighFreq documentation built on Feb. 9, 2024, 8:15 p.m.