run_mean | R Documentation |
Calculate the exponential moving average (EMA) of streaming time series data using an online recursive formula.
run_mean(tseries, lambda, weightv = 0L)
tseries |
A time series or a matrix. |
lambda |
A decay factor which multiplies past estimates. |
weightv |
A single-column matrix of weights. |
The function run_mean()
calculates the exponential moving average
(EMA) of the streaming time series data p_t
by recursively
weighting present and past values using the decay factor \lambda
. If
the weightv
argument is equal to zero, then the function
run_mean()
simply calculates the exponentially weighted moving
average value of the streaming time series data p_t
:
\bar{p}_t = \lambda \bar{p}_{t-1} + (1-\lambda) p_t = (1-\lambda) \sum_{j=0}^{n} \lambda^j p_{t-j}
Some applications require applying additional weight factors, like for example the volume-weighted average price indicator (VWAP). Then the streaming prices can be multiplied by the streaming trading volumes.
If the argument weightv
has the same number of rows as the argument
tseries
, then the function run_mean()
calculates the
exponential moving average (EMA) in two steps.
First it calculates the trailing mean weights \bar{w}_t
:
\bar{w}_t = \lambda \bar{w}_{t-1} + (1-\lambda) w_t
Second it calculates the trailing mean products \bar{w p}_t
of the
weights w_t
and the data p_t
:
\bar{w p}_t = \lambda \bar{w p}_{t-1} + (1-\lambda) w_t p_t
Where p_t
is the streaming data, w_t
are the streaming
weights, \bar{w}_t
are the trailing mean weights, and \bar{w p}_t
are the trailing mean products of the data and the weights.
The trailing mean weighted value \bar{p}_t
is equal to the ratio of the
data and weights products, divided by the mean weights:
\bar{p}_t = \frac{\bar{w p}_t}{\bar{w}_t}
The above online recursive formulas are convenient for processing live streaming data because they don't require maintaining a buffer of past data. The formulas are equivalent to a convolution with exponentially decaying weights, but they're much faster to calculate. Using exponentially decaying weights is more natural than using a sliding look-back interval, because it gradually "forgets" about the past data.
The value of the decay factor \lambda
must be in the range between
0
and 1
.
If \lambda
is close to 1
then the decay is weak and past
values have a greater weight, and the trailing mean values have a stronger
dependence on past data. This is equivalent to a long look-back
interval.
If \lambda
is much less than 1
then the decay is strong and
past values have a smaller weight, and the trailing mean values have a
weaker dependence on past data. This is equivalent to a short look-back
interval.
The function run_mean()
performs the same calculation as the
standard R
functionstats::filter(x=series, filter=lambda,
method="recursive")
, but it's several times faster.
The function run_mean()
returns a matrix with the same
dimensions as the input argument tseries
.
A matrix with the same dimensions as the input argument
tseries
.
## Not run:
# Calculate historical prices
ohlc <- rutils::etfenv$VTI
closep <- quantmod::Cl(ohlc)
# Calculate the trailing means
lambdaf <- 0.9 # Decay factor
meanv <- HighFreq::run_mean(closep, lambda=lambdaf)
# Calculate the trailing means using R code
pricef <- (1-lambdaf)*filter(closep,
filter=lambdaf, init=as.numeric(closep[1, 1])/(1-lambdaf),
method="recursive")
all.equal(drop(meanv), unclass(pricef), check.attributes=FALSE)
# Compare the speed of RcppArmadillo with R code
library(microbenchmark)
summary(microbenchmark(
Rcpp=HighFreq::run_mean(closep, lambda=lambdaf),
Rcode=filter(closep, filter=lambdaf, init=as.numeric(closep[1, 1])/(1-lambdaf), method="recursive"),
times=10))[, c(1, 4, 5)] # end microbenchmark summary
# Calculate weights equal to the trading volumes
weightv <- quantmod::Vo(ohlc)
# Calculate the exponential moving average (EMA)
meanw <- HighFreq::run_mean(closep, lambda=lambdaf, weightv=weightv)
# Plot dygraph of the EMA
datav <- xts(cbind(meanv, meanw), zoo::index(ohlc))
colnames(datav) <- c("means trailing", "means weighted")
dygraphs::dygraph(datav, main="Trailing Means") %>%
dyOptions(colors=c("blue", "red"), strokeWidth=2) %>%
dyLegend(show="always", width=300)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.