which_extreme | R Documentation |
Calculate a Boolean vector that identifies extreme tail values in a single-column xts time series or vector, over a rolling look-back interval.
which_extreme(xtsv, look_back = 51, vol_mult = 2)
xtsv |
A single-column xts time series, or a numeric or Boolean vector. |
look_back |
The number of data points in rolling look-back interval for estimating rolling quantile. |
vol_mult |
The quantile multiplier. |
The function which_extreme()
calculates a Boolean
vector, with TRUE
for values that belong to the extreme tails
of the distribution of values.
The function which_extreme()
applies a version of the Hampel median
filter to identify extreme values, but instead of using the median absolute
deviation (MAD), it uses the 0.9
quantile values calculated over a
rolling look-back interval.
Extreme values are defined as those that exceed the product of the multiplier times the rolling quantile. Extreme values belong to the fat tails of the recent (trailing) distribution of values, so they are present only when the trailing distribution of values has fat tails. If the trailing distribution of values is closer to normal (without fat tails), then there are no extreme values.
The quantile multiplier vol_mult
controls the threshold at which
values are identified as extreme. Smaller quantile multiplier values will
cause more values to be identified as extreme.
A Boolean vector with the same number of rows as the input time series or vector.
# Create local copy of SPY TAQ data
taq <- HighFreq::SPY_TAQ
# scrub quotes with suspect bid-ask spreads
bidask <- taq[, "Ask.Price"] - taq[, "Bid.Price"]
sus_pect <- which_extreme(bidask, look_back=51, vol_mult=3)
# Remove suspect values
taq <- taq[!sus_pect]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.