rmTradeOutliersUsingQuotes: Delete transactions with unlikely transaction prices

View source: R/dataHandling.R

rmTradeOutliersUsingQuotesR Documentation

Delete transactions with unlikely transaction prices

Description

Function deletes entries with prices that are above the ask plus the bid-ask spread. Similar for entries with prices below the bid minus the bid-ask spread.

Usage

rmTradeOutliersUsingQuotes(
  tData,
  qData,
  lagQuotes = 0,
  nSpreads = 1,
  BFM = FALSE,
  backwardsWindow = 3600,
  forwardsWindow = 0.5,
  plot = FALSE,
  ...
)

Arguments

tData

a data.table or xts object containing the time series data, with at least the column "PRICE", containing the transaction price.

qData

a data.table or xts object containing the time series data with at least the columns "BID" and "OFR", containing the bid and ask prices.

lagQuotes

numeric, number of seconds the quotes are registered faster than the trades (should be round and positive). Default is 0. For older datasets, i.e. before 2010, it may be a good idea to set this to e.g. 2. See Vergote (2005)

nSpreads

numeric of length 1 denotes how far above the offer and below bid we allow outliers to be. Trades are filtered out if they are MORE THAN nSpread * spread above (below) the offer (bid)

BFM

a logical determining whether to conduct 'Backwards - Forwards matching' of trades and quotes. The algorithm tries to match trades that fall outside the bid - ask and first tries to match a small window forwards and if this fails, it tries to match backwards in a bigger window. The small window is a tolerance for inaccuracies in the timestamps of bids and asks. The backwards window allow for matching of late reported trades, i.e. block trades.

backwardsWindow

a numeric denoting the length of the backwards window. Default is 3600, corresponding to one hour.

forwardsWindow

a numeric denoting the length of the forwards window. Default is 0.5, corresponding to one half second.

plot

a logical denoting whether to visualize the forwards, backwards, and unmatched trades in a plot.

...

used internally

Details

Note: in order to work correctly, the input data of this function should be cleaned trade (tData) and quote (qData) data respectively. In older high frequency datasets the trades frequently lag the quotes. In newer datasets this tends to happen only during extreme market activity when exchange networks are at maximum capacity.

Value

xts or data.table object depending on input.

Author(s)

Jonathan Cornelissen, Kris Boudt, Onno Kleen, and Emil Sjoerup.

References

Vergote, O. (2005). How to match trades and quotes for NYSE stocks? K.U.Leuven working paper.

Christensen, K., Oomen, R. C. A., Podolskij, M. (2014): Fact or Friction: Jumps at ultra high frequency. Journal of Financial Economics, 144, 576-599


highfrequency documentation built on Oct. 4, 2023, 5:08 p.m.