tradesCleanupUsingQuotes: Perform a final cleaning procedure on trade data

View source: R/dataHandling.R

tradesCleanupUsingQuotesR Documentation

Perform a final cleaning procedure on trade data

Description

Function performs cleaning procedure rmTradeOutliersUsingQuotes for the trades of all stocks data in "dataDestination". Note that preferably the input data for this function is trade and quote data cleaned by respectively e.g. tradesCleanup and quotesCleanup.

Usage

tradesCleanupUsingQuotes(
  tradeDataSource = NULL,
  quoteDataSource = NULL,
  dataDestination = NULL,
  tData = NULL,
  qData = NULL,
  lagQuotes = 0,
  nSpreads = 1,
  BFM = FALSE,
  backwardsWindow = 3600,
  forwardsWindow = 0.5,
  plot = FALSE
)

Arguments

tradeDataSource

character indicating the folder in which the original trade data is stored.

quoteDataSource

character indicating the folder in which the original quote data is stored.

dataDestination

character indicating the folder in which the cleaned data is stored, folder of dataSource by default.

tData

data.table or xts object containing trade data cleaned by tradesCleanup. This argument is NULL by default. Enabling it, means the arguments from, to, dataSource and dataDestination will be ignored (only advisable for small chunks of data).

qData

data.table or xts object containing cleaned quote data. This argument is NULL by default. Enabling it means the arguments from, to, dataSource, dataDestination will be ignored (only advisable for small chunks of data).

lagQuotes

numeric, number of seconds the quotes are registered faster than the trades (should be round and positive). Default is 0. For older datasets, i.e. before 2010, it may be a good idea to set this to, e.g., 2 (see, Vergote, 2005).

nSpreads

numeric of length 1 denotes how far above the offer and below bid we allow outliers to be. Trades are filtered out if they are MORE THAN nSpread * spread above (below) the offer (bid)

BFM

a logical determining whether to conduct "Backwards - Forwards matching" of trades and quotes. The algorithm tries to match trades that fall outside the bid - ask and first tries to match a small window forwards and if this fails, it tries to match backwards in a bigger window. The small window is a tolerance for inaccuracies in the timestamps of bids and asks. The backwards window allow for matching of late reported trades, i.e. block trades.

backwardsWindow

a numeric denoting the length of the backwards window used when BFM = TRUE. Default is 3600, corresponding to one hour.

forwardsWindow

a numeric denoting the length of the forwards window used when BFM = TRUE. Default is 0.5, corresponding to one half second.

plot

a logical denoting whether to visualize the forwards, backwards, and unmatched trades in a plot. Passed on to rmTradeOutliersUsingQuotes

Details

In case you supply the arguments tData and qData, the on-disk functionality is ignored and the function returns cleaned trades as a data.table or xts object (see examples).

When using the on-disk functionality and tradeDataSource and quoteDataSource are the same, the quote files are all files in the folder that contains 'quote', and the rest are treated as containing trade data.

Value

For each day an xts object is saved into the folder of that date, containing the cleaned data.

Author(s)

Jonathan Cornelissen, Kris Boudt, Onno Kleen, and Emil Sjoerup.

References

Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2009). Realized kernels in practice: Trades and quotes. Econometrics Journal, 12, C1-C32.

Brownlees, C.T., and Gallo, G.M. (2006). Financial econometric analysis at ultra-high frequency: Data handling concerns. Computational Statistics & Data Analysis, 51, 2232-2245.

Christensen, K., Oomen, R. C. A., Podolskij, M. (2014): Fact or Friction: Jumps at ultra high frequency. Journal of Financial Economics, 144, 576-599

Examples

# Consider you have raw trade data for 1 stock for 2 days 
## Not run: 
tDataAfterFirstCleaning <- tradesCleanup(tDataRaw = sampleTDataRaw, 
                                          exchanges = "N", report = FALSE)
qData <- quotesCleanup(qDataRaw = sampleQDataRaw, 
                       exchanges = "N", report = FALSE)
dim(tDataAfterFirstCleaning)
tDataAfterFinalCleaning <- 
  tradesCleanupUsingQuotes(qData = qData[as.Date(DT) == "2018-01-02"],
                           tData = tDataAfterFirstCleaning[as.Date(DT) == "2018-01-02"])
dim(tDataAfterFinalCleaning)

## End(Not run)
# In case you have more data it is advised to use the on-disk functionality
# via the "tradeDataSource", "quoteDataSource", and "dataDestination" arguments

highfrequency documentation built on Oct. 4, 2023, 5:08 p.m.