Dataset_HFT300: The high frequency trading data.

Description Usage Format Details References

Description

This dataset is a random subset of a high frequency trading dataset used to assess the performace of RNNs for prediction (Dixon, 2017).

Usage

1

Format

A dataset with 300 observations of sequence length = 10, with a single sequence per row.

The y data is labeled as -1,0,1.

The x data constructs time series sequences (numeric).

Details

The feature represents the instantaneous liquidity imbalance using the best bid to ask ratio. The labels represent the next-event mid-price movement - Y=1 is an up-tick, Y=-1 is a down-tick and Y=0 represents no-movement. The time series sequences length is set to 10. In this package, the class 1 and -1 observations are random selected to yield 12 non-zero observations, while class 0 has 288 observations. Observations are ordered chronologically.

References

Matthew Dixon.(2017) Sequence Classification of the Limit Order Book using Recurrent Neural Networks. arXiv:1707.05642.


lweicdsor/OSTSC documentation built on May 8, 2019, 1:13 p.m.