leadLag: Lead-Lag estimation
In jonathancornelissen/highfrequency: Tools for Highfrequency Data Analysis

View source: R/leadLag.R

leadLag

R Documentation

Lead-Lag estimation

Description

Function that estimates whether one series leads (or lags) another.

Let X_{t} and Y_{t} be two observed price over the time interval [0,1].

For every integer k \in \cal{Z}, we form the shifted time series

Y_{≤ft(k+i\right)/n}, \quad i = 1, 2, …

H=≤ft(\underline{H},\overline{H}\right] is an interval for \vartheta\inΘ, define the shift interval H_{\vartheta}=H+\vartheta=≤ft(\underline{H}+\vartheta,\overline{H}+\vartheta\right] then let

X≤ft(H\right)_{t}=\int_{0}^{t}1_{H}≤ft(s\right)\textrm{d}X_{s}

Which will be abbreviated:

X≤ft(H\right)=X≤ft(H\right)_{T+δ}=\int_{0}^{T+δ}1_{H}≤ft(s\right)\textrm{d}X_{s}

Then the shifted HY contrast function is:

\tilde{\vartheta}\rightarrow U^{n}≤ft(\tilde{\vartheta}\right)= \\ 1_{\tilde{\vartheta}≥q0}∑_{I\in{\cal{I}},J\in{\cal{J}},\overline{I}≤q T}X≤ft(I\right)Y≤ft(J\right)1_{≤ft\{ I\cap J_{-\tilde{\vartheta}}\neq\emptyset\right\}} \\ +1_{\tilde{\vartheta}<0}∑_{I\in{\cal{I}},J\in{\cal{J}},\overline{J}≤q T}X≤ft(I\right)Y≤ft(Y\right)1_{≤ft\{ J\cap I_{\tilde{\vartheta}}\neq\emptyset\right\} }

This contrast function is then calculated for all the lags passed in the argument lags

Usage

leadLag(
  price1 = NULL,
  price2 = NULL,
  lags = NULL,
  resolution = "seconds",
  normalize = TRUE,
  parallelize = FALSE,
  nCores = NA
)

Arguments

`price1`	`xts` or `data.table` containing prices in levels, in case of data.table, use a column DT to denote the date-time in POSIXct format, and a column PRICE to denote the price
`price2`	`xts` or `data.table` containing prices in levels, in case of data.table, use a column DT to denote the date-time in POSIXct format, and a column PRICE to denote the price
`lags`	a numeric denoting which lags (in units of `resolution`) should be tested as leading or lagging
`resolution`	the resolution at which the lags is measured. The default is "seconds", use this argument to gain 1000 times resolution by setting it to either "ms", "milliseconds", or "milli-seconds".
`normalize`	logical denoting whether the contrasts should be normalized by the product of the L2 norms of both the prices. Default = TRUE. This does not change the value of the lead-lag-ratio.
`parallelize`	logical denoting whether to use a parallelized version of the C++ code (parallelized using OPENMP). Default = FALSE
`nCores`	integer valued numeric denoting how many cores to use for the lead-lag estimation procedure in case parallelize is TRUE. Default is NA, which does not parallelize the code.

Details

The lead-lag-ratio (LLR) can be used to see if one asset leads the other. If LLR < 1, then price1 MAY be leading price2 and vice versa if LLR > 1.

Value

A list with class leadLag which contains contrasts, lead-lag-ratio, and lags, denoting the estimated values for each lag calculated, the lead-lag-ratio, and the tested lags respectively.

References

Hoffmann, M., Rosenbaum, M., and Yoshida, N. (2013). Estimation of the lead-lag parameter from non-synchronous data. Bernoulli, 19, 1-37.

Examples

## Not run: 
# Toy example to show the usage
# Spread prices
spread <- spreadPrices(sampleMultiTradeData[SYMBOL %in% c("ETF", "AAA")])
# Use lead-lag estimator
llEmpirical <- leadLag(spread[!is.na(AAA), list(DT, PRICE = AAA)], 
                       spread[!is.na(ETF), list(DT, PRICE = ETF)], seq(-15,15))
plot(llEmpirical)

## End(Not run)

jonathancornelissen/highfrequency documentation built on Jan. 10, 2023, 7:29 p.m.