EHO Model

This section describes the static model for estimating the probability of informed trading developed by @EHO. Arrival rates of buys and sells as well as probability parameters are assumed to be constant over the whole range spanned by the data. Therefore the usage of this models do not enable to estimate the probability of informed trading on a daily basis.

The sequence of trading days is assumed to be discrete and independent, whereas the time during a trading day is continuous. Conditions of trading days are not observable and determined by nature before the market opening. Information events, days on which private, price-relevant information enter the market, occur with probability $\probinfevent$. These are good information with probability $1 - \probbadnews$ and news with negative direction with probability $\probbadnews$. Hence, the probabilities of no-news, good-news and bad-news days are given by: $$ \begin{align} \prsym(\nonews) &= 1 - \probinfevent \ \prsym(\goodnews) &= \probinfevent ( 1 - \probbadnews) \ \prsym(\badnews) &= \probinfevent \probbadnews \end{align} $$ Furthermore, buys and sells are supposed to follow latent independent Poisson processes with constant intensities. A Poisson process is a point process which is often defined on the positive line. According to @daley2003,

we shall understand by a point process some method of randomly allocating points to the real line.

However, in terms of arrivals of buys and sells, we assume the Poisson processes in the static $\pintext$ models to be defined on the positive half-line.^[ Poisson processes are directly related to the corresponding discrete Poisson distribution which is defined by the following cumulative distribution function (cdf) and probability mass fun $$ \begin{align} \text{cdf}: \:\: F\left(\lambda;n\right) = \exp\left(-\lambda\right) \sum\limits_{k=0}^n \dfrac{\lambda^k}{k!} \quad \text{and} \quad \text{pmf}: \:\: f\left(\lambda;n\right) = \exp\left(-\lambda\right) \dfrac{\lambda^n}{n!} \notag ; \:\: \text{with}\:\: x \in \mathbb{R}^+ \end{align} $$ with the number of arrivals $n$, the intensity parameter $\lambda \in \mathbb{R}^+$ and $n \in \mathbb{N}0$. The mean and variance of the Poisson distribution are equal and given by the parameter $\lambda$.] Waiting times or interarrival times between two consecutive buys or sells are exponentially distributed.^[ The exponential distribution is a continuous distribution which has cdf and probability density function (pdf) given below, $$ \begin{align} \text{cdf}: \:\: F\left(\lambda;x\right) = 1 - \exp\left(-\lambda x\right) \quad \text{and} \quad \text{pdf}: \:\: f\left(\lambda;x\right) = \lambda \exp\left(-\lambda x\right) \notag; \:\: \text{with}\:\: x \in \mathbb{R}^+ \end{align} $$ with rate parameter $\lambda$.] On information events, the intensity of the Poisson process either for buys or sells is increased by a positive constant parameter, depending on the direction of information. The arrivals of transactions, which are indeed observable, can be interpreted as a merging of both latent point processes.^[ The merging of two homogeneous independent Poisson processes again yields a Poisson process, e.g. see @firstcourse.] The observable arrivals of transactions $N{O}$ are the outcome of a competition between the latent Poisson processes for buys and sells, $N_{B}$ and $N_{S}$, which is the first to arrive. The waiting times of the latent Poisson processes determine the direction of the next trade. Assuming that the current waiting time of the buys' point process is less than the sells' interarrival time, the observed transaction will be buyer-initiated and $N_{B}$ increases by 1 as well as $N_{O}$, whereas $N_{S}$ remains unchanged.^[For most exchanges no data is available about the direction of transactions. Due to this reason algorithms like Lee & Ready are used to try to detect if a transaction is buyer- or seller-initiated [see @LeeReady].] After observing a transaction the waiting times of both latent point processes are reset and the race of buys and sells process begins anew.

For each of the Poisson processes of uninformed buys and uninformed sells a unique intensity is assumed. The expected number of uninformed buys equals $\intensuninfbuys$, whereas the expected amount of uninformed sells is $\intensuninfsells$. Informed buys and sells appear with rate $\intensinf$.^[ All model parameters representing trading intensities are assumed to be positive real numbers.] According to equation \ref{eq:pingeneral} the probability of informed trading in the EHO model can be calculated as $$ \begin{align}\label{eq:pineho} \pintext &= \dfrac{\probinfevent \probbadnews \intensinf + \probinfevent (1 - \probbadnews) \intensinf} {(1 - \probinfevent)(\intensuninfbuys + \intensuninfsells) + \probinfevent \probbadnews (\intensuninfbuys + \intensuninfsells) + \probinfevent (1 - \probbadnews) (\intensuninfbuys + \intensuninfsells) + \probinfevent \probbadnews \intensinf + \probinfevent (1 - \probbadnews) \intensinf} \notag \ &= \dfrac{\probinfevent \intensinf}{\intensuninfbuys + \intensuninfsells + \probinfevent \intensinf}, \end{align} $$ where $\pintext$ and the model parameters are constant over the whole range of the underlying data.

On information events the probability for an arrival of buys or sells increases depending on the direction of the private information. Informed buyers (sellers) enter the market if they receive a positive (negative) signal. The intensity of the point process for buys (sells) increases by the positive parameter $\intensinf$ whereas the rate for sells (buys) remains on the level for no-news days.

The following scenario tree illustrates the probabilities for the potential states a trading day in the $\pintext$ framework can reside in. In addition, the mapping of the sets of arrival rates for buys and sells to the different trading days' conditions can be read directly from the graph.

EHO Scenario Tree

For deriving the (log) likelihood function in the EHO setting we can utilize standard theory for homogeneous Poisson processes. A Poisson process on the (positive) real line is completely defined by the following equation [see @daley2003, p. 19], $$ \begin{align} \label{eq:poissonprocess} \prsym\left{\intevent{a_i}{b_i} = n_i, i = 1, \dots, k\right} = \prod\limits_{i=1}^k \dfrac{\left[\lambda\left(b_i - a_i\right)\right]^{n_i}}{n_i!} \exp\left(-\lambda\left(b_i - a_i\right)\right), \end{align} $$ where $\intevent{a_i}{b_i}$ represents the number of arrivals lying in the right-bounded, left-open interval $\left(a_i, b_i\right]$ with $a_i < b_i \leq a_{i+1}$. The term $\lambda\left(b_i - a_i\right)$ can be interpreted as the expected number of arrivals happening in $\left(a_i, b_i\right]$.

It is common practice in the $\pintext$ literature to specify the intensities of Poisson processes for buys and sells, $\intensuninfbuys$, $\intensuninfsells$ and $\intensinf$, as expected arrivals per trading day. Hence, the half-bounded interval $\left(a_i, b_i\right]$ in equation \ref{eq:poissonprocess} simplifies to an interval of length 1 and we can adopt equation \ref{eq:poissonprocess} for usage in the EHO model for a trading day $\daysym$, $$ \begin{align} \prsym\left{N_{\daysym} = n\right} = \dfrac{\lambda^{n}}{n!} \exp\left(-\lambda\right), \label{eq:poissonprocessekop} \end{align} $$ with the buys' or sells' Poisson process $N_{\daysym}$, the number of buys or sells $n \in \mathbb{N}_0$ on trading day $\daysym$ and $\lambda$ displays the arrival rate of buys or sells, respectively.

The different types of trading trades must be taken into account. If on trading day $\daysym$ no private information hit the market, it is free from information-based traders. According to equation \ref{eq:poissonprocessekop} the notation of the separated probabilities of observing $B_{\daysym}$ buys and $S_{\daysym}$ sells on a no-news day $\daysym$ is straightforward. Hence, due to the independence of the latent Poisson processes, the probability of observing a tuple of $B_{\daysym}$ buys and $S_{\daysym}$ sells can be written as product of the single probabilities, $$ \begin{align} \label{eq:probseqno} \underbrace{\exp\left(-\intensuninfbuys \right)\dfrac{\intensuninfbuys^{B_{\daysym}}}{B_{\daysym}!}}{\text{Buys}} \underbrace{\exp\left(-\intensuninfsells \right)\dfrac{\intensuninfsells^{S{\daysym}}}{S_{\daysym}!}}{\text{Sells}}. \end{align} $$ On a good-news day $\daysym$ informed traders are active and buy equities which yields to an increase in the arrivals of buyer-initiated transactions captured by parameter $\intensinf$, $$ \begin{align} \label{eq:probseqgo} \underbrace{\exp\left(-(\intensuninfbuys + \intensinf) \right)\dfrac{(\intensuninfbuys + \intensinf)^{B{\daysym}}}{B_{\daysym}!}}{\text{Buys}} \underbrace{\exp\left(-\intensuninfsells \right)\dfrac{\intensuninfsells^{S{\daysym}}}{S_{\daysym}!}}{\text{Sells}}. \end{align} $$ Likewise to good-news days, the intensity of seller-initiated trades increase on a bad-news day $\daysym$, $$ \begin{align} \label{eq:probseqbad} \underbrace{\exp\left(-\intensuninfbuys \right)\dfrac{\intensuninfbuys^{B{\daysym}}}{B_{\daysym}!}}{\text{Buys}} \underbrace{\exp\left(-(\intensuninfsells + \intensinf) \right)\dfrac{(\intensuninfsells + \intensinf)^{S{\daysym}}}{S_{\daysym}!}}{\text{Sells}}. \end{align} $$ The likelihood function of observing a sequence of $B{\daysym}$ buys and $S_{\daysym}$ sells on a trading day $\daysym$ can now be formulated, using equations \ref{eq:probseqno} - \ref{eq:probseqbad}, as weighted sum of these condition-specific probabilities. Thus, the joint density of observing $B_{\daysym}$ buys and $S_{\daysym}$ sells on a trading day $\daysym$ can be formulated as $$ \begin{align} \label{eq:dailylikelihoodEKOP} \likelihood\left( \thetaeho \mid (B_{\daysym},S_{\daysym}) \right) &= \left(1-\probinfevent\right)\left(\exp\left(-\intensuninfbuys \right)\dfrac{\intensuninfbuys^{B_{\daysym}}}{B_{\daysym}!} \exp\left(-\intensuninfsells \right)\dfrac{\intensuninfsells^{S_{\daysym}}}{S_{\daysym}!}\right) \notag \ & + \probinfevent \left(1-\probbadnews\right) \left(\exp\left(-(\intensuninfbuys + \intensinf) \right)\dfrac{(\intensuninfbuys + \intensinf)^{B_{\daysym}}}{B_{\daysym}!} \exp\left(-\intensuninfsells \right)\dfrac{\intensuninfsells^{S_{\daysym}}}{S_{\daysym}!}\right) \notag \ & + \probinfevent \probbadnews \left(\exp\left(-\intensuninfbuys \right)\dfrac{\intensuninfbuys^{B_{\daysym}}}{B_{\daysym}!} \exp\left(-(\intensuninfsells + \intensinf) \right)\dfrac{(\intensuninfsells + \intensinf)^{S_{\daysym}}}{S_{\daysym}!}\right). \end{align} $$ Utilizing the independence of trading days, the probability of observing $\datasymbs = \left(B_d, S_d\right){d = 1}^{\totaldays}$ for $\daysym = 1, \dots, \totaldays$ trading days can be written as product of daily likelihoods, $$ \begin{align} \label{eq:likelihoodEKOP} \likelihood\left(\thetaehoshort \mid \datasymbs \right) = \prod\limits{d=1}^{\totaldays} \likelihood\left(\thetaehoshort \mid (B_d,S_d)\right). \end{align} $$ Hence, the log likelihood function for a total of $\totaldays$ trading days can be formulated as $$ \begin{align} \label{eq:loglikelihoodEKOP} \log \likelihood\left(\thetaehoshort \mid \datasymbs \right) = \sum\limits_{d=1}^{\totaldays} \log \likelihood\left(\thetaehoshort \mid (B_{\daysym},S_{\daysym})\right), \end{align} $$ which yields the concrete notation, $$ \begin{align} \log \likelihood\left(\thetaehoshort \mid \datasymbs \right) = &\sum\limits_{d=1}^{\totaldays} \log \Biggl( \left(1-\probinfevent\right) \left(\exp\left(-\intensuninfbuys \right)\dfrac{\intensuninfbuys^{B_{\daysym}}}{B_{\daysym}!} \exp\left(-\intensuninfsells \right)\dfrac{\intensuninfsells^{S_{\daysym}}}{S_{\daysym}!}\right) \notag \ & + \probinfevent \left(1-\probbadnews\right) \left(\exp\left(-(\intensuninfbuys + \intensinf) \right)\dfrac{(\intensuninfbuys + \intensinf)^{B_{\daysym}}}{B_{\daysym}!} \exp\left(-\intensuninfsells \right)\dfrac{\intensuninfsells^{S_{\daysym}}}{S_{\daysym}!}\right) \notag \ & + \probinfevent\probbadnews \left(\exp\left(-\intensuninfbuys \right)\dfrac{\intensuninfbuys^{B_{\daysym}}}{B_{\daysym}!} \exp\left(-(\intensuninfsells + \intensinf) \right)\dfrac{(\intensuninfsells + \intensinf)^{S_{\daysym}}}{S_{\daysym}!}\right) \Biggr). \label{eq:loglikelihoodEHOconcrete} \end{align} $$

The formulation shown in equation \ref{eq:loglikelihoodEHOconcrete} is very inefficient in terms of computation time and often raises overflow errors.^[ Overflow errors occur if the calculated number is too big in magnitude so that it can not be longer represented by the machine/software.] In computations of equation \ref{eq:loglikelihoodEHOconcrete} factorials of daily buys and sells need to be evaluated. Since the number of daily buys and sells can easily exceed values of several hundreds or thousands, calculations can easily get infeasible. Even if the number of daily buys or sells is small enough leading to computable factorial terms, finite values of the likelihood are not ensured. Additionally, the terms $\intensuninfbuys^{B_{\daysym}}$, $\intensuninfsells^{S_{\daysym}}$, $(\intensuninfbuys + \intensinf)^{B_{\daysym}}$ and $(\intensuninfsells + \intensinf)^{S_{\daysym}}$ are potential sources of overflow errors. Furthermore, single terms may be finite but products of those may not. In contrast to overflow errors induced by large values for daily buys and sells, the exponential terms in the likelihood function my introduce underflow errors,^[ Underflow errors occur if the number to represent is too small and vanishes to zero.] i.e. $\exp\left(-\intensuninfbuys \right)$, $\exp\left(-\intensuninfsells \right)$, $\exp\left(-(\intensuninfbuys + \intensinf) \right)$ and $\exp\left(-(\intensuninfsells + \intensinf) \right)$.

Since the EKOP model is nested in the EHO setting for equal intensities of uninformed buys and sells ($\intensuninfbuys = \intensuninfsells$), we forego to explain the simpler model structure.



Try the pinbasic package in your browser

Any scripts or data that you put into this service are public.

pinbasic documentation built on May 2, 2019, 2:07 a.m.