Lin and Ke Factorization

An even more stable and accurate formulation of the likelihood function is presented in the work of @LinKe. The factorization is applicable even for heavily traded stocks. The effectiveness and stability of the likelihood is caused by two principles [see @LinKe, p. 629]:

To fortify the usefulness of these two principles the authors give the following example. Say, one intends to compute $\log\left(\exp(x)\exp(y) + \exp(z)\right)$ with $x = 800$, $y = -400$ and $z = 900$. A threshold for the inputs for exponential function lies at 710, meaning that any larger value gets the exponential function to overflow and return infinite results.

At first, $\exp(x)\exp(y)$ would lead to an overflow error due to the fact that one input for the exponential is bigger than the threshold (800 > 710). Taking the first principle into account, we can compute this expression with $\exp(x + y)$ which gives r exp(800 - 400). However, the expression $\exp(z)$ would still produce an infinite value and therefore the expression $\log\left(\exp(x + y) + \exp(z)\right)$ is still not computable. The second principle states that one should avoid large input values for the exponential and small positive input values for the logarithmic function. Hence, $m + \log\left(\exp(x + y - m) + \exp(z - m)\right)$ with $m = \max(x + y, z) = 900$ is a more stable and accurate expression. Irrelevant of the specified values for $x,y$ and $z$, one of the terms $x + y - m$ and $z - m$ is always zero, whereas the remaining term is always less than 0. This yields small input values for the exponential functions, which in turn always sum up to a value greater than 1. Thus, we have no small positive input values for the natural logarithm. Using $x,y$ and $z$ as specified before we can compute the expression $\log\left(\exp(x)\exp(y) + \exp(z)\right)$, incorporating the R function log1p, as $$ \begin{align} & m + \log\left(\exp(x + y - \max(x + y, z)) + \exp(z - m)\right) = \notag \ & 900 + \log\left(\exp(800 - 400 - \max(400, 900)) + \exp(900 - 900)\right) = \notag \ & 900 + \log\left(\exp(-500) + 1 \right) \notag \ & \text{with} \log\left(\exp(-500) + 1 \right) = r log1p(exp(400 - 900)) \notag \end{align} $$ Following the two principles mentioned by @LinKe, a stable and efficient formulation of the likelihood is given by $$ \begin{align} \label{eq:linkefactr5par} \log \likelihood\left(\thetaehoshort \mid \datasymbs \right) = &\sum\limits_{\daysym=1}^{\totaldays} \Biggl( -\intensuninfbuys - \intensuninfsells + B_{\daysym} \log \left(\intensinf + \intensuninfbuys\right) + S_{\daysym} \log \left(\intensinf + \intensuninfsells\right) + e_{\max, \daysym}\Biggr) \notag \ & + \sum\limits_{\daysym=1}^{\totaldays} \log \Biggl( \left(1-\probinfevent\right) \exp\left(e_{1,\daysym} - e_{\max, \daysym} \right) + \probinfevent \left(1-\probbadnews\right) \exp\left(e_{2,\daysym} - e_{\max, \daysym} \right) \notag \ & + \probinfevent\probbadnews \exp\left(e_{3,\daysym} - e_{\max, \daysym} \right) \Biggr), \end{align} $$ where $e_{1,\daysym} = -B_{\daysym}\log\left(1+\dfrac{\intensinf}{\intensuninfbuys}\right)-S_{\daysym}\log\left(1+\dfrac{\intensinf}{\intensuninfsells}\right)$, $e_{2,\daysym} = -\intensinf - S_{\daysym}\log\left(1 + \dfrac{\intensinf}{\intensuninfsells} \right)$, $e_{3,\daysym} = -\intensinf - B_{\daysym}\log\left(1 + \dfrac{\intensinf}{\intensuninfbuys} \right)$ and $e_{\max, \daysym} = \max\left(e_{1,\daysym}, e_{2,\daysym}, e_{3,\daysym} \right)$. Again, the constant term $-\log(B!S!)$ is dropped.

With the Lin-Ke formulation we have a stable methodology to estimate $\pintext$ even for heavily traded stocks. Besides its the stability, the Lin-Ke factorization also speeds up the likelihood computation. Since the previously presented factorization by @EHO2010 returns infeasible function values for very frequently traded stocks and EKOP is nested in EHO model, we must strongly recommend the usage of the likelihood formulation by @LinKe in combination with the extended EHO setup for optimization routines.



anre005/pinbasic documentation built on May 6, 2022, 4:40 a.m.