Another methodology which utilizes hierarchical agglomerative clustering (HAC) to generate starting values is proposed by @Cluster. The daily order imbalance $\orderimb, \daysym = 1, \dots, \totaldays$, serves as criterion to assign the trading days to three clusters representing no-news, good-news and bad-news trading days. HAC is a bottom-up clustering technique in which at the beginning of the algorithm all order imbalances $\orderimb$ illustrate a cluster of their own, e.g. if trading data for one quarter of a year is used to estimate the probability of informed trading roughly 60 clusters exist when the algorithm is initialized.
@Cluster use the complete-linkage clustering to sequentially merge the small clusters to bigger ones. Two clusters with the shortest distance are combined in each step. The definition of shortest distance distinguishes between several available agglomerative clustering methods.^[ @Cluster mention that they tested different agglomerative clustering methods but that the complete-linkage method performed marginally better than the others. For a description of the other available methods, e.g. single-linkage or centroid-linkage, see @ClusterAna.] In complete-linkage clustering or farthest-neighbour clustering, the distance between clusters is calculated as the distance between those two elements, whereat the elements are in separated clusters, that are farthest away from each other. The minimal computed distance in each step causes the merging of both clusters involved.
To be precise, in the complete-linkage clustering, the distance $\operatorname{D}(X, Y)$ between two clusters $X$ and $Y$ can be written as $$ \begin{align} \operatorname{D}(X,Y) = \underset{x \in X, y \in Y}{\max} d(x, y), \end{align} $$ where $d(x, y)$ is the distance between the cluster elements $x \in X$ and $y \in Y$. @Cluster use the euclidean norm as measure for $d(x,y)$.
The following is a step-by-step instruction how to use the clustering algorithm to generate initial values for the parameters in the EHO model [see @Cluster, p. 1809].
R
function hclust
to perform this task [see @fastcluster].]
Stop the algorithm when there are three clusters left.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.