Computes the initial cluster assignment based on a combination of nearest neighbor based noise detection, and agglomerative hierarchical clustering based on maximum likelihood criteria for Gaussian mixture models.
A numeric vector, matrix, or data frame of observations. Rows correspond
to observations and columns correspond to variables. Categorical
An integer specifying the number of clusters.
An integer specifying the number of considered nearest neighbors per point used for the denoising step (see Details).
A number in (0,1) which defines the proportion of points
initialized as noise. Tipically
A character string indicating the covariance model to be used. Possible models are:
The initialization is discussed in details in Coretto and Hennig (2016). Two
steps are performed:
Denoising step: for each data point compute its
k-NND). All points with
than the (1-
-quantile of the
are initialized as noise. Intepretaion of
k is that:
(k-1), but not
k, points close
together may still be interpreted as noise or outliers
Clustering step: perform the model-based hierarchical clustering (MBHC)
proposed in Fraley (1998). This step is performed using
hc. The input argument
modelName is passed
hc. See Details of
hc for more details.
An integer vector specifying the initial cluster
0 denoting noise/outliers.
Fraley, C. (1998). Algorithms for model-based Gaussian hierarchical clustering. SIAM Journal on Scientific Computing 20:270-281.
Coretto, P. and C. Hennig (2017). Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering. arXiv preprint available at arXiv:1309.6895.
1 2 3 4 5 6 7 8 9 10 11 12 13
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.