Description Usage Arguments Details Value References Author(s) See Also Examples

Computes the initial cluster assignment based on a combination of nearest neighbor based noise detection, and agglomerative hierarchical clustering based on maximum likelihood criteria for Gaussian mixture models.

1 2 |

`data` |
A numeric vector, matrix, or data frame of observations. Rows correspond
to observations and columns correspond to variables. Categorical
variables and |

`G` |
An integer specifying the number of clusters. |

`k` |
An integer specifying the number of considered nearest neighbors per point
used for the denoising step (see |

`knnd.trim` |
A number in [0,1) which defines the proportion of points
initialized as noise. Tipically |

`modelName` |
A character string indicating the covariance model to be used. Possible models are: |

The initialization is based on Coretto and Hennig (2017). First, wwo
steps are performed:

*Step 1 (denoising step):* for each data point compute its
`k`

th`-`

nearest neighbors
distance (`k-`

NND). All points with `k-`

NND larger
than the (1-`knnd.trim`

)`-`

quantile of the `k-`

NND
are initialized as noise. Intepretaion of
`k`

is that: `(k-1)`

, but not `k`

, points close
together may still be interpreted as noise or outliers

*Step 2 (clustering step):* perform the model-based hierarchical
clustering (MBHC) proposed in Fraley (1998). This step is performed using
`hc`

. The input argument `modelName`

is passed
to `hc`

. See *Details* of
`hc`

for more details.

If the previous *Step 2* fails to provide `G`

clusters each
containing at least 2 distinct data points, it is replaced with
classical hirararchical clustering implemented in
`hclust`

. Finally, if
`hclust`

fails to provide a valid partition, up
to ten random partitions are tried.

An integer vector specifying the initial cluster
assignment with `0`

denoting noise/outliers.

Fraley, C. (1998).
Algorithms for model-based Gaussian hierarchical clustering.
*SIAM Journal on Scientific Computing* 20:270-281.

P. Coretto and C. Hennig (2017).
Consistency, breakdown robustness, and algorithms for robust improper
maximum likelihood clustering.
*Journal of Machine Learning Research*, Vol. 18(142), pp. 1-39.
https://jmlr.org/papers/v18/16-382.html

Pietro Coretto pcoretto@unisa.it https://pietro-coretto.github.io

hc

1 2 3 4 5 6 7 8 9 10 11 12 13 |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.