Description Usage Arguments Details Value Author(s) See Also Examples

K-means variant that uses a class-wise Mahalanobis metric. The implementation follows somewhat Lloyd's, with class-wise covariance computation step following that of centres.

1 |

`dat` |
Matrix with n rows and d columns of n d-dimensional data elements to cluster. |

`k` |
Number of clusters in the output. |

`maxiter` |
Maximum number of iterations. |

`seeds` |
Optional indexes of initial centres taken in the input data. If NULL, uniform sampling is used. |

`prior` |
Prior population size used for regularizing components. |

K-means is characterized by the use of identity as the metric. To remain close to this in spirit, each class-wise covariance matrix is normalized after computation so that is trace equals d. This avoids excessively unbalanced classes, while facilitating the case where the support of a given cluster is less than 2 - covariance cannot be computed in this case. Covariance then defaults to identity. Also to prevent degeneracies when 2 < cluster size < d, a regularization term proportional to sample data features is added to the covariance diagonal.

The returned value follows the GMM data structure (i.e., as returned by e.g. varbayes() and newGmm())

`labels` |
Cluster labels taking values in 1..k |

`w` |
Numeric vector of cluster weights |

`mean` |
List of mean vectors |

`cov` |
List of covariance matrices |

P. Bruneau

1 |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.