Description Usage Arguments Value Author(s) Examples

A parallel and scalable implementation of the algorithm described in Ostrovsky, Rafail, et al. "The effectiveness of Lloyd-type methods for the k-means problem." Journal of the ACM (JACM) 59.6 (2012): 28.

1 2 |

`data` |
Data file name on disk (NUMA optimized) or In memory data matrix |

`centers` |
The number of centers (i.e., k) |

`nrow` |
The number of samples in the dataset |

`ncol` |
The number of features in the dataset |

`nstart` |
The number of iterations of kmeans++ to run |

`nthread` |
The number of parallel threads to run |

`dist.type` |
What dissimilarity metric to use c("taxi", "eucl", "cos") |

A list containing the attributes of the output.
cluster: A vector of integers (from 1:**k**) indicating the cluster to
which each point is allocated.
centers: A matrix of cluster centres.
size: The number of points in each cluster.
energy: The sum of distances for each sample from it's closest cluster.
best.start: The sum of distances for each sample from it's closest cluster.

Disa Mhembere <[email protected]>

1 2 3 4 |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.