A recursive (not acutally implemented as recursion) partitioning of data into two disjoint sets at every level as described in https://en.wikipedia.org/wiki/Hierarchical_clustering

data
Data file name on disk (NUMA optmized) or In memory data matrix |

kmax
The maximum number of centers |

nrow
The number of samples in the dataset |

ncol
The number of features in the dataset |

iter.max
The maximum number of iteration of k-means to perform |

nthread
The number of parallel threads to run |

init
The type of initialization to use c("forgy") or initial centers |

tolerance
The convergence tolerance for k-means at each hierarchical split |

dist.type
What dissimilarity metric to use |

min.clust.size
The minimum size of a cluster when it cannot be split |

A list of lists containing the attributes of the output.
cluster: A vector of integers (from 1:**k**) indicating the cluster to
which each point is allocated.
centers: A matrix of cluster centres.
size: The number of points in each cluster.
iter: The number of (outer) iterations.

Disa Mhembere <[email protected]>

