Description Usage Arguments Value Author(s) Examples
K-means provides k disjoint sets for a dataset using a parallel and fast NUMA optimized version of Lloyd's algorithm. The details of which are found in this paper https://arxiv.org/pdf/1606.08905.pdf.
1 2 3 4 5 6 7 8 9 10 11 |
data |
Data file name on disk (NUMA optimized) or In memory data matrix |
centers |
Either (i) The number of centers (i.e., k), or |
nrow |
The number of samples in the dataset |
ncol |
The number of features in the dataset |
iter.max |
The maximum number of iteration of k-means to perform |
nthread |
The number of parallel threads to run (ii) an In-memory data matrix, or (iii) A 2-Element list with element 1 being a filename for precomputed centers, and element 2 the number of centroids. |
init |
The type of initialization to use c("kmeanspp", "random", "forgy", "none") |
tolerance |
The convergence tolerance |
dist.type |
What dissimilarity metric to use |
A list containing the attributes of the output. cluster: A vector of integers (from 1:k) indicating the cluster to which each point is allocated. centers: A matrix of cluster centres. size: The number of points in each cluster. iter: The number of (outer) iterations.
Disa Mhembere <disa@cs.jhu.edu>
1 2 3 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.