Description Usage Arguments Details Value Author(s) References Examples
View source: R/weightedKmeans.R
This function computes the standard MacQueen version of k-means algorithm.
1 | kmeans(dat, k=2, nbRep=100)
|
dat |
Numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns). |
k |
The clustering is processed for k partitions. |
nbRep |
The number of random starts. |
The MacQueen k-means algorithm (MacQueen, 1967) aims to separate n objects in k non-overlapping groups as to minimize the sum of squared errors (i.e. the sum of distances between the points and the center of their group). First, this variant of k-means proceeds to a step of initialization choosing k data points as centroids (centers of partitions), assigning the points to the nearest centroid according to the Euclidean distance and updating the centroids using the mean of the points in the group. Then, the algorithm iteratively until convergence proceeds to a step assignation where each point is assigned to the nearest centroid according to the Euclidean distance and the concerned centroid is updated consequently using the mean of the points in the group. The convergence is reached either when the centroids stop moving or when the number of internal iterations is attained. The quality of the clustering produced by the MacQueen k-means algorithm is evaluated by the well-known Calinski-Harabasz cluster validity index (Caliński and Harabasz, 1974).
k |
The clustering is processed for k partitions. |
bestCH |
The best value of the Calinski-Harabasz cluster validity index produced by the k-means algorithm. |
clusteringCH |
The clustering produced by the k-means algorithm for the best Calinski-Harabasz cluster validity index. |
bestSil |
The best value of the Silhouette cluster validity index produced by the k-means algorithm. |
clusteringSil |
The clustering produced by the k-means algorithm for the best Silhouette cluster validity index. |
Algorithm |
The algorithm used to produce the clustering. |
Alexandre Gondeau
Caliński, T., and Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3, 1-27.
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp. 281-297.
1 2 3 4 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.