BatchKMeans | R Documentation |
Uses the batch method k-means method for forming clusters.
BatchKMeans(x, centers, weights, iter.max, n.starts, seed = 1223)
x |
A |
centers |
Either the number of clusters (e.g., 2), or a set of initial cluster centers. |
weights |
An optional vector of sampling weights, or, the name or, the
name of a variable in |
iter.max |
The number of iterations of the algorithm. |
n.starts |
The number of times to run the whole algorihtm. |
seed |
The seed for the random number generator. |
The batch method works by selecting initial cluster centers, allocating each observation
to the closest cluster, recomputing the cluster centers, and repeating these steps until the either the
residual sum of squares stops reducing, or, iter.max
is exceeded.
The two novel features of this algorithm, relative to traditional k-means algorithms such as Hartigan and Wong (1979) are: (1) The algorithms addresses weights. (2) The algorithm classifies cases that have incomplete data.
The algorithm starts by initially assigning cases to clusters as follows: (1) Cases with missing values are removed. (2) If the data is weighted, a new 'bootstrapped' sample is created via resampling. (3) The Hartigan-Wong algorithm is applied to the bootstrapped sample. (4) Each of the cases in the data set (including those with partially missing data) are assigned to the closest cluster center.
The algorithm then repeatedly: (1) Recomputes the clustered center as the weighted mean of the data for the cases assigned to the cluster. (2) Assigns cases to the closest cluster. This proceeds until the cluster membership stabilizes or the number of iterations is exceeded.
Where n.starts
is greater than 1, or, there are less than 100 cases left after removing cases with
incomplete data, the remaining start points are selected by: (1) identifying unique cases, and (2) sampling
without replacement from amongst the unique cases.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.