View source: R/suggest_number_of_clusters.R
| suggest_number_of_clusters | R Documentation | 
Algorithm establishes the maximum number of cluster based on the lesser of k_limit and the number of unique values in x. A set of kmeans models are created starting with a single cluster and progressing to the maximum number of clusters. For model, the sum of within sum of squares is calculated. Note that kmeans model produces a within sum of squares for k (number of clusters) = 1. If the method is changed from kmeans, it may be necessary to create the sum of squares for k = 1 manually using degrees of freedom * sample variance.
suggest_number_of_clusters(x, k_limit = 10, diagnostic_file_prefix = NULL)
x | 
 vector of numeric values  | 
k_limit | 
 numeric maximum number of clusters to consider  | 
diagnostic_file_prefix | 
 character, if present, a file is output with the wss~cluster number plot. number:wss curve and y = x line.  | 
Both sets of values are scaled from 0 to 1 so that the intersection may be found with the line y = x. The intersection is designated as the knee of the curve commonly used to determine the optimal number of clusters. The distance of each point from the line y = x is calculated and the point closest to the line chosen as the suggested number of clusters.
A diagnostic plot may be produced showing the within sum of squares and cluster number.
numeric
# suggest_number_of_clusters()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.