Description Usage Arguments Details Value Author(s) References Examples
Estimate number of clusters by bootstrapping stability
1 | k.select_ref(df, k_range = 2:7, n_ref = 5, B = 100, B_ref = 50, r = 5)
|
df |
|
k_range |
|
n_ref |
number of reference distribution to be generated |
B |
number of bootstrap re-samples |
B_ref |
number of bootstrap resamples for the reference distributions |
r |
number of runs of k-means |
This function uses the out-of-bag scheme to estimate the number of clusters in a dataset. The function calculate the Smin of the dataset and at the same time, generate a reference dataset with the same range as the original dataset in each dimension and calculate the Smin_ref. The differences between Smin and Smin_ref at each k,Smin_diff(k), is taken into consideration as well as the standard deviation of the differences. We choose the k to be the argmax of ( Smin_diff(k) - ( Smin_diff(k+1) + (Smin_diff(k+1)) ) ). If Smin_diff(k) less than 0.1 for all k in k_range, we say k = 1
profile
vector
of ( Smin_diff(k) - ( Smin_diff(k+1) + se(Smin_diff(k+1)) ) ) measures for researchers's inspection
k
estimated number of clusters
Tianmou Liu
Bootstrapping estimates of stability for clusters, observations and model selection. Han Yu, Brian Chapman, Arianna DiFlorio, Ellen Eischen, David Gotz, Matthews Jacob and Rachael Hageman Blair.
1 2 3 4 5 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.