ChooseK | R Documentation |
Function to choose the number of clusters k. Examines cluster numbers between
k0 and k1. For each cluster number, generates boot
bootstrap data
sets, fits the Gaussian Mixture Model (FitGMM
), and calculates
quality metrics (ClustQual
). For each metric, determines the
optimal cluster number k_opt
, and the k_1SE
, the smallest
cluster number whose quality is within 1 SE of the optimum.
ChooseK(
data,
k0 = 2,
k1 = NULL,
boot = 100,
init_means = NULL,
fix_means = FALSE,
init_covs = NULL,
lambda = 0,
init_props = NULL,
maxit = 10,
eps = 1e-04,
report = TRUE
)
data |
Numeric data matrix. |
k0 |
Minimum number of clusters. |
k1 |
Maximum number of clusters. |
boot |
Bootstrap replicates. |
init_means |
Optional list of initial mean vectors. |
fix_means |
Fix the means to their starting value? Must provide initial values. |
init_covs |
Optional list of initial covariance matrices. |
lambda |
Optional ridge term added to covariance matrix to ensure positive definiteness. |
init_props |
Optional vector of initial cluster proportions. |
maxit |
Maximum number of EM iterations. |
eps |
Minimum acceptable increment in the EM objective. |
report |
Report bootstrap progress? |
List containing Choices
, the recommended number of clusters
according to each quality metric, and Results
, the mean and standard
error of the quality metrics at each cluster number evaluated.
See ClustQual
for evaluating cluster quality, and FitGMM
for estimating the GMM with a specified cluster number.
set.seed(100)
mean_list <- list(c(2, 2), c(2, -2), c(-2, 2), c(-2, -2))
data <- rGMM(n = 500, d = 2, k = 4, means = mean_list)
choose_k <- ChooseK(data, k0 = 2, k1 = 6, boot = 10)
choose_k$Choices
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.