cluutils: Clustering utilities

View source: R/clu_utils.R

cluutilsR Documentation

Clustering utilities

Description

Utility object that groups clustering metrics and model-selection helpers.

Usage

cluutils()

Details

The object organizes helpers into two semantic groups:

Metrics

  • metric_wcss() computes the total within-cluster sum of squares.

  • metric_silhouette() computes the mean silhouette score from pairwise distances.

  • metric_entropy() computes external clustering entropy against a reference label.

  • metric_purity() computes cluster purity against a reference label.

  • metric_davies_bouldin() computes the Davies-Bouldin index.

  • metric_calinski_harabasz() computes the Calinski-Harabasz score.

  • metric_adjusted_rand_index() computes the adjusted Rand index.

  • metric_noise_points() summarizes the number of noise points in density-based clustering.

  • metric_loglik() and metric_modularity() expose model-specific quality summaries.

Selectors

  • selector_best() selects the best hyperparameter value by direct optimization.

  • selector_elbow() selects the elbow of a metric curve via maximum curvature.

Metric helpers return a standardized list with fields metric, value, goal, and type. This keeps the contract uniform even when the metrics themselves differ.

Value

returns a cluutils object exposing metric and selector helpers.

Examples

utils <- cluutils()

data(iris)
x <- iris[, 1:4]
clu <- stats::kmeans(x, centers = 3)$cluster

utils$metric_wcss(x, clu)
utils$metric_silhouette(x, clu)
utils$metric_entropy(clu, iris$Species)
utils$selector_best(c(0.31, 0.42, 0.39), goal = "maximize")

daltoolbox documentation built on May 14, 2026, 9:06 a.m.