calcGD53: Generalized Dunn’s index (53)

View source: R/clustering_evaluation.R

calcGD53R Documentation

Generalized Dunn’s index (53)

Description

Calculate the Generalized Dunn’s index (v53) of clustering quality.

Usage

calcGD53(data, belongmatrix, centers)

Arguments

data

The original dataframe used for the clustering (n*p)

belongmatrix

A membership matrix (n*k)

centers

The centres of the clusters

Details

The Generalized Dunn’s index \insertCiteda2020incrementalgeocmeans is a ratio of the worst pair-wise separation of clusters and the worst compactness of clusters. A higher value indicates a better clustering. The formula is:

GD_{r s}=\frac{\min_{i \neq j}≤ft[δ_{r}≤ft(ω_{i}, ω_{j}\right)\right]}{\max_{k}≤ft[Δ_{s}≤ft(ω_{k}\right)\right]}

The numerator is a measure of the minimal separation between all the clusters i and j given by the formula:

δ_{r}≤ft(ω_{i}, ω_{j}\right)=\frac{∑_{l=1}^{n}≤ft\|\boldsymbol{x_{l}}-\boldsymbol{c_{i}}\right\|^{\frac{1}{2}} . u_{il}+∑_{l=1}^{n}≤ft\|\boldsymbol{x_{l}}-\boldsymbol{c_{j}}\right\|^{\frac{1}{2}} . u_{jl}}{∑{u_{i}} + ∑{u_{j}}}

where u is the membership matrix and u_{i} is the column of u describing the membership of the n observations to cluster i. c_{i} is the center of the cluster i.

The denominator is a measure of the maximal dispersion of all clusters, given by the formula:

\frac{2*∑_{l=1}^{n}≤ft\|\boldsymbol{x}_{l}-\boldsymbol{c_{i}}\right\|^{\frac{1}{2}}}{∑{u_{i}}}

Value

A float: the Generalized Dunn’s index (53)

References

\insertAllCited

Examples

data(LyonIris)
AnalysisFields <-c("Lden","NO2","PM25","VegHautPrt","Pct0_14","Pct_65","Pct_Img",
"TxChom1564","Pct_brevet","NivVieMed")
dataset <- sf::st_drop_geometry(LyonIris[AnalysisFields])
queen <- spdep::poly2nb(LyonIris,queen=TRUE)
Wqueen <- spdep::nb2listw(queen,style="W")
result <- SFCMeans(dataset, Wqueen,k = 5, m = 1.5, alpha = 1.5, standardize = TRUE)
calcGD53(result$Data, result$Belongings, result$Centers)

geocmeans documentation built on Oct. 16, 2022, 1:07 a.m.