View source: R/clustering_evaluation.R
calcDaviesBouldin | R Documentation |
Calculate the Davies-Bouldin index of clustering quality.
calcDaviesBouldin(data, belongmatrix, centers)
data |
The original dataframe used for the clustering (n*p) |
belongmatrix |
A membership matrix (n*k) |
centers |
The centres of the clusters |
The Davies-Bouldin index \insertCiteda2020incrementalgeocmeans can be seen as the ratio of the within cluster dispersion and the between cluster separation. A lower value indicates a higher cluster compacity or a higher cluster separation. The formula is:
DB = \frac{1}{k}\sum_{i=1}^k{R_{i}}
with:
R_{i} =\max_{i \neq j}\left(\frac{S_{i}+S_{j}}{M_{i, j}}\right)
S_{l} =\left[\frac{1}{n_{l}} \sum_{l=1}^{n}\left\|\boldsymbol{x_{l}}-\boldsymbol{c_{i}}\right\|*u_{i}\right]^{\frac{1}{2}}
M_{i, j} =\sum\left\|\boldsymbol{c}_{i}-\boldsymbol{c}_{j}\right\|
So, the value of the index is an average of R_{i}
values. For each cluster, they represent
its worst comparison with all the other clusters, calculated
as the ratio between the compactness of the two clusters and the separation
of the two clusters.
A float: the Davies-Bouldin index
data(LyonIris)
AnalysisFields <-c("Lden","NO2","PM25","VegHautPrt","Pct0_14","Pct_65","Pct_Img",
"TxChom1564","Pct_brevet","NivVieMed")
dataset <- sf::st_drop_geometry(LyonIris[AnalysisFields])
queen <- spdep::poly2nb(LyonIris,queen=TRUE)
Wqueen <- spdep::nb2listw(queen,style="W")
result <- SFCMeans(dataset, Wqueen,k = 5, m = 1.5, alpha = 1.5, standardize = TRUE)
calcDaviesBouldin(result$Data, result$Belongings, result$Centers)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.