Cluster quality statistics
Compute several quality statistics of a given clustering solution.
A dissimilarity matrix or a dist object (see
Factor. A vector of clustering membership.
optional numerical vector containing weights.
Compute several quality statistics of a given clustering solution. See value for details.
A list with two elements
with the following statistics:
Point Biserial Correlation. Correlation between the given distance matrice and a distance which equal to zero for individuals in the same cluster and one otherwise.
Hubert's Gamma. Same as previous but using Kendall's Gamma coefficient.
Hubert's Gamma (Somers'D). Same as previous but using Somers' D coefficient.
Average Silhouette width (observation).
Average Silhouette width (weighted).
Calinski-Harabasz index (Pseudo F statistics computed from distances).
Share of the discrepancy explained by the clustering solution.
Calinski-Harabasz index (Pseudo F statistics computed from squared distances).
Share of the discrepancy explained by the clustering solution (computed using squared distances).
Hubert's C coefficient.
The Average Silhouette Width of each cluster, one column for each ASW measure.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
data(mvad) ## Aggregating state sequence aggMvad <- wcAggregateCases(mvad[, 17:86], weights=mvad$weight) ## Creating state sequence object mvad.seq <- seqdef(mvad[aggMvad$aggIndex, 17:86], weights=aggMvad$aggWeights) ## Computing Hamming distance between sequence diss <- seqdist(mvad.seq, method="HAM") ## KMedoids using PAMonce method (clustering only) clust5 <- wcKMedoids(diss, k=5, weights=aggMvad$aggWeights, cluster.only=TRUE) ## Compute the silhouette of each observation qual <- wcClusterQuality(diss, clust5, weights=aggMvad$aggWeights) print(qual)