valstat.object: Cluster validation statistics - object
In fpc: Flexible Procedures for Clustering

valstat.object

R Documentation

Cluster validation statistics - object

Description

The objects of class "valstat" store cluster validation statistics from various clustering methods run with various numbers of clusters.

Value

A legitimate valstat object is a list. The format of the list relies on the number of involved clustering methods, nmethods, say, i.e., the length of the method-component explained below. The first nmethods elements of the valstat-list are just numbered. These are themselves lists that are numbered between 1 and the maxG-component defined below. Element [[i]][[j]] refers to the clustering from clustering method number i with number of clusters j. Every such element is a list with components avewithin, mnnd, cvnnd, maxdiameter, widestgap, sindex, minsep, asw, dindex, denscut, highdgap, pearsongamma, withinss, entropy: Further optional components are pamc, kdnorm, kdunif, dmode, aggregated. All these are cluster validation indexes, as follows.

`avewithin`	average distance within clusters (reweighted so that every observation, rather than every distance, has the same weight).
`mnnd`	average distance to `nnk`th nearest neighbour within cluster. (`nnk` is a parameter of `cqcluster.stats`, default 2.)
`cvnnd`	coefficient of variation of dissimilarities to `nnk`th nearest wthin-cluster neighbour, measuring uniformity of within-cluster densities, weighted over all clusters, see Sec. 3.7 of Hennig (2019). (`nnk` is a parameter of `cqcluster.stats`, default 2.)
`maxdiameter`	maximum cluster diameter.
`widestgap`	widest within-cluster gap or average of cluster-wise widest within-cluster gap, depending on parameter `averagegap` of `cqcluster.stats`, default `FALSE`.
`sindex`	separation index. Defined based on the distances for every point to the closest point not in the same cluster. The separation index is then the mean of the smallest proportion `sepprob` (parameter of `cqcluster.stats`, default 0.1) of these. See Hennig (2019).
`minsep`	minimum cluster separation.
`asw`	average silhouette width. See `silhouette`.
`dindex`	this index measures to what extent the density decreases from the cluster mode to the outskirts; I-densdec in Sec. 3.6 of Hennig (2019); low values are good.
`denscut`	this index measures whether cluster boundaries run through density valleys; I-densbound in Sec. 3.6 of Hennig (2019); low values are good.
`highdgap`	this measures whether there is a large within-cluster gap with high density on both sides; I-highdgap in Sec. 3.6 of Hennig (2019); low values are good.
`pearsongamma`	correlation between distances and a 0-1-vector where 0 means same cluster, 1 means different clusters. "Normalized gamma" in Halkidi et al. (2001).
`withinss`	a generalisation of the within clusters sum of squares (k-means objective function), which is obtained if `d` is a Euclidean distance matrix. For general distance measures, this is half the sum of the within cluster squared dissimilarities divided by the cluster size.
`entropy`	entropy of the distribution of cluster memberships, see Meila(2007).
`pamc`	average distance to cluster centroid, which is the observation that minimises this average distance.
`kdnorm`	Kolmogorov distance between distribution of within-cluster Mahalanobis distances and appropriate chi-squared distribution, aggregated over clusters (I am grateful to Agustin Mayo-Iscar for the idea).
`kdunif`	Kolmogorov distance between distribution of distances to `dnnk`th nearest within-cluster neighbor and appropriate Gamma-distribution, see Byers and Raftery (1998), aggregated over clusters. `dnnk` is parameter `nnk` of `distrsimilarity`, corresponding to `dnnk` of `clusterbenchstats`.
`dmode`	aggregated density mode index equal to `0.75dindex+0.25highdgap` after standardisation by `cgrestandard`.

Furthermore, a valstat object has the following list components:

`maxG`	maximum number of clusters.
`minG`	minimum number of clusters (list entries below that number are empty lists).
`method`	vector of names (character strings) of clustering CBI-functions, see `kmeansCBI`.
`name`	vector of names (character strings) of clustering methods. These can be user-chosen names (see argument `methodsnames` in `clusterbenchstats`) and may distinguish different methods run by the same CBI-function but with different parameter values such as complete and average linkage for `hclustCBI`.
`statistics`	vector of names (character strings) of cluster validation indexes.

GENERATION

These objects are generated as part of the clusterbenchstats-output.

METHODS

The valstat class has methods for the following generic functions: print, plot, see plot.valstat.

Author(s)

Christian Hennig christian.hennig@unibo.it https://www.unibo.it/sitoweb/christian.hennig/en/

References

Hennig, C. (2019) Cluster validation by measurement of clustering characteristics relevant to the user. In C. H. Skiadas (ed.) Data Analysis and Applications 1: Clustering and Regression, Modeling-estimating, Forecasting and Data Mining, Volume 2, Wiley, New York 1-24, https://arxiv.org/abs/1703.09282

Akhanli, S. and Hennig, C. (2020) Calibrating and aggregating cluster validity indexes for context-adapted comparison of clusterings. Statistics and Computing, 30, 1523-1544, https://link.springer.com/article/10.1007/s11222-020-09958-2, https://arxiv.org/abs/2002.01822

fpc
Flexible Procedures for Clustering

valstat.object: Cluster validation statistics - object
In fpc: Flexible Procedures for Clustering

Cluster validation statistics - object

Description

Value

GENERATION

METHODS

Author(s)

References

See Also

Related to valstat.object in fpc...

R Package Documentation

Browse R Packages

We want your feedback!

fpc Flexible Procedures for Clustering

valstat.object: Cluster validation statistics - object In fpc: Flexible Procedures for Clustering

Cluster validation statistics - object

Description

Value

GENERATION

METHODS

Author(s)

References

See Also

Related to valstat.object in fpc...

R Package Documentation

Browse R Packages

We want your feedback!

fpc
Flexible Procedures for Clustering

valstat.object: Cluster validation statistics - object
In fpc: Flexible Procedures for Clustering