Description Usage Arguments Details Value References See Also
Calculates the purity measure (e.g. described by Wu, Xiong and Chen (2009)) to compare two cluster assignment vectors (external cluster validation). It is a value in (0,1], higher values indicating more similarity. It finds the most common ground truth class in each cluster and sums over these relative frequencies.
1 | purity_fast(assignments, groundTruth)
|
assignments |
Integer vector of cluster assignments containing only values from 1 to k with k = number of clusters (code depends on this!). |
groundTruth |
Integer vector of class (ground truth) assignments containing only values from 1 to k with k = number of clusters. |
Be aware that this measure is asymmetric and can still be high if the classes
of the ground truth are split up into multiple (but pure) clusters. Wu, Xiong
and Chen (2009) propose to use the symmetric van Dongen measure
(vanDongen_fast
) instead.
The purity measure as double in (0,1].
Van Dongen, S. (2000). Performance criteria for graph clustering and markov cluster experiments. National Research Institute for Mathematics and Computer Science. Amsterdam.
Wu, J., Xiong, H. & Chen, J. (2009). Adapting the right measures for k-means clustering. In Proceedings of the 15th acm sigkdd international conference on knowledge discovery and data mining (pp. 877-886). ACM.
Other External Cluster Validity Indices: conditionalEntropy_fast
,
fowlkesMallows_fast
,
pairCVIParameters_fast
,
phi_fast
, randIndex_fast
,
vanDongen_fast
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.