View source: R/measures_clusterings.R
homogeneity | R Documentation |
Computes the homogeneity between two clusterings, such as a predicted and ground truth clustering.
homogeneity(true, pred)
true |
ground truth clustering represented as a membership vector. Each entry corresponds to an element and the value identifies the assigned cluster. The specific values of the cluster identifiers are arbitrary. |
pred |
predicted clustering represented as a membership vector. |
Homogeneity is an entropy-based measure of the similarity
between two clusterings, say t
and p
. The homogeneity
is high if clustering t
only assigns members of a cluster to
a single cluster in p
. The homogeneity ranges between 0
and 1, where 1 indicates a perfect homogeneity.
Rosenberg, A. and Hirschberg, J. "V-measure: A conditional entropy-based external cluster evaluation measure." Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), (2007).
completeness
evaluates the completeness, which is a dual
measure to homogeneity. v_measure
evaluates the harmonic mean of
completeness and homogeneity.
true <- c(1,1,1,2,2) # ground truth clustering
pred <- c(1,1,2,2,2) # predicted clustering
homogeneity(true, pred)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.