homogeneity: Homogeneity Between Clusterings

View source: R/measures_clusterings.R

homogeneityR Documentation

Homogeneity Between Clusterings

Description

Computes the homogeneity between two clusterings, such as a predicted and ground truth clustering.

Usage

homogeneity(true, pred)

Arguments

true

ground truth clustering represented as a membership vector. Each entry corresponds to an element and the value identifies the assigned cluster. The specific values of the cluster identifiers are arbitrary.

pred

predicted clustering represented as a membership vector.

Details

Homogeneity is an entropy-based measure of the similarity between two clusterings, say t and p. The homogeneity is high if clustering t only assigns members of a cluster to a single cluster in p. The homogeneity ranges between 0 and 1, where 1 indicates a perfect homogeneity.

References

Rosenberg, A. and Hirschberg, J. "V-measure: A conditional entropy-based external cluster evaluation measure." Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), (2007).

See Also

completeness evaluates the completeness, which is a dual measure to homogeneity. v_measure evaluates the harmonic mean of completeness and homogeneity.

Examples

true <- c(1,1,1,2,2)  # ground truth clustering
pred <- c(1,1,2,2,2)  # predicted clustering
homogeneity(true, pred)


clevr documentation built on Sept. 16, 2023, 5:06 p.m.