View source: R/measures_clusterings.R
v_measure | R Documentation |
Computes the V-measure between two clusterings, such as a predicted and ground truth clustering.
v_measure(true, pred, beta = 1)
true |
ground truth clustering represented as a membership vector. Each entry corresponds to an element and the value identifies the assigned cluster. The specific values of the cluster identifiers are arbitrary. |
pred |
predicted clustering represented as a membership vector. |
beta |
non-negative weight. A value of 0 assigns no weight to completeness (i.e. the measure reduces to homogeneity), while larger values assign increasing weight to completeness. A value of 1 weights completeness and homogeneity equally. |
V-measure is defined as the \beta
-weighted harmonic
mean of homogeneity h
and completeness c
:
(1 + \beta)\frac{h \cdot c}{\beta \cdot h + c}.
The range of V-measure is between 0 and 1, where 1 corresponds to a perfect match between the clusterings. It is equivalent to the normalised mutual information, when the aggregation function is the arithmetic mean.
Rosenberg, A. and Hirschberg, J. "V-measure: A conditional entropy-based external cluster evaluation measure." Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), (2007).
Becker, H. "Identification and characterization of events in social media." PhD dissertation, Columbia University, (2011).
homogeneity
and completeness
evaluate the component
measures upon which this measure is based.
true <- c(1,1,1,2,2) # ground truth clustering
pred <- c(1,1,2,2,2) # predicted clustering
v_measure(true, pred)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.