Similarity.measures: Similarity measures between pairs of clusterings

Similarity.measuresR Documentation

Similarity measures between pairs of clusterings

Description

Classical similarity measures between pairs of clusterings are implemented. These measures use the pairwise boolean membership matrix (Do.boolean.membership.matrix) to compute the similarity between two clusterings, using the matrix as a vector and computing the result as an internal product. It may be shown that the same result may be obtained using contingency matrices and the classical definition of Fowlkes and Mallows (implemented with the function sFM), Jaccard (implemented with the function sJaccard) and Matching (Rand Index, implemented with the function sM) coefficients. Their values range from 0 to 1 (0 no similarity, 1 identity).

Usage

sFM(M1, M2)
sJaccard(M1, M2)
sM(M1, M2)

Arguments

M1

boolean membership matrix representing the first clustering

M2

boolean membership matrix representing the second clustering

Value

similarity measure between the two clusterings according to Fowlkes and Mallows (sFM), Jaccard (sJaccard) and Matching (sM) coefficients.

Author(s)

Giorgio Valentini valentini@di.unimi.it

References

Ben-Hur, A. Ellisseeff, A. and Guyon, I., A stability based method for discovering structure in clustered data, In: "Pacific Symposium on Biocomputing", Altman, R.B. et al (eds.), pp, 6-17, 2002.

See Also

Do.boolean.membership.matrix

Examples

library("clusterv")
library("stats")
library("cluster")
# Synthetic data set generation (3 clusters with 20 examples for each cluster)
M <- generate.sample3(n=20, m=2)
# k-means clustering with 3 clusters
r1<-kmeans(t(M), c=3, iter.max = 1000);
# this function is implemented in the clusterv package:
cl1 <- Transform.vector.to.list(r1$cluster); 
# generation of a boolean membership square matrix:
Bkmeans <- Do.boolean.membership.matrix(cl1, 60, 1:60)
# the same as above, using PAM clustering with 3 clusters
d <- dist (t(M));
r2 <- pam (d,3,cluster.only=TRUE);
cl2 <- Transform.vector.to.list(r2);
BPAM <- Do.boolean.membership.matrix(cl2, 60, 1:60)
# computation of the Fowlkes and Mallows index between the k-means and the PAM clustering:
sFM(Bkmeans, BPAM)
# computation of the Jaccard index between the k-means and the PAM clustering:
sJaccard(Bkmeans, BPAM)
# computation of the Matching coefficient between the k-means and the PAM clustering:
sM(Bkmeans, BPAM)

mosclust documentation built on June 8, 2025, 11:23 a.m.