# fowlkes_mallows: Computes the Fowlkes-Mallows similarity index of two... In ramhiser/clusteval: Evaluation of Clustering Algorithms

## Description

For two clusterings of the same data set, this function calculates the Fowlkes-Mallows similarity coefficient of the clusterings from the comemberships of the observations. Basically, the comembership is defined as the pairs of observations that are clustered together.

## Usage

 1 fowlkes_mallows(labels1, labels2) 

## Arguments

 labels1 a vector of n clustering labels labels2 a vector of n clustering labels

## Details

To calculate the Fowlkes-Mallows index, we compute the 2x2 contingency table, consisting of the following four cells:

n_11:

the number of observation pairs where both observations are comembers in both clusterings

n_10:

the number of observation pairs where the observations are comembers in the first clustering but not the second

n_01:

the number of observation pairs where the observations are comembers in the second clustering but not the first

n_00:

the number of observation pairs where neither pair are comembers in either clustering

The Fowlkes-Mallows similarity index is defined as:

\frac{n_{11}}{√{(n_{11} + n_{10})(n_{11} + n_{01})}}.

To compute the contingency table, we use the comembership_table function.

## Value

the Fowlkes-Mallows index for the two sets of cluster labels

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ## Not run: # We generate K = 3 labels for each of n = 10 observations and compute the # Fowlkes-Mallows similarity index between the two clusterings. set.seed(42) K <- 3 n <- 10 labels1 <- sample.int(K, n, replace = TRUE) labels2 <- sample.int(K, n, replace = TRUE) fowlkes_mallows(labels1, labels2) # Here, we cluster the \code{\link{iris}} data set with the K-means and # hierarchical algorithms using the true number of clusters, K = 3. # Then, we compute the Fowlkes-Mallows similarity index between the two # clusterings. iris_kmeans <- kmeans(iris[, -5], centers = 3)\$cluster iris_hclust <- cutree(hclust(dist(iris[, -5])), k = 3) fowlkes_mallows(iris_kmeans, iris_hclust) ## End(Not run) 

ramhiser/clusteval documentation built on Oct. 17, 2017, 12:26 p.m.