Dissimilarity between a pair of clusters

Description

Calculate the dissimilarity between a pair of cell populations (clusters) from the distributions of the clusters.

Usage

1
dist.cluster(cluster1,cluster2, dist.type = 'Mahalanobis')

Arguments

cluster1

an object of class Cluster representing the distribution parameters of the first cluster.

cluster2

an object of class Cluster representing the distribution parameters of the second cluster.

dist.type

character, indicating the method with which the dissimilarity between a pair of clusters is computed. Supported dissimilarity measures are: 'Mahalanobis', 'KL' and 'Euclidean'.

Details

Consider two p-dimensional, normally distributed clusters with centers μ1, μ2 and covariance matrices Σ1, Σ2. Assume the size of the clusters are n1 and n2 respectively. We compute the dissimilarity d12 between the clusters as follows:

  1. If dist.type='Mahalanobis': we compute the dissimilarity d12 with the Mahalanobis distance between the distributions of the clusters.

    Σ = ( (n1-1) * Σ1 + (n2-1) * Σ2) / (n1+n2-2)

    d12 = sqrt( t(μ1-μ2) * Σ^(-1) * (μ1-μ2))

  2. If dist.type='KL': we compute the dissimilarity d12 with the Symmetrized Kullback-Leibler divergence between the distributions of the clusters. Note that KL-divergence is not symmetric in its original form. We converted it symmetric by averaging both way KL divergence. The symmetrized KL-divergence is not a metric because it does not satisfy triangle inequality.

    d12 = 1/4 * ( t(μ2 - μ1) * ( Σ1^(-1) + Σ2^(-1) ) * (μ2 - μ1) + trace(Σ1/Σ2 + Σ2/Σ1) + 2p )

  3. If dist.type='Euclidean': we compute the dissimilarity d12 with the Euclidean distance between the centers of the clusters.

    d12 =sqrt(∑(μ1-μ2)^2 )

The dimension of the clusters must be same.

Value

dist.cluster returns a numeric value denoting the dissimilarities between a pair of cell populations (clusters).

Author(s)

Ariful Azad

References

McLachlan, GJ (1999) Mahalanobis distance; Journal of Resonance 4(6), 20–26.

Abou–Moustafa, Karim T and De La Torre, Fernando and Ferrie, Frank P (2010) Designing a Metric for the Difference between Gaussian Densities; Brain, Body and Machine, 57–70.

See Also

mahalanobis.dist, symmetric.KL, dist.matrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
## ------------------------------------------------
## load data and retrieve a sample
## ------------------------------------------------

library(healthyFlowData)
data(hd)
sample = exprs(hd.flowSet[[1]])

## ------------------------------------------------
## cluster sample using kmeans algorithm
## ------------------------------------------------

km = kmeans(sample, centers=4, nstart=20)
cluster.labels = km$cluster

## ------------------------------------------------
## Create ClusteredSample object  
## and compute mahalanobis distance between two clsuters
## ------------------------------------------------

clustSample = ClusteredSample(labels=cluster.labels, sample=sample)
clust1 = get.clusters(clustSample)[[1]]
clust2 = get.clusters(clustSample)[[2]]
dist.cluster(clust1, clust2, dist.type='Mahalanobis')
dist.cluster(clust1, clust2, dist.type='KL')
dist.cluster(clust1, clust2, dist.type='Euclidean')