# dist.cluster: Dissimilarity between a pair of clusters In flowMatch: Matching and meta-clustering in flow cytometry

## Description

Calculate the dissimilarity between a pair of cell populations (clusters) from the distributions of the clusters.

## Usage

 `1` ```dist.cluster(cluster1,cluster2, dist.type = 'Mahalanobis') ```

## Arguments

 `cluster1 ` an object of class `Cluster` representing the distribution parameters of the first cluster. `cluster2 ` an object of class `Cluster` representing the distribution parameters of the second cluster. `dist.type` character, indicating the method with which the dissimilarity between a pair of clusters is computed. Supported dissimilarity measures are: 'Mahalanobis', 'KL' and 'Euclidean'.

## Details

Consider two `p`-dimensional, normally distributed clusters with centers μ1, μ2 and covariance matrices Σ1, Σ2. Assume the size of the clusters are `n1` and `n2` respectively. We compute the dissimilarity `d12` between the clusters as follows:

1. If dist.type='Mahalanobis': we compute the dissimilarity `d12` with the Mahalanobis distance between the distributions of the clusters.

Σ = ( (n1-1) * Σ1 + (n2-1) * Σ2) / (n1+n2-2)

d12 = sqrt( t(μ1-μ2) * Σ^(-1) * (μ1-μ2))

2. If dist.type='KL': we compute the dissimilarity `d12` with the Symmetrized Kullback-Leibler divergence between the distributions of the clusters. Note that KL-divergence is not symmetric in its original form. We converted it symmetric by averaging both way KL divergence. The symmetrized KL-divergence is not a metric because it does not satisfy triangle inequality.

d12 = 1/4 * ( t(μ2 - μ1) * ( Σ1^(-1) + Σ2^(-1) ) * (μ2 - μ1) + trace(Σ1/Σ2 + Σ2/Σ1) + 2p )

3. If dist.type='Euclidean': we compute the dissimilarity `d12` with the Euclidean distance between the centers of the clusters.

d12 =sqrt(∑(μ1-μ2)^2 )

The dimension of the clusters must be same.

## Value

`dist.cluster` returns a numeric value denoting the dissimilarities between a pair of cell populations (clusters).

## References

McLachlan, GJ (1999) Mahalanobis distance; Journal of Resonance 4(6), 20–26.

Abou–Moustafa, Karim T and De La Torre, Fernando and Ferrie, Frank P (2010) Designing a Metric for the Difference between Gaussian Densities; Brain, Body and Machine, 57–70.

`mahalanobis.dist, symmetric.KL, dist.matrix`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26``` ```## ------------------------------------------------ ## load data and retrieve a sample ## ------------------------------------------------ library(healthyFlowData) data(hd) sample = exprs(hd.flowSet[[1]]) ## ------------------------------------------------ ## cluster sample using kmeans algorithm ## ------------------------------------------------ km = kmeans(sample, centers=4, nstart=20) cluster.labels = km\$cluster ## ------------------------------------------------ ## Create ClusteredSample object ## and compute mahalanobis distance between two clsuters ## ------------------------------------------------ clustSample = ClusteredSample(labels=cluster.labels, sample=sample) clust1 = get.clusters(clustSample)[[1]] clust2 = get.clusters(clustSample)[[2]] dist.cluster(clust1, clust2, dist.type='Mahalanobis') dist.cluster(clust1, clust2, dist.type='KL') dist.cluster(clust1, clust2, dist.type='Euclidean') ```