symmetric.KL: Symmetrized Kullback-Leibler divergence

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/clusterDistances.R

Description

Compute the Symmetrized Kullback-Leibler divergence between a pair of normally distributed clusters.

Usage

1
symmetric.KL(mean1, mean2, cov1, cov2)

Arguments

mean1

mean vector of length p for cluster 1, where p is the dimension of the clusters.

mean2

mean vector of length p for cluster 2.

cov1

pxp covariance matrix for cluster 1.

cov2

pxp covariance matrix for cluster 2.

Details

Consider two p-dimensional, normally distributed clusters with centers μ1, μ2 and covariance matrices Σ1, Σ2. We compute the KL divergence d12 between the clusters as follows:

d12 = 1/4 * ( t(μ2 - μ1) * ( Σ1^(-1) + Σ2^(-1) ) * (μ2 - μ1) + trace(Σ1/Σ2 + Σ2/Σ1) + 2p )

The dimension of the clusters must be same.

Note that KL-divergence is not symmetric in its original form. We converted it symmetric by averaging both way KL divergence. The symmetrized KL-divergence is not a metric because it does not satisfy triangle inequality.

Value

symmetric.KL returns a numeric value measuring the Symmetrized Kullback-Leibler divergence between a pair of normally distributed clusters.

Author(s)

Ariful Azad

References

Abou–Moustafa, Karim T and De La Torre, Fernando and Ferrie, Frank P (2010) Designing a Metric for the Difference between Gaussian Densities; Brain, Body and Machine, 57–70.

See Also

mahalanobis.dist, dist.cluster

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
## ------------------------------------------------
## load data and retrieve a sample
## ------------------------------------------------

library(healthyFlowData)
data(hd)
sample = exprs(hd.flowSet[[1]])


## ------------------------------------------------
## cluster sample using kmeans algorithm
## ------------------------------------------------

km = kmeans(sample, centers=4, nstart=20)
cluster.labels = km$cluster

## ------------------------------------------------
## Create ClusteredSample object  
## and compute mahalanobis distance between two clsuters
## ------------------------------------------------

clustSample = ClusteredSample(labels=cluster.labels, sample=sample)
mean1 = get.center(get.clusters(clustSample)[[1]])
mean2 = get.center(get.clusters(clustSample)[[2]])
cov1 = get.cov(get.clusters(clustSample)[[1]])
cov2 = get.cov(get.clusters(clustSample)[[2]])
n1 = get.size(get.clusters(clustSample)[[1]])
n2 = get.size(get.clusters(clustSample)[[2]])
symmetric.KL(mean1, mean2, cov1, cov2)

flowMatch documentation built on Nov. 8, 2020, 8:02 p.m.