symmetric.KL {flowMatch} | R Documentation |
Compute the Symmetrized Kullback-Leibler divergence between a pair of normally distributed clusters.
symmetric.KL(mean1, mean2, cov1, cov2)
mean1 |
mean vector of length |
mean2 |
mean vector of length |
cov1 |
|
cov2 |
|
Consider two p
-dimensional, normally distributed clusters with centers μ1, μ2 and covariance matrices Σ1, Σ2. We compute the KL divergence d12
between the clusters as follows:
d12 = 1/4 * ( t(μ2 - μ1) * ( Σ1^(-1) + Σ2^(-1) ) * (μ2 - μ1) + trace(Σ1/Σ2 + Σ2/Σ1) + 2p )
The dimension of the clusters must be same.
Note that KL-divergence is not symmetric in its original form. We converted it symmetric by averaging both way KL divergence. The symmetrized KL-divergence is not a metric because it does not satisfy triangle inequality.
symmetric.KL
returns a numeric value measuring the Symmetrized Kullback-Leibler divergence between a pair of normally distributed clusters.
Ariful Azad
Abou–Moustafa, Karim T and De La Torre, Fernando and Ferrie, Frank P (2010) Designing a Metric for the Difference between Gaussian Densities; Brain, Body and Machine, 57–70.
mahalanobis.dist
, dist.cluster
## ------------------------------------------------ ## load data and retrieve a sample ## ------------------------------------------------ library(healthyFlowData) data(hd) sample = exprs(hd.flowSet[[1]]) ## ------------------------------------------------ ## cluster sample using kmeans algorithm ## ------------------------------------------------ km = kmeans(sample, centers=4, nstart=20) cluster.labels = km$cluster ## ------------------------------------------------ ## Create ClusteredSample object ## and compute mahalanobis distance between two clsuters ## ------------------------------------------------ clustSample = ClusteredSample(labels=cluster.labels, sample=sample) mean1 = get.center(get.clusters(clustSample)[[1]]) mean2 = get.center(get.clusters(clustSample)[[2]]) cov1 = get.cov(get.clusters(clustSample)[[1]]) cov2 = get.cov(get.clusters(clustSample)[[2]]) n1 = get.size(get.clusters(clustSample)[[1]]) n2 = get.size(get.clusters(clustSample)[[2]]) symmetric.KL(mean1, mean2, cov1, cov2)