# KLD: Kullback-Leibler Divergence (KLD) In LaplacesDemonR/LaplacesDemon: Complete Environment for Bayesian Inference

## Description

This function calculates the Kullback-Leibler divergence (KLD) between two probability distributions, and has many uses, such as in lowest posterior loss probability intervals, posterior predictive checks, prior elicitation, reference priors, and Variational Bayes.

## Usage

 `1` ```KLD(px, py, base) ```

## Arguments

 `px` This is a required vector of probability densities, considered as p(x). Log-densities are also accepted, in which case both `px` and `py` must be log-densities. `py` This is a required vector of probability densities, considered as p(y). Log-densities are also accepted, in which case both `px` and `py` must be log-densities. `base` This optional argument specifies the logarithmic base, which defaults to `base=exp(1)` (or e) and represents information in natural units (nats), where `base=2` represents information in binary units (bits).

## Details

The Kullback-Leibler divergence (KLD) is known by many names, some of which are Kullback-Leibler distance, K-L, and logarithmic divergence. KLD is an asymmetric measure of the difference, distance, or direct divergence between two probability distributions p(y) and p(x) (Kullback and Leibler, 1951). Mathematically, however, KLD is not a distance, because of its asymmetry.

Here, p(y) represents the “true” distribution of data, observations, or theoretical distribution, and p(x) represents a theory, model, or approximation of p(y).

For probability distributions p(y) and p(x) that are discrete (whether the underlying distribution is continuous or discrete, the observations themselves are always discrete, such as from i=1,...,N),

KLD[p(y)||p(x)] = sum of p(y[i]) log(p(y[i]) / p(x[i]))

In Bayesian inference, KLD can be used as a measure of the information gain in moving from a prior distribution, p(theta), to a posterior distribution, p(theta | y). As such, KLD is the basis of reference priors and lowest posterior loss intervals (`LPL.interval`), such as in Berger, Bernardo, and Sun (2009) and Bernardo (2005). The intrinsic discrepancy was introduced by Bernardo and Rueda (2002). For more information on the intrinsic discrepancy, see `LPL.interval`.

## Value

`KLD` returns a list with the following components:

 `KLD.px.py` This is KLD[i](p(x[i]) || p(y[i])). `KLD.py.px` This is KLD[i](p(y[i]) || p(x[i])). `mean.KLD` This is the mean of the two components above. This is the expected posterior loss in `LPL.interval`. `sum.KLD.px.py` This is KLD(p(x) || p(y)). This is a directed divergence. `sum.KLD.py.px` This is KLD(p(y) || p(x)). This is a directed divergence. `mean.sum.KLD` This is the mean of the two components above. `intrinsic.discrepancy` This is minimum of the two directed divergences.

## Author(s)

Statisticat, LLC. [email protected]

## References

Berger, J.O., Bernardo, J.M., and Sun, D. (2009). "The Formal Definition of Reference Priors". The Annals of Statistics, 37(2), p. 905–938.

Bernardo, J.M. and Rueda, R. (2002). "Bayesian Hypothesis Testing: A Reference Approach". International Statistical Review, 70, p. 351–372.

Bernardo, J.M. (2005). "Intrinsic Credible Regions: An Objective Bayesian Approach to Interval Estimation". Sociedad de Estadistica e Investigacion Operativa, 14(2), p. 317–384.

Kullback, S. and Leibler, R.A. (1951). "On Information and Sufficiency". The Annals of Mathematical Statistics, 22(1), p. 79–86.

`LPL.interval` and `VariationalBayes`.
 ```1 2 3 4``` ```library(LaplacesDemon) px <- dnorm(runif(100),0,1) py <- dnorm(runif(100),0.1,0.9) KLD(px,py) ```