Chemistry Reference and  Research
           
 
Periodic Table
- standard table
- large table
 
Chemical Elements
- by name
- by symbol
- by atomic number
 
Chemical Properties
 
Chemical Reactions
 
Organic Chemistry
 
Branches of Chemistry
Analytical chemistry
Biochemistry
Computational Chemistry
Electrochemistry
Environmental chemistry
Geochemistry
Inorganic chemistry
Materials science
Medicinal chemistry
Nuclear chemistry
Organic chemistry
Pharmacology
Physical chemistry
Polymer chemistry
Supramolecular Chemistry
Thermochemistry

Kullback-Leibler divergence

In probability theory and information theory, the Kullback-Leibler divergence, or relative entropy, is a quantity which measures the difference between two probability distributions. It is named after Solomon Kullback and Richard Leibler , two NSA mathematicians. The term "divergence" is a misnomer; it is not the same as divergence in calculus. One might be tempted to call it a "distance metric", but this would also be a misnomer as the Kullback-Leibler divergence is not symmetric and does not satisfy the triangle inequality.

The Kullback-Leibler divergence between two probability distributions p and q is defined as

\mathit{KL}(p,q) = \sum_x p(x) \log \frac{p(x)}{q(x)} \!

for distributions of a discrete variable, and as

\mathit{KL}(p,q) = \int_{-\infty}^{\infty} p(x) \log \frac{p(x)}{q(x)} \; dx \!

for distributions of a continuous variable.

It can be seen from the definition that

\mathit{KL}(p,q) = -\sum_x p(x) \log q(x) + \sum_x p(x) \log p(x) = H(p,q) - H(p)\, \!

denoting by H(p,q) the cross entropy of p and q, and by H(p) the entropy of p. As the cross-entropy is always greater than or equal to the entropy, this shows that the Kullback-Leibler divergence is nonnegative, and furthermore KL(p,q) is zero iff p = q.

In coding theory, the KL divergence can be interpreted as the needed extra message-length per datum for sending messages distributed as q, if the messages are encoded using a code that is optimal for distribution p.

In Bayesian statistics the KL divergence can be used as a measure of the "distance" between the prior distribution and the posterior distribution. If the logarithms are taken to the base 2 the KL divergence is also the gain in Shannon information involved in going from the prior to the posterior. In Bayesian experimental design a design which is optimised to maximise the KL divergence between the prior and the posterior is said to be Bayes d-optimal .

References

  • S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Mathematical Statistics 22(1):79–86, March 1951.
01-04-2007 01:16:19
The contents of this article are licensed from Wikipedia.org under the GNU Free Documentation License. How to see transparent copy