Kulback-Leibler divergence

Kullback-Leibler divergence (KL divergence) [1-2] is a measure of the distance between two probability distributions P and Q. It has many other names including the relative entropy. For two distributions $P$ and $Q$ on $A$, it is defined as follows:

$$ D(P \| Q)=\sum_{a\in {A}} P(a) \log \frac{ P(a)}{Q(a)}. $$

If $P$ and $Q$ are not discrete, the above sum is understood as a Lebesgue integral.

Properties of the Kullback-Leibler divergence
$$ D(\lambda P_1+ (1-\lambda) P_2 \| \lambda Q_1+ (1-\lambda) Q_2) \leq \lambda D(P_1 \| Q_1)+ (1-\lambda) D(P_2 \| Q_2) $$ at each $(P,Q)$ with strictly positive $Q$ (when the topology of pointwise convergence is considered).
 * $D(P \| Q) \geq 0 $ with equality if and only if $P=Q$ and $Q$.
 * It is not a true metric because it is not symmetric, i.e. $D(P \| Q) \neq D(Q \| P)$ in general, and it does not satisfy the triangle inequality.
 * It is convex in the pair $(P, Q)$, i.e. if $(P_1, Q_1)$ and $(P_2,Q_2)$ are two pairs of distributions, and $0 \leq \lambda \leq 1$, then
 * $D(P \| Q)$ is a lower semi-continuous function of $(P,Q)$, and continuous

Generalizations of the Kullback-Leibler Divergence
KL divergence belongs to a broader class of distance measures called f-divergences [3]. Another important measure that belongs to this class is the Csiszar's distance, also known as information divergence (I-divergence). The Csiszar's distance generalizes the distance from probability distributions to nonnegative functions, which do not necessarily sum to the same constant. For two nonnegative functions $P$ and $Q$, the Csiszar's distance is defined as follows:

$$ I(P \| Q)=\sum_{a\in {A}} \left[ P(a) \log \frac{ P(a)}{Q(a)}-P(a)+Q(a) \right]. $$

The Csiszar's distance shares the same properties with KL divergence; it is always nonnegative, nonsymmetric, and convex in the pair $(P,Q)$.

Relations to inverse problems
Selecting "good" discrepancy measures for linear inverse problems is of importance in many areas. Several authors have examined a variety of distance measures which are consistent with some natural axioms including uniqueness and convexity [4-7]. They have showed that for linear inverse problems with nonnegativity constraints the Csiszar's distance is the only measure that satisfies the stated axioms.