# Entropy

The (Shannon) entropy of a discrete random variable $X$ with alphabet $A$ measures its information content. It is given by the following formula: $$H(X) = \sum_{a \in A} P_X(a) \log \frac 1 {P_X(a)}$$

The base of the logarithm determines the information units. Bases 2 and $e$ correspond to bits and nats, respectively.

## Properties of the entropy

• Entropy is nonnegative and is upper bounded by the logarithm of the cardinality of the alphabet:

$$0 \leq H(X) \leq \log |A|$$ The lower bound is achieved by a deterministic random variable, while the upper bound is achieved by the equiprobable on $A$ random variable, provided that the alphabet $A$ is finite.

• Entropy is a concave function of $P_X$.
• For any deterministic function $f: A \mapsto B$:

$$H(f(X)) \leq H(X)$$ with equality if and only if $f$ is a one-to-one function almost surely.

## The binary entropy function

The entropy of the binary random variable with bias $p$ as a function of $p$ is called the binary entropy function. $$h_b(p) = p \log \frac 1 p + (1 - p) \log \frac 1 {1 - p}$$

## Rényi entropy

Rényi entropy, named after Alfréd Rényi, is a generalization of the Shannon entropy. Rényi entropy of order $\alpha$ for $\alpha \geq 0$, $\alpha \neq 1$ is given as follows:

$$H_\alpha(X) =\frac{1}{\alpha-1} \log \left(\sum_{a \in A} P_X(a)^\alpha \right)$$ It could be shown that

$$\lim_{\alpha \rightarrow 1}H_\alpha(X) =H(X),$$ that is the Shannon entropy is the special case of Rényi entropy when $\alpha \rightarrow 1$.