Layman's introduction to compressive sensing

Compressive sensing is an interdisciplinary research area that became very popular in the mid 2000's. The main idea is that classical data acquisition systems first sample a signal (e.g., at the Nyquist rate) and then compress the sampled representation. If the compression ratio is large, then the sample rate is much larger than the data rate and much of the sampled information is discarded during compression. By combining sampling and compression into a single operation, compressive sensing may allow the 'sample rate to be significantly reduced. These operations are combined by allowing the system to sample arbitrary linear combinations of the high-rate samples.

Compressive sensing is typically applied to signals that are approximately sparse (i.e., very few sampled values have significant magnitude) in some transformed basis. Many real world signals have this property. For example, most natural images are in the wavelet basis.

Noiseless Observations
Let $\mathbf{x} \in \mathbb{R}^n$ be a real $n$-dimensional vector that represents the signal sampled at a high rate. The measurement matrix $\Phi \in \mathbb{R}^{m \times n}$ defines the linear combinations used for compressive sensing; each row of $\Phi$ defines a sample in the new system. Therefore, the measurement vector $\mathbf{y} \in \mathbb{R}^m$ for the compressive sensing system is defined by $$ \mathbf{y} = \Phi \mathbf{x} = (\Phi \Psi^{-1}) (\Psi \mathbf{x}) = \tilde{\Phi} \tilde{\mathbf{x}}, $$ where the matrix $\Psi$ defines a linear transform such that $\tilde{\mathbf{x}} = \Psi \mathbf{x}$ is approximately sparse.

Given the observation $\mathbf{y}$, the valid set of signal vectors is $$ V(\mathbf{y}) = \{ \mathbf{x} \in \mathbb{R}^n | \Phi \mathbf{x} = \mathbf{y} \}.$$ Since this set has dimension has at least $n-m$, there is clearly no unique way to solve for $\mathbf{x}$. One can still choose the "best" reconstruction based on the prior knowledge that $\mathbf{x}$ is approximately sparse. In practice, one can do this in a computationally tractable way by using a method, known as basis pursuit , that solves the linear program implied by $$ \hat{\mathbf{x}} = \arg \min_{\mathbf{x} \in V(\mathbf{y})} \| \mathbf{x} \|_1. $$ There are now many computationally efficient ways to solve this problem.

If the number of non-zero entries in $\mathbf{x}$ is at most $K$, then it has been shown that, for $m = O ( K \log n ) $, measurement matrices exist where reconstruction by basis pursuit is guaranteed to succeed.

Noisy Observations
Consider the case where the observation vector is corrupted by additive noise $$ \mathbf{y} = \Phi \mathbf{x} + \mathbf{z},$$ where $\mathbf{z}$ is zero-mean Gaussian white noise. In this case, reconstruction can be done in a computationally feasible manner by using a method, known as LASSO, that solves the quadratic program implied by $$ \hat{\mathbf{x}} = \arg \min_{\mathbf{x} \in V(\mathbf{y})} \| \mathbf{y} - \Phi \mathbf{x} \|_2^2 + \lambda \| \mathbf{x} \|_1. $$ Once again, there are now many computationally efficient ways to solve this problem.