The talk is about two different topics that are related by linear state space models and sparse multi-channel representations of "information" about signals. We first consider unsupervised learning of such representations using normal priors with unknown variance (NUV) and expectation maximization, combining sparse estimation, dictionary learning, signal labeling, and blind signal separation into a single algorithm. The actual computations boil down to iterative multivariate-Gaussian message passing (i.e., recursions as in Kalman smoothing). We then proceed to multilayer (features of features) networks using such representations. In this setting, we also address supervised learning, where the backpropagation of gradients profits from sparsity. We also point out the suitability of such sparse multi-channel representations for self-timed continuous-time "neuromorphic" computation.