We find the capacity of discrete-time channels subject to both frequency-selective and time-selective fading, where the channel output is observed in additive Gaussian noise. A coherent model is assumed where the fading coefficients are known at the receiver. Capacity depends on the first-order distributions of the fading processes in frequency and in time, which are assumed to be independent of each other, and a simple formula is given when one of the processes is iid and the other one is sufficiently mixing. When the frequency-selective fading coefficients are known also to the transmitter, we show that the optimum normalized power spectral density is the waterfilling power allocation for a reduced signal-to-noise ratio, where the gap to the actual signal-to-noise ratio depends on the fading distributions.