We find the capacity of discrete-time channels subject to both
frequency-selective and 
time-selective fading, where the channel output is observed
in additive Gaussian noise. 
A coherent model is assumed where the fading coefficients are
known at the receiver. 
Capacity depends on the first-order distributions of the fading processes in
frequency and in time, 
which are assumed to be independent of each other,
and a simple formula is given when one of the
 processes is iid and the other one is sufficiently mixing.
When the frequency-selective fading 
coefficients are known also to the
transmitter, we show that the optimum normalized power spectral
 density is the
waterfilling power allocation for a reduced signal-to-noise ratio, 
where the
gap to the actual signal-to-noise ratio depends on the fading distributions.