We consider data transmission through a time-selective (correlated) flat Rayleigh fading channel under an average power constraint. The channel is estimated at the receiver with a pilot signal, and the estimate is fed back to the transmitter. The estimate is used for coherent demodulation, and to adapt the data and pilot powers. Namely, the transmitter can conserve power by decreasing the training power when the channel is faded. However, the channel estimate must always be sufficiently accurate to guide this adaptation. By taking a continuous limit in which the channel becomes a diffusion (Ornstein-Uhlenbeck) process, which is estimated with a Kalman filter, we are able to solve for the optimal training policy. Specifically, the optimal pilot power control is "bang-bang", i.e., depending on the current system state (channel estimate and associated error variance) the pilot power is either the maximum allowable, or zero. The associated regions of the state space are explicitly determined as a solution to a free boundary partial differential equation. Numerical results show a significant increase in achievable rate due to the adaptive training scheme with feedback, relative to constant training, which does not require feedback.