Predictive state representations (PSRs) are class of models for dynamical systems, recently enjoying a growth in popularity. We interpret PSRs two ways: first, as recurrent neural networks, and second, as latent state models. We take advantage of these two interpretations to develop two complementary training algorithms: the RNN view leads to a backprop (gradient descent) method, and the latent state view leads to a spectral system identification method. These two training algorithms have complementary strengths: the spectral method finds a globally good basin of attraction, while the backprop method converges to locally optimal parameters. By combining the two methods, we can achieve state-of-the-art modeling performance.