A central problem in artificial intelligence is that of planning to maximize future reward under uncertainty in a partially observable environment. We discuss algorithms for learning a model of such an environment directly from sequences of action-observation pairs, and for closing the loop by planning in the learned model. Specifically, we present a spectral (or subspace identification) algorithm for learning the parameters of a Predictive State Representation, and we describe two methods of planning in the learned model. http://arxiv.org/abs/0912.2385 http://arxiv.org/abs/1011.0041