Fast polarization is crucial for the performance guarantees of polar codes. In the memoryless setting, the rate of polarization is known to be exponential in the square root of the block length. A complete characterization of the rate of polarization for models with memory has been missing. We consider polar codes for processes with memory that are characterized by an underlying ergodic finite state Markov chain. We show that the rate of polarization for these processes is the same as in the memoryless setting, both to the high and to the low entropy sets. Thus, polar codes achieve the Markov capacity in many information-theoretic applications.