The problem of estimating the Kullback-Leibler divergence D(P||Q) between two large-alphabet distributions P and Q with alphabet size k is studied. It is first shown that there does not exist any consistent estimator for all pairs of distributions. A restricted set that contains pairs of distributions, with density ratio bounded by a constant which may depend on k is further considered. A minimax optimal estimator is constructed for this problem by employing both the polynomial approximation and plug-in approach. This estimator is shown to need less number of samples than the plug-in estimator by a log k factor. An application of the minimax optimal estimator to the problem of universal outlier detection is further studied, which demonstrates the superb performance of the minimax optimal estimator compared to the plug-in approach.