Non-linear functionals of the densities arise in applications of machine learning, signal processing and statistical estimation. Important examples of such functionals include Shannon and R\'{e}nyi entropies and Csiz\'{a}r $f$-divergences. In many of these applications, the functional of interest has to be empirically estimated from sample realizations of the underlying densities. Though this estimation problem has received significant attention in the mathematical statistics community, general results on rates of convergence of estimators are unavailable. Since the rate of convergence relates the number of samples to the performance of the estimator, convergence rates have great practical utility. These convergence results can subsequently be used to drive parameter selection in algorithms, yield insight into algorithmic strengths and weaknesses, and predict performance in applications. In this work we derive convergence rates for a class of estimators of non-linear functionals. Our class of estimators exploit a close relation between density estimation and the geometry of proximity neighborhoods in the data sample. We present a statistical analysis of the bias and variance, including rates of convergence, for this class of estimators. In addition, we establish results on weak convergence (CLT) of these estimators. We apply these results to optimally select estimator tuning parameters and to derive confidence intervals for the non-linear functional. To illustrate the usefulness of our theory, we apply our theory to (i) determine intrinsic dimension in high dimensional data sets with high accuracy and (ii) detect anomalies in wireless sensor networks.