Using a small number of nodal observations, this contribution introduces methods for semi-supervised classification of graph nodes - a task encountered in many real-world networks such as those derived from social interactions and biological dependencies. Furthermore, feature-based semi-supervised classification can also be tackled via graph-based methods by constructing a graph based on similarity metrics. In this context, diffusion-based classifiers such as the personalized PageRank and the heat kernel have been shown to exhibit remarkable classification accuracy at minimal computational overhead. Such classifiers rely on iteratively diffusing (or propagating) information from labeled to unlabeled nodes. The present work develops a suite of schemes for searching within the class of possible diffusion functions and selecting the one that maximizes graph-based metrics such as smoothness or conductance, while also achieving good fit on the labeled nodes. Experiments on real networks demonstrate that adapting the diffusion function to the given graph and observed labels may significantly improve accuracy over standard non-adaptive diffusions. Similar to PageRank and Heat kernel, the complexity of the novel approaches scales linearly with the number of edges, which highlights their merits for learning over large-scale graphs.