This paper studies reliability of communication over wideband multipath channels in the finite energy regime. Motivated by measurement results and physical arguments, our analysis focuses on sparse multipath scenarios, where the degrees of freedom (DoF) in the channel scale sub-linearly with the signaling dimension (time-bandwidth product). This is in contrast to the implicit assumption of rich multipath in most existing works, where the DoF scale linearly. We consider signaling using orthogonal short-time Fourier (STF) waveforms that serve as approximate eigenfunctions for underspread channels and relate multipath sparsity in the delay-Doppler domain to channel coherence in time and frequency. Our focus is on the non-coherent scenario, where we employ a training-based STF communication scheme and investigate its reliability under a finite energy constraint using random coding error exponents. Our analysis reveals a fundamental tradeoff between channel learnability and diversity: optimizing the tradeoff at any given packet payload and energy yields the largest error exponent and the smallest probability of error. For channels that are asymmetrically sparse in delay and Doppler, we show that the minimum error probability is achievable at all code lengths (signaling dimension) by appropriately adapting the STF packet configuration (signaling duration and bandwidth) to the level of sparsity in delay and Doppler. For symmetrically sparse channels, we show the existence of an optimal code length at which the minimum error probability is achieved, regardless of the packet configuration. Numerical results are provided to illustrate the implications of the results.