We consider the problem of learning sparsely used overcomplete dictionaries, where each observation consists of a sparse combination of the mutually incoherent dictionary elements. Our method consists of a clustering-based initialization step that gives a reasonably accurate initial estimate of the true dictionary. This estimate is further improved via an iterative algorithm with the following alternating steps: 1) estimation of the dictionary coefficients for each observation through ℓ1 minimization, given the dictionary estimate and 2) estimation of the dictionary elements through least squares, given the coefficient estimates. We establish that, under a set of sufficient conditions, our method converges at a linear rate to the true dictionary.