In this paper, we propose a novel coupled dictionary training method for single-image super-resolution (SR) based on patchwise sparse recovery, where the learned couple dictionaries relate the low- and high-resolution (HR) image patch spaces via sparse representation. The learning process enforces that the sparse representation of a low-resolution (LR) image patch in terms of the LR dictionary can well reconstruct its underlying HR image patch with the dictionary in the high-resolution image patch space. We model the learning problem as a bilevel optimization problem, where the optimization includes an ℓ1-norm minimization problem in its constraints. Implicit differentiation is employed to calculate the desired gradient for stochastic gradient descent. We demonstrate that our coupled dictionary learning method can outperform the existing joint dictionary training method both quantitatively and qualitatively. Furthermore, for real applications, we speed up the algorithm approximately 10 times by learning a neural network model for fast sparse inference and selectively processing only those visually salient regions. Extensive experimental comparisons with state-of-the-art SR algorithms validate the effectiveness of our proposed approach.