Noise reduction is an active research area in image processing due to its importance in improving the quality of image for object detection and classification. In this paper, we develop a sparse representation based noise reduction method for hyperspectral imagery, which is dependent on the assumption that the non-noise component in an observed signal can be sparsely decomposed over a redundant dictionary while the noise component does not have this property. The main contribution of the paper is in the introduction of nonlocal similarity and spectral-spatial structure of hyperspectral imagery into sparse representation. Non-locality means the self-similarity of image, by which a whole image can be partitioned into some groups containing similar patches. The similar patches in each group are sparsely represented with a shared subset of atoms in a dictionary making true signal and noise more easily separated. Sparse representation with spectral-spatial structure can exploit spectral and spatial joint correlations of hyperspectral imagery by using 3-D blocks instead of 2-D patches for sparse coding, which also makes true signal and noise more distinguished. Moreover, hyperspectral imagery has both signal-independent and signal-dependent noises, so a mixed Poisson and Gaussian noise model is used. In order to make sparse representation be insensitive to the various noise distribution in different blocks, a variance-stabilizing transformation (VST) is used to make their variance comparable. The advantages of the proposed methods are validated on both synthetic and real hyperspectral remote sensing data sets.