Hyperspectral Image Denoising Using 3-D Geometrical Kernel With Local Similarity Prior

The majority of existing hyperspectral (HS) image denoising methods exploit local similarity in HS images by rearranging them into the matrix or vector forms. As the typical 3-D data, the inherent spatial and spectral properties in HS images should be simultaneously explored for denoising. Therefore, a 3-D geometrical kernel (3DGK) is developed in this article to describe the local structure. The proposed method assumes that the pixel can be represented by other pixels within a 3-D block efficiently owing to the local similarity with adjacent positions. Then, the HS image is modeled by the 3-D kernel regression with L1-norm constraint, in which the local similarity is captured by the proposed 3DGK. To efficiently compute the parameters in 3DGK, geometrical structures, such as scale, shape, and orientation in the 3-D block, are estimated from the gradient information approximately. Finally, the noises are effectively removed while preserving the structures in HS images. Moreover, experimental results on simulated and real datasets demonstrate that the performance of 3DGK is better than those of the methods based on local similarity prior.


I. INTRODUCTION
W ITH the rapid development of imaging sensors, hyperspectral (HS) remote sensing technology has been attracting much attention and HS images have also been applied in many scene interpretation tasks, such as land-use classification [1], target detection [2], and unmixing [3]- [6], due to containing abundant spectral information. However, during acquiring or transmitting HS images are always corrupted by some undesired noises [7], which have some impacts on the performance of the above-mentioned tasks. Therefore, it is vital to efficiently remove the noises and recover the HS image accurately.
Over the past two decades, many methods are proposed to achieve the denoising of HS images by adopting different tools. For example, Li et al. [8] proposed an HS image denoising method based on distributed sparse representation [9], which exploited the intraband and interband information for dictionary learning. Qian and Ye [10] introduced the spatial and spectral structure into sparse representation for HS denoising, in which similar patches are collected to ensure the sparsity. Fu et al. [11] learned an adaptive spatial-spectral dictionary from the noisy HS images for noise removal. Subsequently, Zhuang and Bioucas-Dias [12] combined low-rank and sparse representation to capture the spatial and spectral correlation in HS images. In [13], Xiong et al. embedded the low-rank prior into tensor to model the correlations in both spectral and spatial domains, which is regularized by L 0 gradient term. Besides, to capture the nonlocal self-similarity [14], Zhang et al. [15] proposed a total variation (TV) regularized nonlocal low-rank tensor decomposition model to restore the clean HS images. For accurate weight estimation, Hu et al. [16] proposed an adaptive nonlocal means denoising method for the denoising of HS images. Xue et al. [17] further explored the global correlation across the spectrum and nonlocal similarity in HS images to regularize the low-rank tensor model. Besides, Xie et al. [18] proposed an intrinsic tensor sparsity measure to encode the low-rank prior in the Tucker decomposition model, which also considered the nonlocal similarity. Chen et al. [19] utilized the tensor-ring decomposition model to construct high-order tensors, which improves the ability for noisy HS images. Moreover, other priors, such as nonnegativity [20], are also incorporated with the tensor model for HS image denoising.
Besides, a large number of methods are also specifically developed for HS image denoising by integrating local information priors, which has been widely focused and extensively applied to HS image denoising. Local prior mainly captures the inherent information of HS image locally, because it is difficult to describe the global prior in the whole HS images directly. As a popular model, TV [21] is proposed to explore the local gradient information in images, which is extended for HS images by considering the spectral property. For example, a spectral-spatial adaptive hyperspectral TV (SSAHTV) is proposed in [22] to further exploit the noise intensity differences and spatial property differences. Further, the anisotropic spatial and spectral TV is designed to deal with the piecewise structures in [23] and low-rank tensor decomposition is considered to capture the global spatial and spectral correlation of HS images. Meanwhile, He et al. [24] also employed This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ TV to regularize the local low-rank decomposition model for denoising of HS images. Fan et al. [25] presented a spatial and spectral TV model to deal with spectral structure loss led by the bandwise TV. In [26], 1-D TV along the spectral dimension is introduced to preserve the spectral signatures. Besides, local similarity (LS) [27] is also an efficient prior commonly used for HS image denoising. It is assumed that the pixel values in a local region are similar and they can represent each other due to the close location among them. For example, Zhao and Yang [28] utilized sparse and low-rank constraints to capture the local redundancy and correlation in HS images, which can well preserve the spatial and spectral details in the denoised HS images. Lu et al. [29] divided all bands in HS images into several groups according to spectral correlation and then spatial adaptive LS is considered to find related pixels. In [30], the local similar pixels are defined by superpixel segmentation, which avoids dividing HS images into square patches. Recently, some HS image denoising methods based on deep neural networks [31]- [34] are also proposed and acquire satisfactory results. These methods hierarchically extracted image features from local regions, which are also inspired by local prior. For example, the local structures in HS images are extracted in convolutional neural networks level by level [35].
As a powerful tool to model local relationships, kernel-based methods are usually employed to describe the LS prior among adjacent pixels and have been generally used for natural image denoising. For example, Takeda et al. [36] analyzed the local structures in images by kernel regression and obtained better denoising results. In [37], Bouboulis et al. adopted the semiparametric formulation to establish the adaptive kernel for image denoising. Subsequently, the local geometrical structures are further exploited in [38] and the sparsity of neighbors is cast on the steerable kernel. Then, the LS prior captured by the kernel is also applied to the denoising of HS images. For example, Zhong and Wang [39] considered the conditional random field to handle the edges and textures in a local region, in which a filter bank is used to represent the structures. Moreover, kernel regression is also utilized in [40] and a structure tensor is defined to explore the local gradient orientation information. However, the spectral information is represented by a fixed vector in [40], which cannot adaptively estimate the features in the spectral direction. These methods obtain better denoised HS images, the inner structures in 3-D blocks of HS images are not analyzed carefully and deeply. The LS between different pixels is not only related to the distance in position but also decided by the geometrical structures, such as edges and textures. As shown in Fig. 1(a), the pixel p locating in the sharp edge is not similar to pixel p 1 although the distance in position between them is close, where the pixels are labeled by a circle. Inversely, p looks very similar to the pixel p 2 in intensity because they are in the same edge structure. Moreover, we compute the correlation coefficients (CC) of the adjacent bands of the HS image in Fig. 1(a) and the CC curve is displayed in Fig. 1(b). From Fig. 1(b), one can see that the similarity in adjacent bands is high owing to the high spectral resolution. Thus, the similarity in the spectral domain should also be considered. Therefore, to simultaneously exploit the local geometrical structure and spectral correlation in HS images, we proposed a new 3-D geometrical kernel (3DGK) to remove the noises in HS images from the perspective of kernel regression in this article. It is assumed that the other pixels in a 3-D block centered by the pixel to be denoised are similar to the center pixel. Then, the similarity weights between the center pixel and the other neighboring pixels are measured by the 3DGK. In the proposed method, the inner structures within a local 3-D block of HS images can be approximated adaptively by the 3DGK, where the scale, shape, and orientation of the kernel are deformable and can be adjusted by elongation, rotation, and scaling. The estimated kernel not only considers the distance information but also makes full use of the spatial and spectral structures. In 3DGK, the similarity weights can be inferred more reasonably. Through the geometrical structure prior, the noises in HS images can be reduced efficiently while preserving the spectral and spatial details.
The contributions of the proposed method can be summarized as follows.
1) We present a novel 3DGK regression model for HS images denoising, which adopts the geometrical structures in 3-D blocks to capture the LS in spatial and spectral domains. The geometrical structures are reflected by the weights assigned to each pixel within the 3-D blocks. 2) We analyze the geometrical structures in the 3-D blocks deeply and infer the scaling, elongation, and rotation factors from the local gradient information in the 3-D blocks to estimate more accurate weights. The rest of this article is organized as follows. In Section II, the 3DGK is constructed and its denoising process is described. In Section III, experimental results on different datasets and noise distributions are given. Finally, Section IV concludes this article.

II. 3DGK DENOISING
In this section, according to the LS assumption, the pixel to be denoised in a 3-D block is estimated by the neighboring pixels under the 3-D kernel regression framework. Then, the inner geometrical structures are adaptively analyzed by elongation, rotation, and scaling. Finally, the denoised HS images are produced by the proposed 3DGK.

A. Kernel Regression in HS Images
Similar to the model in [36], the denoising model can be written as (1) for the noisy HS image according to regression theory [41] p where p i is the noised pixel value.
T denotes the position of the pixel. f (·) is a regression function. Here, the two spatial dimensions correspond to horizontal and longitudinal axes, respectively. The spectral dimension is viewed as the vertical axis. So, x h i , x l i , and x v i are coordinate values of p i . ε i is the noise. N equals to the number of estimated pixels and N = b 3 for a 3-D block with size b × b × b. In a local 3-D block of HS image, the center pixel to be denoised is similar to the neighboring pixels. Then, the center pixel can be well represented. Thus, in the local 3-D block, the Taylor expansion of (1) is calculated as where ∇ and H stand for Gradient and Hessian operators, respectively. Due to the symmetry property of the Hessian matrix, (2) can be rewritten as where ltr lexicographically rearranges all elements in the lower triangular part of a matrix into a column vector. β 0 is equivalent to f (x), which is the pixel value to be estimated. β 1 and β 2 are computed by (4) and (5), respectively Then, β 0 , β 1 , and β 2 can be obtained by minimizing (6). Besides, to produce clearer spatial details, we adopt L 1 -norm in (6) to replace the L 2 -norm used in [36]; then, the optimization of (6) becomes the minimization of the weighted where denotes elementwise multiplication. The related symbols in (6) are defined as follows: is a kernel determined by a smoothing factor H whose size is 3 × 3. To estimate the denoised pixel values, an auxiliary variable is introduced for efficient optimization of (6), which is formulated as Then, the augmented Lagrangian multiplier method (ALM) [42] is considered to solve (11). So, the augmented Lagrange function of (11) is written as where ρ is the penalty parameter and y is the Lagrangian multiplier. According to the iterative scheme, q is obtained by solving the subproblem Equation (13) can be optimized efficiently by the soft threshold operator in [43].
For b, its subproblem is By replacing the elementwise multiplication with the matrix multiplication and simplification, (14) can be reformulated as where W x = diag(w x ). diag(·) transforms a vector into a diagonal matrix. This is a least-squares optimization problem, whose closed-form solution can be obtained by setting the derivate of (15) w.r.t. b as 0. The derivate of (15) w.r.t. b is Finally, the multiplier y is updated by Taking the computation complexity into consideration, the iteration number is set as 20 due to the efficiency and rapid convergence of ALM. It is obvious that the value of the estimated pixel is β 0 , which is the first element in b. Naturally, if the weights in W x are given, the denoised pixel value can be computed easily.

B. 3-D Geometrical Kernel
It can be found that the denoising performance of (6) depends upon the weight matrix W x , which is derived from the kernel K H (x). So, the choice of the kernel is critical. A typical kernel is a Gaussian function, which uses the information of location distance between the center pixel and the neighboring pixels. For example, the Gaussian kernel is employed in [44] to find the smooth or texture components of HS images. It is obvious that the similarity may be low for the close pixels in location distance if they have different spatial or spectral appearances. In addition to the distance information, the pixels are very similar in a 3-D block of HS images, which not only have similar intensity in the spatial domain but also are similar in spectral response due to the high spectral resolution of HS images. Besides, there exist abundant geometrical structures, such as edges or textures in the 3-D block, which also should be considered for HS image denoising. As shown in Fig. 1(a), the pixels along the edges have a more similar intensity and the pixel values are closer in the smooth regions. For spectral domain, the structure similarity also can be found.
Although the geometrical structure prior is utilized in [36] for image denoising, only spatial information is calculated to estimate the weights that cannot take full advantage of the spectral information in HS images. For K H (x), its property is packaged inside H. Therefore, H should carry the location distance, intensity, and structure information in a 3-D block of HS images. Considering the requirement, 3DGK is constructed to simultaneously capture the geometrical information of HS images in the spatial and spectral domains, which is formulated as h is a smoothing parameter. μ i denotes the local density. C i contains the geometrical information, such as the orientation and magnitude of structures in a 3-D block, which can be reflected roughly by the covariance matrix of gradients in spatial and spectral domains. Thus, C i is defined by (19) shown at the bottom of this page, where g h (x j ), g l (x j ), and g v (x j ) are the gradients of the pixel in x j along horizontal, longitudinal, and vertical directions, respectively. B i is the 3-D block whose central point is the pixel p i . However, it is time-consuming to compute C i directly. Fortunately, C i can be decomposed approximately by (20) owing to the symmetry in where γ i is the scaling factor of the 3-D kernel. According to the elongation matrix Λ i , the principal axes of the 3-D kernel are elongated. t i , v i , and w i are the elongation factors on the principal axes of the 3-D kernel, respectively. The rotation of the 3-D kernel is denoted by U i , which can be modeled as (22) shown at the bottom of the next page, where R x (θ i ), R y (φ i ), and R z (ψ i ) are the rotation matrices along with different directions. Thus, the rotation of principal axes is achieved by U i . θ i , φ i , and ψ i are the rotation angles on horizontal, longitudinal, and vertical axes, which can be efficiently from U i by dcm2angle in MATLAB.
For an intuitive understanding, the scaling, elongation, and rotation of a 3-D kernel are shown in Fig. 2, which gradually matches the spatial and spectral structures in the 3-D block. The 3-D ball kernel is first resized by γ i . Then, the lengths of the three axes are adjusted by Λ i . Finally, a triaxial ellipsoid kernel is rotated along with different directions through U i and the 3DGK is obtained. Through the transformation of the above operations, the structures in the 3-D block can be efficiently represented.
According to the image local analysis in [45], the orientation of the structure in the 3-D block can be derived from its gradients.
block. g h , g l , and g v are column vectors, which are comprised of the gradients of the 3-D block in different directions. The relationship between G i and C i can be written as Then, U i and Λ i can be calculated from the singular value decomposition (SVD) of G i , which is (24) where V i , Σ i , and U i are the corresponding decomposed results of G i after SVD. U i in (20) is the same as that U i in (24) and Σ i = diag ([s i,1 , s i,2 , s i,3 ]). So, the parameters in (20) are efficiently inferred from (24) due to the numerical stability and efficiency of SVD. For example, Λ i can be regarded as the amplitude of the principal axes of the 3-D kernel, whose elements are calculated by where α is a regularization parameter to avoid meaningless values. By (25), the elongation factors can be normalized to avoid the large variation caused by pixel values. The shape of the 3-D kernel is controlled by Λ i . In homogeneous or smooth regions, the 3-D kernel is close to a ball in shape. In some heterogeneous or edges regions, the 3-D kernel is adjusted to capture the structure information. Therefore, the geometrical structure is embedded into the 3-D kernel naturally. Besides, the scaling factor γ i is defined as where M is the number of pixels in the 3-D blocks. γ i is also related to the intrinsic structure in the 3-D block. For smooth areas, the values of s i,1 , s i,2 , and s i,3 will be close because the pixel values are similar in smooth areas. So, the kernel size should be large, which is decided by γ i . Through a large kernel, more pixels can be used to reconstruct the central pixel in the 3-D block. In edge or texture areas, the pixel values in the 3-D block will have large differences and s i,1 , s i,2 , and s i, 3 in different directions are also quite different. Then, the kernel will be small. Fig. 3 displays the estimated 3DGK from some typical geometrical structures in the Pavia University (PaviaU) dataset, such as smoothness, edge, and texture, for visual analysis. It can be observed that the structure of the estimated kernel is consistent with that in the corresponding 3-D block in Fig. 3. Because the weights within the 3-D kernel are invisible in 3-D space, we project the estimated 3-D kernels into 2-D space along with three straightforward directions. The horizontal and longitudinal directions are along the spatial dimensions of the 3-D blocks, respectively. The vertical direction is consistent with the spectral dimension. For example, we can see that larger weights mainly concentrate on the center. In smooth blocks, most pixel values are similar. Thus, the weights are mainly dependent upon distance in position, which means the weights are larger for the pixels closer to the center. For the edge block, the projections on horizontal and longitudinal directions reflect that the neighboring pixels are more similar to the center pixel along the vertical direction, namely spectral dimension. Because the similarity is high among adjacent bands. In the vertical projection, the structure looks like the spatial information in the edge block. For the texture block, larger weights are obtained along spectral dimension due to the complex structures in the spatial domain. We can see that the structures in 3-D blocks are dominated by the edge for edge blocks. For texture blocks, their structures are complex and there are no obvious principal directions. Naturally, the kernel area in the vertical projection of the texture blocks is small.
Through 3DGK, the structure in 3-D blocks can be well exploited. Therefore, β 0 that is the desired pixel value can be obtained through (6) once the 3-D kernel is referred from the local structures of HS images. Finally, HS image denoising is realized by the proposed 3DGK.

C. Complexity Analysis
The proposed denoising method is composed of two parts: the estimation of 3DGK for each pixel and the reconstruction of the denoised image. In the first part, gradient calculation and the approximation of covariance matrix are involved. The complexity is O(3W HB) for computing the gradients of the image with size W × H × B. W and H are the width and height of HS images, respectively. B denotes the number of bands in HS images. When estimating the covariance matrix, SVD is utilized whose complexity is O(18NW HB) for total pixels. N stands for the number of pixels in the 3-D block. Thus, the computational complexity is O ((3 + 18N )W HB). For the second part, the complexity is dominated by the update of q and matrix multiplication in (15). The complexity of the update of q is O(NW HB) for the involved elementwise operation. For b, the complexity is O ((10N 2 + 100N )W HB) to solve (15). Therefore, it can be found that the second part of 3DGK dominates most of the computing time, and the computational complexity increases with the increasing of the 3-D block size for a given HS image. Fortunately, the method can be achieved in parallel for acceleration because the pixels in the HS image can be estimated independently.

III. EXPERIMENTAL RESULTS AND COMPARISONS
In this section, experiments on simulated and real datasets are conducted to verify the effectiveness of the proposed method. The denoised results of the proposed methods are compared with those of some kernel-based denoising methods, including SK [36], SDGNLM [38], and AKR [40]. Besides, some methods based on local or global similarity are also considered for comparisons, such as SSAHTV [22], 3DNLM [46], LRMR [47], and E3DTV [48]. The denoised results of all methods are analyzed in visual quality and then some numerical indicators are employed to evaluate the denoised images objectively. The results of simulated datasets are assessed by three indicators [49], i.e., mean peak signal-to-noise ratio (MPSNR), mean structural similarity (MSSIM), mean feature similarity (MFSIM), and spectral angle mapper (SAM). These indicators are the average of PSNR [50], SSIM [51], FSIM [52], and SAM on all bands, which are defined as where C equals to the number of bands in HS images. i i and j i are the spectral vectors of the ith pixel in the reference HS images and the denoised HS images, respectively. For the three indicators, the larger the values, the better the denoised results. CEIQ [53] and blind score (BS) [54] are considered to assess the denoised results of all methods on the real dataset. BS extracts the statistics features in HS images for assessment and has been used in many fields [55], [56]. The denoised images will be better if CEIQ is larger. However, smaller values mean better quality for BS. In the following experiments, h in 3DGK has an important influence on denoising results, which is related to the noise level. According to the formulation in [38], we find a linear function between h and noise standard deviation σ, expressed as: h = 4σ + 0.175. μ i about kernel is set as 1. The regularization parameters α and β are set as 0.01. The size of the 3-D block is 9 × 9 × 9 in 3DGK. For a comprehensive analysis, the influences of some key parameters on denoised images are given in Section III-D for 3DGK. For real datasets, the 3-D block size is set as 11 × 11 × 11. Because the noise level is unknown in real datasets, h is empirically set as 0.4 and 0.25 for Indian Pines (IP) data and Urban data, respectively.

A. Simulated Data Experiments
In the simulated data experiments, PaviaU and Washington DC Mall (WDCM) datasets are used to evaluate the performance of the proposed methods visually and quantitatively. PaviaU dataset is collected by the reflective optics system imaging spectrometer sensor, whose spatial resolution is 1.3 m. The spatial resolution of the WDCM dataset is 2.8 m, which is acquired by the hyperspectral digital imagery collection experiment (HYDICE) sensor in 1995. Owing to space limitation, a subimage with the size 256 × 256 × 191 is cropped from WDCM data as the reference image. For the PaviaU data, some bands are corrupted by heavy noises, which cannot be viewed as reference images. So, these bands in PaviaU data are removed and the size of the selected subimage is 256 × 256 × 98. The two datasets are shown in Fig. 4. For the simulated data, the pixel values of all bands are normalized into [0, 1]. In the simulated   Table I when the noise level increases from 0.05 to 0.2. For different noise levels, the proposed method 3DGK can provide a better performance in the indicators on the whole. For the noise HS image in Case 2, the denoised results have a better performance than the noise HS images with noise levels 0.15 and 0.2 for almost all methods. Besides, Fig. 5 also presents the PSNR values of all bands of the denoised results from all methods. From Fig. 5, one can see that the proposed method has better performance for the first 70 bands. LRMR [47] behaves well on the last several bands.
reference HS image and the denoised HS images from different methods on the 661-nm band. The reference image and noise image are displayed in Fig. 6(a) and (b), respectively. The denoised images for all methods are shown in Fig. 6(c)--(j). From Fig. 6, we can see that the noises in the HS image are removed visibly for all methods when compared with the noise image in Fig. 6(b). For SSAHTV [22], the noises are suppressed but some artificial effects are introduced, especially in the smoothing areas. For Fig. 6(d), the blurring of the image can be observed and the noises in Fig. 6(e) are not eliminated well. For the result of SDGNLM [38], although the noises are efficiently removed, the spatial details and textures are also oversmoothed. For the denoised images in Fig. 6(g) and (h), some noises can be still found. The result of the proposed method has a better visual performance. Besides, a region containing buildings and vegetation is selected for a comparison, which is magnified and put on the bottom right corner of the image. From the enlarged region, we can see that the details in Fig. 6(d) and (e) are more blurry than those in Fig. 6(g) and (h). For the region in Fig. 6(j), the spatial information is blurred and the noises are suppressed well. Besides, we can see that the reconstruction errors of the result from the proposed method are relatively small from the error maps in Fig. 6.
Moreover, some pixels are chosen and their spectral residual curves are plotted in Fig. 7. From Fig. 7, it can be found that the spectral differences are considerable between the reference curve and that of SDGNLM [38]. The curve of 3DNLM [46] also has an obvious difference from the reference curve. Compared with other difference curves, the spectral residual of 3DGK is relatively smaller, which can preserve the spectral features more efficiently.
For the WDCM dataset, the numerical indexes are reported in Table II. Similar results can be seen in Table II, where the denoising performance degrades with the increase of noise level. Compared with other methods, the proposed method provides better performance in the three indicators on the whole. The   PSNR curves from different methods are also displayed in Fig. 8. In Fig. 8, it can be seen that the PSNR values of AKR [40] are lower than those of other methods. The proposed method can produce better performance.
The denoised images of all methods are compared and one band is selected and exhibited in Fig. 9, in which the falsecolor composites of HS images are made up of three bands (2320, 1910, 590 nm). Moreover, Fig. 9 also presents the error maps between the reference HS image and the denoised HS images from different methods on the 2320-nm band. The reference band and the noised band are placed in Fig. 9(a) and (b). In Fig. 9(d) and (e), the blurring effects are visible, which may be caused by improper spatial and spectral structure estimation. Because the spectral structures of HS images are ignored in SK [36] and AKR [40]. For the result of SDGNLM [38], some textures in flat areas are obviously suppressed. In Fig. 9(g) and (h), the noises in some homogenous regions are not eliminated adequately. For the proposed method, the spatial structures are maintained well. A homogeneous area is chosen and enlarged for visual analysis in Fig. 9 and the enlarged area is put on the bottom right corner of the denoised image. For the local area in Fig. 9(c), some spatial information is distorted. The results in Fig. 9(d) and (e) are also blurred. It can be observed that many details are lost in the selected area of Fig. 9(f) although the noises are removed. Besides,  [22]. (d) SK [36]. (e) AKR [40]. (f) SDGNLM [38]. (g) 3DNLM [46]. (h) LRMR [47]. (i) E3DTV [48]. (j) 3DGK. some noises can be seen in the area of Fig. 9(h). For the proposed method, some blur effects can be found in the magnified region. From Fig. 9, it can be observed that smaller reconstruction errors are achieved for the proposed method Fig. 10 displays the residual curves on different locations compared with the reference spectral signatures. From Fig. 10, we can see that although the noises are removed for SDGNLM [38], the spectral features are also damaged. For most methods, the consistency is better when the band number is larger than 100 except SDGNLM [38]. For the proposed method, the residual curve is relatively stable when the band number is smaller than 100.

B. Real Data Experiments
In this part, two real datasets shown in Fig. 11 are utilized for performance analysis. The first data, IP data, is collected by AVIRIS in 1992, whose size is 145 × 145 × 220. The second data, named Urban, is obtained by HYDICE over an urban area, whose size is 307 × 307 × 210.
For the IP dataset, some bands, such as the 488-, 1772-, and 2418-nm bands, are selected from the denoising results of all methods and presented in Fig. 10, where the results in one column are produced by the same methods and the images in one row have the same band index. From Fig. 12, it can be found that the results from SDGNLM [38] are blurry severely. For the images with heavy noises, the visual performance of the results from SSAHTV [22] is unsatisfactory. For the results in Fig. 12(c) and (d), the noises are removed but some blurring effects arise. Fig. 12(g) provides better denoising results, which may be benefited from the constraint of low rank. For the results of the proposed method, spatial information is preserved but some noises can be still found from the image in the middle row. Besides, different regions in different bands are considered, which are enlarged for a direct comparison. In the selected regions of the images in the first row, we can see some spatial details or weak targets are oversmoothed in Fig. 12(f)-(i). For the images with heavy noises in the second row, the spatial information is enhanced well in Fig. 12(f)-(h), but the hue of Fig. 12(f) has an obvious difference from that of Fig. 12(a). In the last row of Fig. 12, the results of AKR [40], LRMR [47], and 3DGK behave well in visual performance. But the region in Fig. 12(d) is more blurry.
The denoising results of the Urban dataset are shown in Fig. 13 and three bands (1470, 1550, and 1650 nm) are provided for comparison. There exist some striping noises and mixed noises in the Urban dataset. From Fig. 13, we can see that the striping noises are suppressed in the results of different methods. But the spatial details are oversmoothed in the results of SDGNLM [38] and 3DNLM [46]. In Fig. 13(i), the stripes can be still viewed from the denoised result in the first row. For the results of AKR [40], blurring effects can be seen in the building areas. For the images in the last row, better visual performance is provided.
Similarly, some interesting regions are selected in the denoised results for comparison. From different regions, we can see a similar performance with the results in Fig. 12. In the second row of Fig. 13, some subtle textures are smoothed in Fig. 13(e) and (f) when removing the strips. The results of LRMR [47] have a better performance in spatial details but the noises are not suppressed well. The proposed method has a better performance in noise removal.
Besides, Table III lists the numerical results of these methods on IP and Urban datasets. Larger CEIQ represents better denoised results. For BS, the higher score means lower denoising quality. For the IP dataset, the proposed method provides better numerical values in CEIQ and BS, which means the denoised result of the proposed method behaves well in general. In the Urban dataset, the best CEIQ is from the proposed method, but LRMR [47] provides the best BS.

C. Robustness of 3DGK
To verify the robustness of the kernel estimation on different noise levels, we use the simulated PaviaU and WDCM datasets and then calculate the difference values of the rotation angles θ i , φ i , and ψ i of the kernel for noise pixels with the corresponding angles of the kernel for clear pixels. Then, the means of all relative errors for different noise levels are computed and plotted in Fig. 14. Fig. 14(a) demonstrates the estimation errors of the PaviaU dataset on rotation angles θ i , φ i , and ψ i when compared with its clear HS image. It can be observed that the relative errors vary with the increase of noise level. However, the relative errors of different angles are less than 0.07, which indicates the estimated kernel is robust to the noises. The relative errors of the WDCM dataset on rotation angles are shown in Fig. 14(b). From Fig. 14(b), we also can see that the relative errors are smaller. Although, the relative error of φ is greater than 0.1 for noise level with 0.2, which is affected by the heavy noises, the relative errors of θ and ψ are still less than 0.1. Compared with the PaviaU dataset, the relative errors of the WDCM dataset are greater, which may be caused by the complex structures of different objects in the WDCM dataset.

E. Running Time
In Table IV, we report the running time of all methods on different datasets for a comprehensive analysis. All methods are implemented on the same computer with Intel Core i7-6700 processor, 3.4 GHz, and 16 GB memory by MATLAB R2017a. From Table IV, it can be observed that regression-based methods, such as SK [36] and AKR [40], spend more time. Besides, 3DNLM [46] is also time-consuming because of the neighbor search in local areas. SSAHTV [22] and LRMR [47] are faster when compared with the proposed method. Compared with SDGNLM [38] and 3DNLM [46], 3DGK has low computational complexity.

IV. CONCLUSION
In this article, an HS image denoising method is developed by extending the kernel regression model to the 3-D formulation, which makes effective use of the spatial and spectral geometrical structures in HS images. In the proposed method, 3DGK is established by estimating the structure and orientation in a 3-D block, whose information is inferred from its gradients. Then, 3DGK is adjusted adaptively by scaling, elongation, and rotation to approximate the inner structure of the 3-D block. So, more proper weights can be assigned to the adjacent pixels in the 3-D block. Finally, the denoised pixels are obtained by efficiently minimizing the weighted L 1 -norm constraint through ALM. Experiments are conducted on simulated and real datasets. Compared with some methods based on LS prior, the proposed method has a better performance in qualitative and quantitative evaluations. In future work, the geometrical structure captured by 3DGK will be further explored and combined with other efficient priors, such as TV.