Remote Sensing Image Fusion based on Nonnegative Dictionary Learning

For the problem of Panchromatic and multispectral remote sensing image fusion, we propose a remote sensing image fusion algorithm based on nonnegative dictionary learning. The basic idea of the algorithm is to use the panchromatic image with high spatial resolution to learn the high-low resolution dictionary pair, and to improve the fusion effect of remote sensing image by combining the nonnegativity of the image. Firstly, high resolution dictionary and low resolution dictionary are learned from high spatial resolution panchromatic image by nonnegative dictionary learning technology. Then multispectral image is sparsely represented by low resolution dictionary to obtain coefficient matrix. Finally, using coefficient matrix and high resolution dictionary, high resolution multispectral image is reconstructed. Compared with state-of-the-art methods, the proposed algorithm can get high spatial resolution and well preserve spectral information of multispectral image. Our experimental results of real QUICKBIRD and IKONOS remote sensing image fusion validate the effectiveness of the proposed algorithm.


I. INTRODUCTION
Most remote sensing images obtained by optical earth observation satellites (such as IKONOS, QuickBird and Landsat) include panchromatic images with high spatial resolution and multispectral images with low spatial resolution. Panchromatic and multispectral remote sensing image fusion is also called pan sharpening. Its purpose is to obtain multispectral images with high spatial resolution and spectral resolution, and to provide detailed and accurate image data for subsequent applications such as ground object recognition and classification.
Traditional methods of remote sensing image fusion can be roughly divided into two categories: the fusion method based on component alternative and the fusion method based on multi-resolution analysis. The fusion method based on component alternative mainly includes IHS (intensity-huesaturation) transformation method [1], PCA (principal component analysis) principal component analysis method [2,3] and GS (gram-Schmidt) method [4,5]. Basic idea of this methods is to use panchromatic image to replace the components in its multispectral image and improve the spatial resolution of multispectral image. The fusion method based on multi-resolution analysis includes wavelet transform, pyramid transform and NSCT transform [6][7][8]. Its basic idea is to decompose multispectral image and panchromatic image respectively, then fuse the decomposition coefficient, and then reconstruct the fusion image through multi-resolution. This kind of algorithm can retain the spectral information in the multispectral image well, but there will be spatial distortion, typical phenomena are ringing effect, virtual scene confusion, edge and texture blur.
In recent years, with the development of sparse representation theory [9][10][11][12], this technology has been used in the fusion of Panchromatic and multispectral remote sensing images, and achieved advanced fusion effect. In [13], a sparse representation based remote sensing image fusion algorithm is first proposed, which uses the similar highresolution multispectral image to learn the high-resolution dictionary and realize the high-resolution reconstruction of the low-resolution multispectral image. In [14][15][16][17], the lowresolution multispectral image dictionary and the highresolution panchromatic image dictionary are VOLUME XX, 2021 obtained by using the joint dictionary. Then, the two dictionaries are used to estimate the high-resolution multispectral dictionary through different models to realize remote sensing image fusion. In [18,19], arsis format is introduced into sparse representation image fusion algorithm to further improve the performance of the algorithm. Most of the above algorithms need to use the similar high-resolution multispectral image samples to train the dictionary, but the similar high-resolution multispectral image is often difficult to obtain, which limits the application of the algorithm in practice. In addition, with the rise of deep learning technology [20][21][22], this technology is also used in remote sensing image fusion [23][24][25], but this kind of algorithm needs a large number of samples for training.
In view of the shortcomings of existing sparse representtation based image fusion algorithms, we propose a remote sensing image fusion algorithm based on nonnegative dictionary learning. Firstly, high-low resolution dictionary pair are obtained from panchromatic images by multi task dictionary learning technology. Then the sparse representation coefficient matrix of multispectral image in low resolution dictionary is solved. Finally, the highresolution multispectral image is reconstructed by coefficient matrix and high-resolution dictionary to realize remote sensing image fusion. Compared with the existing similar algorithms, this algorithm does not need highresolution multispectral image as a sample, and because of the nonnegativity of image data, this algorithm can obtain the fusion image with higher spatial resolution and less spectral distortion.
The rest of this paper is organized as follows. Section II briefly describes the sparse representation and dictionary learning technology, then in Section III, the fusion algorithm of remote sensing image based on nonnegative dictionary learning is introduced in detail and Section IV gives the detailed steps of the algorithm and analyzes the computational complexity of the algorithm. In Section V, the performance of the algorithm is verified by real data experiments. Finally, conclusion is drawn in Section VI.

II. SPARSE REPRESENTATION AND DICTIONARY LEARNING
In this section, we briefly review the sparse representation and dictionary learning technology.

A. Sparse Representation
We firstly review sparse representation technology. Given a set of samples β = (β 1 , ⋯ , β K ) ∈ R n×K and redundant dictionaries A ∈ R n×m , the purpose of sparse representation is to find sparse coefficient matrix, so that each sample can be sparse represented. Using 1 norm represent sparsity, the problem of sparse representation is expressed as Where Γ = (Γ 1 , Γ 2 , ⋯ , Γ K ) ∈ R m×K is sparse coefficient matrix and λ is the weighted parameter. The common methods to solve sparse representation problems are the Orthogonal Matching Pursuit (OMP) and BPDN algorithm.

B. Sparse Representation
We then review the dictionary learning technology. Given a set of samples β = (β 1 , ⋯ , β K ) ∈ R n×K , the purpose of dictionary learning is to find redundant dictionaries A ∈ R n×m , so that each sample can be sparse represented by sparse coefficient matrix Γ = (Γ 1 , Γ 2 , ⋯ , Γ K ) ∈ R m×K . The problem of dictionary learning is expressed as Where λ is weighted parameter. The common methods to solve dictionary learning problems are mod algorithm, K-SVD algorithm [26] and online dictionary learning algorithm [27]. The MOD algorithm needs a large computational cost, so we use the idea of K-SVD algorithm to dictionary learning. K-SVD algorithm divides the process of dictionary learning into two steps: one is to update redundant dictionary A ∈ R n×m with the fixed parse coefficient matrix and the other is to update the coefficients Γ = (Γ 1 , Γ 2 , ⋯ , Γ K ) ∈ R m×K with the fixed redundant dictionary.

III. REMOTE SENSING IMAGE FUSION ALGORITHM BASED ON NONNEGATIVE DICTIONARY LEARNING
In this section, we propose a remote sensing image fusion algorithm based on nonnegative dictionary learning. In our algorithm, the high-low resolution sample set is generated by the high resolution panchromatic remote sensing image, then the high-resolution and low-resolution dictionaries are learned by using nonnegative dictionary learning technology, and the sparse representation coefficient matrix of multispectral images under the low-resolution dictionary is solved. Finally, the high-resolution multispectral images are reconstructed by using the results to realize remote sensing image fusion.

A. Preprocessing to get high-low resolution panchromatic sample set
High resolution panchromatic remote sensing images are firstly preprocessed to get high-low resolution panchromatic sample set. The low resolution panchromatic remote sensing image Z l is obtained by fuzzy processing and down-sampling of high resolution panchromatic remote sensing image Y h , and then the panchromatic image Z l is restored to the original high-resolution panchromatic image of the same size by bicubic interpolation. After the above processing, the high and low resolution panchromatic images are divided into overlapped image blocks with the same size of √n × √n, and then K image blocks are randomly selected to form the high and low resolution sample set {p k h , p k l } ∈ R n (k = 1, ⋯ , K).
The high and low resolution sample set generation process is summarized as follows: VOLUME XX, 2021 Algorithm 1: Construct high and low resolution sample set Input: high resolution panchromatic remote sensing image Y h ; (1) Down-sampling and blur the image Y h to get the low resolution image Z l ; (2) The image Z l is restore to the same size as Y h by bicubic interpolation, denoted as Z ̅ l (3) Y h and Z ̅ l are divided into overlapped image blocks with the same size, K image blocks are randomly selected; Output: the high and low resolution sample set {p k h , p k l } ∈ R n (k = 1, ⋯ , K).

B. High-low resolution dictionary pair learning
we use the sample set {p k h , p k l }to learn the high resolution dictionary and low resolution dictionary through the nonnegative dictionary learning technology. Let the problem of learning high-low resolution dictionary pair is written as 1 , Where D 1 , D 2 ∈ R n×m represent the high resolution dictionary and the low resolution dictionary respectively, λ 1 , λ 2 are the weighted coefficient, and ‖α‖ 1 = ∑ |α i,j | i,j is the sparse constraint of the coefficient matrix α ∈ R m . By combining formula (3) and (4), we can get that the dictionary learning model is Image data is nonnegative. According to the theory of nonnegative matrix decomposition, P j can be decomposed into the product of two nonnegative matrices. Therefore, by adding nonnegative constraints to the dictionary and coefficient matrix in the dictionary learning model, the computational efficiency and fusion performance are improved, and the nonnegative dictionary learning model is obtained as The nonnegative constraints of redundant dictionary D j (j = 1,2) and coefficient matrix α reflect the nonnegativity of hyperspectral remote sensing image data.
Based on the basic idea of nonnegative matrix decomposition multiplication iterative algorithm [28], an iterative scheme is constructed to solve the nonnegative dictionary learning model (6). There are two iterative steps: fixed redundancy dictionary D j (j = 1,2), updating coefficient matrix α ; fixed coefficient matrix α , updating redundancy dictionary D j (j = 1,2).
When the coefficient matrix α is fixed, the nonnegative dictionary learning model (6) can be simplified as . . ∀ : ≥ 0 4 VOLUME XX, 2021 Let we can get Let From the above formula, the iterative form of redundant dictionary can be written as Specifically, the iterative form of high resolution dictionary is denoted as The iterative form of low resolution dictionary is written as Repeat the above two iterative steps (10) and (14) until convergence, and the high-resolution dictionary D 1 , lowresolution dictionary D 2 and coefficient matrix α can be obtained. It can be seen from the iterative schemes (10) and (14) that when the initial values of dictionary D j (j = 1,2) and coefficient matrix α are nonnegative matrices, the updated dictionary and coefficient matrix are also nonnegative matrices.

C. High resolution multispectral remote sensing image reconstruction
High resolution multispectral remote sensing image can be reconstructed by the high resolution dictionary and low resolution dictionary. The input low resolution multispectral image is used to estimate its sparse representation coefficient matrix under the low resolution dictionary, and then the high resolution multispectral image is reconstructed by combining the sparse representation coefficient matrix and the high resolution dictionary. Denote the low resolution multispectral images as (s = 1, ⋯ , S) and S represent the number of bands included. Image X s is restored to low resolution interpolating image X s by bicubic interpolation, which is the same size as the original high resolution panchromatic image and then X s is divided into P image blocks which overlap each other and have the same size as √n × √n. Denote the vector of the p -th image block as x p s and the data matrix of all the s -band image blocks as x s = [x 1 s , ⋯ , x P s ] ∈ R n×P , then the nonnegative sparse representation model of the coefficient matrix is Where ω s ∈ R m×P is the sparse representation coefficient matrix. Let L ω s (ω s ) = Similar to the iterative scheme (10), we can get the following iterative scheme The coefficient matrix of each band image can be obtained by using the iterative scheme (18). According to the iterative scheme (18), when the initial value of the coefficient matrix ω s is a non-negative matrix, ω s updated by the iterative scheme is also a nonnegative matrix. Next, we use the high resolution dictionary D 1 and coefficient matrix ω s to reconstruct the high resolution multispectral image. Using high resolution dictionary D 1 and coefficient matrix ω s , the reconstructed image data matrix of each band is Each column vector in the data matrix x s h corresponds to an image block in the s -band. The high-resolution multispectral remote sensing image X s (s = 1, ⋯ , S) can be obtained by splicing the image blocks according to their positions and averaging the overlapped parts.

A. Detailed steps of our algorithm
In this subsection, the detailed steps of the proposed algorithm are described, as shown in Algorithm 2.

Note for Algorithm 2: (a) Selection of dictionary dimension :
The high-low resolution dictionary pair have the same dimension, and the dictionary dimension is generally taken as 2 ≤ ≤ 4 . If the dictionary dimension is too large, the calculation amount will increase. (b) Selection of the weight parameters and : parameters λ 1 and λ 2 are values generally related to dictionary dimension m , and the weight parameter of our algorithm is set as

B. Analysis Of Computational Complexity
The computation cost of our algorithm is mainly from two steps: learning low-high resolution dictionary pair and solving low-resolution multispectral image coefficient matrix ω s . The cost of computation for high-low resolution dictionary pair learning comes from the coefficient matrix α iterative format (10) and the high and low resolution dictionary learning iterative format (14). Assuming that the dictionary learning iterations times, the cost of computation required is respectively Because n < m, computation cost for learning high-low resolution dictionary is The computation cost of coefficient matrix ω s is derived from the iterative scheme (18). If the number of iterations is N 1 , the computation cost of the estimated coefficient matrix ω s is expressed as Because multispectral images have S bands in total, the computational cost of this algorithm is It can be seen that the time complexity of this algorithm is related to the number of bands S , the number of iterations N and N 1 , the number of samples K , the number of image patches P and the dictionary dimension m , and is most affected by the dictionary dimension m.

V. EXPERIMENTAL RESULTS AND ANALYSIS
In this section, we will verify the performance of our algorithm through real data experiments, and compare it with GS algorithm [4] and SR-D algorithm [19]. Subsection A compares the performance of real QuickBird Remote sensing image after fusion; Subsection B compares the performance of real IKONOS remote sensing image after fusion; Subsection C shows the influence of dictionary dimension on algorithm performance; Subsection D shows the influence of training dictionary sample number on algorithm performance.
In the following experiments, panchromatic and multispectral images provided by the QuickBird and IKONOS satellites are used. QuickBird Remote sensing images include high-resolution panchromatic images with a resolution of 0.7 m and low-resolution multispectral images with a resolution of 2.8 m, among which multispectral image contains four bands. IKONOS remote sensing image includes highresolution panchromatic image with a resolution of 1 meter and low-resolution multispectral image with a resolution of 4 meters, among which multispectral image contains 4 bands. The size of image block is 8 × 8, the overlapping pixels of adjacent cubes are 7, the number of dictionary atoms is m = 256, and the number of samples is K = 4000.
Due to the lack of ideal fusion image as a reference, we use spectral angle mapper (SAM) index and average gradient (AG) index to measure the degree of spectral and spatial detail information retention respectively in this paper. SAM uses the original multispectral image as the reference image to measure the degree of spectral information retention of the original multispectral image. The smaller the value of SAM, the better the image spectral information retention. AG uses highresolution panchromatic image as reference image to measure the degree of spatial detail preservation. The higher the value of AG, the better the image spatial detail retention.

A. Comparison of fusion performance of real QuickBird remote sensing image
In this subsection, we compare the performance of three algorithms by fusing the panchromatic image and multispectral image of QuickBird satellite. Figure 1 compares the visual effects of three QuickBird Remote sensing images after fusion. As can be seen from Figure 1, compared with GS algorithm and SR-D algorithm, the image color fused by this algorithm is closer to the original multispectral image, which can maintain better spectral information, and the image details are clearer. Table 1 compares the SAM index and AG index of the three algorithms of QuickBird Remote Sensing Image (the mean value of the three fusion images). As can be seen from Table 1, our algorithm has the minimum SAM value and the maximum AG value. Compared with the traditional GS algorithm, the value of SAM algorithm is about 9 less. Compared with the similar SR-D algorithm, the value of SAM algorithm is about 1.2 less. This experiment shows that our algorithm has better visual effect, and that our maintains better spectral information and higher spatial detail information.

B. Comparison of fusion performance of real IKONOS remote sensing image
Through the fusion of IKONOS satellite panchromatic image and multispectral image, this experiment compares the performance of three algorithms. Figure 2 compares the visual effects of three IKONOS remote sensing images after fusion. As can be seen from Figure 2, compared with GS algorithm and SR-D algorithm, the color of the image fused by this algorithm is closer to the original multispectral image Figure   2 (b), and the details of the image are clearer. Table 2 compares the SAM indicators and Ag indicators of the three algorithms. As can be seen from Table 2, the algorithm in this paper has the minimum SAM value and the maximum AG value. Compared with the traditional GS algorithm, the value of SAM algorithm is about 9 less. Compared with the similar SR-D algorithm, the value of SAM algorithm is about 1 less. This experiment also shows that our algorithm has better visual effect, and that our maintains better spectral information and higher spatial detail information. 7 VOLUME XX, 2021

C. Influence of dictionary dimension on algorithm performance
This experiment shows the effect of the number of samples needed to train the high-low resolution dictionary pair on the performance of our algorithm. Using three remote sensing images of QuickBird satellite in subsection A, the dictionary dimension (or the number of dictionary atoms) is from 100 to 300, and other simulation conditions are unchanged. Figure 3 (a) and Figure 3 (b) respectively show the SAM and AG indexes of the algorithm in this paper when the dictionary dimensions are different. It can be seen from Figure 3 that when the dictionary dimension is less than 160, the Ag value increases gradually with the increase of dictionary dimension, while the SAM value decreases gradually. When the dictionary dimension is greater than or equal to 160, the algorithm performance is stable. Therefore, in order to keep the good performance of this algorithm, the selection of dictionary dimension should not be less than 160. That is to say, the value of dictionary A ∈ R n×m needs to be greater than 160.

D. The effect of sample number on algorithm performance
This experiment shows the effect of the number of samples needed to train the high-lllow resolution dictionary on the performance of the algorithm. Using the remote sensing image of Quick Bird satellite in subsection A, the number of samples changes from 500 to 5500, and other simulation conditions remain unchanged. Figure 4 (a) and Figure 4 (b) respectively show the SAM and AG indexes of the algorithm in this paper. As can be seen from Figure 4, when the number of samples is less than 3000, the AG value increases gradually with the increase of the number of samples, while the SAM value decreases gradually. When the number of samples is greater than or equal to 3000, the performance of the algorithm is basically stable. It can be seen that when training high-low resolution dictionary pair, the number of randomly selected samples should not be less than 3000 pairs. That is to say, the value of the sample set needs to be greater than 160.

VI. CONCLUSIONS
To solve the fusion problem of high resolution panchromatic remote sensing image and low resolution multispectral remote sensing image, a remote sensing image fusion algorithm based on nonnegative dictionary learning is proposed. The basic idea of the algorithm is to use the high-resolution panchromatic image and the degraded low-resolution panchromatic image to learn the high-resolution and low-resolution dictionary, and to improve the fusion performance by combining the nonnegativity of the image. First, the high-resolution and lowresolution dictionaries are obtained by using the nonnegative dictionary learning technology of panchromatic images. Then the sparse representation coefficient matrix of multispectral image in low resolution dictionary is solved. Finally, the highresolution multispectral image is reconstructed by the coefficient matrix and the high-resolution dictionary to realize the fusion of remote sensing image. Through the experiments of high-resolution panchromatic remote sensing image fusion of the QuickBird and IKONOS satellites, it is shown that the fusion result of this algorithm can obtain higher spatial