Hyperspectral Remote Sensing Image Classification With CNN Based on Quantum Genetic-Optimized Sparse Representation

Due to the characteristics of the spectrum integration, information redundancy, spectrum mixing phenomenon and nonlinearity of the hyperspectral remote sensing images, it is a major challenging task to classify the hyperspectral remote sensing images. Therefore, a hyperspectral remote sensing image classification method, named QGASR-CNN is proposed in this paper. In the QGASR-CNN, a quantum genetic-optimized sparse representation method is designed to obtain the over-complete dictionary with sparsity, and achieve the feature sparse representation to construct the sparse feature matrix of hyperspectral remote sensing image pixel groups. Then the convolution neural network(CNN) directly convolutes with image pixels to build the feature mapping relation by using convolution operation. Finally, in order to testify the effectiveness of the QGASR-CNN, the actual hyperspectral remote sensing image datasets are selected in here. The comparison results show that the QGASR-CNN sparsely represents the features of hyperspectral remote sensing images and improves the classification accuracy. It can effectively alleviate the problems of the small samples and ‘salt and pepper misclassification’.


I. INTRODUCTION
A hyperspectral sensor obtains approximately continuous spectral curve of the ground object in ultraviolet, visible, near infrared, mid infrared, and other electromagnetic wave bands. It combines the spectral information reflecting the reflection characteristics of ground object with the image information reflecting the spatial position relationship of ground object. It is composed of dozens or even hundreds of continuous wave band images into a three-dimensional data cube, which has the characteristics of combination of image and spectrum, and it has been widely used in resource The associate editor coordinating the review of this manuscript and approving it for publication was Qiangqiang Yuan . exploration, environmental monitoring, precision agriculture, disaster assessment, target recognition, and other fields [1]- [3]. Hyperspectral remote sensing image classification is based on the recognition information with specific meaning learned from the original spectral information, and each pixel is accurately divided into its own feature categories. Compared with general remote sensing images, hyperspectral remote sensing images have many characteristics, such as a large number of bands, fine and continuous spectrum, large data volume and information redundancy, spectral mixing phenomenon and data nonlinearity [4]- [6]. The data and the spatial structure is relatively complex and the dimension is very high. Traditional image processing and classification methods are difficult to achieve classification results. As a typical ultra-high dimensional data, how to mine the diagnostic information of interested objects from the massive high dimensional data has become a bottleneck in the applications of hyperspectral remote sensing image.
In recent years, a lot of research on feature extraction and classification of hyperspectral images have been conducted by using sparse representation and convolution neural network(CNN) theory. Each test sample to be classified is taken as a linear combination of training samples [7]. Then the sparse coefficient is solved, and the classification is determined by minimizing residual between the test sample and all kinds of training samples, but only the spectral information is used in the feature extraction, resulting in the non-smoothness of the classification results. A joint sparse representation model based on the maximum likelihood estimation is proposed [8]. The traditional quadratic loss function is replaced by a coding residual function used to measure the joint approximation error class. The coding residual estimation transforms the traditional maximum likelihood estimation into an iterative weighted joint sparse representation. This method can reduce the number of non-uniform neighborhood pixels. However the robustness of outliers is poor. Besides, the iteration is slow due to the high dimension and large amounts of data, which reduces the efficiency. The group sparse principal component transformation is studied [9]. The signal is represented by the eigenvector of the covariance matrix of the training sample. The eigenvector is regarded as the base atom of the signal, and the atoms are orthogonal to each other to maintain the main components of the original spectrum. But the sparse prior is not used explicitly. Aiming at the problem that the independence of sparse reconstruction process will result in the loss of similarity information of coefficients between similar samples [10], a regularized sparse representation algorithm is proposed, which combines the centralized quadratic constraint as a regular term into the objective function of sparse representation to retain the similarity information. However, the solution of sparse representation ignores the Euclidean distance relationship between samples. The hyperspectral remote sensing image is classified by using the deep CNN network model [11]. It obtains the spectral vectors of each pixel, extracts the local spectral information on the spectral vector by using CNN convolution layers, and takes the feature map generated by convolution operation as the input of the full connection layer, and finally completes the hyperspectral remote sensing image classification. This method makes use of CNN's local connection, weight sharing and other characteristics, greatly reduces the model parameters and training difficulty, further improves the classification performance, but fails to make full use of the rich space spectrum information of hyperspectral remote sensing image, reducing the ability of feature learning. In reference [12], the spectral information, spatial structure information and semantic context awareness information of each pixel are represented jointly by different regions, and the features of multiple scales and different layers are combined from different layers for depth feature extraction.
However, this method divides the original image into six patches, which are global, top, bottom, left, right and center. Each patch corresponds to a CNN structure. After the feature extraction of the six CNN branches, the whole connection operation is carried out. The input information of CNN is not positively related to the classification effect. Complex inputs enlong the training time and classification time, which easily leads to over-fitting. After extracting spatial features from principal components analysis (PCA) dimensionally reduced hyperspectral image data by CNN [13], the spatial features extracted by CNN are further processed by sparse coding. Finally, the classification task is completed based on the sparse coding features. But it only considers the spatial features of hyperspectral remote sensing image, not its rich spectral features, and the accuracy of classification needs to be improved. A new pixel pair method is proposed to increase the number of CNN training samples [14]. For a test pixel, the trained CNN classifies the central pixel and each surrounding pixel, and then determines the final label through avoidance strategy. In this method, CNN is used to learn the features of pixel pairs, which has better classification performance. However, the generalization ability of CNN model is insufficient, and the accuracy of the classification needs to be improved because of the unclear quantification of pixel pairs.
To sum up, at present, only sparse representation is used to reduce dimension and extract features of hyperspectral remote sensing images, and then traditional SVM, random forest [15]- [17] and other classifiers are used to complete the classification of hyperspectral images and hyperspectral image classification by using CNN alone. Although certain achievements have been achieved, most of them do not use sparse prior explicitly, resulting in 'salt pepper misclassification'. The extraction of joint features of space spectrum and the utilization of spatial information still need to be diversified, and there are deficiencies in the mining of hyperspectral data structure information. On the other hand, the traditional dimensionality-reduction methods are used to participate in the convolution operation of CNN models [18]- [20].
Therefore, in view of the characteristics of hyperspectral remote sensing image, such as the number of bands, the combination of atlas, information redundancy, spectral mixing and data nonlinearity, sparse representation with quantum genetic algorithm and convolutional neural network are integrated in order to propose a novel QGASR-CNN method, which is used to effectively describe the features and achieve The classification for hyperspectral remote sensing image. The actual hyperspectral remote sensing data are used to verify the effectiveness of the QGASR-CNN.
The presented research is organized as follows: Section 2 introduced basic methods of quantum genetic algorithm, sparse representation and convolutional neural network. In the Section 3, the proposed QGASR-CNN is described and the contribution is highlighted. In the Section 4, the experimental verification and analysis are provided. VOLUME 8, 2020 The Section 5 is given to conclude the QGASR-CNN and suggest some works in the future.

A. GUANTUM GENETIC ALGORITHM
Quantum genetic algorithm (QGA) is a probabilistic search optimization method that combines quantum computing theory with genetic algorithm. It has better population diversity and computational parallelism, faster convergence speed, higher search efficiency and stronger global optimization ability [21]- [23]. The QGA introduces the concept and principle of quantum computation into the genetic algorithm, and adopts chromosome coding based on Q-bit. The Q-bit in QGA is a unit vector defined in a two-dimensional complex vector space. A Q-bit can be expressed as α β , where α and β are probability amplitudes of the corresponding states of Q-bit. |α| 2 and |β| 2 represents the probability that the quantum state is observed as ''0'' state and ''1'' state, respectively. The normalization of states needs to meet |α| 2 +|β| 2 = 1. In order to find the optimal solution, we apply the representation of the Q-bit probability amplitude to chromosome coding, quantum rotation gate and other operations to chromosome update.

B. SPARSE REPRESENTATION THEORY
As a state-of-the-art data mining technology, sparse representation can effectively extract the hyperspectral terrain information by using the high redundancy of massive high-dimensional data and the sparsity of interested signals [24]- [26]. According to the sparse representation theory, for a given dictionary, each signal can be expressed by the linear combination of a few primitives in the dictionary. Given the image data set {x 1 ,x 2 , . . . ,x m }, its sparse representation mathematical model can be described as follows.
where D ∈G d×k is the dictionary matrix, k is the sparsity, λ is the regularization parameter, and a i ∈ G k is the sparse representation of the sample x i ∈ G d . Sparse representation learning of hyperspectral remote sensing image uses optimization algorithm to obtain sparse representation coefficient a i , which can well reconstruct x i when a i is as sparse as possible.

C. CONVOLUTIONAL NEURAL NETWORK
In hyperspectral remote sensing image processing, a convolutional neural network (CNN) makes full use of the information of the adjacent areas in the image, greatly reduces the scale of parameters, reduces the complexity of calculation and improves the convergence speed of the network by sparse expression and weight sharing. It is usually composed of convolution layers, pooling layers, full connection layers and an output layer [27]- [29]. Assuming that the pixel value of the sub image of the input image at (i, j) position is p ij the element value of the convolution kernel matrix at (x, y) position is k xy , and the convolution kernel acts on a certain position of the image, the output is given by where f is the activation function and b is the offset term. The CNN can be regarded as a kind of neural network with weight sharing and local connection. According to the previous definition, the forward propagation formula of convolution layer is desceibed as follow, i,e.
By using the same convolution kernel, i.e., weight sharing and local connection, the CNN greatly reduces the trainable parameters in the network, the complexity of the model, and the risk of over fitting, so as to obtain better generalization ability.

III. A NEW HYPERSPECTRAL REMOTE SENSING IMAGE CLASSIFICATION A. THE BASIC THOUGHT
Hyperspectral remote sensing images have the characteristics of numerous bands, fine and continuous spectrum, integrated spectrum, large data volume and redundant information, spectral mixing phenomenon and non-linear data. The spatial structure of images is complex and the dimension is higher. Therefore, traditional image processing and classification methods are difficult to effectively obtain better classification results. Sparse representation can represent the image features with as few atoms as possible and describe the image nature with less characteristic coefficients in a given over-complete dictionary. Quantum genetic algorithm combines quantum computing with genetic algorithm, which has better population diversity and computing parallelism, and takes on faster convergence speed, higher search efficiency and stronger global optimization ability. Convolution neural network is a kind of feed forward neural network, which contains convolution calculation and has deep structure. It has stronger classification ability according to the hierarchical structure. Therefore, in order to improve the classification accuracy and effeciency, based on the manifold structure, polymorphism and low rank of hyperspectral remote sensing images, the quantum genetic algorithm is used to sparsely decompose the over-complete dictionary of hyperspectral image according to the space-spectrum characteristics and sparse priori of pixels. Then the feature sparse representation coefficient of the central pixel is obtained by using the spatial continuity constraints, and the spectral information in the image is combined. The spatial structure and sparse coefficient are used to obtain the sparse representation vector of each pixel in order to construct the sparse feature matrix of hyperspectral image pixel group. Finally, the sparse feature matrix is regarded as the inputs of convolution kernel to establish the feature mapping relation of pixels. The novel QGASR-CNN method is proposed to effectively achieve The classification of hyperspectral image.

B. OVERVIEW OF QGASR-CNN
The proposed QGASR-CNN method includes dictionary construction, sparse coefficient solution and CNN classifier. Firstly, the data is selected, and the over-complete dictionary is constructed by learning. Then, the sparse representation with quantum genetic algorithm is used to etract the features and construct sparse feature matrix. Finally, the sparse feature matrix is considered as the inputs of the CNN classifier, and the scence classification results is obtained by Softmax function. The flow of the QGASR-CNN is shown in Figure 1.

C. THE REALIZATION OF QGASR-CNN 1) CONSTRUCT DICTIONARY
The sparse representation of hyperspectral image is to decompose the hyperspectral image under the over-complete dictionary, and obtain the feature sparse representation. It mainly includes over-complete dictionary design and sparse decomposition algorithm. K-SVD algorithm is usually used to design over-complete dictionary [31]. It uses the original image to construct training samples, which can obtain redundant dictionary to be suitable for the original signal through self-adaptive learning. To a large extent, it guarantees the sparse representation effect of the over-complete dictionary. In order to make the pixel image objects are as close to sparse representation as possible, and fully reflect the local structural features among the pixels, it is necessary to completely reconstruct the original image information. In hyperspectral image processing, any given test pixel can be represented by a given set of labeled pixels with sparse linearity by using l 0 − norm or l 1 − norm regularization operator. For a given test pixel x, the sparse representation aims to obtain the weight vector α of the linear combination. At the same time, it meets the minimum reconstruction error x − Da 2 2 and the minimum sparse constraint α 1 . It can be expressed as argmin x − Da 2 2 + λ 1 α 1 , where λ 1 is the regularization parameter.

2) SPARSE REPRESENTATION
The parameter selection of sparse decomposition greatly affects the performance of sparse representation and the effect of feature extraction. At present, the parameter of sparse decomposition mainly depends on the experiences, which is complex and diverse, and has a wide range of values. Therefore, the parameter value obtained by experiences cannot be the optimal value. Quantum intelligent optimization algorithm combines quantum theory with intelligent computing, makes up for the shortcomings of traditional intelligent optimization algorithm. It has the characteristics of quantum parallel computing, accelerates the convergence speed, and avoids premature phenomenon. Compared with the traditional optimization algorithm, quantum evolution algorithm has good population dispersion, strong global search ability and fast search speed. Therefore, the QGA is applied to solve the problem of sparse representation parameter optimization. In the QGA, the representation of Q-bit probability amplitude is applied to chromosome coding, quantum revolving gate and other operations are used to update chromosomes, so as to find the optimal solution, optimize the parameter values of sparse representation, improve the performance of sparse representation and the sparse representation effect of hyperspectral images. The parameters of the sparse representation mainly include the number of atoms M in the over-complete dictionary, the maximum allowable error ε, the sparsity K, and the number of iterations n. With the increase of the number of dictionary atoms, the adaptability of the dictionary is stronger, but it also increases the capacity for storing the dictionary, reduces the sparsity of pixel data, and increases the complexity of operation. When the number of dictionary atoms is too large, the sparse representation of image loses meaning. Therefore, the QGA adopts the coding method based on Q-bit. Q-bit is a unit vector to define a two-dimensional complex vector space. A Q-bit can be expressed as α β , α and β are probability amplitudes of corresponding states of Q-bit. |α| 2 and |β| 2 represent the probability that the quantum state is ''0'' state and ''1'' state, respectively. The normalization of states must meet the requirements |α| 2 + |β| 2 = 1. In this case, an m-quantum state can be expressed as follow.
Quantum population based on Q-bit coding can be defined as Q(t) = q t 1 , q t 2 · · · q t m , where n represents the population size, t represents genetic algebra, and q t j represents a quantum chromosome, which is defined as follow.
where m is the number of Q-bit, i.e. the length of the quantum chromosome, which can be represented 2 n states. That is to VOLUME 8, 2020 say, a chromosome represents the superposition of multiple states. In addition, because the norm square of probability amplitude tends to 0 or 1, the experience tends to a single state, which ensures the good convergence of QGA. The basic steps of parameter optimization for sparsity coefficients are described as follows.
Step 2: The binary solution set of Q(t) state α t β t is obtained by the spatial and spectral information of the pixel. A quantum chromosome q t j is defined according to the formula (5).
Step 3: Save the current optimal solution to measure each chromosome gene position (Q-bit), and obtain a state according to the formula(4).
Step 4: Calculate the fitness value for each state. There is γ k = arg max i=1...k γ k−1 q i , the fitness function is used to evaluate the fitness value of each individual and record the best individual γ k .
Step 5: Add the best individuals of genetic evolution into the training dictionary.
Step 6: The population renewal of each generation of chromosomes is carried out by quantum revolving gate U(0.025π ).
Because the spectral curve structure of the same pixel is very similar, the obtained coefficients from the overcomplete dictionary are also similar.â(x) can be regarded as a new representation of pixel x (x ∈ X ). That's to say, the sparse representation of pixel is x in the over-complete dictionary A.

3) A NOVEL SCENCE CLASSIFICATION METHOD
When the CNN is used to classify the scenes of hyperspectral images, its depth is the main factor for affecting the classification accuracy. Experimental results show that when a certain number of layers are built, the number of layers in the CNN is further increased. The classification accuracy does not improve, and it will take more time [32]. The CNN model is shown in Figure 2, which consists of a convolution layer, a pooling layer, a full connection layer, and a Softmax layer. The sparse vector of each pixel can be regarded as a two-dimensional image with one height, so the size of the input layer is (n 1 , 1) and n 1 is the total number of wavebands. The first convolution layer is obtained by filtering 20 k 1 × 1 convolution check (k 1 is the sparsity) for input images (n 1 , 1). It contains 20×n 2 ×1 nodes, where n 2 = n 1 −k 1 +1. In order to reduce the amount of calculation and avoid the over-fitting issue, we pool the average value in the pool layer, which contains 20×n 3 × 1 nodes, where there are n and n 3 = n 2 /k 1 .
The first full connection layer has n 4 nodes, and the number of training samples from the pool layer to this layer is (20 × n 3 × 1) × n 4 . The neurons generated by the Relu activation function are used as the input of the second full connection layer. The second full connection layer has n 5 nodes, and the number of samples is (n 4 + 1) × n 5 to be trained. The neurons generated by the Relu activation function are also used as the input of the Softmax layer. In the CNN model, if x i is the input of the i-th layer, then there is where S T i is the sparse vector of the input pixel data of the layer, and f i (·) is the Relu activation function. The second full connection layer will generate n 5 type labels, which will be put into the Softmax function for normalization operation, and then output all class probabilities with the dimension of y = x L+1 . The Softmax function is defined as

IV. EXPERIMENTAL VERIFICATION AND ANALYSIS A. EXPERIMENTAL DATA AND ENVIRONMENT
In order to verify the effectiveness of the QGASR-CNN, hyperspectral remote sensing image datasets from Pavia University and Indian pines were selected in here [32]. The PCA, sparse representation (SR), and the sparse representation with QGA(QGASR) are used to extract the dimensionality-reduction features of the hyperspectral images. Then classification is trained by using the CNN model. The experimental results are compared with other algorithms in order to verify the effectiveness of the proposed QGASR-CNN.  The experiment executed on a PC, and the basic configuration is Intel(R)Core(TM)i5-6300HQ CPU @ 2.30GHz. Pavia University dataset is a hyperspectral remote sensing image dataset collected from Pavia University in the northern Italy by German airborne reflectance optical spectral imager [32]. The spectral imager continuously imaged 115 wavebands in the wavelength range of 0.43-0.86 µm, and the spatial resolution of the image was 1.3 m. Among them, 12 wavebands were eliminated due to the noise. The remaining 103 images are generally used. The size of the image is 610 × 340, including 207400 pixels in total. A large number of background pixels are removed, and the remaining pixels are 42776 in total. The basic information of images is shown in Table 1, and the images are categorized into 9 types in Table 2.
Indian pines were imaged by AVIRIS, an airborne visible infrared imaging spectrometer, for an Indian pine tree in Indiana, USA. The size of 145 × 145 was intercepted and labeled as a hyperspectral image classification test dataset. The imaging wavelength range is 0.4-2.5 µm, and the ground objects are continuously imaged in 220 consecutive bands. However, since the 104-108, 150-163 and 220 bands cannot be reflected by water, we use the remaining 200 bands as the research object. The data contains 21025 pixels in total, a large number of background pixels are eliminated, and 10249 pixels including images are left. The basic information of the 16 types is shown in Table 3. The images are categorized into 16 types in Table 4.

B. DATA CONFIGURATION AND PARAMETER SETTING
In terms of the selection of the size of convolution kernels, we used three-dimensional, five-dimensional, sevendimensional, and nine-dimensional convolution kernels to conduct experiments on the datasets of Indian pines and Pavia   University, respectively. By comparing the overall accuracy (OA), average accuracy(AA) and kappa coefficients(kappa), we can evaluate the influence of the size of convolution kernels on the classification performance of the model. The convolution influences with different sizes on classification accuracy are shown in Figure 3 and Figure 4.
From Figure 3 and Figure 4, it can be seen that two datasets of 5 × 5 convolution check obtain better classification performance. Therefore, the size of convolution core is set as 5 × 5 in this experiment. At the same time, we have carried out experiments on the influence of the proportion of training samples and test samples in the Indian pines dataset on the classification accuracy. The number of selected training samples are 10, 50, 100, 150, and 200, respectively.
As can be seen from Figure 5, the performance of the classification with growing training samples. As can be seen from the back part of the line graph, the addition of datasets VOLUME 8, 2020   does not lead to over-fitting appearance. Therefore, 200 training samples from Pavia University and Indian pines datasets are selected in here. The remaining samples are used as test sets. The data configuration details of Pavia University dataset are shown Table 5.
Due to the less number of the samples for some surface feature types in the Indian pines dataset, we only reserve 8 types of surface feature with a large number of samples in this experiment. Similarly, 200 samples are selected for 8 types of surface feature as the training set and the rest samples regarded as the test set. The data configuration of Indian pines is shown in Table 6.

C. EXPERIMENTAL RESULTS AND ANALYSIS
In order to evaluate The classification effect of the proposed algorithm, The classification results are compared with the PCA, SR and QGASR. In the experiment, the commonly used classification and comparison evaluation indexes are overall    accuracy and kappa coefficient. All algorithms and experiments are repeated for 20 times, the maximum number of iterations is 80. The overall accuracy and kappa coefficients are the mean values for 20 tests. Experiment 1: After the experiment is executed for hyperspectral remote sensing image dataset from Pavia University, the results comparison between the original hyperspectral remote sensing image and the classification results is shown in Figure 6.
The training time, testing time, total accuracy and kappa coefficient among three methods are obtained in Table 7 and Figure 7.
As can be seen from the Table 7 and Figure 7, the total accuracies of the PCA-CNN, SR-CNN and QGASR-CNN are 86.4%, 88.7% and 91.6%, respectively. The total accuracy  of the QGASR-CNN is best, which is 5.2% higher than that of the PCA-CNN. The kappa coefficients of PCA-CNN, SR-CNN and QGASR-CNN are 0.841, 0.863 and 0.882, respectively. The kappa coefficient of the QGASR-CNN is best among three methods. From the comparison results of the evaluation indexes among three methods for the hyperspectral remote sensing image, it can be seen that the QGASR-CNN is superior to the PCA-CNN and SR-CNN in classification accuracy and kappa coefficient. The proposed QGASR-CNN improves the classification accuracy, maintains the smoothness of classification results, and effectively reduces the classification error, but the operation efficiency of the QGASR-CNN is slightly lower than those of PCA-CNN and SR-CNN. Experiment 2: After the experiment is executed for hyperspectral remote sensing image dataset from Indian pines, the results comparison between the original hyperspectral remote sensing image and the classification results is shown in Figure 8.
The operation time, total accuracy and kappa coefficient of the three methods are shown in Table 8 and Figure 9.
As can be seen from the Table 8 and Figure 9, the total accuracies of the PCA-CNN, SR-CNN and QGASR-CNN are 88.2%, 90.3% and 94.1%, respectively. The total accuracy of the QGASR-CNN is best, which is 5.9% higher than that of the PCA-CNN. The kappa coefficients of PCA-CNN, SR-CNN and QGASR-CNN are 0.861, 0.887 and 0.921, respectively. The kappa coefficient of the QGASR-CNN is  best among three methods. From the comparison results of the evaluation indexes among three methods for the hyperspectral remote sensing image, it can be seen that the QGASR-CNN is superior to the PCA-CNN and SR-CNN in classification accuracy and kappa coefficient. The proposed QGASR-CNN improves the classification accuracy, maintains the smoothness of classification results, and effectively reduces the classification error. The operation efficiency of the QGASR-CNN is close to those of PCA-CNN and SR-CNN.
Finally, in order to test whether the QGASR-CNN has over-fitting phenomenon, we have made a comparison between Pavia University dataset and Indian pines dataset. The maximum number of iterations is 120. The loss curve curves of the two datasets are shown in Figure 10 and Figure 11. In here, the loss indicates error value of training set, and the val_loss indicates error value of testing set.
As can be seen from the Figure 10 and Figure 11, that the convergence effect of the QGASR-CNN is more better.
The error values of training set and testing set are minimum values, which show that the QGASR-CNN takes on better classification ability, generalization performance, and stability and robustness. When the QGASR-CNN runs at the number of iterations (80), it has achieved a good classification accuracy. The experiment results show that the QGASR-CNN has faster convergence speed. At the same time, the QGASR-CNN does not exist any over-fitting phenomenon for classifying hyperspectral remote sensing image s from Pavia University dataset and Indian pines dataset.
As can be seen from the Figure 10 and Figure 11, that the convergence effect of the QGASR-CNN is more better.
The error values of training set and testing set are minimum values, which show that the QGASR-CNN takes on better classification ability, generalization performance, and stability and robustness. When the QGASR-CNN runs at the number of iterations (80), it has achieved a good classification accuracy. The experiment results show that the QGASR-CNN has faster convergence speed. At the same time, the QGASR-CNN does not exist any over-fitting phenomenon for classifying hyperspectral remote sensing image s from Pavia University dataset and Indian pines dataset.

V. CONCLUSION
The traditional image processing and classification methods are difficult to achieve better classification results for hyperspectral remote sensing image with the same spectrum and different spectrum, the serious phenomenon of the same spectrum of foreign objects, the complex distribution of ground objects, the large difference of spatial scale, the small number of labeled samples, the complex and diverse noise types. A new hyperspectral remote sensing image classification method based on sparse representation with quantum genetic algorithm and convolutional neural network, namely QGASR-CNN is proposed in this paper. The sparse representation used to extract the characteristics of image and represent the image as a linear combination of base atoms in the dictionary. The quantum genetic algorithm is used to sparsely decompose the image to generate the sparse representation of the image. The sparse feature matrix of pixel group is constructed, and the matrix is used as convolution kernel to obtain the feature map relation of pixel in the CNN model, so as to achieve better classification effect. The hyperspectral remote sensing images from Pavia University and Indian Pines are used to prove the effectiveness of the proposed method. The classification accuracy can reach 94.1%. The experiment results show that the QGASR-CNN has a certain improvement in classification accuracy compared with the traditional method, which effectively solves the problem of ''salt and pepper misclassification'' in hyperspectral remote sensing image classification, and there is no any over-fitting phenomenon.
In the future work, we will further study the relationship between parameter selection and feature mapping in quantum optimization sparse decomposition, and further improve classification accuracy and operation efficiency.