Dual-Concentrated Network With Morphological Features for Tree Species Classification Using Hyperspectral Image

At present, deep learning is a hot topic in the field of the classification of hyperspectral image (HSI), and it has aroused wide attention. However, in fine-grained classification tasks, such as tree species classification, the uncertain spectrum remains the major factor restraining the classification performance. In order to solve the dilemma of forest tree species classification, a dual-concentrated network with morphological features (DNMF) is proposed. First, mathematical morphology is used to extract the morphological features of HSI. Then, coarse-grained information is extracted from the original hyperspectral data, and fine-grained information is extracted from morphological features. After that, both morphological representations and spectral inputs are fed into DNMF, and the overall evaluation index and visual image are obtained. The advantage of DNMF is that it decouples the spatial and spectral information, and a multisource information fusion process is then simulated. Accordingly, DNMF obtains high tree species classification accuracy. In order to verify the superiority of DNMF, we choose Gaofeng State-owned Forest Farm in Guangxi Province and the Belgium dataset, which was collected near the western part of Belgium as the research area. Related experiments demonstrate that the DNMF model achieves clearly better classification performance over other competitive baselines.


I. INTRODUCTION
H YPERSPECTRAL remote sensing technology uses hyperspectral sensors to image the target area in dozens to hundreds of continuous bands. At the same time, it contains rich spatial information and spectral information. With the improvement of sensor technology, high-resolution image has attracted more and more attention [1], [2]. In recent years, airborne hyperspectral image (HSI) technology has gradually matured, and the resolution of HSI has been significantly improved. In 2016, Pang et al. [3] designed the multisensor airborne system LiCHy (light detection and ranging, CCD, and hyperspectral).
LiCHy can obtain HSI with a spatial resolution of 0.2 m. With high-resolution data blocks, HSI enables more accurate land cover analysis. Ren et al. [4] sliced the HSI data into data blocks for efficient computation of the covariance matrix and obtained excellent classification performance in 2014. Azar et al. [5] used spectral blocks with spatial groups to effectively exploit spectral and spatial information. In 2017, Zhong et al. [6] used the residual network to train and classify urban hyperspectral data with a spatial resolution of 1.3 m and achieved excellent results. Also, Zhong et al. [6] concluded that hyperspectral remote sensing earth observation technology is widely used in different fields, such as environmental monitoring, agricultural survey, mining survey, atmospheric research, and other fields. Although hyperspectral datasets are widely used, there are great differences between commonly used hyperspectral datasets and forestry hyperspectral datasets. To be specific, the spectral curves of different ground types in the commonly used public datasets are quite different. For example, Kumar et al. [7] used the open hyperspectral data (Indian pines data) that contains 16 ground object species, and different classes in this dataset represent clearly different spectral curves. However, the spectral reflectance in the same genus of tree species is very similar, and it is necessary to propose a new classification algorithm to solve the problem of "different objects with a similar spectrum" in tree species classification. Based on this classification challenge, the determination of forest tree species has become a new hotspot in forest hyperspectral data applications. In forest application, it can obtain spatial information on tree species composition and distribution pattern in forest, and establish a foundation for tree species classification, forest map drawing, and forest diversity protection. For example, Korpela et al. [8] proposed to use airborne data to classify tree species in 2010. Marcinkowska et al. [9] aimed to discover the potential of hyperspectral remote sensing data for mapping forest vegetation ecosystems in 2014. On this basis, Marcinkowska et al. [10] identified tree species in a large area of forest vegetation, which was conducive to the drawing of forest maps and the protection of forest diversity. Now many research try to introduce light detection and ranging (LiDAR) data into spectrum-driven tree species recognition task to further improve the classification performance [11], [12], [13]. Dalponte et al. [11] analyzed the use of airborne hyperspectral and LiDAR data for tree species classification in the Southern Alps, and the diagnosis spectral resolution can be integrated with the rich geometrical information together. Matthew et al. [12] combined the tree crown information (such as the maximum tree height) in LiDAR data with the spectral reflectance knowledge of HSI, and final classification process was implemented by support vector machine. Similar with tree species classification, many fine-grained recognition tasks rely heavily on the full use of multivariate data, and some researchers have proposed some derivative algorithms based on multivariate data fusion. In 2013, Li et al. [14] proposed convolutional recurrent neural network (CRNN), which used two ways of the classification rule set generation for the CRNN evaluation. In 2018, Li et al. [15] proposed a coupled sparse tensor factorization (CSTF)-based approach for fusing hyperspectral and multispectral images. In 2018, Xu et al. [16] developed a two-tunnel convolutional neural network (CNN) framework to extract spectral-spatial features from HSI. Dian et al. [17] regulated hyperspectral and multispectral image fusion by CNN denoiser. In 2020, Hang et al. [18] proposed an efficient and effective framework to fuse hyperspectral and LiDAR data using two coupled CNN.
Returning to the problem of tree species identification, sometimes it is difficult for us to obtain the same LiDAR/multispectral image as the HSI shooting location. But we often have unprocessed external information that is underutilized. If the unprocessed external information can be used, the structural relationship of the original spatial information can be maintained. In 2021, Hang et al. [19] proposed a multitask generative adversarial network (MTGAN) to take advantage of the rich information from unlabeled samples. Another research found that prior information helps to improve CNN's learning ability. According to this, Hang et al. [20] further proposed an attention-aided CNN model for spectral-spatial classification of HSIs. To weaken the influence of "different objects with a similar spectrum" in the new algorithm under small training samples, we plan to use the fine-grained information (such as morphological features) to compensate for the ignored characteristics reflecting the classes differences in the coarse-grained details. Fine-grained image analysis is a long-standing problem in computer vision and pattern recognition, and it has been widely used in the real world [21]. For example, Yin et al. [22] proposed to use finegrained image information (two local dynamic pose features) to solve the problem that people with similar clothes are difficult to distinguish. Furthermore, there are more fine-grained information in images to be mined. In recent years, mathematical morphology has been gradually applied to the mining of fine-grained spatial information of images and applied in HSI classification. Many morphological algorithms have proposed some feature attributes to enhance local geometric information accordingly. For example, Li et al. [23] proposed to capture the compactness attribute of circular shape and the elongation attribute of long strip shape. So, it is feasible to extract fine-grained information from HSI by the morphological algorithm.
Before mining fine-grained information in HSIs, it is necessary for us to understand the development of mathematical morphology algorithm. In the development of mathematical morphology, based on previous research, Plaza et al. [24] proposed morphological profiles (MP), which is an adaptive mathematical morphology algorithm introducing the concept of MP into HSI processing. MP can effectively extract the spatial information of the image and achieve the effective classification of fuzzy objects. Since then, a series of mathematical morphology algorithms have been proposed. For example, Mura et al. [25] proposed attribute profiles (AP) based on attribute filtering (AF). In 2016, Ghamisi et al. [26] proposed the extinction profiles (EP) component tree (maximum or minimum tree) based on extinction filtering, which sheers the tree nodes based on extinction value to extract features. In 2019, Li et al. [23] proposed the local contain profiles (LCP), which is designed based on the topology tree. Compared with the component tree, the topological tree significantly enhances the stability of the classification process based on regional contextures in the image. In 2021, Cao et al. [27] creatively proposed a threshold-based local contain profile (TLCP) on the basis of EP and LCP. Based on the topology tree, TLCP flexibly set thresholds according to different physical properties of objects and retains more spatial information than LCP. In 2021, Hou et al. [28] proposed multiple morphological profiles (MMPs) to enhance the utilization of HSIs. Aforementioned morphological features enhance spatial information usage in HSI and improve the performance of HSI classification.
Although the traditional morphological algorithm has the advantages of crucial textures extraction and high-efficiency calculation, the classification performance of these methods is still limited and strongly subject to manual parameter selection as investigated by Dalponte et al. [29]. Therefore, a more efficient and autonomous classification framework is urgently needed. Deep learning methods, which act more dynamically to provide automation features, have been extensively employed for remote sensing image feature extraction and classification. In particular, Hinton [30] found that CNN is one kind of deep network that involves fewer parameters than a fully connected network does. CNN can directly process two-dimensional (2-D) images and reduce the setting burden of manual parameters. It mainly consists of three parts: convolution layer, pooling layer, and fully connection layer. In 2017, Raczko and Zagajewski [31] applied CNN to tree species classification for the first time, forming a new HSI classification framework. Based on the CNN framework, scholars effectively improve the accuracy of HSI classification and reduce the complexity of the manual intervention. Chen et al. [32] and Zhao et al. [33] jointly used dimension reduction and deep learning techniques for spectral and spatial feature extraction so as to achieve better classification performance. Wang et al. [34], [35] proposed two kinds of generic models to solve the classification difficulties caused by intraclass differences and interclass similarities. Lee et al. [36] proposed the contextual deep CNN, which can optimally explore local contextual interactions by jointly exploiting local spatio-spectral relationships of neighboring pixel. However, as investigated by Xue et al. [37], the problem of overfitting and information loss remain the greatest challenges in deep networks.
In view of the characteristics of forest hyperspectral remote sensing images, a DNMF is proposed, which solves the shortcoming of traditional morphology methods that have low classification accuracy and are susceptible to the Hughs phenomenon. The proposed method realizes the decoupling of spatial-spectral information from a new perspective and couples it in an appropriate way, which significantly improves the performance of HSI classification. As a morphological feature extraction algorithm, LCP is applied to utilize the fine-grained spatial information of the input image. To be specific, in the first step, mathematical morphological features (LCP) are firstly extracted from the HSI. The second step is to extract the 1-D vector and 2-D data block from the original hyperspectral data and the morphological features. The third step is to acquire the final classification result based on a cross-converged dual-concentrated network.
The main contributions of this article are as follows. 1) In this article, we have creatively proposed a DNMF for tree-species classification. The spatial information and spectral information are first decoupled by morphological feature extraction, and the morphological features and the original data are recoupled in the collaborative utilization of spectral dimension [onedimensional (1-D) vector] and spatial dimension (2-D data block). It imitates the process of multiple information fusion. 2) Due to the characteristics of forest vegetation itself, it is difficult to obtain a large number of samples, which sets obstacles for obtaining the classification results of tree species. This method can obtain high classification accuracy of tree species in the case of small samples, which is suitable for the present status of forest hyperspectral data. Also, it has solved the problem of "different objects with a similar spectrum" in tree species classification.

II. DUAL-CONCENTRATED NETWORK WITH MORPHOLOGICAL FEATURES
We propose a new HSI classification framework, which is called DNMF. The proposed framework is mainly composed of three parts: morphological feature (LCP) extraction of the HSI, HSI standardization, and multigranularity features cross fusion based on the dual centralized network. The overall architecture of the proposed classification framework is shown in Fig. 1.
Part A introduces the LCP features extraction. Part B explains the detailed settings of dual-concentrated network (DCN). Part C describes the feature fusion and HSI normalization.

A. Mathematical Morphology for LCP Feature Extraction
Before deep learning algorithms were applied to HSI, morphological algorithms were relatively common in HSI classification, but it does not mean that the traditional morphological algorithm is backward. In this article, the morphological feature extraction algorithm (LCP) is used to decouple HSI information, which contributes to more significant texture information acquisition, and effectively improving the utilization of spatial information.
Among the traditional morphological methods, EP and LCP are proposed in recent years. Both EP and LCP can effectively extract spatial information in HSI and have auxiliary functions for enhancing spatial information in woodland HSI. Specifically, EP is built on component tree (Max Tree/Min tree). Component tree is built on the basis of the pixel values between the connected fields in pixels, so the component tree is susceptible to external factors. For example, the presence of clouds and shadows affects the pixel values, the components tree will change accordingly. Hence, the feature extraction process is affected inevitably. Compared with EP, LCP is extracted based on the construction of topological tree, where the topological tree is built based on the inclusion relationship between connected areas. It is noteworthy that the inclusion relationship is less sensitive to the external factors, which significantly improves the stability compared with the component tree based on pixel values.
The left side of Fig. 1 is a brief illustration of the extraction process of LCP, while the detailed settings of LCP are divided into three steps, as shown in Fig. 2. First, the original hyperspectral data blocks are processed through the principal component analysis (PCA) algorithm, and corresponding principal components are selected according to our demand for information. Second, topological trees are constructed from the principal components. Third, we set the same extinction filter parameters for the topological trees constructed by different  components. The parameters contain seven extinction values corresponding to seven pruned topological trees. Fourth, we reconstruct the seven pruned topological trees corresponding to each attribute (area, height, volume, etc.) and obtain seven reconstructed images. Finally, we stack seven images of each attribute to obtain a 49-dimensional morphological feature block. In order to better understand the feature extraction and integration of LCP, we introduce in detail the construction process of the topological trees and the principle of topological tree extinction filtering.
The construction of the topological tree can be explained by a simple image inclusion relationship. In Fig. 3, A represents the largest connected area that contains all pixels in the whole image. A is the root node, B, C, and D represent the second largest connected region, and B, C, and D are also child nodes of A. B contains regions F and G, so F and G are the child nodes of B. Following these principles, the topology tree is constructed.
Extinction filtering is the most critical part of the whole process of feature extraction and integration based on LCP after the construction of the topology tree. The extinction value in the topology tree is defined as follows: assuming that M is the local minimum connected region in image X, and ψ = (ϕ λ ) λ is a series of decreasing connected inverse spread transforms. ε ϕ (M ) represents the corresponding extinction value associated with Ψ. If it is the global maximum λ value after extinction filtering, M is still the minimum connected region of ϕ ϕ (X). The definition of extinction value can be given by the following formula: where Min(ϕ μ (x)) is a set containing all the minimum connected regions of ϕ μ (x). EF is a connected filter whose principle is to delete or retain the connected area corresponding to the leaf node and its branch nodes. Let Max(X) = {M 1 , M 2 . . .M N } be the set of minimum connected regions in the image X, and each M i (i = 1, 2. . .N) has an extinction value [defined by (1)]. First, M i is sorted in decreasing order according to ω i . Then, according to threshold n, the first n M i are selected, and the corresponding leaf nodes are marked; finally, the branches with marked leaf nodes are retained according to the filtering strategy, while the branches with unmarked leaf nodes are cut off. This filtering process can be defined as follows: where EF obtains A ν g through reconstruction operation. g is a function for selecting markers. g can be expressed as follows: where Max is the operation of selecting the minimum connected area with the largest extinction value, and M * i is the smallest connected area corresponding to the i highest extinction value.
After determining corresponding extinction value and minimum connection area of each node, we need to set an appropriate extinction value for EF to extract features and eliminate noise areas. The threshold is set as a m , (a = 1, 2, 3. . ., m = 0, 2, 3..s − 1), where a is the basic parameter and s is the number of thresholds.

B. Architecture and Details of the DCN
CNN-based classification is a hot topic in hyperspectral area in the recent years. However, inherent nonlinearity between materials and the corresponding spectral profiles still severely restrict the performance of existing CNN-based methods. To change this phenomenon, we have proposed a HSI processing architecture based on a DNMF.
The right side of the Fig. 1 is a schematic diagram of DCN. In particular, the DCN can be divided into two parts: fine-grained information processing branch and coarse-grained information processing branch. The standardization section will be introduced in Section C.
In the architecture of DCN, we design a new strategy to recouple the spectral-spatial information. Morphological features are extracted by LCP for collecting fine-grained information, while the normalized HSI retains the coarse-grained hyperspectral input, thus constructing a coarse-to-fine information set. In order to make the recoupling process in a more stable manner, the fine-grained information (morphological features) and the coarse-grained information are separately analyzed from the spatial and spectral dimension, respectively. Specifically, both the normalization HSI and the LCP features are divided into 2-D data blocks and 1-D vectors. The processing tunnels of fine-grained and coarse-grained inputs are consistent with each other. Further, taking the processing of normalization HSI as an example, the information feed forward settings are introduced as follow. The size of the normalized HSI is H × W × D, where 2-D input blocks and 1-D vector are first cropped from the input. We select based on the target pixel to construct the 1-D vector in the normalization HSI. In order to maintain the stability and registration of recoupling, the position of the target pixel of morphological features is the same as that of the normalization HSI. The dimension of the 1-D vector is 1 × B. Here, B is the number of the feature panels in the normalization HSI input, which is also the same as the bands number in original HSI. Particularly, Zhang et al. [38] demonstrated that an appropriate increase in the size of the input data helps to improve classification performance. So, the appropriate window size of S × S is set according to the practical demands. To keep the architecture robust, the 2-D data block is extracted from the same target pixel in the 1-D vector, which also contains the same bands number with original HSI. Taking the target pixel as the center, a cube with size of S × S × B is selected as the 2-D block.
Finally, we classify according to the dimension of the input quantity and then fuse related features according to the class. In 1-D fusion channel, the network contains two 1-D convolutional layers, one batch normalization layer, two ReLU layers, a maxpooling layer, and the flatten layer. Uezato et al. [39] explained that the spectral curve is continuous. According to the characteristics of spectral curves provided by spectral dimensions of HSIs, 1-D vectors are fused in a cascade manner in Fig. 1. To ensure the consistency of spatial and spectral information of HSI, the structure of the 2-D fusion channel is also similar to that of the 1-D fusion channel. The difference between the two channels is that the 2-D fusion channel contains two fusions. Both the 1-D fusion channel and 2-D fusion channel pass through a batch normalization layer and a ReLU layer. Classification results were obtained based on the fused features of 1-D and 2-D fusion channels.

C. Data Fusion and HSI Normalization
In the architecture of method, morphological features contain a large amount of fine-grained texture information, which greatly enhances spatial information. Spectral information in forest tree species data shares relatively weak discriminative power, while the enhanced spatial information strongly improve the whole model's performance on tree species classification.
The recoupling of diversified information is embedded in the DCN branch of the proposed model. First, morphological data and standardized HSI are preprocessed into 2-D data blocks as input, and then a 1-D vector is obtained based on 2-D data blocks. First, morphological features and standardized HSI are preprocessed into 2-D data blocks as input, and then 1-D vectors were obtained based on 2-D data blocks. The fusion method of morphological features and standard HSI 1-D vector is cascade fusion, which is designed for better utilize the spectral information. The fusion method of morphological features and standard HSI 2-D data blocks includes three times fusions, which effectively enhances the utilization of spatial information.
In the pretreatment step, HSI has been processed by a standardized process. Due to the influence of weather and light on HSIs, the spectral intensity will change. To reduce the differences in spectral curves between the same tree species and facilitate final classification, we normalize each band of the input HSI. In practical application, each band is normalized and the normalization HSI is obtained by feature concatenation operation. This filtering process can be defined as follows: where N ormalization is obtained by subtracting the difference between the minimum value of this band and the spectral intensity of each pixel in the i band, and then it is divided by the difference between the maximum value and the minimum value. N ormalization i is the ith normalization, HSI i is the ith band of the HSI. Max(HSI i ) represents the pixel with the largest spectral intensity among all pixels in the ith band of HSI, and Min(HSI i ) represents the pixel with the smallest spectral intensity among all pixels in the ith band of HSI.

III. EXPERIMENTS AND ANALYSIS
For the proposed DNMF framework, all the programs are implemented using Python language, and the network is constructed using Keras 1 and TensorFlow 2 deep learning framework. TensorFlow is an open-source software library for numerical computation using data flow graphs, and Keras can be seen as a simplified interface to Tensorflow.

A. Experimental Data
The performance of the proposed DNMF framework is evaluated on two forestry datasets, the GSFF dataset, and the Belgium  dataset. The GSFF dataset is illustrated in Fig. 4, and Belgium dataset is illustrated in Fig. 5. There are few sample points in the ground truth of Belgium data, and the overall display effect is unsatisfactory, so some sample points are enlarged for display.
1) GSFF Dataset: In 2019, Zhang et al. [40] proposed GSFF data for the first time in the study of tree species classification using 3-D-CNN model to process HSI. Considering that it is often difficult to obtain enough samples for model training in actual remote sensing image classification tasks, in order to make the model usable in practical applications, randomly selecting 2.5% labeled pixels per class for training and all the other pixels in the ground-truth map for testing. Zhang et al. [40] proposed the GSFF dataset, which consists of 906 × 572 pixels, and it was gathered by the AISA Eagle II sensor in GSFF in Guangxi province in South China. There are 125 spectral channels covering the range from 400 to 990 nm with a spatial resolution of 1 m [41]. The GSFF dataset originally has 12 different land-cover classes, containing 9 forest vegetation categories.
2) Belgium Dataset: Considering that it is often difficult to obtain enough samples for model training in actual remote sensing image classification tasks, in order to make the model usable in practical applications, randomly selecting 19.8% labeled pixels per class for training and all the other pixels in the ground-truth map for testing. Liao et al. [42] proposed the Belgium dataset, which consists of 649 × 1079 pixels, and it was gathered by the airborne prism experiment (APEX) near the western part of Belgium. There are 286 spectral channels covering the range from 400 to 1000 nm with a spatial resolution of 1.5 m. The Belgium dataset originally has 7 forest vegetation categories, containing 1450 trees.

B. Parameters Tuning
The numbers of training and testing samples of the GSFF dataset are listed in Table I, and the numbers of training and testing samples of the Belgium dataset are listed in Table II. Parameter setting can greatly affects the classification performance of the proposed framework, so the performance with different parameter setting is first evaluated.
1) The Patch Size: The performance of different sizes of the image patch is tested. Each experiment is evaluated by  I  NUMBERS OF TRAINING AND TESTING SAMPLES FOR THE GSFF DATASET   TABLE II  NUMBERS OF TRAINING AND TESTING SAMPLES FOR THE BELGIUM DATASET   TABLE III  CLASSIFICATION PERFORMANCE (%) OF THE PROPOSED DNMF FRAMEWORK  WITH DIFFERENT WINDOW SIZES USING THE GSFF DATASET AND THE  BELGIUM DATASET the overall accuracy (OA), the average accuracy (AA), and the Kappa coefficient. Experimental results demonstrate that the size of the image block has a certain impact on the classification performance of different datasets. Table III lists that the window size of 15 × 15 offers better performance than others on the GSFF data. However, the Kappa coefficient of 15 × 15 is almost the same as the Kappa coefficient of 13 × 13, indicating that increasing the size of the input window could not continue to improve the consistency of classification. At the same time for improving computing efficiency and save computing resources. To sum up, we choose the window size of 15 × 15 for the following experiment with the GSFF dataset. Table III concludes that the window size of 7 × 7 offers better performance than others on the Belgium dataset. To sum up, we choose the window size of 7 × 7 for the following experiment with the Belgium dataset.
2) The Learning Rate: The learning rate controls the convergence of the model by determining how far the weights move in the gradient direction in a mini-batch. Because the initial random weights are far away from the optimal value, a dual-concentrate network of HSI is trained with a large learning rate during the first stage. When the training of the HSI network is completed, the weights of the HSI branches are fixed. Mathematical morphology features are transmitted phase by phase, and the network is fine-tuned with a small learning rate. Adam optimizer is selected to derive the optimal learning rate combination. Specifically, Table IV further lists the optimal learning rate of two training stages with the GSFF dataset and the Belgium dataset.

3) Analysis of the Number of Training Samples:
Labeled samples are very crucial in machine learning. However, data labeling is time-consuming and laborious in remote sensing data. In particular, it is difficult for forest surveys to mark the samples of HSIs, including remote locations, large forest areas, and other issues. Therefore, it is very important to use a small amount of labeled data to constract tree species classification model.
Here, we randomly select [10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%] samples from training set. The experiments are conducted ten times; then, the average results are reported. As shown in Fig. 6, the OA values tend to decrease as the number of training samples decreases. Especially, significant effect deterioration occurs in the result of 3DCNN in Fig. 6. This is because fewer training samples tend to overfitting. In particular, the total sample of the Belgium dataset is only 1450, which is far less than the total sample of GSFF. DNMF framework has the best stability on the Belgium dataset and the GSFF dataset. In the case of small samples, DNMF is the best for tree species identification. On the GSFF dataset, DNMF achieves 98.66% accuracy in the case of 10% training samples, which is significantly better than other deep learning algorithms. Besides, on the Belgium dataset, DNMF achieves 70.63% accuracy with 10% training samples, which is superior to other deep learning algorithms. To be concluded, DNMF obtains more representative morphological features by decoupling spatial-spectral information and recouples them during the training process in DCN, so as to maintain the stability of the algorithm and the accuracy of tree species recognition in the case of small samples.

4) The Normalization Module:
In the DNMF framework, HSIs are normalized according to bands to obtain normalization  On the GSFF dataset, the overall accuracy (OA), the average accuracy (AA), and the Kappa coefficient are significantly improved by 0.15%-0.76% in Table V. This result shows that the spectral curves of the same ground object in the GSFF dataset have little difference, and the normalization operation has little impact on the DNMF framework.
On the Belgium dataset, the overall accuracy (OA), the average accuracy (AA), and the Kappa coefficient are significantly improved by 3.96%-5.15% in Table V. This result shows that the spectral curves of the same ground object in the Belgium dataset are quite different. The normalization operation significantly improves the classification results of the DNMF framework and has a great impact on the DNMF framework.

C. Classification Performance
To demonstrate the performance of the proposed DNMF framework preprocessed remote sensing data, some traditional and state-of-the-art methods are compared, such as SVM, extreme deep learning, CRNN [14], Two-Branch CNN [16], Context CNN [36] and 3DCNN [40].
1) Comparison of Classification Performance: For a fair comparison, all the training and testing samples are set the same as in Tables I, III, and IV. OA, AA, Kappa coefficients, and classification accuracy are used to evaluate each model before and after LCP feature extraction branches is incorporated. Table VI lists classification results of GSFF data. Compared with other methods before the fusion of morphological features branch in all methods, the proposed framework DNMF has a significant improvement in overall results, the overall accuracy is 97.68%. The kappa coefficient is also significantly improved, indicating that the consistency of the proposed DNMF framework in the GSFF dataset is improved without the morphological feature extraction branches. After the fusion of the morphological feature extraction branch in all methods, each method has improved the classification results of each category. DNMF framework still has the highest classification accuracy among the nine categories of tree species, indicating that the proposed network is relatively suitable for forest vegetation classification. Within each model, the OA, AA, and Kappa coefficient have increased significantly after proper fusion of the mathematical morphological feature extraction branch. As a traditional feature extraction method, mathematical morphology has improved the classification accuracy of SVM the most, reaching 14.49%. In the deep learning network, the lifting accuracy of each model is 3.18% for CRNN, 1.63% for Two-Branch CNN, 0.43% for 3DCNN, 2.76% for Context CNN, and 0.91% for Context CNN. Morphological features are useful for species classification.
Besides, in each classification algorithm, we separately use the morphological features for training, and the results are shown in Table VI. The results show that the DNMF framework has the best classification results. However, the classification results of different deep learning algorithms show that the simple use of morphological features will affect the tree species recognition results. The joint utilization of original HSI and morphological features can obtain the best tree species classification performance.  dataset are completely tree species. Compared with other methods before the fusion of morphological features branch in all methods, the proposed framework DNMF has a significant improvement in overall results, the overall accuracy is 86.48%. The kappa coefficient is also significantly improved, indicating that the consistency of the proposed DNMF framework in the Belgium dataset is improved without the morphological feature extraction branch. After the fusion of the morphological feature extraction branches in all methods, each method has improved the classification results of each category. DNMF framework still has the highest classification accuracy among the seven categories of tree species. Within each model, the OA, AA, and Kappa coefficient have increased significantly after proper fusion of the mathematical morphological feature extraction branch. As a traditional feature extraction method, mathematical morphology has improved the classification accuracy of SVM the most, reaching 14.65%. In the deep learning network, the lifting accuracy of each model is 0.86% for CRNN, 0.17% for Two-Branch CNN, 6.63% for 3DCNN, 1.89% for Context CNN, and 1.73% for Context CNN. Morphological features are useful for species classification.
Besides, in each classification algorithm, we separately use the morphological features after feature extraction for training, and the results are shown in Table VII. The results show that the DNMF framework has the best classification results. However, the classification results of different deep learning algorithms show that the simple use of morphological features will affect the tree species recognition results. The joint utilization of original HSI and morphological features can obtain the best tree species classification performance.

2) Visual Comparison:
For visual evaluation of the proposed DNMF framework, the visual map of each aforementioned method for the GSFF data is illustrated in Fig. 7. In Fig. 7, it can be seen that in different models, the visual images obtained by the fusion of the mathematical morphology feature extraction branch are closer to the ground truth map than before the fusion with less noise. Among all the visual images, the proposed DNMF framework is closest to the true value and has the least noise.

IV. CONCLUSION
In this article, a framework based on DNMF processing channel has been proposed. Compared with the existing processing methods, DNMF framework creatively integrates the morphological feature extraction branch into the HSI processing process, decouples the spatial information from HSI and obtains richer and refined spatial information. In different HSI processing frameworks, the integration of LCP morphological feature extraction branches improved the experimental results to varying degrees, which fully demonstrated that the morphological features strongly enhance the spatial information utilization in the process of forest HSI processing. DNMF framework performed best among all baselines because both spectral, fine-grained spatial, and coarse-grained spatial information have been fully utilized in the proposed method. Thus, DNMF achieved superior classification performance even with small training samples of forest HSIs. His research interests include image processing, multisource data fusion, and pattern recognition.