Multiview Low-Rank Hybrid Dilated Network for SAR Target Recognition Using Limited Training Samples

Deep learning (DL) have been intensively exploited for synthetic aperture radar (SAR) target classification, and we have witnessed exciting performances provided by previous studies. Multiview-based DL methods exhibited huge potentials for SAR target classification since they can generate adequate sample images from few raw images. In order to further improve the generalization performance of the multiview-based framework with limited training samples, we propose a novel multiview low-rank hybrid dilated network (MLHDN) for SAR target recognition. Firstly, we design a parameter-sharing hybrid dilated convolution (HDC) to learn multiview features. Secondly, a composite low-rank bilinear pooling (CLRBP) is proposed to fuse multiview features and to reduce their dimensions, yielding class-oriented and compacted feature vectors which are distinctive and representative for classification. Finally, a Softmax layer is used as a classifier. Accordingly, MLHDN encompasses fewer parameters compared to existing multiview-based DL methods. Experimental results demonstrate that MLHDN can achieve state-of-the-art performances on the moving and stationary target acquisition and recognition (MSTAR) datasets for SAR target classification, yielding an accuracy of 96.13% with only 10 training samples per class.


I. INTRODUCTION
Synthetic aperture radar (SAR) has been widely used for Earth remote sensing due to its special imaging capabilities of day-and-night and all-weather [1]. With distinctive characteristics of high-resolution, multi-aspect, multi-dimension, and multi-polarization, SAR images are capable of conducting military and civilian surveillance tasks, such as automatic target recognition (ATR). In general, an entire computation scheme of SAR-ATR includes three exclusive stages: detection, discrimination, and classification [2].
Various existing methods implemented at the classification stage of SAR-ATR can be categorized as model-design-based and feature-learning-driven methods. The model-design-based methods make prediction based on the predefined target model and feature matching The associate editor coordinating the review of this manuscript and approving it for publication was Gerardo Di Martino . engineering [3], which are complex in model design. Whereas, the feature-learning-driven methods rely on target templates or feature representations of the target for classification, e.g., using support vector machines [4], ensemble learning [5], hidden Markov model [6], manifold learning [7], graph learning [8], sparse representation [9], template matching [10], compressive sensing [11], low-rank representation [12], which is more flexible and can be easily implemented.
Recently, deep learning (DL) has been intensively exploited for SAR target classification [13]. Extensive experiments have found that resorting to automatically learn data-driven features from SAR images, DL methods have achieved excellent classification performance in SAR-ATR. According to the issues they solved, most of those methods can be categorized into the following families.
Firstly, many studies aimed at addressing the overfitting problem due to limited training samples. A seminal VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ work [14] of A-ConvNets (a new all-convolutional networks) was designed to reduce the free parameters, and it only consists of sparsely connected layers, rather than fully connected layers which is commonly used in convolutional neural network (CNN). Other studies solved this issue by using various DL methods and learning strategies. For example, transfer learning was combined with stacked convolutional auto-encoders [15]; Convolutional highway unit architecture was formed by combining a modified convolutional highway layer, a max pooling layer, and a dropout layer [16]; Semi-supervised learning was combined with a deep convolutional generative adversarial networks [17], [18]; Deep memory convolutional neural networks was trained by transferring parameters in two steps [19]. Secondly, considering the issues of translation of target, randomness of speckle noise, lack of pose image in training data, different data augmentation operations were introduced into CNN [20]- [22]. Although data augmentation can increase data diversity and robustness of learned features, the augmentation rule is human-designed and need experiments to validate its effectiveness. Therefore, stacked auto-encoder was exploited by using a supervised constraint (a restriction based on Euclidean distance and a dropout step) [23].
Thirdly, the spatial and channel correlations between features are rarely exploited in CNN-based SAR target classification. To this end, various attention mechanisms were introduced into CNNs to improve the representational power of salient features. For example, a lightweight CNN [24] model was proposed by incorporating a ''Squeeze-and-Excitation'' block with a spatial attention model, which they term the ''channel-wise and spatial attention'' block.
Finally, the space-varying backscattering information collected by SAR sensor is often missed by traditional methods for SAR target classification. SAR target classification will be benefited from multi-aspect measurements, where the target and the associated SAR image are often highly sensitive to the viewing angles. Therefore, a novel multi-aspect-aware method is proposed to achieve this idea through the bidirectional long short-term memory recurrent neural networks [25]. Similarly, multiview CNN framework was employed to learn the classification features from the couple of multiview SAR images autonomously and effectively [26]- [28]. In addition, a region-based convolutional neural network was exploited to address the issue of large scaled variabilities of objects in large scene SAR images [29], [30].
Recent researches have proved that DL-based methods are effective for SAR target classification, but they usually require large training samples to ensure the accuracy. However, it would be much difficult and expensive to obtain enough training samples in practical applications. Multiview-based DL methods can generate sufficient sample images from few raw image based on the diversity of the aspect-angle, which is a natural choice to tackle the sample limitation issue of SAR target classification.
Based on the above observations, we focus on exploiting a novel multiview-based DL method for SAR target classification using limited training samples.
In this paper, we propose a novel composite low-rank bilinear pooling (CLRBP) to effectively leverage multiview information for SAR target classification. On the one hand, CLRBP can map high-dimensional bilinear features into low-dimensional and compact features by using low-rank representation. On the other hand, CLRBP exerts a composite decomposition to model similar and exclusive information from multiview features, yielding class-oriented feature vectors, which is more representative and discriminative for classification. In this context, a novel multiview low-rank hybrid dilated network (MLHDN), which is composed of a parameter-sharing hybrid dilated convolution (HDC), a CLRBP, and a Softmax layer, is proposed to realize SAR target classification.
It is worth noting that some recent studies including [25]- [28] focused on multiview-based DL for SAR classification. However, the major conceptual differences between those existing methods and ours are twofold: 1) We exploit bilinear pooling to leverage multiview features and low-rank representation to reduce their dimensions, whereas existing methods simply employed concatenation or summation to fuse multiview features. 2) We introduce unpadded dilated convolutional layers to enlarge the receptive field and aggregate global information, whereas previous studies mainly used standard convolutional kernel, and none of pooling layer is used in our network. Moreover, although CLRBP is inspired by [31], there exists several salient distinctions: firstly, CLRBP consists of two independent projection matrices as we decompose the features into two disparate parts; secondly, one projection matrix in CLRBP is constrained as sharing among different views to learn similar features using fewer parameters; thirdly, a learnable weight is employed to module the balance between similar and exclusive features for each class.
In this context, the main contributions of our work can be summarized as follows.
1) We propose a novel fusing method, namely CLRBP, which can reduce the dimensions of multiview features by using two low-rank projections for the similar and exclusive information respectively, which has fewer learnable parameters based on a composite decomposition, thus reducing the risk of over-fitting under the circumstance of limited training samples. In addition, a parameter-sharing HDC is introduced to extract more informative features since it can increase the receptive field and aggregate global information. 2) We propose a novel end-to-end network MLHDN, integrating a parameter-sharing HDC, a CLRBP, and a Softmax layer to realize SAR target classification with limited training samples. As we know, this network is unique in the related literatures. The experimental results demonstrate that MLHDN is extendable to the number of views. In addition, MLHDN is superior to other counterparts in terms of robustness to view interval angle, robustness to aspect-angle estimation error, generalization performance, and classification accuracy with limited training samples.

II. PROPOSED METHOD
The proposed end-to-end DL architecture (MLHDN) for SAR target classification is graphically illustrated in Fig. 1. Firstly, the input multiview images are rotated by their specific aspect angles. Secondly, a parameter-sharing HDC is introduced to learn multiview features. Thirdly, a CLRBP is proposed to learn representative and discriminative feature vectors. Finally, a Softmax activation is adopted to yield the prediction probability for each class.

A. MULTIVIEW SAR IMAGE ACQUIREMENT
In general, multiple SAR images of different aspects of a target can provide more information than a single SAR image, which contributes to SAR-ATR [32]. Provided that a SAR sensor of a airborne radar system can acquire a series of images for a ground target with diverse aspect angles during a flight. However, there is a tough tradeoff between recognition accuracy and task feasibility in terms of time. Accordingly, only a few SAR images can be obtained, and these images recording a target of diverse aspect angles, make up a SAR image set. In order to alleviate this issue, the autor in [26] proposed a novel multiview SAR image generation method, which can generate adequate multiview SAR images from a SAR image set based on the image combinations, i.e., given the view interval angle and the number of views, a very large number of view combinations of SAR images per class can be obtained.
For example, given a view interval angle of 360 • and a view number of 4, the theoretically largest combination number is A 4 10 if considering the order and C 4 10 without considering the order when using 10 images per class. However, with a smaller view interval angle like 45 • , this combination number would decline rapidly because many of them are not within a same interval.
Furthermore, considering the time complexity, a random selection worked on all generated image combinations is needed [26]. To be specific, a sample-balanced strategy is adopted in our experiments, i.e., supposing the total number is set to 2000 for each class, 200 image combinations are randomly selected for each sample (a image combination belongs to a sample if this sample locates at the first position).

B. HYBRID DILATED CONVOLUTION
Generally, HDC is composed of several 2D dilated (or atrous) convolutional layers which is constructed by inserting zero holes between each pixel in the convolutional kernel [33]. Compared to a standard convolutional layer, a dilated convolutional layer can increase the resolution of intermediate feature maps. However, there is a gridding issue caused by a dilated convolutional layer since the center pixel can only perceive information in a checkerboard fashion, losing a large portion of low-level information. In order to alleviate this problem, the authors in [34] proposed a solution by manually designing a group of dilated convolutional layers with different dilation rates to ensure that the final receptive field could fully cover a square region without any holes. Supposing n convolutional layers with a same kernel size of K × K that having dilation rates of r = [r 1 , r 2 , . . . , r n ], a constraint is defined as follow where M i is the maximal distance between two nonzero values, and the ultimate goal is to let M 2 ≤ K . For example, a kernel size K = 3 and r = [1, 3,9,27]. Then, we have Obviously, M 2 satisfies this goal, and these four values are also the theoretical maximum for a four-layer architecture. Based on this criterion, a parameter-sharing HDC is designed as a multiview feature extractor of our network. Compared to other extractors, HDC can effective enlarge the receptive field and aggregate global information so that it can obtain more representative features.

C. COMPOSITE LOW-RANK BILINEAR POOLING
We propose a CLRBP to obtain class-oriented feature vectors from the high-dimensional bilinear pooling feature maps. VOLUME 8, 2020 Firstly, a bilinear pooling [35] is used to calculate orderless bilinear representations from the features extracted by HDC. The bilinear pooling operation can be formulated as where Since B i has high dimensions, we attempt to reduce its dimensions and acquire a compact feature vector from B i , which can be formulated as where S i ∈ R M is a compact feature vector (M d * d) and P is a projection matrix. However, it may be unsuitable to directly learn P for B i since B i may contain redundant information between different views. On the one hand, for different features f ∈ F i , they may encompass similar information since they belong to an identical target. On the other hand, they may also encompass exclusive information since they come from diverse views with different aspect angles. In this circumstance, we propose to formulate S i as where B s i represents similar features among images which are irrelevant to the view, and B e i represents exclusive features belonging to different images. The addition operation is used to fuse the two features.
However, it is difficult to directly fetch B s i and B e i because they are closely included in B i . Therefore, we only decompose P to approximate S i . In this circumstance, S i can be estimated as where P s is constrainted as the shared parameters between different views, and vec (B i )P s is the estimation of vec (B s i )P s . While P e is unconstrainted, and vec (B i )P e is the estimation of vec (B e i )P e . Besides, a parameter λ is employed to modulate the balance. Thus, S i is calculated as Finally, we impose a low-rank constraint when learning P s and P e based on the eigen-decomposition used in [31]. The projection matrices P s and P e will have fewer learnable parameters under the low-rank constraint. Meanwhile, fewer parameters can reduce the risk of overfitting when training samples are limited. Therefore, it is beneficial to improve the generalization performance. Consequently, S ij in S i is eventually formulated as where U j+ and U j− ∈ R d×r s /2 , V j+ and V j− ∈ R d×r e /2 . Since r s and r e are two ranks related to P s and P e , we can control their respective learnable parameter numbers by subtly modulating the corresponding ranks. Where r s /2 is the second dimension of U j+ and U j− , i.e., r s controls the low-rank level of P s . Similarly, r e /2 is the second dimension of V j+ and V j− and r e controls the low-rank level of P e . With a smaller r s or r e , the corresponding projection matrix would have a lower degree of the freedom, contributing to more compact feature learning.
Once these two hyperparameters of r s and r e are determined, the dimensions of the learnable parameters is fixed, then they are randomly initialized and learned from data during training. Finally, we can obtain a compact feature vector S i from F i .

D. MULTIVIEW LOW-RANK HYBRID DILATED NETWORK
With the above major components at hand, we now can build our network. Firstly, the SAR images are spatially downsampled with a fixed size of 83 × 83, and they are rotated by their aspect angles to reduce the sensitivity of the orientation difference on classification accuracy [26]. Then, the designed parameter-sharing HDC is introduced to learn multiview features. The parameter-sharing HDC consists of 4 unpadded dilated convolutional layers where the kernel size is set to 3×3 for each layer. In addition, the filters and dilation rates are empirically set to [16,32,64,128] and [1,3,9,27], respectively, for each layer. Note that, the setting of filters follows the previous network [14], and the setting of dilation rates aims to maximize the dilation rate of each layer. Next, bilinear pooling is applied to the multiview features, and the obtained bilinear pooling feature maps are further projected into class-oriented feature vectors via the proposed CLRBP, where M is set to 10 which is equal to the number of classes, and the r s and r e are experimentally set to 4 and 2, respectively. Finally, a Softmax activation is applied to the feature vectors to yield the prediction probability for each class. The whole architecture of the proposed MLHDN is reported in Table 1.
It's worth noting that the weights of network parameters are randomly initialized, a cross entropy loss is employed as the loss function, and Adam is adopted as the optimizer. In the training phase, a minibatch size is set to 32, an initial learning rate is empirically set to 0.001, and an epoch is set to 10.

A. SAR DATA SETS
The moving and stationary target acquisition and recognition (MSTAR) data sets [36] are utilized to evaluate the classification performance of our proposed method. MSTAR data sets contain hundreds of thousands of X-band SAR images of ground targets including different target types, aspect angles, depression angles, serial numbers, and articulation. The publicly released data sets include ten different categories of ground targets: BMP-2, BRDM-2, BTR-60, and  BTR-70 (armored personnel carrier); T-62 and T-72 (tank); 2S1 (rocket launcher); ZSU-234 (air defense unit); ZIL-131 (truck); D7 (bulldozer). The targets have been measured over the full 360 • azimuth angles with 1 • -5 • increments. Those targets are presented with 0.3 m resolution, and most of them have 128 × 128 pixels. Fig. 2 visually depicts those ten types of targets.

B. EXPERIMENTAL SETTINGS
• For training and test image sets preparation, we follow a multiview data formation strategy used in [26]. Note that the training interval in our experiments is set to 360 • , and we take the order into consideration to create enough image combinations using a few training samples, and 2000 train image combinations per class are selected as mentioned above. As for the generation process of the test image combinations, we do not consider the order, and the test interval is analyzed in the following experiments. Similarly, 2000 test image combinations per class are randomly selected from a large set. The training and test samples are shown in Table 2.
• For performance comparison, we implement five DL-based methods including multi-layer perception (MLP) [36], deep convolutional neural networks [37] (DCNN), a new all-convolutional networks (A-ConvNet) [14], convolutional highway unit network (CHU-CNN) [16], and four-view deep CNNs (4-VDCNN) [26]. To better evaluate the proposed method, we also implement MLHDN with 1-view for comparison, which only needs one image as the input. Note that, all those six methods use the aspect angle information to rotate the input images.
• Experimental results are quantitatively evaluated using classification accuracy, parameter size, and computational time. It should be noted that all the implementations were carried out using TensorFlow 1.12.0 and Keras 2.2.4 on the Ubuntu 16.04 LTS desktop PC equipped with GeForce GTX 1070 (8GB), Intel Xeon E3 CPU (at 3.3GHz x 8), and 32 GB of RAM. Moreover, we have conducted ten independent runs in each experiment, and reported the averaged accuracies with standard deviations.

C. EXPERIMENT 1-HYPERPARAMETER ANALYSIS
In this experiment, we analyze the impacts of r s and r e on classification accuracy. To this end, r s and r e are respectively set to a same range of {2, 4, 6, 8, 10}. The experimental results are reported in Table 3, where we found that the proposed method is insensitive to the two hyperparameters, and when r s = 4 and r e = 2, MLHDN achieves the highest accuracy with an OA of 96.13%. However, with a larger r s or r e , the corresponding learnable P s or P e becomes complex,  which may bring poor generalization performance. Considering this, relatively small values are suitable for these two hyperparameters when using limited training samples. Thus, in the remaining experiments, the optimal values for r s and r e are set to 4 and 2, respectively. In addition, the impact of number of views on classification accuracy is analyzed in this experiment. As we can see from Fig. 3, the accuracy becomes higher with the increment of the number of views. For instance, the proposed method obtains an OA of 97.23% with the number of views reaches to 8. Note that, 4-VDCNN was designed for four views, thus the number of views is experimentally set to 4 in the following experiments for a fair comparison. Therefore, this result demonstrates that MLHDN has good capacity of extendibility in terms of number of views.

D. EXPERIMENT 2-ABLATION ANALYSIS
In this experiment, we evaluate the contribution of different ingredients to the classification performance of MLHDN. Firstly, we analyze the impact of parameter-sharing HDC on OA. To this end, we implement three variants of MLHDN: replacing HDC with standard convolution and max pooling, replacing HDC with standard convolution and average pooling, and replacing HDC with two standard convolutions (stride = 1 and stride = 2). As shown in Table 4, the proposed method achieves the best classification performance among different variants, with around of 11-15% improvements in terms of OA. As for class-specific accuracies, our method obtains the best results for different classes. We also observe that traditional CNN-based methods with several interleaved standard convolutional layers and max or average pooling layers exhibit similar performances with OAs lower than 85%. This result demonstrates the superiority of HDC compared to traditional interleaved convolution and pooling architectures.
Secondly, we evaluate the effects of CLRBP on OA. To fulfill this objective, we implement three variants of MLHDN: replacing CLRBP + Softmax activation with FC, replacing CLRBP + Softmax activation with BP + FC, and replacing CLRBP with LRBP. As reported in Table 5, the proposed method obtains the highest OA, with 0.35-1.27% improvements compared to other counterparts. Note that, although these three variants can realize comparative performance, proposed CLRBP + Softmax activation obtains the best performance with the fewest number of parameter size. We also find that MLHDN achieves the best class-specific accuracies for six of ten classes. Especially, by employing BP, LRBP, and CLRBP, MLHDN obtains stepwise improvements in terms of OA, which certifies the effectiveness of the proposed CLRBP operation. In addition, when comparing HDC with CLRBP, we can see that they jointly contribute to the performance improvements of MLHDN. However, the contribution of HDC is more significant than that of CLRBP, i.e., HDC contributes to the improvement from 84.58% to 94.86%, and CLRBP leads to a further improvement from 94.86% to 96.13%. Note that, there are two reasons to explain the superiority of MLHDN compared to other variants when using limited training samples. Firstly, the parameter-sharing HDC can effectively aggregate global information without resolution loss while the resolution is a crucial factor for accurate target recognition [10]. Secondly, the CLRBP can extract more compact features by utilizing the low-rank property intrinsically existed in multiview SAR images.

E. EXPERIMENT 3-VIEW INTERVAL ANGLE ANALYSIS
In this experiment, we test the impact of view interval angle on OA. Due to limited samples, the training view interval angle is set to 360 • in order to generate enough multiview images. In contrast, the test view interval angle may be fixed by an actual reconnaissance task. As shown in Fig. 4, we also provide the results of other compared methods, and the proposed method obtains the highest OAs in different cases. It's interesting to note that MLHDN (1-view) also obtains higher accuracy than the other methods. According to the results, when the time of a reconnaissance task is sufficient, which is similar with our experiment under a view interval angle of 135 • , the proposed method achieves an OA of 98.13%. This is due to the fact that the images with a large view interval angle may contain more diverse information. Whereas, when the time is extremely limited, which is similar with our experiment under a view interval angle of 15 • , MLHDN obtains a tolerable OA of 94.67%. Note that, in the following experiments, the view interval angle is set to 45 • , which is same to that of 4-VDCNN for a fair comparison. This result demonstrates that MLHDN is robust to the view interval angle.

F. EXPERIMENT 4-ASPECT-ANGLE ESTIMATION ERROR ANALYSIS
In general, the virtual aspect angles ought to be estimated, and the estimation errors of the target aspect angle may impact the recognition performance. As shown in Fig. 5, the performances of different methods decrease as the increment of standard deviation. This can be explained that when the aspect-angle estimation error becomes larger, the information induced from those wrongly estimated aspect angles deteriorates the training performance, thus leading to the decrease of prediction accuracy. It's worth noting that, MLHDN still yields better performance than the other methods. For instance, MLHDN obtains an OA of 90.68% when the standard deviation reaches to 10, which is still acceptable for actual application. This result demonstrates that MLHDN is robust to the aspect-angle estimation error.

G. EXPERIMENT 5-GENERALIZATION PERFORMANCE ANALYSIS
In this experiment, we evaluate the generalization performance of the proposed method under limited training samples per class. To this end, we randomly choose different numbers of training samples per class according to a range of {5, 6,. . . , 15}. Fig. 6 plots the evolution of classification accuracy with the number of training samples per class. As shown in the figure, the proposed method achieves the best result in different cases. Especially, MLHDN obtains a tolerable OA of 88.28% with only 5 training samples per class, which is 6.93-40.90% higher than others. In addition, MLHDN obtains a remarkable OA of 98.67% when using 15 samples per class for training, which is 2.88-12.62% higher than others. Therefore, the result demonstrates that MLHDN has better generalization performance than the other counterparts under limited training samples. Note that, the performance of 4-VDCNN is not good with limited training samples. However, the OAs of 4-VDCNN greatly increase as the number of training samples becomes larger. When the number reaches to 15, 4-VDCNN can still provide competitive results. In addition, the other traditional methods also yield good results since they resort to the aspect-angle information.

H. EXPERIMENT 6-CLASSIFICATION RESULTS AND COMPLEXITY ANALYSIS
In this final experiment, we report the classification results obtained by the proposed method and other state-of-the-art DL methods with 10 training samples per class. The experimental results are reported in Table 6, where MLHDN significantly outperforms others with an OA of 96.13%. For class-specific accuracies, MLHDN obtains the highest accuracy in all classes. Especially, MLHDN yields 100% accuracy for three of ten classes, i.e., the T62, ZIL131, and ZSU_23_4. As for κ statistic, MLHDN also performs best with a κ of 95.70%, which is 4.47-25.16% higher than the other methods.
In addition, we also report the parameter size, training time, and test time of different methods. As reported in Table 7, MLHDN contains much fewer parameters than MLP, A-ConvNet, CHU-CNN, and 4-VDCNN, which may explain the superiority of our method for SAR target classification because excessive parameters may lead to poor generalization performance when facing with limited training samples. More importantly, the performance improvement didn't sacrifice the efficient. Precisely, the training time and the test time of our method are in the same level compared to the time costs of others. For example, the training time  of our method is 627s, which is a little higher than that of 4-VDCNN. Whereas, the test time of our method is 144s, which is almost the same to that of 4-VDCNN.

IV. CONCLUSION
Focusing on the limitations of existing multiview-based SAR target classification methods when facing with limited training samples, this paper presented a novel multiview low-rank hybrid dilated network. In the proposed method, two major components including a parameter-sharing HDC and a CLRBP contribute most to the performance improvements. The parameter-sharing HDC is designed to learn multiview features by using four dilated convolutional layers with different dilation rates. HDC is capable of extracting more informative features since it can increase the receptive field and aggregate global information. The CLRBP is proposed to fuse multiview features by using bilinear pooling, low-rank representation, and composite decomposition. CLRBP projects the high-dimensional bilinear feature maps into low-dimension and class-oriented feature vectors. On the one hand, CLRBP improves the distinction of learned features for classification. On the other hand, CLRBP reduces the risk of overfitting with limited training samples since it directly contributes to the small parameter size by this projection.
Through comprehensive experiments based on MSTAR data sets, we have observed that the proposed method outperforms other traditional and state-of-the-art SAR target classification methods using limited training samples. When compared to 4-VDCNN, MLHDN has fewer parameters, and costs tolerable running time. In the tests of view interval angle and aspect-angle estimation error, MLHDN is more robust compared to others. As for generalization performance with limited training samples, MLHDN produces the highest accuracies in different cases. As reported in the detailed experimental results with only 10 training samples per class, MLHDN obtains the best OA of 96.13% and κ of 95.70%, and it also yields the best class-specific accuracies in all the ten classes. In this context, the superiority of MLHDN compared to others has been clearly verified.
Currently, we exploited a parameter-sharing HDC and a CLRBP for multiview-based SAR target classification with limited training samples. In the future, we will extend our method to few-shot learning which only needs very few training samples.