Automatic Modulation Recognition for Radar Signals via Multi-Branch ACSE Networks

Automatic modulation recognition (AMR) for radar signals plays a signiﬁcant role in electronic warfare. Conventional recognition methods may suffer from the recognition accuracy and the computation complexity under low signal-to-noise ratio (SNR) conditions. In this paper, a novel multi-branch Asymmetric Convolution Squeeze-and-Excitation (ACSE) networks using multi-domain features and fusion strategy based on a support vector machine is proposed to recognize eight kinds of radar signals. First, features of radar signals in the frequency domain, the autocorrelation domain, and the time-frequency domain are extracted. Then the obtained multi-domain features are converted as the input of the proposed networks which owns the representational power and learning ability. Finally, the outputs of multi-branch ACSE networks are fused via the fusion strategy to obtain the ﬁnal results. Via simulations, the robustness and effectiveness of the fusion strategy are veriﬁed. The results on the simulation dataset prove that the proposed method can achieve more than 93% accuracy at -10dB for all modulations. Compared with four newly proposed networks, the multi-branch ACSE networks achieves better performance under low SNR conditions. And the results on measured signals show that the proposed method outperforms other comparison methods, especially for binary frequency-shift keying (BFSK) signals.


I. INTRODUCTION
In the process of electronic countermeasures, the ability to recognize the enemy radar signal modulation modes quickly and accurately can give priority to control the battlefield information and situation, which means automatic modulation recognition (AMR) for radar signals plays a crucial role in electronic warfare [1]. AMR for radar signals is also widely used in many kinds of radars for both ships and airplanes in civilian applications [2]. The traditional AMR for radar signals methods are mainly based on the pulse description word (PDW) which contains carrier frequency (CF), pulse width (PW), pulse amplitude (PA), time of arrival (TOA), and angle of arrival (AOA) and classify these signal characteristics through matching with the corresponding feature parameters in the database [3]. With the development of complex The associate editor coordinating the review of this manuscript and approving it for publication was Ramesh Babu N . system radar technology, radar signal parameters are changeable and their features are becoming more and more hidden. The traditional methods may suffer from the recognition accuracy and the computation complexity under low signalto-noise ratio (SNR) conditions and complex electromagnetic environment.
Since the intra-pulse characteristic parameters are stable and robust, the intra-pulse feature analysis provides an efficient way to recognize different radar signal modulations. More and more pieces of literature are focused on intrapulse feature extraction including time domain analysis [4], frequency domain analysis, modulation domain analysis [5], high-order statistical analysis [6], and spectral correlation analysis [7]. Thanks to the rise of deep learning (DL), the neural networks such as deep neural networks (DNN) [8], [9], convolutional neural networks (CNN) [9]- [12], and recurrent neural networks (RNN) [14], [15] or algorithms such as auto-encoders (AE) [16] and restricted Boltzmann machine VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ (RBM) [17] in deep learning and the methods based on feature extraction are combined to promote the recognition performance in AMR and overcome the shortcomings in conventional methods. In [8], a fully connected DNNs with unsupervised pretraining is proposed to classify modulation signals at different channels and independent auto-encoders are applied to learn multiple hidden-node attributes. The method can achieve 100% accuracy of noise-free signals, but the accuracy is 94.5% at 15dB. In [9], a DNN-based modulation classifier for single-input single-output systems and corresponding preprocessing methods are introduced. Simulation results prove that the proposed classifier can reach to the ideal maximum likelihood method at Rayleigh fading channel and uncertain noise and outperform the existing maximum likelihood-based classifiers.
In [10], a cost-efficient CNN based for robust AMR deployed for communication systems is proposed and has been tested on the dataset RML2018.01A. The proposed method achieves over 93% at 20dB of 24 challenging modulations. In [11], a method based on the AlexNet convolution neural network and the features of the smooth pseudo-Wigner-Ville distribution (SPWVD) is proposed to extract various feature details of radar signals. The simulation results demonstrate that the overall recognition accuracy of all radar signal modulations, except the QPSK signal, is more than 90% at -6dB. In [12], a CNN-based AMR method with multifeature fusion which includes cyclic spectra features and constellation diagram features is proposed to classify eleven kinds of communication signals. The experimental results show that the recognition accuracy is superior to or equivalent to other DL-based methods. In [13], a deep CNN method is proposed to classify four kinds of radar signals by extracting the bi-spectrum information. The simulation results demonstrate that the overall recognition accuracy is more than 88% at −5dB.
In [14], a method based on RNNs is proposed to classify digital modulation signals with noise at different SNR automatically. The simulation results show that with all noise-added signals can be recognized as the success rate of 94.72%. In [15], a novel framework with a CNN, an RNN, and a generative adversarial network (GAN) is constructed to cooperate for AMR. The framework takes full advantage of the spatial and temporal features and is tested on the opensource dataset RML2016.10a. The simulation results demonstrate that the method achieves 94% accuracy at high SNR. In [16], a method using a stacked sparse AEs and ambiguity function images of signals. The simulation results show that the average recognition accuracy is 99.8% at 15 to 25 dB, 99.6% at 5 to 15 dB, 98.4% at −5 to 5 dB and 90.4% at −10 to 0 dB, respectively. In [17], an AMR for radar signal method based on deep RBM is proposed to extract the feature and recognize radar signals. The simulation experiments prove that the method has a powerful recognition ability and strong robustness. In [18], a deep feature selection network and three features are applied to radar signal recognition. The approach is verified to classify the five different types of radar signals and has obtained good classification performance.
However, most existing AMR methods are focused on the communication or digital modulation signals such as the open-source dataset RML and these methods are not completely suitable for radar signals due to the non-cooperative of radar signals and confidentiality of parameters. For radar signal, there exist the problem of strong subjectivity and feature redundancy especially for handcrafted features which are determined by the given types of modulations [9]. Some AMR methods for radar signals are limited by the small number of modulation modes (typically 4 to 5) and can not achieve satisfactory recognition performance over low SNR conditions.
To solve these problems, a novel AMR for radar signals method based on multi-branch Asymmetric Convolution Squeeze-and-Excitation (ACSE) networks and multi-domain feature fusion is proposed to recognize eight kinds of common radar signals. The main novelty and contribution of our work include: 1) Multi-domain feature extraction. In this paper, we expanded more categories of radar signals compared with most existing literature and robust features in the frequency domain, autocorrelation domain, and time-frequency domain are extracted and converted into image formats to be recognized.
2) Multi-branch ACSE networks and fusion strategy. Each branch ACSE network is employed to learn and process one domain features and the fusion strategy based on support vector machine (SVM) is applied to promote the recognition performance.
3) Results on both simulation and measured signals. The simulation datasets for training and validation are properly constructed by referring to the literature. And the proposed method is tested and verified on three kinds of measured signals.
This paper is organized as follows. Section 2 introduces the material which includes the eight kinds of radar signals and their multi-domain features. The details of the proposed method and important parameters and hyper-parameters can be found in Section 3. Section 4 demonstrates results and comparisons on simulation and measured signals. The contribution of this paper, the overall results, and the discussion of our future work are summarized in Section 5.

II. MATERIAL
In this section, eight kinds of radar signal modulations and methods of feature extraction in the frequency domain, autocorrelation domain, and time-frequency domain are introduced.

A. EIGHT KINDS OF RADAR SIGNAL MODULATIONS
With the development of electronic technology, different radar systems have different intra-pulse modulation modes. There are eight common modulation modes for radar signals: binary amplitude shift keying (BASK), binary frequency-shift keying (BFSK), binary phase-shift keying (BPSK), conventional wave (CW), linear frequency modulation (LFM), sinusoidal frequency modulation, exponential frequency modulation, and stepping frequency wave (SFW).

1) BINARY AMPLITUDE SHIFT KEYING (BASK)
Unipolar non-return-to-zero codes are employed for BASK signals to control the amplitude of the sinusoidal carrier. The mathematical formula of BASK is: where, A m is the amplitude of signals and f c is the carrier frequency. g (t) is a unipolar non-return-to-zero code and ϕ is the phase.

2) BINARY FREQUENCY-SHIFT KEYING (BFSK)
There are two changeable carrier frequencies in BFSK signals, which are modulated by unipolar non-return-to-zero codes. BFSK signals are widely used in low probability interception radar systems due to the good Doppler and range resolution. The mathematical formula of BFSK is: where, A m is the amplitude of signals and f i is the i-th modulated frequency. M is the number of modulated frequencies.
N is the number of code elements and T p is the width of the code element.

3) BINARY PHASE-SHIFT KEYING (BPSK)
Signal phases of BPSK signals are modulated by the binary code and the carrier frequency is constant, which is widely used in radar systems with strong concealment and antijamming ability. The mathematical formula of BPSK is: where, A m is the amplitude of signals and f c is the carrier frequency. φ i is the i-th modulated phase and M is the number of modulated phases. N is the number of code elements and T p is the width of the code element. In our study, the modulation code is a Barker code with 13 code elements for BASK, BFSK, and BPSK.

4) CONVENTIONAL WAVE (CW)
The carrier frequency of CW signals is constant with on frequency and phase modulation. CW signals are widely used in conventional pulse radar systems. The mathematical formula of CW signals is: where, A m is the amplitude of signals, f c is the carrier frequency, and T is the pulse width. rect (·) means a rectangular wave and ϕ is the phase.

5) LINEAR FREQUENCY MODULATION (LFM)
The frequencies of LFM signals are modulated linearly with time. LFM signals are widely used in radar systems thanks to the good velocity resolution and distance resolution. The mathematical formula of LFM signals is: where, A m is the amplitude of signals, f c is the carrier frequency, ϕ is the phase, and k is the slope of frequency modulation.

6) NONLINEAR FREQUENCY MODULATION (SIN AND EXP)
The frequencies of NLFM signals are modulated nonlinearly with time (typically sinusoidal or exponential), which are widely used in novel system radars. The mathematical formula of NLFM signals is: where, A m is the amplitude of signals and ϕ is the phase. f (t) is a nonlinear function, especially, the sinusoidal modulation (SIN) and exponential modulation (EXP) in this paper. f (t) = Asin(ωt + ϕ) stands for a sinusoidal modulation and f (t) = Ae t+α stands for an exponential modulation.

7) STEPPING FREQUENCY WAVE (SFW)
SFW consists of a series of radar pulses with linear stepping frequency and the frequency of SFW increases with each step. SFW can reduce the requirement of instantaneous bandwidth of the digital signal processor while obtaining high range resolution. Therefore, SFW is widely used in synthetic aperture radar (SAR) and inverse synthetic aperture radar (ISAR). The mathematical formula of SFW signals is: where, T r is the pulse repetition period, τ is the pulse width, and f 0 is the starting frequency. f is the increase frequency and N is the number of stepping frequencies.

B. MULTI-DOMAIN FEATURES
The features in the frequency domain, autocorrelation domain, and time-frequency domain are widely used in AMR for radar signals. In our approach, these three features are extracted and fused using the SVM-based fusion strategy of ensemble learning.

1) FREQUENCY DOMAIN FEATURES
In the frequency domain, the frequency information is important in AMR for both consistent frequency signals and frequency-modulated signals. The fast Fourier transform (FFT) is applied to analyze the received radar signals and obtain the frequency domain features. Without loss of generality, the received signal x (t) can be defined as: where, s (t) is the ideal signal, T is the pulse width and n (t) means the noise. Then the FFT of x(t) is calculated as:

2) AUTOCORRELATION DOMAIN FEATURES
Autocorrelation is the cross-correlation of a signal with itself at different points in time. It can measure the similarity of the same signal between two observations. Autocorrelation function (ACF) is widely applied to find periodic signals disturbed by noise and can suppress the influence of noise on the signals to some extent. The mathematical model of the analytical signal obtained after being sampled is: where, A(n), f (n), and ϕ (n) is the amplitude modulation function, the frequency modulation function, and the phase modulation function, respectively. f s is the sampling frequency, and ϕ 0 is the initial signal phase. ACF is defined as the product of the signal and its delay and the mathematical formula of ACF can be expressed as: where, m is the delay. ACF has only relationship with the signal delay, modulation frequency, and modulation phase. Therefore, autocorrelation domain features are obtained by calculating the ACFs of radar signals.

3) TIME-FREQUENCY DOMAIN FEATURES
The Fourier transform is a product of time, which removes the time-varying signal of non-stationary signals. Therefore, the signal is required to be stable, and it is difficult to fully characterize time-varying non-stationary signals. Timefrequency analysis uses a joint function of time and frequency to describe the change of the signal spectrum with time and achieves an effective analysis of non-stationary signals.
Commonly used time-frequency analysis tools include short-time Fourier transform (STFT), Wigner-Ville distribution (WVD), Choi-Williams distribution (CWD), and so on. Compared with STFT and WVD, CWD can effectively suppress cross term interference and has higher time resolution and frequency resolution. CWD is a special kind of Cohen distribution which can be viewed as a smooth WVD. The mathematical formula of Cohen distribution is: where, W x (s, ξ ) is the WVD and (s − t, ξ − f ) is a smooth function. When the smooth function is an exponential function e −α(ξ τ ) 2 , the formula of CWD is obtained as:      (16) After introducing these three feature extraction methods, multi-domain features are calculated and obtained in the MATLAB platform. Then the features are converted into image formats as the input of multi-branch networks. Features of each modulation in the frequency domain, autocorrelation domain, and time-frequency domain under ideal conditions are shown in Fig. 1 to Fig. 8.

III. METHODOLOGY
In this manuscript, a novel multi-branch ACSE network with an SVM-based fusion strategy is proposed to learn and process the multi-features of radar signals. Multi-features   of eight modulations are extracted through FFT, ACF, and CWD introduced above and converted into image formats as the input of proposed networks. The proposed multi-branch ACSE networks consist of three single branch ACSE networks where one branch ACSE networks recognize features in one domain and then the outputs of three branches ACSE networks are fused by an SVM to obtain the final outputs.

A. THE STRUCTURE OF A SINGLE BRANCH ACSE NETWORKS
A single branch of ACSE networks consists of 34 layers of ACSE units. And ACSE units own asymmetric convolution (AC) block [19] and Squeeze-and-Excitation (SE) block [20]. AC block uses one-dimension (1-D) asymmetric convolution kernels to enhance the square convolution kernels (typically 3 × 3 kernels). SE mechanism allows a network to recalibrate features, via which the network could learn to highlight features and restrain useless features selectively using the global information. Therefore, an ACSE unit combines these advantages to improve learnability and promote recognition performance. The specific architecture of an ACSE unit is shown in Fig. 9.

1) AC UNITS
As shown in Fig. 9, the input images pass through an AC unit first which consists of three branches of convolution layers and batch normalization layers. For a 3-D image input M ∈ R U ×V ×C , the output feature map channel O ∈ R R×T ×D in a where, µ j are the mean of channel-wise in batch normalization and σ j denotes the standard deviation in batch normalization, γ j and β j represent the scaling factor and bias. Concretely, for the replacement of a 3 × 3 kernels, an AC unit comprises three parallel layers including square 3 × 3, asymmetric 1 × 3 and 3 × 1 convolution kernels, of which the outputs are added together to enrich the feature space. And three batch-normalization layers are merged into a standard convolutional layer via adding these convolution kernels (1 × 3 and 3 × 1) onto the corresponding positions of the 3 × 3 square kernel. In a filter j, F (j) denotes the fused 3-D kernel VOLUME 8, 2020 and b j denotes the obtained bias. Then F (j) can be calculated as: where,F (j) is the corresponding asymmetric convolution kernel 1 × 3 andF (j) is the corresponding asymmetric convolution kernel 3 × 1. Therefore, the output of an AC unit O AC is: where, O :,:,j ,Ō :,:,j andÔ :,:,j represent outputs of the square convolution kernel 3×3, asymmetric convolution kernel 1×3 and 3×1 branch. Before being input into the next SE unit, the output features O AC are required to be fused, resampled, and aggregated via a ReLU activation layer, a convolution layer, and a batch-normalization layer.

2) SE UNITS
As shown in Fig. 9, the features from an AC unit are input into a SE unit which is an improvement of a Residual unit in the proposed method. Usually, convolution operations in ordinary CNN own the ability to extract informative features via fusing together spatial and channel-wise information in local receptive fields. A SE unit can recalibrate channel-wise feature responses by explicitly modelling interdependencies between channels through the squeeze operation and the excitation operation. For any given transformation F tr : X → U , X ∈ R H ×W ×C , U ∈ R H ×W ×C , the features U are aggregated across spatial dimensions H × W to produce a channel descriptor in a squeeze operation. Without loss of generality, let F tr denote a convolution operator and its outputs U = [u 1 , u 2 , . . . , u c ] can be calculated as: where, V = [v 1 , v 2 , . . . , v c ] are the learned set of filter kernels. The global average pooling is always applied to generate channel-wise statistics and squeeze global spatial information. Mathematically, U can be shrunk to obtain a statistic z ∈ R c is through spatial dimensions H × W , where the c-th element of z can be written as: where, F sq (·) means the squeeze operation. In an excitation operation, to fulfill the requirements of capturing the channelwise dependencies fully, a simple gating mechanism with a sigmoid activation is employed: where, F ex (·) means the excitation operation, δ refers to the ReLU activation function, W 1 ∈ R C r ×C and W 2 ∈ R C× C r . Then two full connection layers around the non-linearity are followed to limit model complexity and improve generalization. The outputX c can be obtained through rescaling U with activations: where, F scale (u c , s c ) denotes a channel-wise multiplication between the feature map u c ∈ R H ×W and the scalar s c .
Since the activations play a role as channel weights adapted to the descriptor, SE units could boost feature discriminability via essentially introducing dynamic conditioned on the input. Finally, the outputX c and the original input of SE units is summed together to obtain the final output of ACSE units. In summary, the ACSE networks in a single branch explicitly enhance the representational power and the model's robustness and produce performance improvements for ordinary convolution neural networks at a minimal additional computational cost [19], [20]. In the real battlefield environment, the noise has greatly interference on received radar signal pulse trains and their multi-domain features. Compared with other ordinary CNN, ACSE networks which own the great representational power and feature adaptive selection ability can reduce the noise interference to some degrees and are more suitable for AMR.

B. THE FUSION STRATEGY
After obtaining the results of three branches in previous, a fusion strategy should be applied to fuse and output the final results. In general, a single classifier using single domain features may lead to poor generalization performance due to misclassification while combining multiple classifiers could reduce this risk. On the other hand, from the perspective of calculation, the learning algorithm is easy to fall into a local minimum, and the generalization performance of some local minima may be very terrible. However, the risk of falling into the terrible local minima can be reduced through the combination of multiple classifiers.
Therefore, for a multi-classification problem, the fusion strategy using multi-classifiers is always applied very well to improve generalization and classification performance [21]. Among the fusion strategy in ensemble learning, the voting strategy which includes the majority voting and the plurality voting and learning-based strategy is always used. Although the voting strategy owns ease of execution, it treats all classifiers equally for both poor classifiers and perfect classifiers, which leads to even more terrible performance. As for the learning-based strategy, the SVM is widely used in the multiclassification problem thanks to the small structural risk. More importantly, the SVM method with kernel function can learn the non-learning and non-linear relationship between inputs [22]. Consequently, the SVM fusion method in ensemble learning is applied in our model to fuse three results in three ACSE branch.  (1) in , l (2) in , and l (3) in are act as the input of the SVM fusion strategy. The input L N can be written as: where, N is the amount of the input samples. L N and the corresponding real labels are employed as the input of the SVM during training. And the radial basis function (RBF) is adopted as the kernel function in our model. The mathematical formula of RBF is: where, x is the input, x c is the center value of kernel function, and σ is the width parameter. After being trained, the optimal classifier can be obtained for the SVM fusion strategy. Finally, the optimal SVM fusion classifier is employed to get the fused results during the test phase.

C. THE OVERALL PROCEDURE OF THE PROPOSED METHOD
After obtaining the optimal weights of the multi-branch ACSE networks and the optimal classifier for the SVM fusion strategy, the framework of our method can be further described. The overall framework is given in Fig. 10 and can be summarized as Algorithm 1.

D. SIMULATION DATASETS AND PARAMETERS SETTING
A simulation dataset has been constructed to train and test the proposed method and the parameters are close to the actual environment by referring to [23]. The specific parameters are shown in Table 1. Here, ''SIN'' means the sinusoidal modulation and ''EXP'' means the exponential modulation. In addition, the white Gaussian noise is added to simulation signals and SNR is varied from −20dB to 20dB in 2dB increment. f c is the center frequency which is set from 30 MHz to 300 MHz for BASK, BFSK, BPSK, CW, and SFW. And the center frequency of LFM and NLFM signals is varied from 30MHz to 330MHz. ''B'' is the bandwidth which is from 25% of the center frequency to 30% of the center frequency for LFM, EXP, and SIN. f s is the sampling frequency which is set as 1GHz for all modulations. ''PW'' is the width of pulses which is set as 1µs. the modulation code is a Barker code with 13 code elements for BASK, BFSK, and BPSK. Therefore, about 1150 images are gotten for each modulation in each domain.
As for the specific parameters for the construction of multibranch ACSE networks and settings in both AC units and SE units, we have referred to [19], [20]. Besides, the construction of our method also requires many learnable parameters and some significant hyper-parameters during training. These learnable parameters are optimized in the ACSE networks training phase using the cross-entropy loss function to obtain optimal weights. The mathematical model for cross-entropy is: where, x is the input array which is the one-hot format. n is a class index in [0,N − 1] for N classes. During training for the multi-branch ACSE networks, the weights are randomly initialized between 0 and 1. The algorithm used for optimization is ''Stochastic Gradient Descent (SGD)'' [24], momentum is 0.9, weight decay is

Algorithm 1 The AMR for Radar Signals Based on Multi-Branch ACSE Networks
Input: Labelled radar signal training dataset S and test dataset T . Output: The prediction results.
Step 1: Calculate FFT, ACF and CWD of training dataset S to obtain multi-domain features and convert these features into image formats.
Step 2: Input images and labels into a corresponding single branch ACSE network for training and then obtain the optimal weights and outputs in each branch.
Step 3: Input the outputs and corresponding labels into an SVM for training and obtain the optimal fusion classifier.
Step 4: Predict the test dataset T using the optimal weights and fusion classifier via repeating Step 1 to Step 3 and output the prediction results. 0.00004, and the learning rate is set as 0.01. 70% of the training dataset is used for training and 30% is used for validation. 200 epochs have been run in the training phase and the batch size is set as 32. We have employed the early stopping strategy [25] to avoid overfitting.

IV. RESULTS AND DISCUSSIONS
To demonstrate the effectiveness and robustness, the proposed method is tested and compared with single branch ACSE networks and other networks on both simulation datasets and a certain number of measured signals. In all the experiments, a computer with Intel R Core TM i7-8700K 3.7GHz CPU, 32GB RAM and NVIDIA GeForce RTX 2060 6GB hardware capabilities, ''PyTorch'', ''Torchvion'' and ''Python'' programming language, CUDA 10.1, CUDNN software has been used.

A. RESULTS AND COMPARISON WITH THE SINGLE BRANCH ACSE NETWORKS
A test dataset that has the same distribution with the simulation training dataset is used to test the recognition performance of the proposed method. And the comparison with results before and after fusion is also analyzed.

1) RESULTS OF THE PROPOSED METHOD
The recognition performance of the proposed method under different SNR conditions is shown in Fig. 11. The accuracy for both BASK and SFW signals is close to 100% even at −20dB. The accuracy for BFSK and BPSK signals is close to 70% and 65%, respectively. And the average accuracy of all modulations is still higher than 50%. As the SNR increasing, recognition accuracies are all more than 92% at −10dB and when SNR > −8dB, the accuracies are close to 100%. In order to further analyze the recognition ability of the proposed method, the confusion matrixes at −12dB and −10dB are shown in Fig. 12(a) and Fig. 12(b). At −12dB, BASK, CW, and SFW signals are hardly misclassified but the rest modulations are easy to be misclassified, especially the SIN signals. There are a certain number of BFSK, BPSK, EXP, and LFM signals that are misclassified due to the similar features under noise. Under low SNR conditions, the multidomain features of SIN signals are more easily disturbed by the Gaussian white noise, which leads to a large number of misclassifications. With the improvement of SNR, these multi-domain features are more obvious and distinguished from each other. At −10dB, there are only a few misclas- sifications for all modulations except the SIN signals. The recognition performance of SIN signals has also been greatly improved.

2) COMPARISON WITH THE SINGLE BRANCH ACSE NETWORKS
Since the proposed method is based on a fusion strategy, it is necessary to compare the results before and after fusion. In general, results are given in both overall recognition accuracy of each modulation and performance under different SNR conditions in AMR. First, single branch ACSE networks using single domain features before fusion including frequency domain feature branch, autocorrelation domain feature branch, and time-frequency domain feature branch are compared with the proposed method. The overall recognition accuracy for eight modulations is shown in Fig. 13.
Here, ''Frequency'' represents the branch using frequency domain features, ''Corr'' represents the branch using autocorrelation domain features, ''CWD'' represents the branch using time-frequency domain features, and ''Fusion'' represents the proposed method. The overall recognition accuracy of the proposed method is close to 100% for BASK and SFW. For BFSK, BPSK, and CW, the accuracy of the proposed method is higher than 93%. For the rest modulations, the accuracy of the proposed method is still higher than 86%. Although for CW signals the accuracy of the frequency feature branch is higher than the proposed method, the performance of the frequency feature branch is not stable which is worst for SFW. Especially, the accuracy of the proposed method is at least 3% higher than the three single branches. On the whole, the accuracies of the proposed method are almost higher than three single branches for eight modulations, except CW signals.
As for the performance under different SNR conditions, the results for the frequency, autocorrelation, time-frequency feature branch, and the proposed method are given in Fig. 14. Both our method and the autocorrelation feature branch can achieve more than 50% accuracy even at −20dB, which is at least 20% higher than the rest two branches. With the improvement of SNR, all the accuracies are also increasing. The accuracy of the proposed method converges first and is close to 100% at −8dB. Then the accuracies of both autocorrelation feature branch and time-frequency feature branch converge and are close to 100% at −4dB. And the accuracy of the frequency feature branch converges at 4dB. On the whole, the recognition accuracies of the proposed method are almost the highest than three single branch using single domain features, which proves the effectiveness and robustness of our fusion strategy. The multi-branch ACSE networks can achieve better performance under low SNR conditions and promote recognition accuracy for each modulation.

B. COMPARISON WITH OTHER METHODS
In the next experiments, the comparison of the simulation dataset for test and measured signals are shown and analyzed. Besides, computation and time complexity are compared and discussed.

1) RESULTS ON THE SIMULATION DATASET
In essence, AMR for both radar signals and digital communication signal is a multi-classification problem. According to the existing reference, there are many DL-based methods or networks to solve the multi-classification problem. In our study, four newly proposed neural networks including ResNet [26], SE-Net [20], ECA-Net [27], and EfficientNet-b4 [28] are employed as a comparison. And we set the baseline that the training and validation dataset, the optimal algorithm, the hyper-parameters for these four networks are the same as the proposed method. Both ResNet and SE-Net are chosen with 34 layers.
Similarly, the results are given in both overall recognition accuracy of each modulation and performance under different SNR conditions. The overall recognition accuracy for four comparison methods and the proposed method is shown in Fig. 15. Five methods can all achieve great recognition performance for SFW signals and more than 80% accuracy for the rest modulations. Although the performance of the Effi-cientNet is 1% better than the proposed method for LFM signals, the accuracies of the proposed method are much higher for the rest modulations, especially for the BASK, BFSK, and BPSK signals.
As for the performance under different SNR conditions, the results of ResNet, SE-Net, ECA-Net, and EfficientNet are shown in Fig 16 (a) to (d), respectively. All methods can achieve close to 100% accuracy for SFW signals even at −20dB. The performance of SE-Net is slightly better than ResNet in our experiments and both of them recognize poorly at low SNR for all modulations except the SFW signals. The accuracies of these four methods converge to 100% at −2dB. The overall accuracies of five methods under different SNR are given in Fig. 17 for convenient comparison. At −20dB, the average accuracy of the proposed method is more than 55% which is at least 25% higher than the other VOLUME 8, 2020  four methods. And the proposed method can achieve more than 90% accuracy at −12dB where the average accuracies of ResNet, SE-Net, and ECA-Net are worse than 76%. It takes 12dB for the proposed method to converge to 100% accuracy and it takes 16dB for the other four methods to converge to 100% accuracy. Therefore, the proposed method owns better recognition performance especially under low SNR conditions and converges to 100% accuracy faster than the other four methods.

2) RESULTS ON MEASURED SIGNALS
Due to the need in complex actual battlefield electromagnetic environments, it is not enough to test the proposed method only using the simulation dataset under ideal propagation conditions. However, due to the confidentiality and particularity of the radar signals, it is hard to get enough real radar signal data which is also the reason why there are no experiments on measured signals in most literature focused on AMR for radar signals. In our study, we have got many measured signals of civil aviation by a certain radar and BASK, BFSK, and LFM signals are sorted out. In addition, these measured signals have been done some preprocessing such as filtering and resampling.
The recognition confusion matrix is given in Table 2. The proposed method can achieve 99% accuracy for BASK, 92% accuracy for BFSK, and 100% for LFM signals. Since  there are different code elements between the simulation dataset and the measured BFSK signals, BFSK signals are more likely to be misclassified. The comparison results on measured signals are shown in Fig 18. As for LFM signals which are most widely used in radar systems, all five methods can achieve great recognition performance. The accuracies of ResNet, SE-Net, EfficientNet, and the proposed method are all more than 90% for BASK signals. For the same reason, these four methods also have poor recognition performance for BFSK signals. In our experiments, the recognition and generalization ability of the ECA-Net is worst. Thanks to the fusion of multi-domain features, the proposed method can deeply excavate and fuse the characteristics of the signal in different domains to improve the recognition and generalization ability. On the whole, the proposed method outperforms these four networks especially under low SNR conditions and for measured signals.

3) THE COMPUTATION AND TIME COMPLEXITY
The parameter of floating-point operations (FLOPs), the number of parameters, and inference time are employed to measure and analyze the computational and time complexity. Since five methods are belongs to CNN and CNN mainly consist of convolution layer and full connection layer, we just given the mathematical formula of these two layers here.   The FLOPs of convolution layers and full connection layers can be calculated as follow [29]: where, D is the number of layers, M and K are the lengths of the feature map and kernel of l-th layer, and C is the number of channels. Table 3 demonstrates the FLOPs of all layers, the corresponding input size, the number of parameters, and the inference time based on the hardware platform mentioned above for the five methods. Due to the AC units added to promote the learning ability, the computation and time complexity of the proposed method is higher than SE-Net. In the computation complexity, the parameters of the other four methods are less than the proposed method. In the time complexity, the proposed method takes about 10ms more than SE-Net but about 20ms less than EfficientNet. On the whole, our model achieves better recognition performance and generalization ability with a certain amount of increase in complexity.

V. CONCLUSIONS
A novel AMR for radar signals based on multi-branch ACSE networks and multi-domain features is proposed to recognize eight kinds of common radar signals including the amplitude, phase, linear frequency, and non-linear frequency modulation. The proposed methods take full advantage of the significant learning ability of multi-branch ACSE networks and the effectiveness of the fusion strategy to promote recognition performance. The simulation results show the average accuracy of the proposed method is higher than 55% even at −20dB and the recognition accuracies for all modulations are more than 93% at −10dB. When SNR > 8dB, the accuracies all converge to 100%, which is much better than the results before being fused. Compared with ResNet, SE-Net, ECA-Net, and EfficientNet, the proposed networks achieve better recognition performance under low SNR conditions especially at −20dB to −10dB. The comparison results on measured signals prove that the proposed method owns better recognition and generalization performance and outperforms the other four networks, especially for BFSK signals.
Nevertheless, the analysis of computational and time complexity demonstrates our model has a certain increase of FLOPs and parameters, which may not be suitable for realtime processing of radar signals. In the future, we are supposed to reduce the storage source of the proposed method. Thus, we suggest reducing the complexity of our method by the model pruning and compression while maintaining the existing recognition performance as the future work. He is currently an Associate Professor. His research interests include radar signal processing and synthetic aperture radar systems. VOLUME 8, 2020