Intelligent Modulation Pattern Recognition Based on Wavelet Approximate Coefﬁcient Entropy in Cognitive Radio Networks

In this paper, in order to settle the problem of unintentional interference between communication devices and obtain effective information quickly and accurately in cognitive radio (CR) , and an intelligent modulation pattern recognition method based on wavelet approximate coefﬁcient entropy (WACE) is proposed. Based on the traditional wavelet entropy, an improved wavelet entropy, WACE, is presented, which can characterize the modulated signal pattern and suppress the noise effectively. Furthermore, in order to solve the problem of high complexity for linear weighting calculation, the deep neural network (DNN) is adopted, and the vector of the WACE is used as the input of the DNN to realize intelligent recognition of a variety of typical communication signal modulation patterns. Simulation results verify the correctness of the theoretical analysis, and show that the proposed intelligent recognition method can effectively realize the modulation pattern recognition of multiple signals at low signal-to-noise ratio (SNR), with relative low computational complexity.


I. INTRODUCTION
W ITH the rapid increase of wireless devices, the unintentional interference between devices is becoming more and more serious. In the field of military and civil communication, how to suppress the related unintentional interference and obtain the effective signal modulation pattern in time is an important and challenging problem. J.Mitola proposed cognitive radio (CR) technology in [1], which used the learning ability of CR to autonomously sense the surrounding spectrum environment and respond to the actual electromagnetic situation in real time. CR technology, as the most potential solution to overcome the problem of spectrum resource shortage, attracts a lot of attention, and is one of the potential solutions for future communication system. Therefore, the anti-jamming modulation recognition technology in the framework of CR has a broad prospect. For example, in electronic countermeasures, the modulation recognition technology is applied to the spectrum sensing of receiving equipment, which can provide more necessary information for the battlefield command and decision-making.

A. LITERATURE REVIEW
Wavelet analysis is a local transform of time and frequency, which can effectively extract information from the signal and is conducive to the perception of the surrounding electromagnetic environment. For some time, many CR technologies are devoted to modulation recognition of communication signals by using spectrum and cyclic spectrum [2], characteristic parameters and their statistics [3], time-frequency transform [4], [5], and high-order cumulants [6]- [8]. However, these methods are difficult to achieve multiresolution analysis of the modulated signals, which increases the difficulty of ob-taining effective information, and the real-time performance of signal analysis and processing is not good. Thus, in order to grasp the current radio spectrum situation quickly and accurately, wavelet analysis is applied to the modulation recognition of communication signals. In [9], the effective recognition of Arabic speech numbers is realized by using the wavelet coefficients of speech signals after discrete wavelet analysis. In [10], a recognition algorithm based on wavelet variation coefficient difference and similarity feature is proposed to classify and recognize common digital modulation signals. In [11], continuous wavelet transform and multilayer wavelet decomposition are used to extract the features of signals, and different classification features are adopted for different modulated signals. The algorithm does not need symbol period estimation and synchronization time estimation, which improves the operation speed and recognition rate of modulation signal.
Traditional feature extraction and classification methods are mainly based on feature statistics and clustering algorithms. The example, in [12], a density based spatial clustering of applications with noise (DBSCAN) algorithm is proposed, and combined with K-means clustering algorithm based on distance, the characteristic values of signals are directly extracted to realize modulation recognition of communication signals. However, when the noise interference is serious, the recognition rate is low. In order to keep the recognition rate and reduce the complexity as much as possible, in [13], a novel method based on constellation structure is proposed to identify PSK and QAM modulation of different orders, in the slow and flat fading channel. In [14], two depth automatic encoders and cyclic spectrum features of signals are used for modulation recognition, but the recognition performance is poor in low signal-to-noise ratio (SNR) environment.
Compared with the traditional feature extractions and classification methods, deep learning has been applied to modulation recognition of communication signals because of its strong classification ability and fitting ability to nonlinear functions, which provides a new solution to the problems existing in wavelet analysis [15], [16]. In [17], a method based on deep learning is proposed, which combined with two convolutional neural networks trained on different data sets to achieve a relatively high automatic modulation recognition rate. In [18], an automatic modulation recognition framework is established to identify radio signals in communication systems. After preprocessing the signal data, the deep convolution neural network and the long-term short-term memory network are considered in the framework. The recognition rate of the signal is significantly improved compared with the original method. In order to improve the accuracy of automatic modulation classification (AMC) in the case of small data sets, in [19], a hybrid model named HybridNet is proposed, where a bidirectional gated recurrent unit (Bi-GRU) is placed after convolutional neural network to capture temporal dependencies explicitly. In [20], a method of modulation recognition by exploiting the graph convolutional network (GCN). Herein, to convert signals to graphs, the modulation dataset is divided into multiple subsets.
However, there are two major problems in the application of discrete wavelet analysis and deep neural network (DNN) in multi signals modulation recognition. (1) The signal recognition methods only using wavelet analysis rely on the accurate estimation of communication signal parameters. Noise, frequency offset and other interference factors will bring large errors to the above identification methods, and the error reduction methods are also complex, so these identification methods can not be applied to the complex and changeable electromagnetic environment. (2) Using the above deep learning model alone is more complex and requires more space costs. In the current communication environment, resources are relatively scarce, so it is more important to keep the recognition rate stable or even improve while reducing the complexity. To sum up, in the complex electromagnetic environment with low SNR and difficult to obtain accurate modulation parameters, it is a crucial problem to find a method with low complexity and high modulation recognition accuracy.

B. OUR CONTRIBUTIONS
In order to solve the above problems, this paper uses the method of discrete wavelet analysis combined with DNN to realize modulation pattern recognition of communication signals in complex electromagnetic environment. To reduce the computational complexity of DNN and improve the output performance of modulation recognition at low SNR, an intelligent modulation pattern recognition method based on wavelet approximate coefficient entropy (WACE) is presented. Different from the traditional method, it can effectively extract the correlation features of signals in complex electromagnetic environment with low SNR, in which it is difficult to obtain accurate modulation parameters. At the same time, combining with the DNN can solve the problem of high complexity of multi signals modulation recognition using the linear weighting of the WACE alone. Conversely, the WACE can also be used to characterize the modulation signal with a small number of features, which can reduce the number of input variables of the DNN to a certain extent, thus reducing the training complexity of the DNN. Therefore, this method complements the advantages of WACE and DNN to realize modulation pattern recognition of multiple signals.
In general, the key contributions of this work can be summarized as follows: 1) A novel improved wavelet entropy, WACE, is presented. Compared with traditional wavelet entropy, WACE has a better ability to characterize modulation signal patterns and suppress noise, which is beneficial to modulation pattern recognition. 2) Aiming at the problem that it is difficult to recognize the modulation patterns in complex and changeable application scenarios with serious interference between devices, the exponential factor γ and linear weighting factor α of WACE are defined, and by dynamically ad- justing the two factors, the portability and engineering application of this method are realized. 3) For the high complexity problem of linear weighting using WACE alone, in the case of given the exponential weight factor γ, the WACE combined with DNN method is used to realize modulation pattern recognition under low SNR with relative low computational complexity. 4) In the case of small training set and low SNR, the model performance and computational complexity of the proposed WACE combined with DNN method are analyzed, and some comparison results are given to prove that this method has lower complexity, faster convergence speed and better recognition performance. The organization of the paper is presented as follows. In Section II, the system model of communication signal modulation recognition is given. In Section III, the mathematical definition of WACE is given, and the WACE vector of each communication signal is calculated as the input of the DNN. Then, the section IV describes the relevant information of the DNN in detail, and analyzes the influence of related variables on the DNN. In Section V, simulation performance of the model for modulation recognition is given, and the influence of parameter selection is analyzed. Moreover, the comparison with other paper schemes is also considerd. Finally, Section VI summarizes the paper.

II. SYSTEM MODEL OF MODULATION RECOGNITION
In this section, a complete communication signal modulation recognition system model will be established to simulate the communication situation in complex electromagnetic environment, and five representative and widely used communication signal modulation patterns will be considered.
The communication signal modulation recognition system model in this paper includes an integrated signal processing center and N potential modulation signal transmitters, as shown in Fig. 1.
In the field of wireless communication, the signal processing center is the core of the whole system. It is responsible for receiving and processing signals of various modulation patterns from all directions, including valid communication signals and unintentional interference signals. At the same time, there may also be intentional jamming signals. For example, in electronic warfare, whether it is to accurately obtain the information transmitted by our side, or to intercept the enemy signal to obtain intelligence, or to interfere with the effective communication of the enemy, it is inseparable from the modulation recognition technology. In order to reflect the diverse modulation patterns of communication signals in complex electromagnetic environment better, five typical modulation patterns of communication signals are selected in this paper, and they are minimum shift keying (MSK), quadrature phase shift keying (QPSK), 16 quadrature amplitude modulation (16QAM), offset-QPSK (OQPSK) and binary phase shift keying (BPSK) respectively.

III. MODULATIONS RECOGNITION BASED ON WACE
Wavelet entropy is a theory developed on the basis of wavelet transform, multiresolution analysis and information entropy. In this section, under the guidance of wavelet theory and multiresolution analysis, the traditional wavelet entropy algorithm and the proposed WACE algorithm are analyzed respectively.

A. WAVELET THEORY AND MULTIRESOLUTION ANALYSIS
Wavelet analysis is a time-frequency analysis method. For any function f (t) ∈ L 2 (R), the continuous wavelet transform is (1) where a is the scaling factor, b is the translation factor, a, b ∈ R; a = 0, and ψ a,b is the the scaling and translation wavelet sequences of the basic wavelet ψ(t). In practice, continuous wavelet must be discretized. Discrete wavelet transform is to discretize continuous parameter a, b into m, n, and the basic wavelet function of discrete wavelet transform is Then the discrete wavelet transform of any function is Multiresolution wavelet analysis decomposes a signal into components at different scales using orthogonal wavelet basis function (Daubechies 5 (db5) wavelet basis function in this paper). The implementation process is equivalent to repeating a set of high-pass and low-pass filters to decompose a time series signal: High-pass filters produce high-frequency detail components of the signal, and low-pass filters produce low-frequency approximate components of the signal. Each time the low-frequency component of the previous scale is decomposed again, and two decomposition components of the next scale are obtained [21], as shown in Fig. 2. In Fig. 2, S represents the original signal, An represents the low-frequency approximate component on the scale n obtained after the decomposition of the original signal, and Dn represents the high-frequency detail component on the scale n. In the field of wavelet analysis, multiresolution analysis can decompose the signal to be analyzed by one or more layers through discrete wavelet transform to obtain the approximate component (low-frequency component) and detail component (high-frequency component). The approximate component can be considered as the main part of the signal, while the detail component can be considered as a supplement to the approximate component. The detail component does not change the important attributes of the original signal, mainly reflects the instantaneous state information of the original signal. At the same time, the detail component may contain some key information of the original signal, or some noise. Thus, this decomposition method is also conducive to the denoising of the signal polluted by noise. Meanwhile, most of the random noise and high frequency interference between devices are distributed in the detail component. To some extent, we can use this feature to achieve better modulation recognition performance by reducing the proportion of detail component and increasing the proportion of approximate component.
In order to suppress noise and unintentional interference between devices to the greatest extent, obtain information quickly and effectively, and improve spectrum utilization as much as possible. Based on the above, the limit case may be considered: If all the low-frequency components are taken and the high-frequency components are discarded, the antinoise performance will be greatly improved, but some key information of the signal will be lost. This paper makes up for this problem by two means. (1) If the wavelet coefficients of the original signal are added to the multiresolution analysis of the modulated signal to form the wavelet domain features together with other scale analysis results, no information of the original signal will be lost. (2) If the appropriate wavelet function is selected to make the energy of each scale more concentrated on the low-frequency components, the denoising effect is better.
dbN wavelet (N is the order of the wavelet function) performs well in the field of signal denoising, therefore, this paper selects this wavelet. There are two aspects to consider when choosing the order N in this paper. (1) N in the wavelet corresponds to the vanishing moment of the wavelet function. The larger the vanishing moment, the smaller the high frequency coefficient, the more concentrated the signal energy, and the better the noise removal effect. (2) The increase of vanishing moment N will also cause too much noise to be concentrated in the low frequency component, which will affect the denoising effect, and will also make the support length of the wavelet function longer. Ulteriorly, excessive support length will significantly increase the computational complexity. To sum up, this paper chooses db5 wavelet function to concentrate signal energy and obtain the best denoising effect.

B. TRADITIONAL WAVELET ENTROPY ANALYSIS
In this section, two traditional wavelet entropy are mainly analyzed, which are wavelet energy entropy and adaptive wavelet entropy.

1) Wavelet energy entropy
By combining multiresolution wavelet analysis with information entropy, the definition and calculation method of wavelet energy entropy for any signal can be obtained [21].
Suppose that any digital signal s(n) with n sampling points is decomposed on M scales. On a given decomposition scale m, the wavelet coefficient vector is A m = (a m1 , a m2 , ..., a m,n ), m = 1, 2, ..., M . A vector sequence {A} can be formed by the wavelet coefficient vector A 1 , A 2 , ..., A M of each decomposition scale. In this paper, the vector norm of wavelet coefficients can be used to describe the closeness of wavelet coefficients at different scales, and the energy on scale M can be defined as E j of the wavelet coefficients at each scale is used as the distribution of the energy sequence instead of the probability distribution of the signal. Ulteriorly, the entropy based on the energy distribution is called the wavelet energy entropy, which is defined as 2) Adaptive wavelet entropy The concept of adaptive wavelet entropy is based on information entropy, in [22], the definition of adaptive wavelet entropy is given by combining information entropy theory with discrete wavelet transform where adaptive wavelet entropy E is a real number, S is the original signal s(n) decomposed by discrete wavelet transform, P is an exponential weight, 1 ≤ P < 2, S m is the m-th layer signal of the original signal after the discrete wavelet transform, and N is the length of signal S m .

C. WACE
The two wavelet entropy mentioned above, including wavelet energy entropy and adaptive wavelet entropy, have achieved good results in their respective fields. However, if they are used in modulation recognition of CR, especially when the SNR of the modulation signal to be identified is low, the two wavelet entropy cannot achieve good recognition results. For example, in [23], adaptive wavelet entropy is used for multi signals modulation recognition, and the average recognition rate is about 95% when there is in the absence of noise in the back propagation (BP) network. However, when the SNR is low, the recognition performance of this method for some modulated signals decreases rapidly and the recognition performance is not good. Based on this, an improved wavelet entropy, WACE, is presented, and it is the entropy value calculated from all the wavelet approximation coefficients of the signal. The vector of the wavelet approximation coefficient can be expressed as where m represents the decomposition scale parameter, m ∈ where γ is the exponential weight item. After this step, the vector sequence {W} of wavelet approximate coefficients in different scales is transformed into 2-norm weighted sequence {||W||}. Here, the wavelet coefficients of the original signal are weighted by 2-norm and added to the vector sequence to ensure that the information of the original signal is not lost in the feature extraction of the wavelet domain. Suppose the signal is decomposed at M scales and the approximate coefficient vector of the wavelet on scale m is W m = (w m1 , w m2 , ..., w m,n ). Energy at scale m is defined as To increase the number of wavelet entropy features of the signal to be identified, referring to the concept of adaptive wavelet entropy, the WACE is given with the following expression where A represents the WACE of the m-layer of discrete wavelet decomposition, L m is the length of the m-layer wavelet approximation coefficient, and γ approx is an exponential weight vector. In this way, the meaning of WACE is the average energy of wavelet approximate coefficient per length of signal in a certain scale, or the average energy of wavelet approximate coefficient of digital signal at each sampling point. This improved wavelet entropy represents the average energy of each wavelet approximate coefficient length in any signal, and reflects the uncertainty of signal at different decomposition scales, this is why we call it WACE. For different signals, the WACE at a certain scale can reflect the characteristics of the signal at that scale. When a signal is decomposed into M -level by discrete wavelet transform, M + 1 WACEs can be calculated according to (9). Here, each layer of WACE represents certain wavelet domain characteristics of the signal. In order to make them represent the signal together, the entropy vector is constituted by the WACE in each layer, which can be expressed as follows (11) Compared with wavelet energy entropy and adaptive wavelet entropy, WACE has many advantages in modulation recognition. (1) By discarding the high-frequency coefficients after discrete wavelet decomposition and using db5 wavelet with larger vanishing moment, the extracted entropy vector of WACE has stronger anti noise ability. (2) By selecting different weight vector γ approx , the proportion of low-frequency components is increased, and the interference of high-frequency noise is suppressed. In the same noise environment, the number of decomposition layers can be reduced, the computational complexity can be reduced, and the recognition speed can be faster.
In this paper, the selection of weight vector matrix is 1.5 times of unit column vector, and the reasons are as follows.
(1) After adding 1.5 exponential term, the residual noise in low frequency coefficients of each scale can be further weakened, and the key information which is conducive to feature extraction can be amplified. (2) If the exponential weights change in the same direction with the number of decomposition layers, the key information in the lowerscale coefficients will be obscured, disturbing the feature extraction of modulated signals, resulting in a decrease in recognition rate or speed. Conversely, if the exponential weight vector changes in the opposite direction with the number of decomposition layers, some of the noise in the small-scale coefficients will be amplified so that useful features of the modulated signal may not be extracted, and the recognition rate will also be reduced. Of course, depending on the actual problem solving, different exponential weights can be applied to make the WACE achieve better analysis processing effect, that is, the improved wavelet entropy has good portability in other fields. In Section V-C, the influence of different selection of exponential weight vector a on multi signals modulation recognition is analyzed in detail. VOLUME 4, 2016 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

A. NECESSITY ANALYSIS OF INTELLIGENT RECOGNITION
In theory, the use of WACE can realize the modulation pattern recognition of two signals even multi signals through linear weighting, and the feature quantity T can be expressed as follow, where a is the weight vector.
However, this modulation pattern recognition has many shortcomings in practice. (1) When a different signal is received, even the same modulation pattern, needs to be reidentified due to differences in noise, frequency offset, etc., and the selected linear weighting coefficient and exponential weighting vector need to be changed frequently but not always enable to find a clear calculation formula. Ulteriorly, only relying on experience will lead to a decline in recognition efficiency, which will affect the performance of the system. (2) In the complex electromagnetic environment, if the detection and identification of parameters are completed, the calculation complexity is high, the steps are cumbersome, and the cost of time and space will be very high, which cannot meet the requirements of rapid modulation recognition.
Therefore, this paper uses WACE combined with DNN for intelligent recognition. (1) The use of DNN does not require too many parameters, and the weights can be directly used for signal modulation recognition after training and optimization, with high recognition efficiency and low complexity. (2) DNN has strong feature extraction and classification capability for the recognition of multi modulation signals, which avoiding the cumbersome process of pairwise recognition in traditional algorithms, and identifying multi modulation pat-terns in one step. In addition, in a complex electromagnetic environment, DNN-based intelligent modulation recognition is more conducive to the construction of CR systems, and then intelligently perceive the surrounding radio environment. This paper will analyze the superiority of the proposed WACE and the necessity of intelligent recognition in section V-D.

B. THE NETWORK MODEL OF INTELLIGENT RECOGNITION
In this paper, the network model of intelligent recognition is a DNN model, which has three layers, including 6 nodes in the input layer, 30 neurons in the hidden layer, and 5 nodes in the output layer, as shown in Fig. 3. In the DNN model, the 6 input nodes of the input layer are determined by the dimension of the WACE vector E approx . When the decomposition scale is 5, the dimension of E approx is 6. If the decomposition scale is too small, the extracted WACE components are too few and the feature quantity is insufficient, which will affect the accuracy of modulation recognition and is not conducive to fighting noise. Conversely, if the decomposition scale is too large, it will not only increase the complexity of the DNN, but also has too many features, which will make the generalization of the DNN worse and is not conducive to signal modulation recognition.
The learning rule of DNN is a random gradient descent algorithm, which has higher processing speed and stability when input complexity is lower than batch algorithm. The learning method is a BP algorithm driven by a cross-entropy cost function. In this paper, considering high performance and simplicity, we use the cross-entropy cost function in the training process and the mean square error (MSE) cost function in the subsequent performance evaluation of the convergence curve of each modulation signal. The activation function of the hidden layer is the Sigmoid function, and the activation function of the output layer is the Softmax function. In addition, in [23], it selects 200 samples of each modulation signal, then calculates its adaptive wavelet entropy vector and obtains a better convergence effect through DNN training. Therefore, in the training optimization process of this paper, the training set is 200 calculated WACE vectors E approx of randomly generated MSK, QPSK, OQPSK, BPSK, and 16QAM modulated signals, and the test set is 100 calculated WACE vectors E approx of randomly generated MSK, QPSK, OQPSK, BPSK, and 16QAM modulated signals.
In [24], a clustering algorithm is used to extract the characteristic parameters of the modulated signal, and then a dual hidden layer DNN is used to identify and classify the modulated signal. However, this method has a higher complexity and a lower recognition rate under low SNR. In contrast, using the WACE vector as the input of the DNN can reduce the number of layers. Especially, when the SNR of the input signal is low, it can reduce the training complexity of the DNN effectively and ensure a certain correct recognition rate.

V. SIMULATION RESULTS AND ANALYSIS
In this section, five modulation signals are simulated using the model proposed in this paper, and the modulation recognition situations under different SNR are compared. Then, the influence of different exponential weight vector γ approx on the recognition is analyzed. Finally, the superiority of the model under low SNR is verified and compared with other paper schemes. Herein, we mainly analyze and compare the performance of WACE and two traditional wavelet entropy when the intelligent recognition network is not used, and the performance of the proposed WACE method and clustering method, high-order cumulant method, cyclic spectrum method for feature extraction. Furthermore, the above features combined with DNN for modulation pattern recognition are also taken into account.

A. IDEAL SITUATION IN THE ABSENCE OF NOISE
In the simulation experiments, as described above, the five modulation signal patterns in this paper are MSK, QPSK, 16QAM, OQPSK and BPSK. Oversampling each modulated signal produced by the same parameters, each modulated signal has 500 sampling points, and the carrier frequency of the modulated signal varies according to the actual situation. Under ideal noise-free conditions, the modulation recognition of the five signals is shown in Fig. 4(a) and Fig. 4(b). Fig. 4(a) shows the recognition performance in an ideal environment without noise, and the recognition rate of 16QAM, MSK, OQPSK and QPSK signals increases gradually from 0 to nearly complete recognition with the increase of training epochs of DNN. Similarly, Fig. 4(b) shows the change of recognition rate of BPSK signal with training epochs in an ideal environment without noise. Among the five modulation signal patterns in this paper, the vector feature of WACE of BPSK signal is the most obvious compared with the other four modulation signal patterns. Thus, after several training epochs of DNN, the gap between the actual output and the true correct value is very small. In Fig. 5, when the number of training is small, the value of BPSK signal convergence curve is very small, which is similar to the final convergence state of other four modulation signals. Therefore, the recognition rate of BPSK signal is at a high level at the beginning, gradually tends to be stable with the increase of training epochs, and the correct recognition rate of test set is higher. In addition, in the recognition process, the training epochs required for the five signal recognition rates to reach the stationary state are different, and they increase in the order of 16QAM, MSK, OQPSK and QPSK. During the training process, five kinds of modulation signals are mixed training. For the test set, the weight of the DNN is updated every round, and then when the recognition is performed, each modulation signal will have a big difference in the recognition rate due to different convergence speeds. When the number of training is more than 40 epochs, the correct recognition of five kinds of modulation signals can almost be achieved. Fig. 5 is a graph showing the variation of the MSE of the five modulation signals with the number of training epochs in a noise-free environment, indicating the convergence performance of the DNN model during the training process. It can be seen from the Fig. 5 that the MSE of 16QAM drops rapidly after several rounds of training, so as to obtain better convergence performance, followed by MSK, when the number of training epochs is 10-20, the MSE decreases significantly. The decreasing trend of the MSE of QPSK and OQPSK is basically the same. When the number of training epochs is 20-30, the convergence speed of OQPSK is slightly faster than that of QPSK. In addition, the rapid decrease in the MSE of the training set corresponds to the rapid increase in the recognition rate of each modulation signal in the test set. In Fig. 4(a), in the recognition process, the number of training required for each signal to reach the convergence state increases in the order of 16QAM, MSK, OQPSK and QPSK, and the MSE decreases rapidly in the interval of the corresponding training epochs.
In the test set identification process, in order to more clearly see the identification of the WACE vector, the two relatively similar signals, QPSK and OQPSK, are compared. In Fig. 6, with the increase of training epochs, the probability of QPSK signal being mistaken as OQPSK signal increases gradually, and the false recognition rate reaches the maximum between 15-20 training epochs. Then, with the continuous increase of training epochs, QPSK can hardly be mistaken as OQPSK signal when the training epochs reach more than 35. Consequently, this also shows the effectiveness of the proposed method for QPSK and OQPSK modulation recognition.

B. ACTUAL SITUATION WITH NOISE
In order to explore the recognition of the above five modulation signals by the WACE vector in a noisy environment, MSK and QPSK are selected to consider their modulation recognition performance under low SNR, and the added noise   is additive white Gaussian noise (AWGN). The SNR selected in the simulation are 1dB, 2dB, 5dB and 10dB. Fig. 7 shows the recognition rates of the test sets of MSK and QPSK at a given SNR. It can be seen from the Fig. 7 that as the SNR decreases, the training epochs required for the MSK and QPSK to reach the convergence state increases significantly.
In addition, WACE has a strong ability to extract key features of the noise-added modulation signal, and the influence of noise can be mostly removed by increasing the number of training. When the SNR is 1 dB, the MSK and QPSK can still be identified accurately, which is determined by the denoising characteristics of the WACE vector. At the same time, this also shows that the method proposed in this paper can achieve modulation recognition under low SNR by slightly increasing the training epochs of the DNN at a small cost. Fig. 8 shows that the MSE of MSK in the final convergence state increases with the decrease of SNR. However, this does not affect the convergence state of the signal at low SNR. In 1dB environment, the MSE can still reach about 10 −7 , that is, the modulation recognition under low SNR can be realized by using WACE combined with DNN. In addition, as shown in Fig. 4(a) and Fig. 7, the recognition rate of MSK increases rapidly in the interval of 10-20 training epochs. Correspondingly, in Fig. 8, it can be seen that the MSE decreases with the increase of SNR under any same training epochs between 10-20. For other modulation signals, 16QAM, OQPSK and BPSK have similar rules. To sum up, the results show that the proposed model is effective for multi signals modulation recognition under low SNR.

C. ANALYSIS OF INDEX WEIGHT VECTOR SELECTION
At the end of Section III-C, the reason why this paper chooses 1.5 times unit column vector γ approx is explained. In this section, we will elaborate on the influence of the γ approx on the modulation recognition from the perspective of simulation analysis. Taking the OQPSK as an example, we will analyze the influence of different γ approx on OQPSK modulation recognition in an ideal noise-free environment  and a 1dB noise environment. Fig. 9(a) shows the modulation recognition of the OQPSK when the exponential weight vector γ approx is α times the unit column vector in a noise-free ideal environment, where α is 1.0, 1.5, 2.0, 2.5, and 3.0, respectively. The two cases of A and B in the Fig. 9 are γ approx = [1,2,3,4,5,6] T and γ approx = [6, 5, 4, 3, 2, 1] T . It can be seen from the figure that as α increases, the training epochs required for modulation signal recognition to reach a steady state gradually decreases, but the speed of decrease becomes slower and slower. This is because in the ideal environment without noise, with the increase of α, the WACE features of each scale of the modulation signal are exponentially amplified, the discrimination between the modulated signals becomes larger, and the training epochs required for recognition are reduced. But when the exponential weight is increased to a certain extent, the reduction of training epochs is not obvious. In case A, the weight of large-scale signal is large, and the weight of smallscale is small. Some key information in large-scale original signal is lost and the amplification degree is large, resulting in the signal recognition speed slowing down obviously. On the contrary, in case B, the large-scale weight is small, and the small-scale weight is large. In the ideal environment without noise, the original signal features are maximized, and the training epochs required for recognition are the least. Fig. 9(b) shows the recognition of OQPSK in a noise environment with a SNR of 1dB. It can be seen that with the increase of α, part of the original noise is exponentially amplified, and the degree of recognition fluctuation gradually increases. Until α = 3.0, the recognition of OQPSK deteriorates rapidly. Similarly, in case A, some key information in large scale is lost, and the recognition rate increases slowly. In case B, the original signal noise is amplified too much, and the modulation recognition of each signal is invalid. To sum up, α = 1.5 and α = 2.0 can achieve a higher recognition rate at low SNR. Considering the recognition stability and computational complexity, this paper chooses α = 1.5 to realize the modulation pattern recognition of multi signals.

D. COMPARISON OF WACE AND TRADITIONAL WAVELET ENTROPY
In this part, we will compare the modulation recognition effects of the three wavelet entropies given in the paper without using the intelligent recognition network model. The modulation signal patterns include the existing 5 patterns and the newly-added binary frequency shift keying (2FSK), and the parameters of the modulation signal are the same as above. In the simulation, 500 repeated experiments were performed to analyze the recognition rate of three kinds of wavelet entropy, and the classification and recognition were carried out according to the range of the characteristic parameters corresponding to different modulation signals. Take a= [8,16,32,16,8,4] and the recognition rate is shown in Fig. 10.
The recognition of low SNR is mainly considered. (1) Compared with the adaptive wavelet entropy method, the WACE method increases the recognition rate by about 15%   on average under low SNR, and when the SNR is higher than 2dB, the rate of correct recognition of all six modulated signals is above 90%. Besides, when SNR is 4dB, the recognition rate reaches 95%, and when it is higher than 5dB, the recognition rate is stable above 98%. (2) Compared with the wavelet energy entropy method, except for the SNR near -5dB, the correct recognition rate of the WACE method under the other SNR is much higher than that of the wavelet energy entropy method. Therefore, the simulation shows that compared with the existing method, the WACE method has better effect on the feature extraction of the modulated signal, stronger anti-interference ability, and the computational complexity is almost the same as that of the original method. But at the same time, the recognition rate of WACE is not high enough, and the weighting process is still complicated, so this paper introduces intelligent recognition model to solve this problem.

E. PERFORMANCE COMPARISON WITH OTHER ALGORITHMS
The algorithm proposed in this paper can realize the modulation pattern recognition of many common signals, especially for QPSK and OQPSK, it has good recognition effect. OQPSK is not included in the modulation recognition algorithm of [24]- [26], while 16QAM is not considered in [27]. Besides, most of the modulation recognition algorithms proposed in other papers are intra-class recognition of a certain kind of modulated signals, such as MQAM, MFSK, etc. [28]- [30]. Tab. 1 shows the comparison of recognition rate and related parameters of DNN between this paper and other related papers under low SNR.
Each paper in Tab. 1 uses different feature parameter extraction methods, and the number of feature parameters obtained is different, but they all use DNN as classifier to recognize modulation patterns of multi signals: (1) Compared with [24], the proposed algorithm reduces the number of layers of DNN to three, which effectively reduces the complexity of the algorithm, and has better performance at low SNR. (2) Compared with [31], this paper significantly reduces the number of signal characteristic parameters and the algorithm complexity when the performance is similar to [31] at low SNR. (3) Compared with [32], the recognition performance of this paper is better at low SNR when the complexity of DNN is similar. To sum up, it shows that the proposed method can effectively realize modulation pattern recognition at low SNR and reduce the computational complexity, which is more applicable in engineering.

VI. CONCLUSIONS
Based on the traditional wavelet energy entropy and the adaptive wavelet entropy, a new improved wavelet entropy, the WACE, is proposed. It can extract the relevant features from the modulated signal better and has better anti-noise performance. Then, in order to reduce the computational complexity of DNN and improve the output performance paper > 98% the lowest SNR (dB) number of feature parameters number of DNN layers WACE in this paper 1 6 3 clustering in [24] 4 6 4 high-order cumulant in [31] 0 11 3 cyclic spectrum in [32] 5 4 3 of modulation recognition at low SNR, the system model of modulation recognition is established and five typical modulation patterns of communication signals are selected. Finally, the proposed method is simulated under the ideal environment without noise and different SNR, and the simulation results show that the intelligent modulation recognition method based on WACE and DNN not only reduces the complexity of DNN, but also improves its modulation recognition performance at low SNR. In the complex electromagnetic environment with low SNR, a higher recognition rate can be achieved by slightly increasing the training epochs of DNN, which proves the effectiveness of the modulation recognition method.