Number of Coherent Nodes Estimation Based on Multiple Feature Extraction in Wireless Sensor Network

Knowing number of nodes is the precondition of their locating and estimation of other parameters in wireless sensor network, actually it can be evaluated according to the signal received by the sensor array. The popular algorithms are only appropriate for the uncorrelated signals and circumstance of Gaussian white noise, the estimation precision will deteriorate at some harsh environments. Hence a new algorithm for determining coherent node number based on multiple feature extraction in wireless sensor network is presented in this article. First, perform spatial difference smoothing to the array data. Then the eigenvalues and eigenvectors can be acquired by eigen-decomposition, consequently multiple features which are used for training are constructed. Finally, back propagation neural network and particle swarm optimization are exploited for calculating node number. Simulation results demonstrate that the algorithms has good performances under the background of colored noise and small samples.


I. INTRODUCTION
With the development of sensor, embedded system, information processing and wireless communication, wireless sensor network(WSN) consisting of many nodes makes it possible to obtain effective information from land, sea, air, sky, and underground. WSN is a kind of net constructing by some sensors which have possibility of perception, computing, communication, and organizing through Ad Hoc, these nodes collaborate with one another [1]- [6], monitor, perceive, and collect every various environmental data in real-time, we can fuse these data to get detail information, then transmit them to the users [7], [8]. Comparing with traditional wireless communication network, WSN shows self-organization, lowcost, and dynamic, so that it can be deployed in the forest [9], [10], desert [11], battle field [12], [13], and other hostile environments, as well as intelligent home [14], [15], The associate editor coordinating the review of this manuscript and approving it for publication was Qilian Liang . factory [16], [17], traffic [18], [19], and hospital [20], [21], their number can be a few to tens of thousands.
The core mission of WSN is detecting information of interest, one of the common characteristics is the demand for the node number [22]- [25], we can respond correctly only number of the nodes are known accurately, it is also a premise of estimating their directions of arrival, positions, powers, and other parameters, actually we can determine this information according to their signals acquiring by some array sensors. In practical applications, there are variety of disturbances, and complex noise, meanwhile, sampling times is always limited by the hardware and harsh environments, then the node number are often calculated inexactly, leading to the deviation of other estimations. Generally speaking, the nodes to be estimated are the cooperation targets of our side, but there are also lots of non-cooperative nodes, some of them are correlated, even coherent with one another, the most typical examples are radars, submarines, satellites, and their decoys set by enemy deliberately, they respectively form a  wireless network and bring great difficulty to our locating and other parameter estimations [26], [27]. Therefore, to study an effective node number estimation algorithm has a great theoretical significance and practical value.
The problem of signal number estimation can date back to the late 1950s, it is necessary to set a detecting threshold to compare with the test statistic at that time, so this algorithm is susceptible to the subjective influence. Since the minimum length description (MDL) [28] and other information theory criteria [29] were put forward, their good estimation performances have attracted wide attentions, but they are only suitable for the Gaussian white noise, concerning this issue, some scholars proposed Gerschgorin Disk estimation [30] which is appropriate for colored noise, then they are improved by spatial smoothing technique, so as to apply to coherent sources. Another algorithm is based on Bootstrap, which is a course of resampling to the received data, in the year 2000, Brcich was the first to use Bootstrap and hypothesis testing to compute the signal number [31]. Based on Bootstrap resampling, Zhang exploited clustering to estimate number of sources in circumstances of low signal to noise ratio(SNR) and small samples, as a result of full use of eigenvalues and eigenvectors, the precision is improved greatly compared with previous similar algorithms [32]. But because Bootstrap needs resampling, recombination, and recalculation to the received data for hundreds of times, it is very time consuming.
In 1995, Vapnik proposed the idea of support vector machine(SVM) [33], it is a statistical theory aiming at small snapshots and mainly used for solving data classification. SVM transforms the problem of nonlinear classification into linear classification through increasing the data dimension, it shows good generalization ability, solves the common overfitting in traditional algorithms, the efficiency and accuracy of the algorithm are relatively high. At present, SVM has been put into text classification, biomedicine, face recognition, and signal number estimation [34], [35], simultaneously more and more researchers solve these problems by neural network [36], [37], but they are merely suitable for uncorrelated sources and Gaussian white noise.
In general, coherent node sources which commonly exist in WSN means signals with the same frequency and fixed phase differences, they will lead to the rank deficit of the received data covariance, resulting in the number misjudgement. Besides, as long as background noise does not meet the white statistical property, it can be considered as colored noise, then the smaller eigenvalues are no longer approximately equal in this environment. So this article presents a new algorithm for determining coherent node number, and there are three contributions: First, differential smoothing is used for decorrelation to get the covariance matrix without the information of uncorrelated sources and noise. Second, multiple features are formed based on the eigenvalues and eigenvectors. Third, back propagation neural network(BPNN) is employed to evaluate signal number, in the processing of parameter initialization, particle swarm optimization(PSO) is also utilized to solve the problem of unstable training or poor convergence.
The organization of the paper is described below: In section II, we model the signal composed by uncorrelated, coherent sources, and colored noise. Section III gives the process of node number estimation based on BPNN. Section IV demonstrates the algorithm performance by some simulations. Section V summarizes the full paper. Figure 3, a uniform linear array(ULA) formed by M sensors receives K far-field mixed narrow-band sources, where K u independent sources are from θ u (u = 1, 2, . . . K u ), K c coherent ones are from θ c (c = 1, 2, . . . K c ), and K u + K c = K , then array output at time t is written

As shown in
(1) where R N is the covariance matrix of uncorrelated sources and performs Toeplitz characteristic, that is to say, for a Toeplitz matrix E, it satisfies E = JE T J, here J is a permutation matrix whose elements on back-diagonal equal one, and others are all zeros. R NT and R n are respectively the covariance matrix corresponding coherent and noise. As in Figure 3, we can acquire M similar arrays by virtual sliding, each of them includes M sensors, then refer to the idea of spatial smoothing, we can obtain the data after virtual smoothing the covariance of every subarray is

III. DETERMINE NODE SIGNAL NUMBER A. REMOVE UNCORRELATED SOURCES AND NOISE
we can eliminate the uncorrelated sources and noise through introducing spatial difference matrix thus, the covariance no longer has the information of uncorrelated sources and noise. We need to dispose M arrays and average them, then covariance containing only coherent sources can be acquired then eigen-decomposition is performed, we have

B. FEATURE EXTRACTION
After decomposing R ds , we can extract some features containing signal information. Eigenvalues represent powers of node signal, so they are selected as the first feature parameter expressing as Besides, the ratio of adjacent eigenvalues reflects their variation degree as well, so it is chosen as the second feature, that is In order to differentiate signal subspace and noise subspace, corresponding information in eigenvector can be utilized likewise, because mean square error of the eigenvector envelope manifests the regularity of difference between signal and noise, it is written Therefore, three kinds of parameters are selected as input features. In fact, we can extract more characteristics by fusing λ and e.

C. BPNN TECHNIQUE
BPNN plays an irreplaceable role in intelligent algorithm, which greatly improves the height of our analyzing problems, it is also the promotion of neural network algorithm and widely used. BPNN mainly includes the forward propagation of input data and the back propagation of error, the main principle is to establish the relationship between the input data and the ideal output by constantly changing the weights VOLUME 8, 2020 among each layer, its purpose is to keep the output data close to the ideal output so as to minimize the error. The number of hidden layers can be set according to the requirement of the project, the more the number of layers, the smaller the error, and the better the generalization ability of the network, but it will increase the training time. Therefore, the number of hidden layers should be reasonably determined by the complexity of data.
It can be seen from Figure 4, BPNN has three layers: input layer, hidden layer, and output layer, they respectively include n, p, and m neurons, the connection weights are w ij and w kj limited in (−1,1).
The input and output of hidden layer are calculated according to training samples where the weight vector is W 1 = [w 11 . . . w 1n w 21 . . . w 2n w p1 . . . w pn ] 1×(np) , intercept vector is Here net j and z j denote input and output of j-th neuron in hidden layer, f () is the transfer function, Sigmoid can be chosen here, it is then reckon the input and output of the output layer where the weight vector is 1×m . here, yi k and y k represent input and output data of the k-th neuron in output layer. When we import the p-th group of samples, the error is combining (12), (15), and (16), we have We know from (17), E p can be improved by modifying weighting w ij and w kj , when E p is obtained, the course of back propagation is implemented to decrease the error: (1) Define input partial derivative of output layer as δ 0 , then (2) Calculate the input partial derivative δ h of hidden layer (3) Modify the weighting w kj with partial derivative δ 0 where η means learning rate, w kj (N + 1) is the connection weighting after adjusting N + 1 times. (4) Modify the weighting w ij with partial derivative δ h we can adjust the error in (17) according to (27) and (29). The flow chart of BPNN is shown in Figure 5, the steps are as follows: (1) Divide the array data into training samples and testing samples.
(2)Initialize parameters of the network, set the weightings and threshold in (−1, +1), the minimum error ε, and the maximum learning times T LM .
(3) Input the training samples, then compute the errors by calculating output of each neuron.
(4) Adjust the weightings and threshold of the network to minimize the error.
(5) Determine whether the parameters satisfy our requirement, if the error is less than the threshold or iteration times is more than the maximum learning times, we will input the testing samples for the estimation.

D. PARAMETER INITIALIZATION
Though BPNN has a strong learning ability, it is easy to fall into local optimum value, resulting in obtaining a good training result but a bad testing effect. Besides, a pair of random initial weighting and threshold will lead to a poor convergence. Therefore, PSO is employed for optimizing initial parameters of BPNN here, all the weightings in the network are encoded to vectors representing particle positions, then connection weightings are reconstituted by iterative optimization, consequently the problems of local optimum and complex computation can be processed effectively.
The realization process of PSO algorithm is mainly to simulate the foraging of birds, during the foraging process, a bird usually finds out the optimal path to the food and then attracts other birds to gather together. Thus, each bird is deemed as a particle in the vast space, and any location is accurately described by the fitness function, so that the optimal location of each particle with a certain velocity can be recorded. At this point, the moving direction is determined based on the optimal position, and the flight speed is adjusted according to the rules of the algorithm. In the process of finding the best position, each particle constantly updates its speed and location, and the final output result is a population optimal value selected from all the individuals.
Supposing there are N particles in the space, D-dimensional objective function is selected as fitness, then the optimal solution of objective function is solved through iterative updating each particle. Define the position vector and flight speed are respectively X i and V i , that is search the optimal individual and population positions these particles update their speeds and positions to generate new populations where v d i (t) and x d i (t) are respectively the d-dimensional velocity and position vectors of the i-th particle after t iterations, w(t) denotes corresponding inertia weighting, c 1 and c 2 are learning factors on behalf of steps modifying best position of the particle itself and the whole population, r 1 and r 2 are randomly selected in [0,1]. Figure 6 gives the flow chart of PSO, the steps are summarized below: (1) Initialize the parameters of the particles c 1 , c 2 , and T max , original speed and position in velocity space and searching space.  (2) Evaluate the fitness of each particle according to the defined fitness function.
(3) Set the individual optimal values, then find out the global optimum location from them.
(4) Update speed and position of each particle according to (34) and (35).
(5) Compare the current value with the former one to obtain the individual optimal value p d i , then compare the present global optimum with the historical optimum to get global optimal position. (6) If the iteration times meets the pre-set maximum, the result will be given, or return to (4).
To sum up, we can generalize the proposed algorithm as follows: Step 1: Remove the uncorrelated source and noise from the received data to obtain the coherent source item; Step 2: Decompose the covariance matrix of coherent sources, then extract multiple feature parameters from eigenvalues and eigenvectors; Step 3: Initialize the network parameters through PSO; Step 4: Train the network with corresponding data; Step 5: Test the data with the trained network; Step 6: Modify the parameters according to the testing data properly; As the proposed algorithm employ multiple feature extraction, BPNN, and PSO for coherent node signals, we can abbreviate it MBPC. It mainly includes solving covariance, extracting coherent features, and training, so the complexity is about M 2 Z +3M +MDT max T LM +Mp, where Z is the snapshots number. The proposed algorithm uses virtual smoothing technique, so no matter how many coherent and uncorrelated signals there are, it only needs to satisfy M > K and has no special requirements to the modulation mode.

E. DATA PREPROCESSING AND PARAMETER SETTING 1) DATA PREPROCESSING
In order to transform the data with different orders of magnitude into a same order which can be computed directly, as well as excluding the interferences, we need to convert the original data in the domian [−1, +1] by

2) DETERMINE THE NUMBER OF NEURONS IN EACH LAYER
Three classical layers are used here, we will estimate node number according to the differences between signal eigenvalues, noise eigenvalues, and their corresponding eigenvectors. In this article, 23 sets of data including eigenvalues, the ratio of adjacent eigenvalues and mean square error of the eigenvector envelope are picked, that means number of neurons in input layer n = 23. The output is the node number, namely neuron number in output layer m = 1. For the sake of decrease the complexity of the network, only one hidden layer is set. Then we should choose a reasonable neuron number, if they are too small, we can not acquire a good result, else, a long time will be taken, so the formula for neuron number in hidden layer is where p, n, and m are respectively neuron numbers of the three layers. According to the reality, α is chosen in [1,10], p is set to be 10.

3) TRANSFER FUNCTION
In the course of constructing BPNN, the non-linear activation function Sigmoid is used here, which maps the data in [−1, +1] and makes input variable reflect the relation between input data and output data.

4) PARAMETER SETTING FOR BPNN
For the sake of ensuring the stability of the network and rate of convergence, learning rate is suggested to be 0.4, the maximal learning times T LM = 2000, minimum desired error is 0.001.

5) PARAMETER SETTING FOR PSO
On the basis of actual sample scale, define the size of particle population as N = 30, and set the connection weighting,

6) COLORED NOISE MODEL
Colored noise widely exists in reality, corresponding covariance matrix of received data is where R N is that of colored noise, the elements on its diagonal are no longer equal any more. Referring to [38], we express their elements as here σ 2 n is the colored noise power, ρ is the spatial correlation coefficient in [0,1].

IV. SIMULATION
Supposing that our ULA is composed of eight omnidirectional sensors with spacing d = λ/2, λ is the wavelength of the node signal, matlab environment, 80% of the snapshots is used for training and others is served for testing, 500 Monte-carlo trials, Modified MDL criteria [39] based on Forward-Backward spatial smoothing(MFBS) [40] and MBPC are separately employed for the estimation.
Example 1 Estimation success probability for the node signals with large angle intervals Assuming that three far-field narrow-band node signals arrive at the array from (20 • , 40 • , 60 • ), the former two are coherent with each other, the other one is uncorrelated with them, Gaussian white noise(WN) environment, the success probabilities of the two algorithms versus SNR are shown in Figure 7 when snapshots is 200, Figure 8 gives that versus snapshots number when SNR is 10dB. Meanwhile, the same simulations under additive colored noise(CN) environment are exhibited in Figure 9 and Figure 10.   It is seen from Figure 7, MFBS can not estimate the node signal completely when SNR is lower than 2dB, and it reaches 100% after 6dB. While the proposed MBPC algorithm eliminates the noise, corresponding to enhance the SNR,  and PSO is exploited for the BPNN initialization, so it has a higher success probability from beginning to end. Figure 8 demonstrates that MBPC performs more better than MFBS under circumstance of small snapshots. We also know from Figure 9 and Figure 10, as a result of removing noise item, they are nearly the same with the results in WN. But the eliminating effect to the CN is inferior to the WN in low SNR or small snapshots, so the estimation results in WN is a little better than that in CN.
Example 2 Estimation success probability for the node signals with small angle intervals Similarly, assuming that three far-field narrow-band node signals arrive at the array from (19 • , 20 • , 40 • ), the latter two nodes are coherent with each other, the first one is uncorrelated with them, other conditions are the same with example 1, Figure 11 and Figure 12 display the simulation results under WN, while Figure 13 and Figure 14 show that under CN environment.
We know from Figure 11 and Figure 12, as the latter two nodes are near to each other, their steering vectors of MFBS  are correlated with one another more or less, then the covariance matrix after forward-backward smoothing still exists rank deficiency, thus their number and locations can not be distinguished. However, MBPC has removed the uncorrelated part, so it is not affected by the angle interval caused by uncorrelated node, in addition, its performance is still excellent at the background of lower SNR or small snapshots. It also can be observed from Figure 13 and Figure 14, on account of removing noise item, they are nearly the same with the results in WN.

V. CONCLUSION
For the sake of determine the coherent node number of wireless sensor network in the environment of colored noise and uncorrelated interferences, a novel algorithm based on BPNN is proposed, it gets rid of uncorrelated information and noise, then utilizes the strong learning ability to make a decision. Meanwhile, PSO is employed for the parameter initialization of BPNN, avoiding the problem of falling into local extremum. The new algorithm still performs well in the surroundings of low SNR, small snapshots, and near uncorrelated interferences to some extent. Moreover, the trained network is adopted here so as to reduce the complexity and improve the estimation efficiency, promoting its practical applications in future.