Information Geometry-Based Fuzzy-C Means Algorithm for Cooperative Spectrum Sensing

In this paper, the spectrum sensing problem is investigated under the context of information geometry and a novel clustering algorithm based spectrum sensing scheme is developed to obtain a classifier to estimate the channel state of primary user (PU). In order to enhance the sensing performance at complex environment, the empirical mode decomposition (EMD) algorithm is applied to wipe off the noise component of the received signals from all secondary users (SUs). Subsequently, a signal matrix composed of the reconstructed signals is constructed. Based on the information geometry (IG) theory, the covariance matrix of the signal matrix is mapped into a point on a manifold. Then, the sample points on the manifold are collected as a data set. Moreover, a novel clustering algorithm, namely Riemannian distance based Fuzzy-c means clustering (RDFCM) algorithm, is developed to cluster the samples on manifold for obtaining a classifier, which is employed to decide the PU state. The simulation results show that compared with other spectrum sensing methods, the proposed scheme improves the performance of detection.


I. INTRODUCTION
With the rapid development of the wireless communication technology, the amount of wireless devices and services has increased dramatically over the past decade, which requires a great deal of spectrum resources. However, according to the investigation report of Federal Communications Commission, the utilization of spectrum resources is poor [1]- [3]. In order to conquer this drawback, the spectrum sensing which is a vital technique in cognitive radio (CR), has been extensively developed to obtain awareness about the spectrum usage and existence of primary users (PUs) in an area [4]- [7]. Several conventional spectrum sensing approaches, such as energy detection (ED) [8], [9], matched filtering detection (MFD) [10], [11] and cyclostationary feature detection (CFD) [12], [13] have been developed. The ED approach is a comparatively simple technique, which obtains the sensing decision by calculating the energy of the signal and comparing it to a threshold. However, it is vulnerable to noise The associate editor coordinating the review of this manuscript and approving it for publication was Hayder Al-Hraishawi . uncertainty. By analyzing the spectral correlation function of the signal, the CFD method is competent to differentiate the signal and noise, but the computational burden is enormous. The MFD algorithm requires the priori knowledge of the PU, such as bandwidth and operating frequency, which is restricted in practical application.

A. RELATED WORK
The aforementioned works are based on signal secondary user (SU), which are susceptible to multipath fading, shadowing and receiver uncertainty. To conquer these bottlenecks, the cooperative spectrum sensing (CSS) which is an indispensable framework in CR, has achieved widespread attention to improve the sensing performance by integrating the information received from multiple SUs [14]- [17]. Compared with signal SU sensing, the SUs in CSS transmit their sensing information to a fusion center (FC) to make a global decision, which is more precise than individual decision. Most of the CSS approaches are based on random matrix theory (RMT) [18]- [20]. The FC establishes a signal matrix based on the signals received from all the SUs. Then, the statistical eigenvalue of the covariance matrix is calculated and compared it with a threshold to determine the state of PU. On the other hand, the information geometry (IG) based CSS methods are gradual investigated [21]- [23]. In [22], a novel wideband spectrum sensing method based on Riemannian distance and Riemannian mean was proposed. Based on IG theory, a Riemannian distance detector (RDD) which is blind to noise statistical characteristics and priori knowledge of PU, was designed to detect the spectrum hole. In [23], the covariance matrices of the received signal matrix were mapped into a series of coordinates on the statistical manifold, and the signal features were constructed by the geodesic distances between the coordinates and the corresponding Riemannian mean. Then, a constant false alarm rate detector was applied to determine the state of PU. In the view of IG theory, the statistical features are located in the manifold, rather than the linear space theory, which provides a novel perspective to analyze the spectrum sensing problem.
It should be emphasised that all the aforementioned works need to deduce a precisely threshold to decide the state of channel, which is intractable in complicated environment. In recent years, machine leaning has been widely applied in various fields due to its capacity to employ mathematical calculations to analyze and interpret patterns and structures in data. In spectrum sensing community, several machine leaning based schemes have been proposed [24]- [28]. In [24], newly CSS approaches based on machine learning algorithms, such as K-means clustering, Gaussian mixture model and K-nearest-neighbor, were developed. The vector of the signal energy is considered as a feature vector to train a classifier to decide whether the PU is presence or not. In [25], a spectrum sensing method based on K-means clustering and support vector machine algorithms was presented. To reduce the computational burden, a low-dimensional probability vector was designed as the feature vector and used it to obtain a classifier to achieve spectrum sensing. In [26], the statistical eigenvalue of the covariance matrix, such as maximum-minimum eigenvalue (MME), was treated as the feature vector to train a classifier by using the K-means and K-medoids algorithms, respectively. In general, the machine learning based spectrum sensing approaches use a feature vector, such as energy statistic, probability vector and statistical eigenvalue of the covariance matrix, to train a classifier by using supervised or unsupervised learning algorithm. Therefore, choosing appropriate eigenvalue and machine learning algorithm are crucial to obtain desired sensing performance.

B. MOTIVATIONS
The weaknesses of different methods are displayed in Table 1.
To overcome these problems, a new spectrum sensing method based on IG theory is developed in this paper. In order to realize clustering directly in manifold space, under the context of IG theory, a novel spectrum sensing scheme is developed in this paper by using a newly designed Riemannian distance based Fuzzy-c means clustering (RDFCM) algorithm. In real scenario, the received information of SUs are adulterated by noise, which degrades the detection performance. To address this issue, the empirical mode decomposition (EMD) algorithm which treated as a forceful technique to deal with the nonstationary and nonlinear signal, is adopted to eliminate noise component [29]- [31]. Then, based on the IG theory, the covariance matrices of the signal matrices are mapped into a series of coordinate points on the manifold. Subsequently, the RDFCM algorithm is designed, which clustering directly on the manifold space. Finally, a classifier is trained to perceive the state of PU by using the coordinate points on the manifold.
The novelty and contribution of this work are summarized as follows: 1) This paper develops a new clustering algorithm, namely RDFCM algorithm. It achieves clustering directly on manifold space by applying Riemannian distance to measure the distance between two points on the manifold. The original FCM algorithm is suitable for linear space only. It cannot cluster the points which lie on the manifold space. Therefore, the developed RDFCM algorithm is the extension of the FCM algorithm. 2) Different from existing methods [20], [21], this paper extends the RDFCM clustering algorithm to deal with the spectrum sensing problem. Unlike the traditional methods, this method is realized on manifold space and the decision threshold is not required. Moreover, based on the sample points on the manifold and the developed RDFCM algorithm, a classifier is trained to decide whether the PU is present or not. 3) This scheme integrates the EMD algorithm to exclude the noise component of the signals received by SUs, which can be considered as a data preprocessing process to obtain superior feature. In simulation part, the availability of the developed scheme is analyzed under the SUs at different signal to noise ratio (SNR). The rest of this paper is organized as follows. Section II introduces the scenario of CSS. Section III proposes an EMD-based RDFCM approach (EMD-RDFCM) for CSS. In Section IV, the effectiveness of the EMD-RDFCM approach is verified under different conditions. In Section V, the conclusion is given.

II. COOPERATIVE SPECTRUM SENSING
Consider a CR system with one PU, one FC and L SUs, as displayed in Figure 1. In CSS scenario, SUs share their sensing data to FC for acquiring a global decision to decide the presence or absence of the PU. Typically, the spectrum sensing can be considered as a binary hypotheses testing problem, which can be expressed as where y l (n) and x l (n) represent the received and transmitted signals of the lth SU, N is the number of sample points. z l (n) is the Gaussian white noise and satisfies z l (n) ∼ N (0, σ 2 z ), h l (n) denotes the channel gain, H 0 and H 1 stand for the absence and presence of PU, respectively. Hence, the signal matrix can be established as In order to reflect the sensing performance, the probabilities of detection P d and false alarm P f are formulated as

III. COOPERATIVE SPECTRUM SENSING BASED ON EMD-RDFCM APPROACH
The structure of EDM-RDFCM approach is given in Figure 2.
To begin with, the EMD algorithm is employed to eliminate the noise component in the received signal. Subsequently, the reconstructed signal matrix is established and the corresponding covariance matrix is calculated. Moreover, based on the IG theory, the covariance matrices are mapped to coordinate points on the manifold and the new RDFCM clustering algorithm is developed. Finally, a data set is prepared to train a classifier to determine the state of the channel.

A. EMD-BASED SIGNAL DENOISING
In practical sensing scenario, SUs will be disturbed by noise, which degrades the sensing performance. To address this conundrum, the EMD algorithm which is considered as an effective technique for nonlinear and nonstationary modal decomposition, is employed to remove the noise component in the noisy signal that collected by SUs. The basic idea of EMD algorithm is decomposing the signal into a series of intrinsic mode functions (IMFs). The major advantage of the EMD is that the basis functions are derived from the signal itself. Hence, the analysis is adaptive in contrast to the traditional methods where the basis functions are fixed. It is noticed that most of the noise components are concentrated on the higher frequency ones. Hence, by reconstructing the low-frequency modes, the noise part can be eliminated. Assume that y l (t) represents the received noisy signal for lth SU. The decomposition process of EMD algorithm is given in Algorithm 1.
After EMD decomposition, the received noisy signal y l (t) for lth SU can be represented by where C is the number of modes, r C (t) is a residual. In order to remove the noise components in y l (t), the IMF components at low frequency should be reconstructed. Hence, we need to find a mode index j s , after which, IMF components are dominated by the signal. In this paper, the consecutive MSE (CMSE) is adopted to determine the j s . Defineỹ Then, the CMSE is defined as Therefore, the index j s is obtained by After EMD denoising, the received signal for lth SU is expressed byȳ Then, the signal matrix is represented as

B. SPECTRUM SENSING BASED ON INFORMATION GEOMETRY
Based on (10), the covariance matrix of signal matrix is calculated as Then, the binary hypotheses can be formulated as where C p ∈ R L×L is the PU signal matrix, I ∈ R L×L is the identity matrix. From [22], [32], we know that C follows the Wichter distributions W(L, σ 2 z I) and W(L, C p + σ 2 z I) in the case of H 0 and H 1 , respectively.

Algorithm 1 EMD Algorithm
Step 1: Extract all the extreme points on y l (t).
Step 2: Calculate the upper and lower envelopes y max l (t) and y min l (t) by cubic spline interpolation approach. Then, the envelope mean is calculated by Step 3: Calculate the first component by If h 1 (t) satisfies the IMF conditions, then goto Step 4. Otherwise, repeat Steps 1 and 2 on h 1 (t).
Step 4: Let IMF 1 = h 1 (t) and calculate the residual signal by Step 5: If r C (t) is a monotonous function or a constant, then let r C (t) be the final residual signal. Otherwise, let r C (t) be the original signal and repeat Steps 1-5.
IG is the application of differential geometry in statistics. The main idea is using parameterized probability distribution clusters to construct statistical manifold and then transform the statistical problem into a geometric problem. Spectrum sensing is a kind of signal detection problem, which can be solved by analyzing the probability distribution of VOLUME 8, 2020 detection data. Based on IG theory, the spectrum sensing problem can be transformed into a geometric problem on manifold space and the properties of probability distribution function clusters can be analyzed more intuitively by using geometric method. In the context of IG theory, the statistical manifold S m can be defined as where x ∈ R n is the random variable, θ ∈ R m is the parameter vector, p(x|θ) is the probability density function, is the probability distribution space. In IG theory, θ can be considered as the coordinate on the manifold. From the above analysis, the Wichter distributions W(L, σ 2 z I) and W(L, C p + σ 2 z I) can be mapped into the statistical manifold and the covariance matrices σ 2 z I and C p +σ 2 z I can be regarded as the corresponding coordinates. The structure of the statistical manifold is shown in Figure 3. In IG theory, the Riemannian distance which is the shortest curve to connect the two coordinate points on manifold, can be used to indicate the similarity of two distributions. Let 1 and 2 be the two points on the manifold. Then, the Riemannian distance between 1 and 2 can be calculated by where · is the Frobenius norm and λ i is the i eigenvalues of the matrix −1 1 2 .

C. COOPERATIVE SPECTRUM SENSING BASED ON RDFCM CLUSTERING ALGORITHM
FCM algorithm is a vital clustering approach in unsupervised machine learning, which groups the data points into a specific number of clusters by minimizing the distance between the data and the cluster centers of their fuzzy memberships [33]- [36]. Let S = {s 1 , . . . , s k } be a data set containing k data points, q be the number of clusters and satisfies 2 ≤ q < k, C F = {c 1 , . . . , c q } be a set of cluster centers. Then, FCM algorithm performs clustering by addressing where µ m ij ∈ [0, 1] is the membership of s j in class i, m ∈ [1, ∞) is the degree of fuzzification, U = {µ ij } ∈ R q×k is the membership matrix. The constraints guarantee that each data has the same overall weight in the data set and none of the clusters is empty. By using the alternating optimization approach, the update equations at p iterate are obtained by It is noticed that traditional FCM algorithm uses Euclidean distance to measure the distance between the sample point and the cluster center, which only available in linear space. To achieve clustering on the manifold, a modified FCM algorithm, namely RDFCM method is developed, which uses Riemannian distance instead of Euclidean distance to measure the distance between two elements on manifold.
Under the spectrum sensing scenario, RDFCM needs to divide the sampled data located on the manifold into two groups, i.e., the PU is idle and the PU is active. Before training the classifier, the data set T which contains M covariance matrices, needs to obtain, such that Let T j ∈ R L×L be the jth sample in T . Denote ϒ i is the ith center of the clusters. Then, the objective function of RDFCM algorithm is designed as
Step 3: Calculate the equation (24) and update the cluster centers.
Step 4: Calculate the equation (25) and update the membership matrix.
Step 6: Output ϒ i and U.
where D 2 (T j , ϒ i ) is the Riemannian distance between T j and ϒ i , µ m ij is the membership and satisfies m = 2. The Lagrange multiplier method is employed to minimize the objective function. Hence, we can obtain The partial derivatives of L with respect to µ ij and ϒ i can be calculated as Then, we can final obtain From (24) and (25), we know that ϒ i and µ ij are related to each other. Therefore, an iterative approach is adopted to obtain the optimal solution. The specific process is given in Algorithm 2. After the training is accomplished, a classifier for spectrum sensing is obtained by whereT is the data on the manifold that need to be classified. If F(T) > ξ , it means that the PU is absence and the SUs are not allowed to be accessed. If F(T) ≤ ξ , it indicates that PU is presence, then the SUs can use the licensed spectrum. ξ is a designed parameter that uses to control the P d and P f .

D. COMPLEXITY ANALYSIS
The complexity of the EMD-RDFCM-based CSS method consists of two parts, i.e., the training phase and the sensing phase. In training phase, the complexity is expressed , where M is the number of data set, q is the number of cluster centers, and I k is the number of iterations. It is worth mentioning that the complexity of the EMD-RDFCM-based CSS approach in training phase can be rewritten as O(M ) as the M is increase. In sensing phase, the classifier has been obtained and can be used directly. Therefore, comparing with the training phase, the complexity of the sensing phase is low and can be ignored.

Remark 1:
The differences between the EMD-RDFCM method and existing methods lies in that: 1) Different from existing methods [8], [10], [12], [18], [22], [23], the fixed decision threshold and the fixed reference are not required in EMD-RDFCM method. Therefore, it is more adaptive in different scenarios. 2) Different from existing methods [24]- [27], this method is realized on manifold space by using the novel RDFCM clustering algorithm. Moreover, the EMD algorithm is adopted to exclude the noise component of the signals received by SUs, which can be considered as a data preprocessing process to obtain superior feature.

IV. SIMULATION ANALYSIS
In this section, the effectiveness of the EMD-RDFCM-based CSS method is verified.  [26], [27], are revealed in Figures 4-7, respectively. It is obvious that the RDFCM method is better than traditional ones. Similarly, the EMD-RDFCM achieve superior detection performance than RDFCM, which means that the EMD algorithm removes the noise component effectual. The detailed data is shown in Table 2. In SNR = [−10dB, −10dB], P f = 0.1, compared with conventional methods, the detection probability of EMD-RDFCM approach is enhanced    In SNR = [−12dB, −12dB], P f = 0.1, the detection probability of EMD-RDFCM approach is increased by 4.15%, 29.17%, 49.68%, and 78.61%, respectively. Figure 12 shows the convergence of the loss value for EMD-RDFCM approach at SNR = [−12dB, −12dB]. It means that the developed algorithm is stable.

B. SUs IN DIFFERENT SNR
In this section, the effectiveness of the EMD-RDFCMbased CSS method is demonstrated in the condition of SUs in different SNR. As can be seen from Figures 8-11, compared with traditional methods, the EMD-RDFCM approach can obtain the best detection results. The detection probability of different methods are provided in Table 3    detection probability of the other methods are 0.901, 0.676, 0.539 and 0.471, respectively. From section A and B, we can conclude that the EMD-RDFCM method can obtain the optimal sensing performance than the rest of approaches under two different scenarios. Under the condition of SNR = [−12dB, −12.5dB], the convergence of the loss value for the EMD-RDFCM method is shown in Figure 13.

C. DIFFERENT NUMBER OF SUs
In this portion, the sensing performance of the EMD-RDFCM method is analyzed under different number of SUs. The ROC curves and detection probability of different methods are displayed in Figure 14 and Table 4, respectively. It is clear that the detection performance is better as the number of the SUs increase. VOLUME 8, 2020      different sampling point are revealed at Figure 15. The detailed detection probability is provided in Table 5. According to the experimental results, the sensing performance is improved as the number of the sampling point increase.

V. CONCLUSION
In this paper, an EMD-RDFCM approach is developed to address the spectrum sensing problem in CR. To enhance the detection precision, the EMD algorithm is adopted to wipe off the noise portion of the received signal. A data set which contains the covariance matrix of the reconstructed signal is prepared to train a classifier. Then, an RDFCM clustering algorithm is proposed to cluster samples on manifold space. After the classifier is obtained, we use it to decide the state of the channel. Finally, the simulation part verifies the effectiveness of the EMD-RDFCM approach under two different conditions, i.e., the SUs in same SNR and the SUs in different SNR. The major contribution of this paper is on proposing an RDFCM clustering algorithm to obtain a classifier on manifold. Furthermore, the developed EMD-RDFCM approach obtains the better sensing performance in contrast to traditional methods. In future work, the delay, delivery ratio, fairness index and efficiency of the CR will be considered and extend the developed CSS method to real-world scenarios.
YONGWEI ZHANG received the B.S. degree from the School of Electronic and Information Engineering, Jiaying University, Meizhou, China, in 2016. He is currently pursuing the Ph.D. degree with the School of Automation, Guangdong University of Technology, Guangzhou, China. His current research interests include cognitive radio, spectrum sensing, adaptive dynamic programming, and optimal control. PIN WAN received the B.S. degree in electronic engineering and the M.S. degree in circuit and system from Southeast University, in 1984 and 1990, respectively, and the Ph.D. degree in control theory and control engineering from the Guangdong University of Technology, in 2011. He is currently a Professor with the School of Automation, Guangdong University of Technology.