Cognitive Waveform Optimization for Phase-Modulation-Based Joint Radar-Communications System

A dual-function radar communication (DFRC) system enables the implementation of a primary radar operation and a secondary communication function concurrently. A bank of transmit beamforming weight vectors are guaranteed to have the same transmitted radiation pattern to satisfy in the target detection requirements, while the phase symbol is selected from a preset dictionary so that communication information can be embedded. However, as the radar channel is time-variant due to the fluctuation in the radar cross-section (RCS) of the target and the Doppler shift that results from the relative motion of the target, it is necessary for a successive waveform design and selection scheme to continually obtain target information. Our work aims at enhancing the target detection performance by maximizing the relative entropy (RE) between two hypotheses (in the first hypothesis we assume the target is not present in the echoes while in the second hypothesis we assume the target exists in the echoes) and by minimizing the mutual information (MI) between successive target echoes. The proposed scheme overcomes the coexisting communication and radar detection problems in intelligent transportation systems (ITSs), where it is necessary to extract the features of target information that is obtained from a vehicle-mounted sensor. Our simulation results demonstrate an improvement in the target detection performance by the proposed two-stage approach. In addition, the system can transmit data of several Mbps with low symbol error rates.

The communication receiver detects the embedded information by estimating which waveform was transmitted. However, the target detection performance is degraded due to the waveform variation from pulse to pulse. Communication data embedding schemes were also proposed for dual functionality with multiple transmit/receive frameworks [9], [19]- [21]. A dual-function radar-communication (DFRC) system that employs sidelobe control and waveform diversity has been presented in [22]. The mechanism of the DFRC system is to transmit multiple orthogonal waveforms, where each waveform is used to embed binary data.
Most recently proposed DFRC systems are based on phased arrays and the multiple-input multiple-output (MIMO) technique [23]- [29]. Time-modulated arrays are used to realize dual functionality to enable target detection in the main lobe while performing wireless communication in the sidelobe in [19]. The main strategy of the approach is to use sparse time-modulated array or phase-only synthesis time-modulated array approaches to control the instantaneous pattern. Both methods enable the excitation of variations in the sidelobe levels (SLLs) toward a specified direction [19]. However, since the number of transmission antennas is constant, the former affords only a few degrees of freedom. Therefore, the method cannot realize many distinct SLLs toward a specified direction. The latter offers superior performance in realizing more SLLs; however, its calculational burden is too heavy due to nonlinear optimization. An amplitude modulation (AM)-based method for embedding information into the radar waveform was considered. The main strategy of the method is to embed data into the radar waveform via controlling the SLLs. The scheme can not only regulate the instantaneous pattern sidelobe but also acquire prominent SLLs toward a specified direction. However, the main lobe must be kept constant throughout the coherent processing interval (CPI) to realize high performance in target detection. Therefore, this technique enables embedded data transmission to a receiver that is positioned within the sidelobe region but cannot perform wireless communication within the main beam of the radar.
Inspired by the MIMO radar system, Hassanien and Amin developed a novel DFRC system with multisensor transmit/ receive configurations and bilevel sidelobe control [20]. The multiwaveform DFRC system enables the embedding of binary data via each orthogonal waveform and the emission of multiple independent waveforms simultaneously. The number of bits that can be embedded is equal to the number of transmitting orthogonal waveforms. At least one bit is transmitted for each radar pulse. Convex optimization methods can be used to embed communication data and to realize bilevel sidelobe control. The information-embedding scheme that was developed in [20] employs a sidelobe binary amplitude shift keying (ASK)-based technique. Convex optimization approaches have also been used for the design of transmitted beamforming vectors that satisfy the constraints that are imposed by the radar functions while optimizing the transmit radiation pattern. Similar to the AM-based technique, the ASK-based method only transmits embedded information within the sidelobe region. This drawback of the techniques is due to the main beam remaining the same throughout the entire CPI. A phase-modulation (PM)-based scheme for embedding data into the illumination of a radar system was considered in [21]. The scheme maps the binary bits into a phase symbol that corresponds to a phase dictionary of a suitable size. It differs substantially from the AM-and ASK-based techniques. The scheme can embed the binary data toward communication receivers, regardless of whether the communication receivers are positioned within the main lobe or the sidelobe. The PM-based method can be both coherent and noncoherent. Therefore, it can be used for both directional communications and broadcasting. The MIMO radar system uses a set of orthogonal waveforms via each element to generate waveform diversity. Cheng and Liao discuss the problem of the spectrally compatible waveform design for MIMO radar in the presence of multiple targets [25]. The waveform is designed by minimizing the waveform energy of the overlayed space-frequency bands under constraints of waveform similarity and individual SINR requirements. The work of [27] discusses the problem of waveform optimization for MIMO radar with good transmit beampattern under certain practical constraints in coexistence with communication systems. The waveform is designed by minimizing a weighted summation of the beampattern integrated sidelobe-to-mainlobe ratio and waveform energy over the space-frequency bands. To further increase the data transmission rate, [29] proposed a frequency hopping (FH) coding method for designing a set of orthogonal waveform. During each radar pulse, the number of embedded symbols is equal to the number of orthogonal waveforms times the length of the FH code. These strategies are inherently secure against interference and interception in directions other than a specified direction [30].
However, the radar reflection characteristics of radar targets and the environment can be regarded as time-variant. In both phased array and MIMO techniques, it is not easy to recognize and detect a radar target in a dynamic environment, especially if the radar target and the interference have the same angle but different ranges. This shortcoming can be overcome by using cognitive techniques [30]- [32]. Cognitive radar (CR) continually uses information acquisition mechanisms to facilitate adaptive emission in dynamic radar scenarios. CR forms an adaptive closed feedback loop from the receiver to the transmitter, which has tremendous potential for enhancing the performance in target recognition and detection, as demonstrated in [33]. The updated information regarding the radar target and interference is used to design a transmitted waveform that is based on the mutual information (MI) maximization criterion [34], [35]. This continuous learning scheme develops waveform optimization methods and offers high performance in target recognition and detection, according to [36].
In this paper, we introduce a PM-based approach for embedding information into the illumination of MIMO radar. VOLUME 8, 2020 It enables binary information delivery to communication receivers that are located not only within the sidelobe region but also within the main beam. Since the radar channel is time-variant due to the fluctuation in the radar cross-section (RCS) of the target and the Doppler shift that results from the relative motion of the targets, it is necessary for a successive waveform design and selection scheme to continually obtain target information from the radar environment. Therefore, we analyze the performance of a cognitive DFRC system. We combine the relative entropy (RE) algorithm that is presented in [36] and the MI strategy [34] to realize an optimized waveform, which facilitates enhanced target detection in a dynamic environment. We propose a cognitive waveform design scheme for the DFRC system. The proposed waveform design algorithm can be divided into two steps, as follows: Step one: The step is to design an ensemble of optimized waveforms for illumination. RE can be employed as a measure for evaluating the detection performance of radar systems. The larger the value of RE, the higher the target detection performance can be realized [37]. The main objective is to maximize the RE between two hypotheses (in the first hypothesis we assume the target is not present in the echoes while in the second hypothesis we assume the target exists in the echoes) under the transmission power constraint and to obtain an ensemble of optimized waveforms.
Step two: The objective of this step is to minimize the MI between the current backscattering echo and the predicted value in the next moment. The selection strategy ensures that we acquire independent target echoes from the same radar scenario to obtain more target features from pulse to pulse. Therefore, we always choose the most suitable waveforms for illumination that would generate more uncorrelated and independent target echoes.
These two steps correspond to the design of the ensemble of waveforms and the selection of a reasonable waveform out of the ensemble, respectively. The waveform optimization scheme is based upon adaptive learning from the radar scenario, which is realized through a feedback loop from the receiver to the transmitter. This feedback includes vital information about the target features that is derived from the target echoes. Via this approach, the transmitter adjusts its waveforms to suit the dynamically changing environment. The novel contributions of this work are summarized as follows: 1) We present the architecture of an adaptive PM-based DFRC system, which benefits from the principle of cognition radar. The system utilizes a constant learning approach by updating the estimates on target scene parameters through multiple interactions with the environment.
2) We develop a novel algorithm for waveform optimization and selection in the PM-based DFRC framework, which is based on the RE maximization criterion and the MI minimization criterion.
3) We provide performance analysis of the PM-based DFRC system network in terms of the receiver operating characteristic (ROC), detection probabilities and communication symbol error rates (SERs) between the proposed nodes.
The remainder of this paper is organized as follows: In Section 2, a DFRC signal model that employs a PM-based method and sidelobe control is described. In Section 3, a noncoherent PM-based communication approach is discussed. In Section 4, we propose a two-stage cognitive waveform optimization strategy. The transmitted waveforms are designed based on the RE maximization criterion and selected based on the MI minimization criterion. The simulation results for the proposed schemes are presented in Section 5 and the conclusions of this study are presented in Section 6.

A. NOTATIONS
Throughout this paper, the following notations will be used: We use boldface lowercase letters and boldface uppercase letters to denote vectors and matrices, respectively; (.) * denotes the complex conjugation operation; (.) T denotes the transpose operation; (.) H denotes the Hermitian transpose operation; vec (.) denotes the vectorization of the columns of a matrix; and I R denotes the R × R identity matrix.

II. SIGNAL MODEL
We present a DFRC signal model that uses a PM-based scheme. A case of the advanced model, which is considered in [26], is discussed in this section. We consider a DFRC system architecture that is configured with one joint radarcommunication transmission antenna array, one radar receiving antenna array, and additional communication receiving antenna arrays. The joint radar-communication transmission array consist of T transmission antennas, which are arranged in a uniform linear shape. The radar receiving array consists of R receiving antennas, which are arranged in an arbitrary linear shape.
To simplify the discussion, we assume that both the radar transmission and receiving antennas are narrowly spaced relative to each other. Therefore, targets that are positioned in the far field are at the same spatial angle relative to both antenna arrays. The objective of the joint radar-communication transmission array is to deliver messages to the intended communication receivers as a secondary operation without impacting the primary target detection. The system architecture of the joint radar-communications is illustrated in Figure 1.
We denote the bandwidth of the DFRC signal and the total budget of transmission power as B and P t . Two orthogonal transmitted waveforms are denoted as φ u (t) and φ v (t). Both waveforms occupy the same bandwidth, namely, the spectra of φ u (t) and φ v (t) overlap in the frequency domain. We assumed that both orthogonal signals have been normalized to unit transmission power, namely, T P |φ u (t)| 2 dt = T P |φ v (t)| 2 dt = 1, where t describes the fast time index and T P denotes the radar pulse width. Both waveforms are further assumed to satisfy the orthogonality condition, namely, signals at the input of the joint radar-communication transmission array is expressed as where P u and P v denote the transmission powers. u and v denote the T × 1 transmission designed weight vectors associated with φ u (t) and φ v (t), respectively. We denote the total transmission power as P t = P u + P v . The vector of the baseband signals s (t) need not satisfy the orthogonality condition.
It is assumed that a target is located in a specified rangebin within the main beam of the radar operation in the additive white Gaussian noise (AWGN) environment. The R×1 vector of baseband signals that are caused by backscattering from the target at the output of the radar receiving antennas is expressed as where i denotes the slow-time index (radar pulse number); β (i) denotes the target scattering coefficient during the i-th radar pulse; a (θ ) is the T × 1 steering vector of the transmission array and b (θ ) is the R × 1 steering vector of the receiving array; and n (t) denotes the vector of AWGN with zero mean and covariance δ 2 I R . It is worth noting that the target scattering coefficient β (i) and the target's direction of arrival in steering vectors θ are assumed to be known in our work. They remain constant during the entire radar scan duration, but vary independently from scan to scan. In Swerling III, the RCS samples measured by the radar are correlated throughout an entire scan but are uncorrelated from scan to scan (slow fluctuation), and the target scene is dominated by a single powerful scattering center with many weak reflectors in its vicinity. This model will be considered in this paper. For the wireless communication task of the DFRC system, we assume that the communication receiver has complete knowledge of the waveform ensemble that is employed at the transmitter. The baseband signals at the output of the communication receiver can be expressed as where α com (i) denotes the attenuation coefficient, which summarizes the propagation channel between the transmission antenna array of the DFRC system and the communication receiver during the i-th radar pulse, and n (t; i) is the AWGN with zero mean and variance δ 2 c . Substituting (1) into (3) and matched-filtering the pulse echo y com (t; i) to the orthogonal waveform φ u (t) yields the communication data y u (i) as follows: where A u is the magnitude of the transmitting gain toward the predefined communication receiver, φ u is the phase of the transmitting gain that is related to φ u (t), and n u (i) denotes AWGN at the output of the matched filter with zero mean and variance δ 2 c . In the same way, matched-filtering the pulse echo y com (t; i) to the orthogonal waveform φ v (t) yields the communication data y v (i) as follows: where A v is the magnitude of the transmitting gain toward the predefined communication receiver, φ v is the phase of the transmitting gain that is related to φ u (t), and n v (i) denotes AWGN with zero mean and variance δ 2 c . Figure 2 illustrates the architecture of a cognitive PM-based DFRC system.
Next, two PM-based information-embedding approaches are introduced, which are employed to realize coherent communications and non-coherent communications. For a coherent wireless communication process, a PM-based information-embedding scheme can be realized by choosing phase φ u (t) or φ v (t) from a predetermined phase dictionary during each radar pulse. The communication receiver can recover the embedded binary sequence by estimating the embedded phases. The wireless communication process requires phase synchronization between the DFRC transmitter and the communication receiver [38].
If two transmission weight vectors u and v are designed for satisfying the condition, namely, , y u (i) and y v (i) can be ensured to have the rotational invariance property, according to which y u (i) is equal to y v (i) up to a phase rotation φ, which can be expressed as The binary sequence can be embedded into the emission of each orthogonal waveform by controlling the value of the phase rotation at the transmission antenna array, namely, by choosing the phase rotation from a predetermined phase dictionary. VOLUME 8, 2020 Notice that two orthogonal waveforms are emitted simultaneously and propagate over the same channel. As a result, any phase synchronization error that takes place due to, such as propagation distortion, yields the same phase error in both y u (i) and y v (i). This results in a common phase term in the numerator and denominator of (6) which has no influence on the phase rotation. The phase symbol is embedded as a phase rotation between two orthogonal waveforms. Since measuring the phase associated with the first waveform relative to the phase associated with the second waveform cancels out any common phase term, the common initial phase at the transmit array and/or the common phase error terms have no influence on the estimation of the phase rotation at the receiver. The phase rotation can be well-kept. By estimating the phase rotations at the communication receiver, the phase symbol can be gained. Therefore, employing a phase decoder at the communication receiver doesn't necessarily require phase synchronization.

III. NON-COHERENT PM COMMUNICATIONS
Phase synchronization errors between the DFRC transmitter and the communication receiver and imprecise channel coefficient estimation will lead to degradation of the communication system's performance and failure to satisfy the design requirements. To overcome this problem, we introduce a noncoherent PM-based scheme for embedding communication data into the illumination of a radar system [10]. The communication data are hidden in the phase difference between two radar transmitted waveforms.
An N -bit communication data item that is embedded into a radar pulse is represented as the binary sequence b n , n = 1, . . . , N , which can be mapped into a dictionary of K = 2 N phase-rotation symbols ϕ = {ϕ 1 , . . . , ϕ K }, where ϕ k describes the k-th phase-rotation symbol. Hence, with the objective of delivering N -bit communication data in the illumination of DFRC system, the corresponding phase-rotation symbol should be embedded in the radar pulse.
A T × 1 weight vector z is employed to produce a population of 2 T −1 weight vectors of the same dimensionality, which is described as Z = z 1 , . . . , z 2 T −1 . A dictionary is formed by employing K pairs of weight vectors (u 1 , v 1 ) , . . . , (u K , v K ) that correspond to a population Z. The phase difference that is related to the k-th pair of weight vectors (u k , v k ) can be expressed as The k-th pair of vectors (u k , v k ) yields the phase difference φ k relative to the corresponding phase symbol ϕ k . We assume that the transmission power is divided equally between the radar signals that are associated with the two waveforms. The T × 1 vector of the baseband signals can be reformulated as It is assumed that the k-th symbol is embedded into one radar pulse, and matched-filtering the received signal to two orthogonal waveforms φ u (t) and φ v (t) yields and Therefore, the phase difference at the communication receiver can be estimated as The embedded data in the received signal are determined by comparing phaseφ (i) in (11) to the predetermined dictionary that is obtained from (7) and by comparing the mapping phase-rotation symbol to the original communication data. The noncoherent PM-based communication scheme is angledependent. The performance in recovering the embedded data at the communication receivers from directions other than the planned direction is limited, particularly for a large-scale dictionary.

IV. (TWO-STEP) WAVEFORM OPTIMIZATION
In this section, we develop a novel waveform optimization strategy for a cognitive DFRC system to further improve the performance of target detection in a dynamic environment. The waveform ensemble design and optimization selection procedures are described as a two-step algorithm.

A. RE MAXIMIZATION
Step 1: waveform optimization. This step includes the design of transmitted signals for the PM-based DFRC transmission antennas array. We aim at maximizing the RE between the distributions with no target and with a target at time i. The larger the RE, the higher performance of target detection that can be realized. The method ensures optimal waveform matching with the target and noise. The strategy of RE maximization for cognitive waveform design is derived from Stein's lemma [40], which is stated as follows: Theorem 1: Consider a binary hypothesis testing problem between alternatives H 0 and H 1 . Two distributions, namely, p 0 and p 1 , are under H 0 and H 1 , respectively. The RE between distributions p 0 and p 1 can be expressed as Let A n and A c n be acceptance areas for H 0 and H 1 , respectively. We assume that the error probabilities of the two types are α n = p n 0 A c n and β n = p n 1 A n , respectively. Then we define β ε n = min α n <ε β n , 0 < ε < 1 2 . The results can be obtained as follows: Target detection in radar signal processing can be expressed as a binary hypothesis testing problem: where a (θ i ) and b (θ i ) denote a (θ ) and b (θ) during the i-th radar pulse. The RE is represented as where p 0 (x i ) and p 1 (x i ) are the probability density functions (PDFs) of x i under alternatives H 0 and H 1 , respectively. It is worth noting that the received backscatter signals without pulse compression are used to optimize the transmitted waveform as presented in Fig. 2. According to the binary hypothesis testing model, α n is the false-alarm probability and β n is the missed detection probability. According to Stein's lemma, if α n is fixed, β n is exponentially small, with an exponential rate that is equal to D ( p 0 (x i ) p 1 (x i )). Therefore, to further improve the performance in target detection, we should maximize D ( p 0 (x i ) p 1 (x i )). Under a transmission power constraint, the PM-based waveform design problem that is based on RE maximization is modeled as: Then, the main function of this waveform design problem can be expressed as (15), the PM-based waveform design problem can be reexpressed as: Matrices R s (θ i ) and R N are both positive semidefinite Hermitian matrices [36], and rank R s (θ i ) = 1. Then, denote the eigen-decomposition of β 2 i R s (θ i ) and R N as U i s i U H i and V N N V H N , respectively, where s i = β 2 i diag δ s i ,1 , δ s i ,2 , . . . , δ s i ,K and N = diag δ N ,1 , δ N ,2 , . . . , δ N ,K . It is easy to obtain the expres- According to [36], log det (I + R) −1 + tr (I + R) −1 is a monotonic increasing function of positive semidefinite matrix R. Based on maximizing MI, the Pareto-optimal solution s opt i of the above waveform design algorithm (19) can be expressed as: We can then construct an optimal waveform ensemble C s i based upon the Walsh-Hadamard codes. Each waveform corresponds to a particular column vector of the Walsh-Hadamard matrix. In other words, we start our waveform design with the orthogonal sequences from the Hadamard matrix, and modulate the power of the waveform on an individual pulse level using the maximization criterion presented in (20). After the optimized waveform ensemble C s i has been gained, the most suitable waveforms for illumination are selected from the ensemble in Step 2.

B. TARGET IMPULSE RESPONSE AND PARAMETER ESTIMATION
The radar receiver has complete knowledge of the transmitted waveform at all instants of time. Hence, we can use this information to extract parameters like target impulse response, target channel covariance matrix R H (θ ), and noise variance R n , where R H (θ) = β 2 a T (θ ) b (θ ) b T (θ ) a (θ ). Let x i and x i−1 be the received signal vectors at two successive time instants. Using (2) i and R 2 i−1 represent the variances of the received signals at the respective time instants. Solving simultaneously the two above equations, we can estimate the values for R H (θ ) and R n . These values will be used to generate the estimate of x i+1 for all values of s i+1 ∈ C using (2), where C is the ensemble of the transmitted waveforms. We will choose s i+1 ∈ C based on the proposed MI minimization approach.
The estimation of the target channel covariance matrix and the noise variance will be performed at every instance of reception of x i , and their values will be updated and used to generate new estimates for x i+1 .

C. MI MINIMIZATION
Step 2: Waveform selection. The main objective of this step is to minimize the MI between the current pulse echo x i and the next pulse echo x i+1 . The successive pulse echoes are independent of each other in time, with the objective of obtaining additional information about the radar scenario at each radar pulse.
The MI between x i and x i+1 can be denoted as I (x i , x i+1 ). If x i and x i+1 are dependent on each other, I (x i , x i+1 ) will be tremendously high. Consequently, a dramatic gain in information about the dynamic target scenario cannot be realized. Therefore, we proceed to the procedure of the waveform selection, in which we desire to realize independent pulse echo samples that are scattered by the dynamic target scenario to obtain additional target information about the target from pulse to pulse. We choose the waveforms for illumination that generate more independent successive pulse echoes from the same environment, namely, our objective is to choose the most suitable signal x i from the waveform ensemble that was designed in step 1 for minimizing I (x i , x i+1 ).
We denote the number of samples of the radar pulse echo as M , where M > T and M > R. The vector of the radar pulse echo at time i is denoted as MI between x i and x i+1 can be defined as (21) is the entropy of the pair (x i , x i+1 ) given the pair (s i , s i+1 ). The m-th sample data of the vector of the current pulse echo x i and the next pulse echo x i+1 can be described as x = {x i (m) , x i+1 (m)}. Then, we express the PDF of x as follows: According to [34], H ( x| s) can be expressed as follows Hence, H ( x i , x i+1 | s i , s i+1 ) can be derived as follows where is the covariance matrix, which is expressed as follows where x i+1 describe the variances of the radar pulse echo at the current radar pulse echo x i and the next radar pulse echo x i+1 . Since is the correlation coefficient, we can define the term H ( x i | s i ) as follows where Similarly, we can deduce the term H ( x i+1 | s i+1 ) as follows: Then, by substituting (24), (26), and (27) into (21), MI between x i and x i+1 can be represented as Finally, the waveform selection process that is based on MI minimization can be formulated as follows The current pulse x i and the past pulse echo x i−1 are utilized to calculate the variances of the successive radar pulse echoes, namely, R 2 i and R 2 i−1 . We estimate the corresponding correlation coefficient ρ i,i−1 . Then, the variances are used to estimate the next pulse echo x i+1 over all possible transmitted waveforms s i+1 ∈ C s i , where C describes the ensemble of all possible transmitted waveforms. All the values of the corresponding ρ i,i+1 can be estimated. Therefore, the waveform selection problem that is based on MI minimization (29) can be solved by selecting a suitable value for s i+1 ∈ C s i . It is worth noting that RCS scintillation of the target varies slowly. The pulse waveforms are designed based on the back-scattering echoes with slow fluctuating RCS and angles. These codes can be used for transmission in the next pulse. Therefore, the designed waveforms are the best possible waveform for transmission. The proposed twostep cognitive waveform optimization process is summarized as Algorithm 1.

Algorithm 1
The Proposed Two-Step Cognitive Waveform Design Algorithm 1: Initialization: Set iteration transmit weight vector (u 1 , v 1 ). 2: Generate the radar pulse echo at present x 1 , and calculate the variances of the received signals R 2 1 . 3: Solve for the optimization waveforms ensemble C s 1 for illumination based on RE minimization criterion as stated in step 1. 4: Update the variances of the received signals R 2 2 based on the current value x 2 . Generate an estimate of all the values of the corresponding ρ 1,2 . 5: Solve for optimization waveform s 2 ∈ C s 1 on the basis of the MI minimization criterion as stated in step 2. Generate an estimate of the transmit weight vector (u 2 , v 2 ). 6: If i = I max , the iterative procedure ends; or else, return to 2 and repeat.
Let us observe that, from a practical point of view, the proposed optimization procedure requires a condition to stop the iterations. There are several ways to impose it; for instance considering the maximum number of tolerable iterations I max . The two-step cognitive waveform design algorithm for target detection can be realized according to the block diagram in Figure 3.

D. COMPUTATIONAL COMPLEXITY ANALYSIS
As to the computational complexity connected with the implementation of Algorithm 1, it depends on the number of iterations I max as well as on the complexity involved in each iteration. Precisely, the overall complexity is linear with respect to I max , while, in each iteration, it includes the computation of RE maximization criterion (step 1) and the implementation of MI minimization criterion (step 2). At Step 1, the calculation of R s requires O (TR) 3 + (TRM ) 2 operations [42]; the calculation of R N requires O (RM ) 3 operations.
Step 2 corresponds to the update of ρ i,i+1 needs O M 3 , and the complexity required to solve a Second Order Cone Programming (SOCP), which is O M 3.5 log (1/η) [42], where η is a prescribed accuracy.

V. SIMULATION
In this section, numerical results based on Monte Carlo simulations have been provided to validate the effectiveness VOLUME 8, 2020 of the proposed method. Without loss of generality, each entry of the channel matrices follows the standard complex Gaussian distribution. The simulation parameters are based on radar application with a high PRF, such as in X-band radar. A data rate in the range of dozens of Mbps can be achieved. We provide a comparison between the proposed scheme and the method of [36]. To implement the method in [36], we consider a dual-function MIMO system operating in the X-band with carrier frequency f c = 8.2 GHz and bandwidth B = 500 MHz. The sampling frequency is f s = 10 9 sample/sec, which is taken as the Nyquist rate. The PRI is T 0 = 10µs. We assume an arbitrary linear transmit array consisting of M = 16 elements. We further assume that the minimum transmit/receive antenna spacing is sufficiently larger than half wavelength (distributed MIMO configuration). Hence, the correlation introduced by finite antenna element spacing is low enough that the fades associated with two different antenna elements can be considered independent. To implement the radar function, we further assume that the FH step is f = 10MHz, the length of the FH code is Q = 20 and the FH interval duration is t = 0.1µs. We generate a set of 16 FH pulse waveforms. The parameter J = 50 is used. Therefore, the 320 FH code is generated randomly from the set {1, 2, . . . , J }, where J = 50.

A. TARGET DETECTION PERFORMANCE
The performance enhancement of target detection that is realized by the two-stage scheme is evaluated in this section. We employ orthogonal sequences of the PM-based pulse over the transmit antenna elements. The backscatter signals are received and the transmitted signals are later modified by the waveform optimization module as illustrated in Fig. 2. The optimized transmission sequence at one particular transmit antenna after the proposed two-stage optimization process can be obtained. At each iteration of the optimization algorithm, the scattering coefficients for the target and nontarget scatterers vary as described by the Swerling III model. This causes the amplitude returns of the backscatter signals to vary at each instance. However, the amplitudes of the echoes from the target are always assumed to be stronger than those from the clutter sources. For a time-variant radar scenario, 1000 simulations have been run for each value of the received SNR. The convex optimization problems are solved via the CVX toolbox [39]. The simulation parameters are listed in Table 1.  Figure 4 presents the detection performance that is realized using the RE maximization criterion versus the SNR for various numbers of iterations. The iteration process has been run twenty times. All optimized waveforms are produced via the RE maximization strategy. As presented in Figure 4, the value of the requested SNR increases as the probability of target detection increases for a fixed number of iterations. The value of the requested SNR decreases as the number of iterations increases for a specified probability of detection. The detection performance that is realized via the RE maximization strategy converges after fifteen iterations and yields a probability of 0.9 at SNR=3 dB, compared to SNR = 13 dB at the beginning of the iteration process. The detection performance of the RE maximization strategy increases as the number of iterations increases. However, the performance improvement is not substantial after twenty iterations. Figure 5 plots the probability of target detection that is realized using the MI minimization criterion versus the SNR. The variances of successive radar pulse echoes R 2 i and R 2 i−1 and estimates of the corresponding correlation coefficient ρ i,i−1 can be accurately obtained at high SNRs. Consequently, a suitable transmission waveform for target detection at the next time is selected. At a result, the value of MI decreases as the number of iterations increases. However, the estimate of the correlation coefficient ρ i,i−1 is imprecise at low SNR. The value of MI does not decrease substantially after twenty iterations. Therefore, the performance improvement compared to the MI minimization approach is not substantial. This approach has no potential for yielding substantial performance gains in dynamic radar scenarios.
In Figure 6, we compare the target detection performance of the optimized waveform that is provided by the twostage scheme with the performance of the waveform that is provided by the RE maximization method and compare the results with those of the MI minimization method. The iteration process has been run twenty times. Since the two-stage scheme employs the RE maximization criterion and the MI minimization criterion from pulse to pulse, the DFRC system constantly adapts its transmitted waveform to the dynamic radar environment. The optimized waveform is superior to the waveform that is generated via the MI minimization strategy. In addition, the RE maximization method cannot obtain independent radar pulses that are scattered by the target to obtain additional knowledge about the radar scenario at each instant of reception. As presented in Figure 6, the detection performances of the RE maximization method and the MI minimization method are suboptimal. The performance of the proposed two-stage scheme (joint RE maximization and MI minimization criterion) is optimal. Figure 7 plots the ROC for four types of schemes while the SNR is fixed to 10 dB: (1) a 4 × 4 MIMO radar system that is based on the maximum a posteriori (MAP) criterion; (2) a 4 × 4 MIMO DFRC system that is based on the RE maximization criterion, as defined in [36]; (3) a 4 × 4 MIMO DFRC system that is based on the MI minimization criterion, as presented in [34]; and (4) a 4 × 4 MIMO DFRC system via the proposed two-stage scheme.
The curves for the RE maximization strategy, the MI minimization strategy and the proposed two-stage strategy are plotted for twenty-five iterations. For P fa = 0.01, the detection probability that is realized by the proposed two-stage method is 0.9, compared with 0.7 by the RE maximization strategy, 0.65 by the MI minimization strategy and 0.5 by the MAP method. Since the proposed two-stage method could afford to use the temporal correlation of target information during each pulse interval, the DFRC system via the proposed two-stage scheme continually adapts its transmitted waveform to the fluctuating target RCS. Moreover, the sequential radar pulse echoes can be regarded as independent of each other. This ensures that information about the radar scenario is learned at each instant of reception. As a result, the detection performance that is realized by the two-stage method (joint RE maximization and MI minimization criterion) is the best.
We compare the detection capability of the proposed algorithm with that in [36] (referred to as ''RE Approach'') and in [41] (referred to as ''Minorization Maximization Approach''). Fig. 8 shows the detection probabilities of the optimized waveforms by the algorithms in [36], [41] and the proposed algorithm, versus the total transmitted power. The probability of false alarm is kept constant as P fa = 10 −4 . The total transmitted power varies from 0 dB to 25 dB. Herein, the detection probability of the orthogonal waveform is plotted as a benchmark. FIGURE 8. Detection performance of the optimal waveforms by the proposed approach, the method of [36], [41] and orthogonal waveform.
Inspection of the figure shows that, our method outperforms the algorithms in [36] and [41], with performance gains up to 2.4 dB and 2 dB. This is probably due to the fact that, unlike the counterparts, the waveform selection step in the VOLUME 8, 2020 proposed method, which ensures that target echoes that are more statistically independent on each other in time is always acquired, with an intention of gaining more knowledge about the target features at each time instant of reception. In this way the system adjusts its probing signals to suit the dynamically changing environment (the fluctuating target RCS). The static waveforms on the other hand, in spite of multiple iterations, are unable to match the time-varying target response. Fig. 9 assesses the performance of the devised constant-modulus waveforms. The curve highlight that the performance of the waveforms devised by the proposed algorithm outperforms the counterparts. . Detection performance of the optimal constant-modulus waveforms by the proposed approach, the method of [36], [41] and orthogonal waveform.

B. COMMUNICATION PERFORMANCE
A 4 × 4 MIMO DFRC system is considered to operate in the X-band with carrier frequency 7.5GHz and bandwidth 500 MHz. The sample frequency is considered as the Nyquist rate (5×10 8 time/second). The PRF is 100 KHz. We consider the joint radar-communication transmit array including four antennas, spaced half a wavelength apart. We produce a random waveform and an optimization waveform provided by the proposed two-stage scheme as stated in Section 4. Figure 10 illustrates the throughput result provided by the proposed optimization waveform versus distance for BPSK, QPSK, 16-PSK, and 256-PSK constellation. 256-PSK waveform provides a data rate of approximately 9 Mbps at a distance of 15 m, which is better than that generated by BPSK, QPSK, 16-PSK constellation. 256-PSK waveform acquires the highest data rate within a distance of 60 m, as the distance between the system nodes increases the data rate decreases.
In Figure 11, we compare the symbol error rate (SER) performances for the optimization waveforms offered by the proposed two-stage scheme with the performances for a random waveforms provided by the non-coherent PM-based information embedding scheme using BPSK, QPSK, 16-PSK, and 256-PSK constellations. The data rates of the above-mentioned four types of signals are 1.2, 2.4, 4.8, and 9.6 Mbps, respectively. To investigate the SER performance,  14 × 10 7 random PM symbols have been generated. Figure 11 illustrates the SER performance versus SNR for BPSK, QPSK, 16-PSK, and 256-PSK constellation.
The curves highlight that the communication SER performance of BPSK random waveform is enhanced by about 5 dB, 16 dB and 33 dB as compared with QPSK, 16-PSK, 256-PSK random waveform, respectively. Meanwhile, as we can see from Figure 6, for BPSK, QPSK, and 16-PSK, the SER performances of the optimization waveforms offered by the proposed two-stage scheme are as good as that of the random waveforms. However, for the 256-PSK, the communication SER performances of the optimization waveforms are poor relatively. As the size of constellations increases, the cross correlation levels between the optimization waveforms offered by the proposed two-stage scheme get higher. As a result, to meet the more reasonable requirement mentioned above, we select suitable constellation size leads to a tradeoff between communication SER performance and data rate.

VI. CONCLUSION
In this paper, we developed a two-step waveform optimization approach for a cognitive DFRC system, which combines the waveform design and selection processes. The proposed waveform optimization scheme is based upon continuous learning of the radar scenario at the receivers and reallocation of transmission power to match the time-varying radar target and surroundings. The cognitive process guarantees maximum information extraction from the radar environment. The simulation results demonstrated the successful employment of the proposed waveform optimization scheme in DFRC systems for target detection performance improvement without impacting the secondary communication function. The improved DFRC system could form a joint platform, which is crucial to both environmental perception and the establishment of data links. Nevertheless, the radar target in this paper is assumed as a point scatterer amidst several clutter sources, which is limited for intelligent transportation system application. Improving the proposed model and algorithm for an extended target per multiple range cells case is a possible topic for future research.