Towards Dual-functional Radar-Communication Systems: Optimal Waveform Design

We focus on a dual-functional multi-input-multi-output (MIMO) radar-communication (RadCom) system, where a single transmitter communicates with downlink cellular users and detects radar targets simultaneously. Several design criteria are considered for minimizing the downlink multi-user interference. First, we consider both the omnidirectional and directional beampattern design problems, where the closed-form globally optimal solutions are obtained. Based on these waveforms, we further consider a weighted optimization to enable a flexible trade-off between radar and communications performance and introduce a low-complexity algorithm. The computational costs of the above three designs are shown to be similar to the conventional zero-forcing (ZF) precoding. Moreover, to address the more practical constant modulus waveform design problem, we propose a branch-and-bound algorithm that obtains a globally optimal solution and derive its worst-case complexity as a function of the maximum iteration number. Finally, we assess the effectiveness of the proposed waveform design approaches by numerical results.


I. INTRODUCTION
I T has been reported that by 2020, the number of connected devices will jump to more than 20 million, which brings forward impending needs for extra frequency spectrum resources. Realizing the scarcity of the spectrum, network providers and policy regulators are exploring the feasibility to use the spectrum that is currently occupied by other applications [1]- [4], such as airborne radars and navigation systems close to the 3.4GHz band [5] and shipborne and Vessel Traffic Service (VTS) radar at 5.6GHz [6], which may be shared with LTE and Wi-Fi systems in the near future. As an emerging research topic, the communications-radar spectrum sharing (CRSS) not only presents the advantage for enabling the efficient usage of the spectrum, but also provides a new way for designing novel systems that can benefit from the cooperation of radar and communications.
As a straightforward way to achieve the spectral coexistence for communication and radar, opportunistic spectrum sharing [7] provides a naive approach, where the communication system transmits when the space and frequency spectra are not occupied by radar. Nevertheless, it does not allow the two systems to work simultaneously. In view of this, the pioneering work [8] proposes a null-space projection (NSP) method, which has been widely applied to different spectral coexistence scenarios between MIMO radar and communication systems [9], [10]. In such schemes, a radar beamformer is designed to project the signals onto the null-space of the interference channel between the radar and base station (BS)/user equipment (UE), such that the interference from the radar to the communication link is zero. This, however, results in performance loss for the radar, since the beamforming is no longer optimal for target detection and estimation. Trade-offs between the performance of radar and communications can be achieved by relaxing the zero-forcing precoder to impose controllable interference levels on the communication systems [11], which offers a more realistic coexistence.
More recent contributions have exploited optimization techniques to realize CRSS. In [12], the radar beamformer and communication covariance matrix are jointly designed to maximize the Signal-to-Interference-plus-Noise-Ratio (SINR) of the radar subject to capacity and power constraints at the communication's side. Similar work has been done for the coexistence between the MIMO-matrix completion (MIMO-MC) radar and point-to-point (P2P) MIMO communications [13], [14], where the radar sub-sampling matrix is further introduced as an optimization variable. To address the more practical coexistence issue between MIMO radar and multi-user MIMO (MU-MIMO) communications, recent work in [15] considers the robust beamforming designs with imperfect channel state information (CSI) at the communication's side, where the detection probability of the radar is maximized subject to SINR constraints of the downlink users and the power budget of the BS. As a further development of the technique, a novel beamforming design has been proposed in [16] that exploits the interference as a useful power source, which demonstrates orders-of-magnitude power-savings. While the above coexistence approaches are well-designed, a critical shortfall is that radar and communication devices are required to exchange side-information for achieving a beneficial cooperation, such as the CSI, radar probing waveforms and communication modulation formats. Typically, these exchanges are realized by an all-in-one control center that is connected to both systems via either a wireless link or a backhaul channel [14], which conducts the coordination of the cooperation. In practical scenarios, however, such a control center brings forward considerable extra complexity to the system, and is therefore difficult to implement.
In contrast to the above coexistence schemes, a more favorable approach for CRSS is to design a novel dual-functional system that carries out both radar and communications, where the above problem does not exist. Note that such methods are distinctly different from the classic cognitive radio based techniques, as they require the use of specific radar constraints and designs. Recent information theoretical work has shown great potential [17], [18], but it remains to be seen what benefits can be implemented in practice. As an enabling solution, dual-functional waveform design can support target detection while carrying information at the same time. Such possibilities have been explored for single-antenna systems, where several integrated waveforms that combine the radar and communication signals have been proposed [19]- [21]. Nevertheless, all of these schemes lead to performance loss for either radar or communication, e.g., high peak-average-powerratio (PAPR) and limited dynamic range [20]. As a step further, recent works consider dual-functional waveform design for MIMO systems. In [22], a transmit beampattern for MIMO radar is designed to embed the information bits in sidelobe levels. Related works consider waveform shuffling across the antennas or Phase Shift Keying (PSK) by different beamformer weighting factors as the communication modulation schemes [23], [24]. It should be noted that in the above approaches, one communication symbol is represented by one or several radar pulses, which leads to a low date rate in the order of the radar pulse repetition frequency (PRF). To support multiuser transmission for the cellular downlink, the previous work [25] develops a series of beamforming approaches for dualfunctional RadCom systems, which will not affect the original modulation scheme and the data rate of the communication system. Nevertheless, the beamforming approaches only focus on the average power constraints, and do not address the design of the constant modulus signals.
As an important requirement for both radar and communication applications, the utilization of constant modulus waveforms can avoid signal distortion when the low-cost nonlinear power amplifiers are used [26], which leads to an energy-efficient transmission. Such topics have been widely studied for massive MIMO communication scenarios [27]- [30] as well as the MIMO radar waveform designs [31]- [34], where optimization problems with non-convex constant modulus constraint (CMC) are formulated. Due to the NPhardness of these problems, only suboptimal solutions can be obtained via either convex relaxation methods or local algorithms, such as Semidefinite Relaxation (SDR) [31], [32] and Riemannian manifold methods [28], [30]. Recent MIMO radar work proposes to approach the constant modulus solution by a successive Quadratic Constrained Quadractic Programming (QCQP) Refinement (SQR) procedure [33]. Nevertheless, this technique still only guarantees the local optimality of the obtained solution. To the best of our knowledge, the efficient global algorithm for constant modulus waveform design is widely unexplored in the existing literature.
In this paper, we propose several optimization-based waveform designs for the dual-functional RadCom systems. It is worth highlighting that all the proposed methods yield probably globally optimal waveforms, which can be used for both target detection and downlink communications. Throughout the paper, we aim to minimize the downlink multi-user interference (MUI) under radar-specific constraints. First, we consider an orthogonal waveform design, which is often used for the initial omnidirectional probing by MIMO radar. Based on this waveform, we extend to the design of a directional radar beampattern that points to the targets of interest. The aforementioned two optimization problems are non-convex, but the optimal solutions can be readily obtained in closedforms. Still, the obtained performance for the communication system is limited. To allow a trade-off between radar and communication performance, we consider a weighted optimization under non-convex power budget constraint, and obtain its global optimum via a well-designed low-complexity algorithm. Given that both radar and communication systems require constant modulus signals for power-efficient transmission, we finally consider a more practical optimization by enforcing constant modulus constraints and similarity constraints (SC) on the waveform design. In contrast to the existing approaches in both radar and communication works that obtain the local minimizers of problems with CMC [27]- [34], we propose a branch-and-bound method that can efficiently yield a globally optimal solution for the problem. Our numerical results show that the proposed branch-and-bound algorithm considerably outperforms the conventional SQR method [33]. For clarity, we summarize our contributions as follows: • We propose dual-functional waveform design approaches for both omnidirectional and directional radar beampatterns, and derive the closed-form solutions. • We propose a weighted optimization that achieves a flexible trade-off between the radar and communication performance, and solve the problem with a low-complexity global algorithm. • We consider the waveform design with CMC and SC constraints, and develop a branch-and-bound algorithm to obtain the globally optimal solutions, which outperforms the conventional SQR algorithm. • We derive the computational complexity for the proposed algorithms analytically.
The remainder of this paper is organized as follows, Section II introduces the system model, Section III proposes the closed-form waveform optimizations for radar beampattern design, Section IV considers the trade-off design between radar and communications, Section V solves the problem with CMC and SC constraints, Section VI provides numerical results, and finally Section VII concludes the paper.
Notations: Unless otherwise specified, matrices are denoted by bold uppercase letters (i.e., H), vectors are represented by bold lowercase letters (i.e., x), and scalars are denoted by normal font (i.e., ρ). Subscripts indicate the location of the entry in the matrices or vectors (i.e., s i,j and l n are the (i, j)-th and the n-th element in S and l, respectively). tr (·), (·) T , (·) H and (·) * stand for trace, transpose, Hermitian transpose and complex conjugate, respectively. Re (·) and Im (·) denote the real and imaginary part of the argument, · , · ∞ and · F denote the l 2 norm, l ∞ and the Frobenius norm respectively.

II. SYSTEM MODEL
We consider a dual-functional MIMO RadCom system, which simultaneously transmits radar probing waveforms to the targets and communication symbols to the downlink users. The joint system is equipped with a uniform linear array (ULA) with N antennas, serving K single-antenna users while detecting radar targets at the same time.

A. Communication Model
The received symbol matrix at the downlink users can be given as where Following [25], we rely on the following assumptions: 1) The transmitted signal matrix X is used as dual-functional waveform for both radar and communication operations. In this case, each communication symbol is also a snapshot of a radar pulse; 2) The downlink channel H is flat Rayleigh fading, and remains unchanged during one communication frame/radar pulse; 3) The channel H is assumed to be perfectly estimated by pilot symbols.
Given the desired constellation symbol matrix S ∈ C K×L for the downlink users, the received signals can be rewritten as For each user, the entry of S is assumed to be drawn from the same constellation. The second term in (2) represents the MUI signals. The total MUI energy can be measured as It has been proven in [27] that the MUI energy in (3) directly links to the achievable sum-rate of the downlink users. For the i-th user, the SINR per frame is given as [27] where s i,j is the (i, j)-th entry of S, E denotes the ensemble average with respect to the time index. It follows that the achievable sum-rate of the users can be given as For a given constellation with fixed energy, the power of the useful signal E |s i,j | 2 is also fixed. Hence, the sum-rate can be maximized by minimizing the MUI energy.

B. Radar Model
It is widely known that by employing uncorrelated waveforms, MIMO radar achieves higher Degrees of Freedom (DoFs) than the traditional phased-array radar [35], [36]. The existing literature indicates that the design of such a beampattern is equivalent to designing the covariance matrix of the probing signals, where convex optimization can be employed. We refer readers to [36]- [38] for more details on this topic. Here we focus on designing the dual-functional waveform matrix X, which has the following spatial covariance matrix To ensure that R X is positive-definite, we assume L ≥ N without loss of generality. Further, the transmit beampattern for the RadCom system can be given as where θ denotes the detection angle, a (θ) = 1, e j2π∆ sin(θ) , ..., e j2π(N −1)∆ sin(θ) T ∈ C N ×1 is the steering vector of the transmit antenna array with ∆ being the spacing between adjacent antennas normalized by the wavelength.
In the following, we formulate optimization problems that minimize P MUI under MIMO radar-specific constraints.

III. CLOSED-FORM WAVEFORM DESIGN FOR GIVEN RADAR BEAMPATTERNS
In this section, we first consider the omnidirectional beampattern design, which is usually used in MIMO radar for initial probing. After that, we consider a directional beampattern design that points to the directions of interest.

A. Omnidirectional Beampattern Design
For an omnidirectional beampattern, the transmitted waveform matrix X has to be orthogonal, i.e., the corresponding covariance matrix must be the identity matrix. To minimize the MUI energy, the optimization problem is formulated as where P T is the total transmit power, I N denotes the N × N identity matrix. Problem (8) is obviously non-convex due to the equality constraint, which indicates that X is a point on the Stiefel manifold. Fortunately, it has been proven that (8) can be classified as the so-called Orthogonal Procrustes problem (OPP), which has a simple closed-form global solution based on the Singular Value Decomposition (SVD), and is given as where and V ∈ C L×L being the unitary matrices, I N ×L is an N × L rectangular matrix composed by an N ×N identity matrix and an N × (L − N ) zero matrix.

B. Directional Beampattern Design
Given a covariance matrix R d that corresponds to a welldesigned MIMO radar beampattern, the MUI minimization problem is given as where R d is the desired Hermitian positive semidefinite covariance matrix, and tr (R d ) = P T . We consider its Cholesky decomposition, which is where F ∈ C N ×N is a lower triangular matrix. Without loss of generality, we assume R d is positive-definite to ensure that F is invertible. Hence, the constraint in (10) can be equivalently written as which is again an OPP problem, and its globally optimal solution is given byX whereŨΣṼ It follows that the solution of the original problem (10) is given as

C. Complexity Analysis
The omnidirectional beampattern design includes two matrix multiplications and one SVD, which needs a total of O N KL + 2N L 2 complex floating-point-operations (flops), where one complex flop is defined as one complex addition or multiplication. The directional beampattern design, which needs one Cholesky decomposition, four matrix multiplications and one SVD, has the total complexity of O 2N L 2 + N 2 L + N KL + N 3 + N 2 K . For the conventional communication-only zero-forcing (ZF) precoding, which involves one pseudoinverse for H, and one matrix multiplication between the precoder and the transmitted symbol matrix, the complexity is O N KL + N 2 K . It is worth noting that the computational costs of the proposed closedform approaches share the same order of magnitude with that of the zero-forcing precoder.

IV. TRADE-OFF BETWEEN RADAR AND COMMUNICATION PERFORMANCE
It should be highlighted that both problem (8) and (10) enforce a strict equality constraint, in which case the radar performance is guaranteed to be optimal while the communication counterpart may suffer from serious performance loss. This is particularly pronounced in the cases that the covariance matrices of the communication channel are ill conditioned, where the resulting MUI minimum is still high. We therefore consider a trade-off design by allowing a tolerable mismatch between the designed and the desired radar beampatterns.

A. Problem Formulation
Let us first denote the optimal solution obtained from (8) or (10) as X 0 . Given X 0 , the trade-off problem can be then formulated as where 0 ≤ ρ ≤ 1 is a weighting factor that determines the weights for radar and communication performance in the dual-functional system. For coherence between (16) and the previous two problems, we enforce an equality constraint for the power budget, as the radar is often required to transmit at its maximum available power in practice. Note that the two Frobenius norms in the objective function can be combined in the form (16) can be written compactly as 5 which is a non-convex QCQP, and can be readily transformed into a Semidefinite Programming (SDP) using SDR technique.
Since it has only one quadratic constraint, according to [40], [41], the SDR is tight, i.e., the solution of the SDR is rank-1, which yields the globally optimal solution of (18). Nevertheless, due to the large number of variables in the problem, SDR is not computationally efficient in general. Hence, we propose a low-complexity algorithm that achieves the global optimum in the following.

B. Low-complexity Algorithm
We further expand the objective function of (18) as (18) can be rewritten as Since Q is a Hermitian matrix, problem (20) can be viewed as the matrix version of the trust-region subproblem (TRS), for which the strong duality holds [42], i.e., the duality gap is zero. Let us formulate the Lagrangian multiplier as where λ is the dual variable associated with the equality constraint. Let X opt and λ opt be the primal and dual optimal points with zero duality gap, the optimality conditions for the above TRS can be given as [43] ∇L (X opt , λ opt ) = 2 (Q + λ opt I N ) X opt − 2G = 0, (22a) where (22b) and (22c) guarantee the primal and the dual feasibility respectively. It follows from (22a) that where (·) † denotes the Moore-Penrose pseudoinverse of the matrix. Based on (22b) and (22c) we have where Q = VΛV H is the eigenvalue decomposition of Q with V and Λ being the orthogonal and diagonal matrices that contain the eigenvectors and eigenvalues respectively, and λ min is the minimum eigenvalue of Q. One can further show that there exists an unique solution for the equations (24). Let us define where λ i is the i-th eigenvalue of Q. It can be seen that P (λ) is strictly decreasing and convex on the interval λ ≥ −λ min , which suggests that λ opt can be obtained by simple line search methods, e.g., Golden-section search [44]. Thanks to the eigenvalue decomposition, in each iteration we only need to calculate the inversion of a diagonal matrix. Once the optimal λ is obtained, the optimal solution to (16) can be computed by (23). For clarity, we summarize the above approach in Algorithm 1.

C. Complexity Analysis
We end this section by analyzing the complexity of Algorithm 1. The Golden-section search method is known to have linear convergence rate, which finds an ε 0 -solution within O (log (1/ε 0 )) iterations. In each iteration we calculate the value of a 1-dimensional function, which suggests that the complexity of the Golden-section search can be omitted in general. Hence the complexity for Algorithm 1 is domainated by the matrix multiplications, the pseudoinverse and the eigenvalue decomposition. Both of the latter two operations involve the computational costs of O N 3 complex flops, and the matrix multiplications involve the complexity of O N 2 L + N KL + N 3 + N 2 K . Therefore, the total complexity for Algorithm 1 is O N 2 L + N KL + 3N 3 + N 2 K , which again shares the same order of magnitude with the communication-only ZF precoding. For the sake of clarity, we summarize the computational costs of the proposed three waveform design approaches in TABLE I.

V. CONSTANT MODULUS WAVEFORM DESIGN
In the previous sections, the dual-functional RadCom waveform is designed under total power constraints, which is not guaranteed to generate constant modulus signals. In this section, we consider the RadCom waveform design that minimizes the communication MUI energy given the CMC.

A. Problem Formulation
Following the same notations in the previous section, our optimization problem can be formulated as where X 0 ∈ C N ×L is a known benchmark radar signal matrix that has constant-modulus entries, e.g., chirp signals, vec (·) denotes the vectorization of a matrix, and x i,j is the (i, j)-th entry of X. The constraint (26b) is called similarity constraint (SC) in the radar literature [32], which controls the difference between the designed waveform and the benchmark with η being the tolerable difference.
It is trivial to see that the objective function of (26) is separable, since Hence, it can be further simplified using the normalized vector variable, which is where ε = η N PT , x ∈ C N ×1 , x 0 ∈ C N ×1 are the columns of X and X 0 normalized by PT N , s ∈ C K×1 is the column of S, and x (n) denotes the n-th entry of x. Since problem (26) can be solved by solving the problem (28) for each column of X concurrently, we will focus on (28) in the following discussion. For notational convenience, we omit the column index.
Note that 0 ≤ ε ≤ 2 since both x and x 0 have unit modulus. According to [32], the similarity constraint can be rewritten as arg x (n) ∈ [l n , u n ] , ∀n, where l n = arg x 0 (n) − arccos 1 − ε 2 2 , For each x (n), the feasible region is an arc on the unit circle as shown in Fig. 2, which makes the problem non-convex, and NP-hard in general. In the following, we consider a global optimization algorithm for solving (28), which is based on the general framework of the branch-andbound (BnB) methodology [45].

B. The Branch-and-bound Framework
A typical BnB algorithm requires to partition the feasible region into several subregions, where we formulate corresponding subproblems. For each subproblem, we obtain a sequence of asymptotic lower-bounds and upper-bounds by well-designed bounding functions. In each iteration, we update the bounds and the set of the subproblems following the BnB rules until convergence, i.e., the difference between the upperbound and lower-bound goes to zero.
It is well-known that the worst-case complexity for the BnB algorithm is of the exponential order with respect to N , i.e., to search all the branches of the subproblems exhaustively [45]. Nevertheless, by carefully choosing the tightest bounds, it is possible to efficiently identify and prune the unqualified branches, which accelerates the algorithm significantly.
Let us denote the feasible region, i.e., the arc shown in Fig.  2, as θ n = arc (l n , u n ). Problem (31) can be compactly written as P (Θ 0 ) : min where Θ 0 = θ 1 ×θ 2 ×...×θ N , and f (x) is defined in (31). By the above notations, a subproblem can be denoted as P (Θ), where Θ ⊆ Θ 0 is the corresponding subregion. We then find a lower-bound of P (Θ) by a lower-bounding function where x l is a relaxed solution that achieves the bound. In order to compute the upper-bound, we find a feasible solution x u for P (Θ). The upper-bounding function is thus given by The above bounding functions (33) and (34) will be specified in the next subsection. Here we only use f L and f U to introduce the BnB framework for notational convenience. In the BnB algorithm, we store all the subproblems in a problem set S, which will be updated together with the global bounds in each iteration by the following rules [45] 1) Branching: Pick a problem P (Θ) ∈ S that yields the smallest lower-bound. Equally divide Θ into two subregions following some subdivision rules detailed in the following, and generate two subproblems. Then delete P (Θ) in the problem set. 2) Pruning (optional): Evaluate the qualification of the two subproblems, if their lower-bounds are less than the current global upper-bound, add them into S. 3) Bounding: Choose the smallest lower-bound and upperbound from S as the bounds for the next iteration. Note that the pruning step is only for saving the memory of storing S, and will not affect the effectiveness of the BnB procedure. This is because by choosing the smallest bounds in S we can always avoid the unqualified branches. For clarity, we summarize our BnB algorithm in Algorithm 2. To ensure that Algorithm 2 converges in a finite number of iterations, the chosen subproblem for branching, the subdivision rule and the bounding functions f L and f U should satisfy the following conditions [45] 1) The branching is bounding-improving, i.e., in each iteration we choose the problem that yields the smallest lower-bound as the branching node. 2) The subdivision is exhaustive, i.e., the maximum length of the subregions converges to zero as the iteration number goes to infinity.
3) The bounding is consistent with branching, i.e., U B − f opt converges to zero as the maximum length of the subregions goes to zero, where f opt is the optimal value of the original problem. Our Algorithm 2 satisfies condition 1) automatically. We then choose the subdivision rules to obtain the subproblems from the branching node. For a given node P (Θ), we consider the following two rules: In (35) x u and x l are the solutions associated with f U (Θ) and f L (Θ), respectively. According to [45, Theorem 6.3 and 6.4], both the above two rules satisfy condition 2). In practical simulations, we observe that ARS has a faster convergence rate than BRS.

C. Upper-bound and Lower-bound Acquisition
It remains to develop approaches to acquire the lower and upper bounds, which are key to accelerating the BnB procedure. Following the approach in [46], we compute the lower-bound by the convex relaxation of (32). As shown in Fig. 2, the convex hull for each entry x (n), denoted as Q (θ n ), is a circular segment, and can be given as By simple analytic geometry, the angle constraint can be equivalently written as which is nothing but a linear constraint. It follows that the constraint for the vector variable is where u = [u 1 , u 2 , ..., u N ] T ∈ R N ×1 , l = [l 1 , l 2 , ..., l N ] T ∈ R N ×1 , and • denotes the Hadamard product. Hence, the convex relaxation can be given as the following QCQP problem QP-LB : min Problem (40) can be efficiently solved via numerical solvers, e.g., the CVX toolbox. By doing so, we can readily obtain the lower-bound for each subproblem. A natural way to compute the upper-bound is to project each entry of the obtained solution x l of (40) on the corresponding arc to get a feasible solution. Such a projector can be given in an element-wise manner as follows where we omit the subscripts for convenience.
The upper-bound obtained by the projector (41) is still loose in general. To get a tighter bound, one can use PR 1 (x l ) as the initial point, and solve the following non-convex QCQP QP-UB : min which can be locally solved via the fmincon solver in MAT-LAB. Since the solver employs descent methods, the obtained local minimizer is guaranteed to yield a smaller value than f (PR 1 (x l )).
To further accelerate the speed for solving QP-LB and obtaining the bounds, we consider accelerated gradient projection (GP) methods [47] in addition to the QCQP solvers. Given x n ∈ C, the projector PR 2 projects x n to the nearest point in the corresponding convex hull Q (θ n ). The details for deriving PR 2 are provided in the Appendix. Here we briefly introduce our iterative scheme as where we start from x (0) and x (1) = x (0) . For the least-squares objective function, we choose the stepsize as s = 1/λ max , whereλ max is the maximum eigenvalue ofH HH , i.e., the Lipschitz constant. Note that the above iteration scheme can only be used for convex feasible regions due to the interpolation operation (43). For the non-convex QP-UB problem (42), we use x (k) instead of the interpolated point v, and replace the projector PR 2 with PR 1 , which projects the point onto the arc, i.e., the feasible region. Similar to (40), we use PR 1 (x l ) as the initial point.
Based on [48], the complexity for using interior-point method to solve the QCQP problems is O N 3 per iteration. For both gradient-based methods, the costs are O (N K) in each iteration, which are far more efficient in terms of a fixed iteration number.

D. Convergence Analysis and Worst-case Complexity
We end this section by analyzing the convergence behavior and the worst-case complexity for the proposed Algorithm 2. The convergence proof is to show that our bounding functions f L and f U satisfy the condition 3). Recall the definitions of φ n and d n in (35) and (36). By denoting φ max = max {φ n } , d max = max {d n }, we have the following Lemma 1.

Lemma 1.
As φ max or d max goes to zero, the difference between U B and LB uniformly converges to zero, i.e., Proof. Let us first denote the points that generate U B and LB as x u and x l , i.e., U B = f (x u ) , LB = f (x l ). Following the Lagrange Mean-value Theorem we have where The upper-bound of the gradient is given as where the second line of (48) is based on the triangle inequality, the third line is based on the definition of the matrix l 2 norm.
For the convex hull of each arc (l n , u n ), the longest line segment is the chord shown in Fig. 2 (φ n ≤ π) or the diameter (φ n ≥ π). By simple geometric relations we have (49) For φ n ≤ π, ∀n, it follows that By using (46), (48) and (50) we obtain Given any δ > 0, let Theorem 1. Algorithm 2 converges in a finite number of iterations to a value arbitrary close to f opt .
Proof. Algorithm 2 satisfies both conditions 1) and 2). Furthermore, according to the definition of U B and LB we have According to Lemma 1, the bounding is consistent with branching for the proposed two subdivision rules. Therefore, Algorithm 2 satisfies condition 3), which completes the proof.
The following Theorem 2 specifies the worst-case complexity of Algorithm 2 for using BRS.
Theorem 2. When the BRS is used, Algorithm 2 converges to a δ-optimal solution for at most iterations, where η 1 is given by (53).
Proof. Define vol (Θ 0 ) = 2 arccos 1 − ε 2 /2 N as the volume of the initialized feasible region. Assume that Algorithm 2 terminates at the T-th iteration. According to [49], we have It follows that which completes the proof.
Although the worst-case costs of both BnB-ARS and BnB-BRS are at the exponential order with N , our simulations show that in most cases, the algorithm terminates at a small iteration number thanks to the tight bounds.

VI. NUMERICAL RESULTS
In this section, we present numerical results to validate the effectiveness of the proposed waveform design approaches. For convenience, we set P T = 1, and each entry of the channel matrix H subject to standard Complex Gaussian distribution, i.e., h i,j ∼ CN (0, 1). In all the simulations, we set N = 16 and employ a ULA with half-wavelength spacing between the adjacent antennas. The constellation chosen for the communication users is the unit-power QPSK alphabet, i.e., the power of each entry in the symbol matrix S is 1.

A. Dual-functional Waveform Design with Given Radar Beampatterns
We first show the communication sum-rate obtained by different approaches as well as the associated radar beampatterns in Fig. 3 and Fig. 4, respectively, where we define SNR = P T /N 0 , and use 'Omni', 'Directional' and 'ZF' to represent omnidirectional beampattern design, directional beampattern design and zero-forcing precoding based on the problems (8) and (10). Further, we denote the waveform design with strict equality constraints (8) and (10) and the trade-off design (16) as 'Strict' and 'Tradeoff' respectively. The length of the communication frame/radar pulse is set as L = 20. For directional beampattern design, we consider three targets of interest with angles of −π/3, 0 and π/3, and exploit the classic Least-Squares techniques [38] to obtain the desired covariance matrix R d as defined in (10). It can be observed in Fig.3 that, the proposed two strict waveform designs outperform the communication-only zero-forcing precoding significantly, despite that their computational costs remain at the same level as we have discussed in Section III. The resultant radar beampatterns are shown in Fig. 4 for 'Strict', which are exactly the same with the desired beampatterns. Moreover, by introducing a small weighting factor ρ = 0.1 to the communication side, the sum-rates for trade-off designs increase significantly by approaching to that of the zero MUI case, i.e., the AWGN capacity. Meanwhile in Fig. 4, the corresponding radar beampatterns only experience slight performance-loss. In Fig. 5 and 6, we aim to explicitly show the tradeoffs between the communication and radar performance. For omnidirectional beampattern design, the detection probability P D is used as the metric, where we consider the constant false-alarm rate (CFAR) detection for a point-like target in the far field, located at the angle of π/5. The receive SNR is fixed at -6dB. The false-alarm probability for radar is P F A = 10 −7 . We calculate the detection probability based on [9, eq. (69)]. It can be seen that there exists a trade-off between the communication rate and the radar detection performance. For a fixed P D , the achievable rate increases with the decrease of the number of users, which suggests that the MUI energy can be further minimized by increasing the DoFs. The same trends appear in Fig. 6 for the directional beampattern, where we employ the mean squared error (MSE) between the desired and obtained directional beampatterns as the radar metric. Both figures prove that our approach can achieve a favorable tradeoff between radar and communications.

B. Dual-functional Constant Modulus Waveform Design
We show the results for solving the waveform optimization problem with CMC and SC in Fig. 7-9. Following the simulation configurations in [33], we employ the orthogonal chirp waveform matrix as the reference signal. The convergence behavior of the proposed BnB algorithm for solving (28) is shown in Fig. 7, with N = 16, K = 4, ε = 1, where we compare the performance of the two different subdivision rules, i.e., ARS and BRS. Both methods converge in a finite number of iterations with a nearly constant upper-bound, which suggests that we can reach the optimal value of problem (28) by iteratively using the local algorithms for several times, e.g., QCQP solver or the proposed gradient projection method. Nevertheless, due to the non-convexity of the problem, we need BnB algorithm to confirm that this is indeed a global optimum. It can be also observed that the BnB-ARS has a faster convergence rate than BnB-BRS, which is consistent with the analysis in [45].
In Fig. 8 and 9, we show the trade-off between communication sum-rate and radar waveform similarity for the constant modulus designs, where we employ the SQR-Binary Search (SQR-BS) algorithm proposed by [33] as our benchmark technique. Fig. 8 demonstrates the communication sum-rate with increasing ε for N = 16, K = 4, SNR = 10dB. As expected, the proposed BnB algorithm outperforms the SQR-BS significantly, since the result obtained by BnB is the global optimum, while SQR-BS can only yield local minimum solutions. It is worth highlighting that the performance of BnB is very close to the convex relaxation bound, which is Average achievable sum rate (bps/Hz)

× 4, SNR = 10dB
Convex Relaxation Bound BnB SQR-BS [33] Zero MUI ZF obtained by solving QP-LB. When the similarity tolerance ε is big enough, our BnB algorithm can approximate the AWGN capacity, i.e., the MUI can be fully eliminated. Fig. 9 shows the results of radar pulse compression with different similarity tolerance ε, where we use the waveform transmitted by the first antenna, and employ the classic FFT-IFFT pulse compression method [50] with a Taylor window to reduce the power of sidelobes. From Fig. 9 we see that there exists a trade-off between the communication sum-rate and radar pulse compression performance. Moreover, the pulse compression results of BnB and SQR-BS are nearly the same, as their performance is guaranteed by the same waveform similarity constraint, which again proves the superiority of the proposed BnB Algorithm.

VII. CONCLUSION
In this paper, we discuss the waveform design for dualfunctional radar-communication system, which can be used for both target detection and downlink communications. First of all, two design approaches are proposed to minimize the multi-user interference while formulating an appropriate radar beampattern, which have been further extended as a weighted optimization to achieve a flexible trade-off between radar and communications. It has been proven that the computational costs for the above three approaches are all at the same level with the communication-only ZF precoding. Numerical results show that all the proposed methods outperform the ZF precoding, while guaranteeing both the radar and communication performance. Moreover, our trade-off design can significantly improve the communication performance by allowing a slight performance loss at radar. Finally, we consider the non-convex constant modulus waveform design with similarity constraints, where an efficient global optimization algorithm based on the branch-and-bound framework has been developed. Gradient projection algorithms are used to efficiently obtain the upper and lower bounds. Simulations show that the proposed BnB algorithm for constant modulus waveform design with similarity constraints considerably outperforms the conventional SQR-BS algorithm by obtaining the global optimum of the problem.

APPENDIX DERIVATION OF THE PROJECTOR PR 2
The projector are derived for two cases respectively, i.e., the open angle of the circular segment is (a) less than π or (b) greater than π. We start from the first case. As shown in Fig.  10 (a), the whole complex plane C has been divided into five parts. The lower and the upper bounds for the angle are l and u respectively. Let us define where T is the midpoint of AB. Given X ∈ C, we aim to find a nearest point PR 2 (X) ∈ M 1 as the projection. Note that ∀X ∈ M 1 , the projection is itself. For X ∈ M 2 and X ∈ M 3 , the nearest points are A and B respectively. For The projector is then given as X T , f 1 (X) , f 4 (X) , f 5 (X) ≤ 0 (X ∈ M 4 ) , X/ |X| , else, (61) where X T is the foot of perpendicular on AB, i.e., XX T ⊥AB, X T ∈ AB. This is given by For the case of φ ≥ π the projector is the same. The only difference is that f 1 (X) should be defined as