Resource Allocation for D2D Cellular Networks with QoS Constraints: A DC Programming-based Approach

Device-to-device (D2D) communications provide efficient ways to increase spectrum utilization ratio with reduced power consumption for proximity wireless applications. In this paper, we investigate resource allocation strategies for D2D communications underlaying cellular networks. To be specific, we study the centralized resource allocation algorithm for controlling transmit powers of the underlying D2D pairs in order to maximize the weighted sum-rate while guaranteeing the quality of service (QoS) requirements for both D2D pairs and cellular users (CUs). A novel DC (difference of convex function) programming-based method, called alternative DC (ADC) programming, is introduced to effectively solve this complicated resource allocation problem. Through updating each D2D pair’s power alternatively, the QoS requirement for each D2D pair can be solvable and incorporated systematically to the introduced ADC programming framework. The simulation results show that the introduced ADC programming achieves the highest weighted sum-rate compared to the state-of-the-art methods while ensuring that the QoS of each D2D pair and CU are satisfied.


I. INTRODUCTION
To meet the explosively increasing demands of data rate services in mobile networks, device-to-device (D2D) communications attracted a lot of attention from both academia and industry. When combined with traditional cellular networks, D2D communications allow mobile users to communicate directly without going through the base station (BS) [1], [2]. It significantly improves the underlying spectral-efficiency by reusing the radio spectrum. D2D communications can be deployed in a distributed fashion based on schemes such as spatial spectrum sensing [3], [4] and vehicle-to-vehicle communications [5] to decrease the signaling overhead from the BS compared to centralized resource allocation algorithms. However, distributed resource allocation algorithms for D2D communications lack global information to guarantee reliable cellular communications [6]. When multiple users coexist on the same spectrum bandwidth, the resource allocation needs to be carefully designed to ensure D2D communications will not generate severe interference to the existing cellular networks. A game-theoretic auction model can be developed to study the interactions between a number of cellular users (CU) and D2D pairs to optimize their underlying payoffs [7]. Learning-based distributed resource allocation algorithms can also be adopted to enable users to learn their transmission strategies through interaction with each other [8], [9]. On the other hand, these distributed strategies may require a very large training overhead to stabilize. In this regard, a centralized controller is needed for designing an efficient resource allocation scheme to maximize the network performance while providing QoS guarantees for CUs and D2D pairs, respectively. In this paper, we follow the network architecture introduced in the Long Term Evolution (LTE)-Advanced D2D networks [10], where base stations (BS) are present to provide centralized power allocations for all D2D pairs. To be specific, we target maximizing the weighted sum-rate while guaranteeing quality of service (QoS) requirements for both D2D pairs and CUs under the power constraint of each D2D pair. The transmission strategies of D2D pairs have to be carefully designed and coordinated to maximize the overall network throughput while satisfying QoS requirements of users.
The sum-rate maximization problem under power constraints of individual users is proved to be NP-hard when more than one user exists [11], so it is difficult if not impossible to obtain the optimal solution. In this paper, we consider a more complicated weighted sum-rate maximization problem for multi-user and multi-channel D2D networks underlaying cellular networks with individual QoS requirements and power constraints. Therefore, existing algorithms have to rely on many sub-optimal and heuristic approaches. To simplify the sum-rate maximization problem, most existing works divide this problem into two steps: channel allocation and power control [12]- [15]. These approaches often assumed that only one D2D pair and one CU share one channel in the channel allocation step because the power control step becomes relatively easy in a two-user single-channel system [12]- [14]. In [15] and [16], multiple D2D pairs are considered to share one channel with one CU. However, the QoS requirements of D2D pairs are not considered in [15] while D2D pairs' throughput is not included in the sumrate objective in [16]. As a result, these studies provide very limited understanding on the resource allocation of a general multi-carrier D2D network with separate QoS constraints for CUs and D2D pairs.
In order to enable multiple users to share multiple channels, certain approximations are made for solving this complicated sum-rate maximization problem. Specifically, geometric programming (GP) based approach that approximates the capacity log 2 (1 + SINR) as log 2 (SINR) is a common way [17]- [19]. In this way, the original non-convex sum-rate maximization problem is converted to a convex optimization problem via a logarithmic change in variables, and then GP can be solved efficiently using mature convex optimization techniques. The drawback of the GP-based approach is that the approximated capacity is not accurate especially when the signal-to-interference-plus-noise ratio (SINR) is low. Therefore, the GP-based approach is assumed to be operated in a high SINR regime. Unfortunately, due to the massive number of devices and the limited radio spectrum, the low SINR regime is the typical operation regime for D2D networks.
Our solution for this weighted sum-rate maximization problem is based on difference of convex functions (DC) programming [20], [21]. By utilizing the quotient property of logarithms, the objective function of the sum-rate maximization problem could be converted into minimizing a difference of two convex functions. DC programming is a powerful optimization technique because DC function encompasses a wide range of real-world non-convex problems [22]. To be specific, many non-convex problems in wireless networks can be solved in the DC programming framework, especially for logarithms based system throughput maximization problems. Compared to previous DC-based methods with the assumption on the single-channel wireless network [15], [23]- [25], we consider the multi-user multi-channel D2D networks in this paper, which is a much more difficult optimization problem. Furthermore, we consider individual QoS requirements.
In general, different metrics such as reliability [13], delayviolation probability [26]- [28], and average delay [29] can be adopted as the underlying QoS constraints. In this paper, due to the spectrum sharing nature of the problem, we utilize the minimum effective data rate for the D2D pairs and the reliability constraint for CUs in the DC optimization framework as the underlying QoS requirements. In single-channel systems, the minimum data rate requirement can be easily recast as a linear inequality constraint [24]. However, handling minimum data rate requirements is much more challenging in multi-channel systems, so the existing DC-based method in multi-channel systems ignores the minimum data rate requirements [30]. Therefore, we introduced an alternative power allocation method based on DC programming. We found that by updating each D2D pair's power alternatively, the minimum rate constraint for each D2D pair constructs a convex set while treating other D2D pairs' interference as constant noise. Accordingly, we developed a method to approximate this convex set through multiple linear inequalities during one D2D pair's power update. The linear inequality constraints can be generated adaptively until the minimum data rate requirement is satisfied and these linear constraints for the minimum data rate can be easily incorporated in the introduced DC programming optimizer.
Our approach for solving this DC programming problem is based on the branch-and-bound algorithm [31], [32], which is widely used in solving non-convex problems. The branchand-bound method is a global optimization approach, by which the optimal solution can be obtained [23], [33]. However, neither [23] nor [33] considers individual minimum data rate requirement, which results in unfair resource allocation among users. Furthermore, both of them impose the maximum power constraint on the sum power of all the users. Nevertheless, individual power constraints are more relevant for D2D networks since the transmit power is restricted by each D2D device. From optimization perspective, individual power constraints on D2D pairs are more difficult to solve since they form a more complicated feasible solution set, especially in multi-channel networks. Another approach for solving the DC programming problem uses the first-order Taylor series approximation with an iterative method [30], [34], [35]. However, this approach is easily stuck in a suboptimal solution if the objective and constraint functions in the DC programming problem is highly non-linear. Since we are considering a complicated sum-rate maximization problem with separate QoS constraints for CUs and D2D pairs, this approach cannot provide a near-optimal solution.
Main contributions of the paper are as follows: 1) The introduced ADC programming algorithm is able to efficiently solve a complicated weighted sum-rate maximization problem for multi-user and multi-channel D2D networks underlaying cellular networks with individual QoS requirements and power constraints for both D2D pairs and cellular users. Note that the weighted sumrate maximization problem under individual QoS requirements and power constraints is an NP-hard problem while existing methods have to rely on specific assumptions to simplify the problem formulation. The ADC programming method provides a comprehensive solution without simplifications, which increases the practical significance of our method. 2) We conduct the theoretical analysis on the statistical performance guarantees of the minimum instantaneous data rate requirement for cellular users. This provides more accurate performance protection for cellular users while improving the overall network throughput. The analytical characterization is also evaluated through simulation to show the accuracy of the theoretical analysis.
3) The theoretical analysis including the convergence guarantee, the performance bound, and the complexity analysis of the ADC programming algorithm in each power update is provided. Furthermore, we prove that the alternative power update scheme is guaranteed to converge. This allows us to provide the theoretical guarantees on the fundamental aspects of the ADC method. 4) Extensive simulation results are conducted to demonstrate the effectiveness and efficiency of the ADC programming algorithm. The network performance under various network topologies are evaluated and the computational time of various methods are compared. The results show that the ADC programming method achieves much better system performance under QoS constraints than the GP method and existing DC programming methods. Meanwhile, the ADC programming requires much less computational time than existing DC programming methods.
As opposed to the conference version in [36], we make the following extensions to further enhance the system model and resource allocation strategy: 1) Compared [36], we consider D2D communications underlaying cellular networks, where the major constraint is to keep the interference to CUs under control. Thus we add the reliability constraints for CUs. 2) We decreased the computation time by fine-tuning the optimization procedure. To be specific, only feasible points that can better approximate the lower-bound are selected and the minimum rate constraint is approximated efficiently by adaptively adding linear constraints. 3) We adopt the GP-based method to ensure that the initial power allocation can satisfy QoS requirements of all users, and then the ADC programming method will further improve system performance significantly.
The remainder of the paper is organized as follows: Section II introduces the system model of D2D communications underlaying cellular networks. Section III describes the problem formulation of the underlying resource allocation. Section IV introduces the ADC programming solution based on the branch-and-bound algorithm. Section V explains how to incorporate the QoS requirements into the ADC programming framework. Section VI provides the theoretical analysis of the ADC programming. Section VII contains the performance evaluation and we conclude the paper in Section VIII. All the mathematical notations used in this paper are provided in Table 1.

II. SYSTEM MODEL
FIGURE 1: System model. We consider the resource allocation problem of D2D communications underlaying cellular networks where the uplink (UL) radio resource is shared by D2D pairs. Our goal is to develop a resource allocation strategy for D2D pairs to increase spectrum utilization without harming the performance of cellular networks. UL spectrum sharing is considered for two reasons. First, UL spectrum is usually underutilized compared to downlink (DL) spectrum in the cellular systems. Second, the interference caused by D2D communications only affects the BS in UL spectrum sharing, which can be mitigated by centralized coordination from the BS. We consider the scenario of N D2D pairs coexisting with M CUs, where CUs are operating on M orthogonal wireless channels in the cell. We assume that CUs operate on separate wireless channels, where the same index set, 1, ..., M , is used to denote both CUs and their operating channel. We focus on developing resource allocation strategies for D2D communications coexisting with incumbent CUs. It is important to note that in this paper we follow the design principle of modern cellular networks where the transmit powers and channel allocation of CUs are determined by the VOLUME 4, 2016 BS scheduler at first either individually or through BS-BS coordination [37]. The underlying CUs' resource allocation is then treated as the input for each BS to conduct the resource allocation algorithms for its D2D pairs. Since CUs are the incumbent users in the system while D2D pairs are opportunistically accessing the spectrum, it is natural for the BS to conduct D2D users' resource allocation based on the dynamics of CUs' resource allocations: In the case where CUs' power controls are not changing rapidly, the BS can conduct the introduced ADC programming-based resource allocations. When the CUs' power controls are very dynamic, the BS can adopt the early stopping strategy discussed in Section IV to make sure the resource allocation strategies of D2D users can be obtained in a timely fashion. On the other hand, it is possible that the BS will decide not to allow D2D users to access by setting transmit powers to be zero when CUs' power controls change too fast.
D2D links can share the spectrum resources with multiple CUs as shown in Fig. 1, where D2DT and D2DR represent the transmitter and the receiver of the D2D link respectively. To realize the centralized control, each D2DR needs to periodically estimate the channel states of the links between itself and all transmitters in the D2D network, and then each D2DR reports measurement results to the BS. After the BS obtains the measurement results, it will determine the resource allocation strategies of all D2D pairs, which will be conveyed by scheduling signaling. It is important to consider a D2D device may have a finite-size queue that can cause overflow issues [38]. If a D2D transmitter's queue is near to full, the BS can allocate more resources to that D2D user by setting a higher weight for that D2D user. Meanwhile, a D2D device may change its transmission mode, i.e., a D2D pair either communicate directly or by relaying its data through the BS [39]. In this case, the D2D receivers can measure the effective channel gains in different modes and report to the BS, and then the BS can conduct the resource allocation based on the effective channel gains.
We assume that a D2D pair can access multiple channels to fully utilize the spectrum opportunities. Let p n m be the transmit power of D2D pair n on channel m. Note that channel assignment is implicitly conducted in the optimization since p n m > 0 suggests that D2D pair n will access channel m. The instantaneous SINR for CU m and the instantaneous SINR for D2D pair n on channel m are and respectively, where B m is the channel bandwidth of channel m, N 0 is the noise spectral density, p C m is the transmit power of CU m, h C m,BS is the channel gain between CU m and the BS on channel m, h D j,BS,m is the channel gain between D2DT j and the BS on channel m, h D n,n,m is the channel gain between the transmitter and the receiver of D2D pair n on channel m, h D j,n,m is the channel gain between D2DT j and D2DR n on channel m, and h C m,n is the channel gain between CU m and D2DR n on channel m. Fig. 1 shows an example of all considered channel gains where two D2D pairs coexisting with two CUs. Given the instantaneous SINRs, the instantaneous data rates of CU m and D2D pair n are given by and respectively, where R D inst,n,m is the instantaneous data rate of D2D pair n on channel m.
Each channel gain consists of both the small-scale fading and large-scale fading components. For example, the channel gain h C m,BS is assumed to follow where g C m,BS is the small-scale fading component and assumed to be exponentially distributed with unit mean, A is the path loss constant, d C m,BS is the transmission distance, γ is the decay exponent, s C m,BS is a log-normal shadow fading random variable with standard deviation σ, and α C m,BS combines all large-scale fading components. Other channel gains (h D j,BS,m , h D n,n,m , h D j,n,m , and h C m,n ) are defined using the same form.
We assume that the large-scale fading of the underlying wireless links is known at the BS since they usually vary slowly and do not require frequent measurements and feedback from D2DRs. On the other hand, the small-scale fading is changing much faster and thus the associated channel knowledge is difficult to obtain at the BS. Therefore, we assume the knowledge of small-scale fading is not available at the BS since otherwise a large amount of feedback and control signaling overhead is needed. Following the approaches introduced in [40]- [43], we can approximate the SINR by the effective SINR, which only requires the knowledge of the large-scale fading. To be specific, the effective SINR for CU m and for D2D pair n on channel m are respectively. The expectation E[·] is taken over the smallscale fading distribution. The achievable data rate of CU m and D2D pair n can be further approximated by [40]- [43]. To be specific, given the effective SINRs, the effective data rate of CU m and D2D pair n are and respectively, where R D n,m is the effective data rate of D2D pair n on channel m.

III. PROBLEM FORMULATION
Based on the knowledge of the large-scale fading, the BS performs the computations of power allocation algorithms, aiming at maximizing the weighted sum-rate of D2D pairs and CUs by controlling the transmit power of D2D pairs. Since the power allocation depends only on the knowledge of the large-scale fading, we use effective data rate to calculate the data rate for D2D pairs and CUs. Accordingly, the weighted sum-rate is written as where b C m ∈ {0, 1} represents channel m is used by a CU (b C m = 1) or not (b C m = 0), w D n is the weight for D2D pair n, and w C m is the weight for CU m. The weights represent the importance of individual data rate in the system-wide objective. They are determined by the BS based on the system-wide objective in order to provide the flexibility to adjust the ratio of each user's data rate in the weighted sumrate maximization problem. Besides maximizing the system-wide goal, the QoS of both CUs and D2D pairs have to be satisfied. Since CUs are the incumbent users in the system while the D2D pairs are opportunistically accessing the spectrum, we consider two different QoS requirements for D2D pairs and CUs: The minimum data rate constraint is imposed on the effective data rate for each D2D pair, and a more stringent reliability constraint is imposed on the minimum instantaneous data rate for each CU. In this way, the BS could provide statistical QoS guarantees [44] to CUs while allowing D2D pairs to share spectrum resources efficiently.
To incorporate QoS constraints into the power control algorithm, we utilized the alternative power update scheme, where all D2D pairs take turns optimizing their power allocations. We define the problem formulation for the D2D pair n as (P1), which is written as: where p n = (p n 1 , . . . , p n M ) T is the power allocation of D2D pair n and k denotes other D2D pairs except for n. When D2D pair n updates its power allocation, other D2D pairs' power allocations are kept fixed. The transmit power of each CU m, p C m , is given as the input before the alternative power update starts. The objective function jointly considers maximizing the data rate of D2D pair n while limiting the interference to other D2D pairs and CUs, where the first term is the data rate of D2D pair n, the second term and the third term are the aggregate data rate of other D2D pairs and CUs, respectively. (11b) and (11c) are the maximum and the minimum transmit power constraints of each D2DT, respectively. Each D2D pair j and each CU m have their minimum data rate requirements, which are denoted as R D min,j and R C min,m , respectively. (11d) is the minimum effective data rate requirement for each D2D pair, and (11e) is the reliability requirement on the instantaneous data rate for each CU, where c min is the minimum rate coverage probability of CUs. Note that the inter-cell interference caused by CUs in other cells can be easily incorporated into the problem formulation by being treated as the noise in the SINR calculation.
The ADC programming is designed to let D2D pairs take turns updating their power allocation, and each D2D pair updates its power allocation by solving a DC programming problem. By using the quotient property of logarithms, (11a) can be written as a DC function, and then (P1) becomes a DC programming problem (P2) as follow: and Both f (p n ) and g(p n ) are convex functions of p n .
In each power update of the ADC programming, the QoS requirements (11d) and (11e) are transformed into multiple linear constraints, and then a near-optimal power allocation of (P2) is found by using the branch-and-bound algorithm. The branch-and-bound algorithm is explained in Section IV and the algorithm that transforms QoS requirements into multiple linear constraints is detailed in Section V.

IV. BRANCH-AND-BOUND SOLUTION
In this section, we explain how the BS uses the branch-andbound algorithm to find the global optimum power allocation in (P2) without individual QoS requirements. We will show that QoS requirements can be incorporated into the branch-and-bound framework as additional linear constraints in Section V. To simplify the notation, we omit the notation n in (P2) in this section. The branch-and-bound algorithm branches the feasible region into smaller subregions recursively. In every subregion, an upper-bound and a lower-bound are found to bound the DC objective function. The lowerbound is found by using the polyhedral outer approximation and then the upper-bound can be found by plugging the selected feasible points. During the branch-and-bound process, the lowest upper-bound among all the subregions is tracked. We stop branching a subregion when its lowerbound is very close to the lowest upper-bound found so far. If no subregion has to be branched, the branch-andbound algorithm terminates and returns the power allocation corresponding to the lowest upper-bound. In real-world deployment, the branch-and-bound algorithm can be stopped early and returns the power allocation corresponding to the latest lowest upper-bound to prevent long convergence time.
This power allocation is guaranteed to be better than or equal to the initial power allocation.

A. BRANCHING PROCEDURE
The power constraints, (11b) and (11c), form a M -simplex that is constructed by the convex hull of M + 1 vertices in M -dimensional space as follows: where (16) Each generated simplex can be represented as a node in the binary tree, and the branching procedure splits one node into two children nodes recursively until the stop criterion is satisfied. Note that the root node of the binary tree stands for the whole feasible region, which is denoted by F (1) . Every node i in the binary tree corresponds to a simplex F (i) . Node i is branched to two children nodes 2i and 2i + 1 if the branching criterion is satisfied.

B. BOUNDING PROCEDURE
The bounding process is to find the upper-bound u (i) and the lower-bound l (i) within F (i) . Since we are solving a minimization problem, the upper-bound u (i) can be easily found by plugging any feasible point within F (i) into the objective function. On the other hand, the lower-bound l (i) is found by using the polyhedral outer approximation.
The search order in the binary tree is based on the breadthfirst search. During the search process, the lowest upperbound among all searched nodes has to be updated as follows: where p init is the initial power allocation from previous power update, i is the index of previously searched node, and β (i) is the lowest upper-bound selected from nodes 1, . . . , i. We let p (i) β be the power allocation that corresponds to the lowest upper-bound β (i) .
The criterion for branching a node i is as follows: where > 0 is the tolerance value to decide whether node i needs to be branched. If β (i) − l (i) > is satisfied, it means that the lowest upper-bound is significantly larger than the lower-bound of node i, so it is possible to find an upper-bound lower than β (i) after branching the node i. The algorithm terminates when no nodes need to be branched, and the final power allocation is set to the power allocation corresponding to β (i) , where i is the index of the last searched node. From (17), we can see that the final power allocation must be better than or equal to the initial power allocation. Therefore, the optimal power allocation can be obtained if is sufficiently small.

C. CALCULATING LOWER-BOUND AND UPPER-BOUND
In this subsection, we explain how to find the lower-bound l (i) and the upper-bound u (i) within F (i) , which means that the power constraints (11b) and (11c) in (P2) are replaced by F (i) . There are three key points to find l (i) . First, the QoS requirements are transformed into multiple linear constraints as discussed in Section V. Second, the convex function f (p) is bounded below using the polyhedral outer approximation. Third, a slack variable t is introduced. The resulting optimization problem for finding l (i) is written as where A QoS p ≤ b QoS represent the linear constraints generated by QoS requirements, f (i) LB (p) is the polyhedral outer approximation of f (p), which is written as (20) where L (i) is a set of feasible points. Since f (i) LB (p) is the underestimator of f (p), the optimal solution of (19) can be the lower-bound l (i) . Selecting more points in L (i) increases the approximation accuracy of f (i) LB (p) but decreases the computation speed of the solver. The accuracy is mainly affected by the points within the region of node i, so we only add feasible points within F (i) to accelerate the solver.
All the constraints in (19) can be transformed to linear constraints. Since (19) is to minimize a concave objective function over linear constraints, the optimal solution must be one of the vertices of its linear constraints. We use existing vertex enumeration toolbox to calculate all the vertices and find the vertex (t The upper-bound u (i) can be obtained by plugging in any feasible points within F (i) . To be specific, the point p (i) UB corresponding to u (i) is found by where U (i) is a set consisting of the vertices of F (i) and the lower-bound power solution p (i)

V. QOS REQUIREMENTS
In this section, we design the algorithm to incorporate the QoS requirements, reliability constraints for CUs and minimum data rate constraints for D2D pairs, into the ADC programming approach. Our main idea is to transform the QoS requirements into multiple linear constraints because linear constraints can be incorporated into the branch-andbound solver of the ADC programming easily.

A. RELIABILITY CONSTRAINT FOR CUS
Lemma 1. The reliability constraint (11e) for CU m can be expressed as − 1 is the minimum SINR to achieve the minimum rate for CU m.
Proof. We first derive the cdf of the cellular user rate as: Then we can obtain Pr{R C inst,m ≥ R C min,m } = 1 − Pr{R C inst,m ≤ R C min,m } to complete the proof.
From Lemma 1, if b C m is equal to 1, then rearranging the inequality in (22) can transform the reliability constraint into a linear constraint of p n , which is written as follows (23) Therefore, we can transform all active CUs' QoS requirements into linear constraints of p n and then incorporate them into the branch-and-bound solver of the ADC programming.

B. MINIMUM DATA RATE FOR D2D PAIRS
The incorporation of the minimum rate constraints for D2D pairs into the branch-and-bound solver consists of three steps. First, multiple linear constraints are generated adaptively to represent the minimum rate requirements of CUs and the VOLUME 4, 2016 D2D user that updates its power. Second, we design an algorithm to guarantee minimum rate requirements of other D2D pairs that do not update their power are achieved. Third, we adopt the GP-based method to ensure that the initial power allocation can satisfy all users' minimum rate requirements.
The minimum data rate constraint of D2D pair n is which cannot be transformed into linear constraints directly. We can prove that this minimum data rate constraint forms a convex set of p n while taking other D2D pairs' power as fixed. To be specific, the left-hand side of the inequality is a concave function of p n . Multiplying both sides by −1 transforms the concave function to the convex function and reverses the direction of the inequality sign. From 3.1.6 in [45], we know that the α-sublevel set {p ∈ domf | f (p) ≤ α} of a convex function f (p) with α ∈ R is a convex set. Therefore, we can conclude that the minimum rate constraint forms a convex set with respect to p n . The supporting hyperplane theorem from 2.5.2 in [45] is adopted to approximate the minimum rate constraint with multiple linear constraints. For any boundary point z of the convex set, there exists a hyperplane passing through it, which is written as a T p = a T z, where a = 0. For any p within the convex set, the linear constraint a T p ≤ a T z is satisfied. The convex set formed by the minimum data rate constraint can be approximated well if there are sufficient number of linear constraints corresponding to different boundary points. To explain how to incorporate D2D pairs' minimum rate constraints, we use an example of a two-channel scenario as shown in Fig. 2, where R 1 and R 2 represent the data rate of D2D pair n on the first and the second channel, respectively, the shadow region is the convex set formed by the minimum rate constraint of D2D pair n, and the first linear constraint is a T 1 p n ≤ a T 1 z 1 . Let p (0) BAB be the power allocation found by the branch-and-bound algorithm without the linear constraint a T 1 p n ≤ a T 1 z 1 . If p (0) BAB is unable to achieve the minimum data rate, then we can prune the invalid search region by adding a T 1 p n ≤ a T 1 z 1 into the branchand-bound algorithm, and then find p (1) BAB . However, it is still possible that p (1) BAB does not achieve the minimum data rate. In this case, more linear constraints have to be generated until the found power allocation achieves the minimum data rate of D2D pair n. To avoid adding many redundant supporting hyperplanes such that the complexity of the branch-andbound solver becomes very high, we design an algorithm that adds a new supporting hyperplane based on the previously found power allocation. To be specific, we would find the next linear constraint a T j+1 p n ≤ a T j+1 z j+1 by letting a j+1 = ∂R D n (p BAB . In this way, we can prune the invalid search regions that is close to the previously found power allocation as shown in Figure 3, which prevents the solver from finding a similar power allocation after adding the new linear constraint. Therefore, the invalid search region can be pruned efficiently. This procedure iterates until the minimum rate requirement of D2D pair n is achieved. Another important issue has to be addressed is other D2D pairs' minimum data rate requirements. After the power allocation p n for the D2D pair n is updated, other D2D pairs may be unable to achieve their minimum rate requirements because the interference caused by the D2D pair n changes. Therefore, we design an algorithm to guarantee the minimum data rate requirements of other D2D pairs by adjusting the D2D pair n's maximum power constraint M m=1 p n m ≤ φ n max . To be specific, φ n max is initialized to P n max and then the power allocation for the D2D pair n is found by using the branchand-bound algorithm. If other D2D pair k = n is unable to achieve its minimum data rate requirement, then we should decrease the maximum power constraint of the D2D pair n in order to decrease its interference to the D2D pair k. The new φ n max is set to below M m=1 p n m because we should exclude previous found power allocation for the D2D pair n from the search region, otherwise the solver will return the same solution again. This procedure iterates until all other D2D pairs can meet their minimum data rate with the found power allocation for the D2D pair n. The last important issue is the power initialization. Since the transmit powers of D2D pairs are updated alternatively, a good power initialization is important to our ADC programming algorithm. An improper power initialization may result in the situation that we cannot find the power allocation that meets minimum data rate constraints of all users. We adopt GP to find a power initialization point. If the GPbased solution satisfies the minimum data rate requirements using log 2 (SINR), then it must satisfy the minimum data rate requirements using log 2 (1 + SINR) because log 2 (1 + SINR) > log 2 (SINR). Note that the error caused by this approximation is not negligible in the typical SINR value of communication systems [46], e.g., the relative error is 10% when SINR is 7dB. Therefore, we initialize the power allocation using GP to let all the CUs and D2D pairs meet their QoS requirements first, and then the system throughput is improved by the introduced ADC programming while ensuring the QoS requirements of users.

VI. THEORETICAL ANALYSIS
The convergence, performance bound, and the complexity analysis of the branch-and-bound algorithm are provided in the following lemmas.
Lemma 2. The branch-and-bound algorithm is guaranteed to converge, i.e., the branching process in (18) must stop as more nodes are generated.

Proof. See Appendix A.
Lemma 3. The difference between the objective value of the branch-and-bound solution and that of the optimal solution is bounded, which is written as: where p n BAB is the branch-and-bound solution of (P 2), p n opt is the optimal solution of (P2), and > 0 is a control parameter of the branch-and-bound algorithm defined in (18).
Lemma 4. The branch-and-bound algorithm is guaranteed to stop if the following condition is met: where U is the feasible region of (P 2), P n max is the maximum power constraint of the D2D pair n, L is the number of total generated nodes during the branch-and-bound algorithm, f (p) is a function defined in (13), ∇ 2 f (p) is the Hessian matrix of f (p), ρ(·) is the spectral radius of a square matrix, and > 0 is a control parameter of the branch-and-bound algorithm defined in (18).

Proof. See Appendix C.
Remark. The complexity of the branch-and-bound algorithm is O(L), where L is the number of total generated nodes. Lemma 4 demonstrates that the lower-bound of L scales with the spectral radius of ∇ 2 f (p). This provides a clear indication that the complexity of the branch-andbound algorithm scales with the spectral radius of ∇ 2 f (p). Furthermore, there is a trade-off between the optimality and the convergence. From Lemma 3, we find that a small results in a near optimal solution. However, from Lemma 4, it can be seen that a small decreases the convergence rate by making the convergence criterion more difficult to achieve.
A round of the alternative power update is from the D2D pair 1 to the D2D pair N . The power update among D2D pairs stops when the objective value converges. To be specific, the convergence criterion is that the objective value difference between the current round j and the previous round j − 1 is less than η, which is written as where h(p 1 BAB,j ) is the branch-and-bound solution of the D2D pair 1 in round j and h(p N BAB,j−1 ) is the branch-andbound solution of D2D pair N in round j − 1.
Lemma 5. The convergence criterion of the alternative power update among D2D pairs is guaranteed to be satisfied.
Proof. See Appendix D.

VII. SIMULATION RESULTS
In this section, we evaluate our ADC programming (ADCP) method for power control by simulation. We consider a single cell network with N D2D pairs and M CUs, where CUs are transmitting on M separate orthogonal channels. In 3GPP 4G and 5G system level evaluation, 10 active CUs per cell is assumed (Section A.2.1.3.2 of [47] and Section 2.2 of [48]). In our work, we adopt a similar assumption where 4/6 CUs and 6/8/10 D2D pairs co-exist. The CUs are randomly dropped within the r C radius of the BS. The transmit distance of D2D pair n is d n , and the location of each D2D pair is within the r D radius of the BS. The rest of simulation parameters are listed in Table 2.
We conduct simulations under scenarios of different numbers of CUs and D2D pairs. In each scenario, the locations of all users are randomly generated for 400 times. The BS determines the power allocation of D2D pairs based on the existing knowledge of large-scale fading components of the channels. The system performance is evaluated with randomly generated small-scale fading components for 200 times. Therefore, we evaluate the system performance 80000 times in each scenario of fixed numbers of D2D pairs and CUs. VOLUME 4, 2016  We demonstrate the effectiveness of the introduced ADC programming method by comparing with 3 state-of-the-art power allocation methods: one GP-based method [18] and two DC-based methods [24], [30]. Similar to [49], [50], we compare with the global optimal solution using the branchand-bound (BB) method to show the performance gaps of different methods with the optimal solution [31]. We consider a complicated power allocation problem in multi-user and multi-channel D2D networks with individual QoS and power constraints, where existing methods have to make certain simplifying assumptions in order to solve the problem. To be specific, the GP method replaces log 2 (1 + SINR) with log 2 (SINR). One DC-based method discards QoS requirements of users (DCP1) [30], and the other DC-based method only considers single-channel systems (DCP2) [24]. The ADC programming method provides a comprehensive solution without simplifications, which increases its practical significance. Fig. 4 compares the weighted sum-rate for different numbers of CUs and D2D pairs through empirical cumulative distributions. The weighted sum-rate increases as the number of D2D pairs increases because deploying more D2D pairs allows more spatial spectrum access opportunities in the network [3], [4]. We can observe that the weighted sumrate of the GP method is much worse than that of the ADCP method. The reason is that the GP method uses log 2 (SINR) to approximate log 2 (1 + SINR), and this approximation is not accurate at high interference level when the SINR is low. To be specific, the approximation becomes extremely conservative when calculating the minimum data rate of D2D pairs because the approximation error accumulates over multiple channels. It can be clearly seen that the DCP1 method achieves the highest weighted sum-rate in every scenario, since the DCP1 method only maximizes the weighted sum-  rate without considering any QoS requirements. The DCP2 method performs poorly in every scenario because it has to pre-allocate the maximum power and minimum rate constraints to multiple channels before conducting optimization. The reason is that the DCP2 method only supports optimization on one channel. However, the system performance is highly correlated to the power and minimum rate constraints among channels, which are difficult to be decided in advance. In sum, the DCP1 method can achieve the highest weighted sum-rate since it does not have any QoS requirements of users. Our ADCP method always achieves higher weighted sum-rate compared to the DCP2 method and the GP method while guaranteeing QoS requirements of users. Both the GP method and the DCP2 method cannot achieve high weighted sum-rate due to their heuristic and suboptimal approaches to satisfy the QoS requirements of users. It is important to note that our ADCP method has the similar weighted sum-rate as the BB method, which is the global optimal solution. Fig. 5 shows the rate coverage probability of individual user rate for different numbers of D2D pairs, where the number of CUs is set to 4. The minimum data rate requirements for D2D pairs and CUs are marked by bold vertical lines and the minimum rate coverage probability of CUs is marked by a bold horizontal line. The ADCP method and the GP method satisfy the reliability constraint of CUs and have similar probability that a D2D pair can meet the minimum rate constraint. Although our ADCP method has similar individual user rate performance as that of the GP method, it is important to note that our ADCP method achieves much higher weighted sum-rate compared to the GP method. Since the DCP1 method does not have any QoS constraints during optimization, it cannot meet the reliability constraint of CUs and has a lower probability to achieve the minimum rate constraint of D2D pairs compared to the ADCP method and the GP method. The DCP2 method has the lowest probability that a D2D pair can meet the minimum rate constraint. This is because the DCP2 method has to pre-allocate each D2D pair's maximum power and minimum rate constraints among channels. However, a D2D pair may not achieve the minimum rate requirement using the pre-allocated power and rate constraints. It is important to note that our ADCP method has very similar individual user rate and weighted sum-rate performance as that of the BB method, which demonstrates that our ADCP method can achieve a near-optimal solution. Fig. 6 demonstrates the impacts of weights in the weighted sum-rate objective. As discussed in (10), different weights are used to reflect different priorities of individual data rate in the system-wide objective. The combination of weight value is determined by the BS based on the system-wide objective of the resource allocation algorithm. We can observe that the sum-rate of CUs scales with the weights for CUs. Compared to CUs, the transmit distances of D2D pairs are relatively short, so the achievable rates of D2D pairs are usually larger than CUs. Therefore, if we use the same weights for these two kinds of users, the weighted sum-rate maximization result usually provides preference towards D2D pairs.
We compare the average computation time for these methods in Fig. 7 when implemented and executed on the same machine with 2.71 GHz Intel i5 CPU and 12 GB RAM. The GP method has the least computation time in every scenario because the GP method can be converted to a convex optimization problem via logarithmic change in variables, (a) Coverage probability of cellular user rate (4 CUs, 6/8/10 D2D pairs).
(b) Coverage probability of D2D user rate (4 CUs, 6/8/10 D2D pairs). which can be solved efficiently using mature convex optimization toolbox. The DCP 1 method is the slowest because it has to iteratively solve a complicated DC function without alternative power update, where the number of variables is large during optimization. Since the DCP2 method only supports optimization over one channel, it has to solve multiple VOLUME 4, 2016  DC problems. Therefore, the DCP2 method is much slower than the GP and the ADCP method. The ADCP method is slightly slower than the GP method but is much faster than the DCP1 and DCP2 methods. The average computational time of our ADCP method is less than 10 sec. our ADCP method can be applied to real-world scenario because we focus on developing resource allocation strategies only based on the knowledge of large-scale fading. Since large-scale fading (path loss, shadow fading) usually vary slowly, 10 sec computational time is tolerable in the real-world deployment. Note that we do not plot the average computational time of the BB method because it is much higher than other methods. Even for the simplest scenario of 4 CUs and 6 D2D pairs, the average computational time is 81.4 sec. It shows that finding the global optimal solution using the BB method is not practical in terms of computational complexity.

VIII. CONCLUSIONS
In this paper, we study a complicated resource allocation problem of the multi-user and multi-channel D2D communications underlaying cellular networks with individual QoS requirements and power constraints. The introduced ADC programming method provides a comprehensive solution of this complicated resource allocation problem, while existing methods have to rely on specific assumptions to simplify the problem formulation. The minimum data rate constraint is imposed on the effective data rate for each D2D pair, and a more stringent reliability constraint is imposed on the minimum instantaneous data rate for each CU. The theoretical analysis on the convergence guarantee, the performance bound, and the complexity analysis of the ADC programming method in each power update is provided. Furthermore, we prove that the alternative power update scheme in the ADC programming is guaranteed to converge. Extensive simulation results show that our approach achieves the best performance compared to the state-of-the-art methods in all scenarios. .

APPENDIX A PROOF OF LEMMA 2
When a node i is branched to two children nodes 2i and 2i + 1, the polyhedral outer approximation of f (p) becomes more accurate, i.e., f (p) ≥ f (i) LB (p). From (19), it can be seen that the feasible regions of t in node 2i and node 2i + 1 are smaller than that in node i. Furthermore, the feasible regions of p in node 2i and node 2i + 1 are also smaller than that in node i. Since (19) is a minimization problem, l (2i) and l (2i+1) must be larger than or equal to l (i) . From (17), β (i) is a non-increasing function with respect to i. Therefore, both β (2i) −l (2i) and β (2i+1) −l (2i+1) are smaller than or equal to β (i) − l (i) . Furthermore, we have lim i→∞ β (i) − l (i) ≤ lim i→∞ u (i) − l (i) = 0, where the inequality follows (17) and the equality follows that the feasible region of p becomes infinitely small when i → ∞. It means that the branching process must stop (β (i) − l (i) ≤ ) as more nodes are generated.

APPENDIX B PROOF OF LEMMA 3
Since (P2) is a minimization problem, we have h(p n BAB ) ≥ h(p n opt ). Let j be the index of the last searched node in the branch-and-bound algorithm, and we have h(p n BAB ) = β (j) ≤ β (i) for any i ≤ j, where the inequality follows (17). From the branching criterion (18), we have h(p n BAB ) − l (i) ≤ β (i) − l (i) ≤ . Since l (i) ≤ h(p n opt ), we obtain h(p n BAB ) ≤ l (i) + ≤ h(p n opt ) + . Therefore, the performance bound of the branch-and-bound solution is 0 ≤ h(p n BAB )−h(p n opt ) ≤ , which is controlled by .

APPENDIX C PROOF OF LEMMA 4
Let i be the index of the last searched node. From (20), we have f According to the alternative power update scheme, when we identify the branch-and-bound solution for D2D pair n, other D2D pairs' power allocations are kept fixed. Let D2D pair k be the D2D pair that updates its power before D2D pair n. Let p n BAB and p n init be the branch-and-bound solution and the initial power allocation, respectively. Note that h(p n init ) = h(p k BAB ) because the initial power allocation of D2D pair n is from previous power update. From (17), we have h(p n BAB ) ≤ h(p n init ). Therefore, we have h(p n BAB ) ≤ h(p k BAB ). This means that the sequence of objective values generated by the alternative power update scheme is a non-increasing sequence. Since the objective value is lower bounded, the sequence will converge according to the monotone convergence theorem. Therefore, the convergence criterion in (27) must be satisfied.

ACKNOWLEDGMENT OF SUPPORT AND DISCLAIMER
(a) Contractor acknowledges Government's support in the publication of this paper. This material is based upon work funded by NSC under NSC-17-7030. (b) Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of AFRL.