A Max-Min Task Offloading Algorithm for Mobile Edge Computing Using Non-Orthogonal Multiple Access

To mitigate computational power gap between the network core and edges, mobile edge computing (MEC) is poised to play a fundamental role in future generations of wireless networks. In this letter, we consider a non-orthogonal multiple access (NOMA) transmission model to maximize the worst task to be offloaded among all users to the network edge server. A provably convergent and efficient algorithm is developed to solve the considered non-convex optimization problem for maximizing the minimum number of offloaded bits in a multi-user NOMAMEC system. Compared to the approach of optimized orthogonal multiple access (OMA), for given MEC delay, power and energy limits, the NOMA-based system considerably outperforms its OMA-based counterpart in MEC settings. Numerical results demonstrate that the proposed algorithm for NOMA-based MEC is particularly useful for delay sensitive applications.


I. INTRODUCTION
N ON-orthogonal multiple access (NOMA) has gained tremendous popularity in the past few years as a potential scheme to facilitate massive connectivity, as well as to support low-latency communications.On the other hand, mobile edge computing (MEC) is being considered as a key enabler for the communication among devices with low data-processing capability, where such devices offload their computationallyintensive tasks to an edge server.The superiority of NOMAbased MEC has been proven over its orthogonal multiple access (OMA) based counterparts, in terms of energy and/or delay minimization.
In this context, the problem of energy consumption minimization in NOMA-MEC system was explored in [1]- [4].On the other hand, the authors in [5]- [9] considered the problem of delay/latency minimization in NOMA-MEC system.More interestingly, in [10], the optimization problem for a two-user NOMA-MEC system was formulated as a Stackelberg game where the users tend to minimize their energy consumption, whereas the MEC server attempts to minimize the task execution time.The problem of system cost minimization in an ultra-dense NOMA-MEC network was discussed in [11].The problem of minimizing a weighted sum of energy consumption and latency in a NOMA-MEC system was considered in [12] and an efficient solution was obtained using deep reinforcement learning (DRL) approach.The problem of minimizing the total energy consumption in a NOMA-MEC system using DRL in the context of industrial internet of things was presented in [13].
In the case of a NOMA-MEC system, the users often partition their computation tasks in two categories: low-complexity and high-complexity tasks.The low-complexity tasks are executed locally, while the high-complexity tasks are offloaded to the MEC server.Therefore, different from the existing literature, we consider the problem of maximizing the minimum number of bits (of the high-complexity tasks) offloaded by users to an MEC server using NOMA by constraining the total energy, transmission power and the offloading delay within given limits.In this context, Huang et al. studied a max-min computation efficiency problem in NOMA-MEC system [14], and a similar problem in a millimeter wave settings was explored in [15].In order to improve the physicallayer security of offloaded data in a NOMA-MEC system, a max-min antieavesdropping poroblem was presented in [16].
Our system model is unique in a sense that it exploits NOMA to offload data of multiple users in a single time slot for mobile edge computations without impacting the performance of users.We solve the formulated max-min nonconvex problem using the framework of successive convex approximation (SCA).A provably convergent algorithm is derived based on novel techniques to bound non-convex functions appearing in the main optimization problem.Precisely, our main contributions include: (i) considering a novel model that integrates NOMA with MEC in a multi-user scenario, the main problem of maximizing the number of bits offloaded by the worst user to the MEC server using NOMA is formulated; (ii) deriving a tractable and provably convergent algorithm using novel bounding techniques and based on successive convex optimization strategy to solve the formulated problem, and (iii) numerical demonstration of the superiority of the proposed solution compared to the corresponding orthogonal multiple access (OMA) based approach.

II. SYSTEM MODEL AND PROBLEM STATEMENT
Consider a NOMA-MEC system consisting of a base station (BS) which is assumed to have an integrated MEC server, and a set of M users, denoted by U ≜ {U 1 , U 2 , . . ., U M }.It is assumed that the BS and all users are each equipped with a single antenna.Furthermore, we assume that the users have very limited data processing capability, and therefore, they offload their computationally-intensive tasks to the MEC server.The channel gain between U m (m ∈ M ≜ {1, 2, . . ., M }) and the BS is denoted by g m .The deadline for U m to offload its task to the MEC server is denoted by D m , and without loss of generality, we consider It is also assumed that a user stops transmitting immediately even if it completely offloads its data before the assigned deadline.In particular, after U m−1 finishes offloading its data, U m , which also transmits in previous users' time slot, is allowed some additional time Dm to finish offloading its data until the hard deadline D m , such that m j=1 Dj ≤ D m . 1 As a result, users m to M simultaneously offload their data till Dm to the MEC server by means of NOMA.A timing diagram for an example of data offloading in a three-user NOMA-MEC system is shown in Fig. 1.For sum rate, an arbitrary decoding order can be used for successive interference cancellation (SIC) in uplink NOMA [9], [17].Nonetheless, optimal decoding order in the NOMA uplink with fairness consideration is relatively unexplored [18].For the purpose of this work, we have fixed user decoding order according to their delay sensitivity, i.e., we consider SIC with respect to users' delay ordering such that higher-indexed users are decoded first, similar to the arguments in [2], [5], [9] and [19].In this way, the lower-indexed users (i.e., with stricter offloading deadlines) do not see interference from the higher indexed ones that transmit in their time slots.It should be emphasized here that NOMA is a natural candidate to integrate with MEC for the type of system model considered in our paper.Particularly, since users also transmit in previous users' time slot, NOMA, with its interference cancellation and power adaptation capabilities, is required for the smooth operation of users impacted by such interference.
The transmit power of U m during Dj , j ∈ {1, 2, . . ., m}, is denoted by P mj .Also, by properly normalizing the channel gains, we assume that the noise at the BS is distributed as zeromean unit-variance circularly symmetric complex Gaussian random variable, CN (0, 1).Therefore, letting B denote the total available bandwidth, the data offloaded (in nats) by U m to the MEC server is given by Note that the expression for the rate of U m (i.e., ln(1 )) in (1) satisfies the condition for (deadlinedependent) SIC ordering as discussed in [2] and [5].
In this paper, we are interested to find the power allocation P mj and additional times Dm to maximize the worst task to be offloaded among all users, which is formulated as P mj ≤ P m , j ∈ M, m ∈ {j, j + 1, . . ., M }. (2d) It is noteworthy that (2) is different from the conventional max-min rate optimization problem due to the presence of Dj in (1), which is one of the optimization variables.The main motivation for considering this problem stems from 1 To simplify the exposition, throughout the paper D1 is understood as D 1 , if not mentioned otherwise.promoting fairness among users to offload their data to the MEC server for computational purposes.If this consideration is ignored, the users that are more delay sensitive, and are unable to offload more data to meet delay constraints due to impeding channel conditions, suffer the most. 2 We further remark that, to the best of our knowledge, (2) has not been studied previously.In addition, the formulation of ( 2) is different from that of classical uplink NOMA systems [20] both in terms of its objective and constraints.In (2b) we have constrained the total energy to E th , (2c) ensures that U m completes its data offloading before its assigned deadline D m (with D1 = D 1 , which is not an optimization variable), and in (2d) the transmit power of U m during any time slot j with j ∈ M, m ∈ {j, j +1, . . ., M } is constrained to P m .It is also important to note that the constraint in (2c) does not prevent a user with a longer deadline from offloading its data before a user with a shorter one.Specifically, such a scenario will occur when the interference encountered by a higher-indexed user in time slots of lower indexed users is small enough so that it is able to offload its data even before the lower indexed users have completed offloading their tasks.If such a scenario does not prevail, the higher indexed users are allowed to continue transmitting their data till the given task offloading deadline.We remark that unlike downlink NOMA, in an uplink NOMA based system an explicit SIC constraint is not required as each user is to be decoded only once by the BS [21].To keep the problem formulation more tractable, computation rates of user devices and MEC server have not been included as optimization variables in the problem formulation.Moreover, in line with the approximation used in existing works [2], [5], [19], [22] we have ignored the processing energy and time to send data back to users from the MEC server in our problem statement.Inclusion of the mentioned above parameters is a ripe direction for future research.

III. PROPOSED SOLUTION
In this section, we propose an efficient solution to problem (2) and prove the convergence of the proposed algorithm.Moreover, for a fair comparison, we also discuss a corresponding optimized OMA-MEC system.Since the problem in (2) is non-convex, to solve it we approximate the nonconvex constraints with convex ones.Afterwards, we use the SCA framework to develop an iterative procedure to solve the original problem.One of novel features of this work lies in how we develop approximating functions that satisfy certain conditions within the SCA framework, as we see below.
For ease of description, we define p mj ≜ [P jj , P (j+1)j , . . ., P mj ] T for m ≥ j, which consists of the power allocation coefficients of U j to U m during Dj , and d ≜ [ D2 , D3 , . . ., DM ] T .It is obvious that the non-convexity of ( 2) is due to that of the objective in (2a), and the constraint in (2b).Let us consider an equivalent reformulation of (2) given by maximize p≥0,d≥0,ℓ ℓ (3a) It is clear that the non-convexity of ( 2) is carried over into (3b) and (2b).Let us focus on (3b) first.In light of SCA method, to deal with (3b) we need to find a concave lower bound of N off,m .In the related literature, a popular method is to introduce some auxiliary variables to make (3b) more tractable, which eventually increases the complexity of the convex subproblems.Instead, to approximate (3b) we present the following theorem.
Proof: See Appendix A. We also remark that the derivative of both sides of (4) with respect to each variable involved is identical, which is another requirement for the use of a lower bound in light of the SCA principles. Let where f m,j p mj , Dj ; p j is given by ( 7) (shown on the next page), We remark that (6) can be expressed as a few simple secondorder cone (SOC) constraints.To this end, auxiliary variables are introduced to obtain a system of linear and quadratic inequalities such that the quadratic inequalities are transformed to SOC constraints using simple manipulations [23].This is accomplished automatically by a modeling tool such as CVX.
Our next step is to approximate (2b) which requires a convex upper bound of the term Dj P mj .There are several ways to achieve this.For example, the authors in [23]

D2
j + 1 2ϕ P 2 mj , which holds good for any ϕ > 0, and the equality is achieved when ϕ = P mj / Dj .However, in this paper Dj and/or P mj can approach zero, which may lead to some numerical issues if the above convex upper bound is used.Thus, in this paper we simply use the following convex upper bound Dj P mj ≤ 0.25 Dj + P mj 2 + 0.25 D(n) Dj −P mj ≜ q mj (P mj , Dj ; q mj (P mj , Dj ; We remark that q mj (P mj , Dj ; ) is in fact symbolic representation of a convex quadratic function and thus ( 9) is an SOC constraint.From the convex approximation of (3b) and (2b) presented above, we arrive at the following second-order cone program (SOCP): maximize p≥0,d≥0,ℓ ℓ (10a) subject to (2c), (2d), ( 6), ( 9).(10b) Now we are in a position to present an algorithmic framework to solve our main problem in (2), using the approximation developed in (10).The procedure is outlined in Algorithm 1.We simply set P (0) mj = 0 and D(0) j = 0 as a starting feasible point.It is quite insightful to observe that for each run of the proposed algorithm, we solve a convex problem whose feasible set is a subset of the original problem in (2).This occurs due to the fact that we approximate the non-convex function in (5) using a lower bound as given in (6).It not only ensures that solution in each iteration of Algorithm 1 is feasible to (2), but also results in convergence as we see in the next section.

A. Convergence of Algorithm 1
Let the feasible set of (3) be denoted by F; F (n) ≜ {(p, d, ℓ) satisfying constraints in (10)}, and ℓ (n) denote the feasible set and the optimal objective at the n th iteration, respectively.Let the non-convex constraints in (3) be compactly denoted by ψ i (p, d, ℓ), and the corresponding approximated convex constraints in (10) by Ψ i (p, d, ℓ; p (n) , d (n) , ℓ (n) ).It is straightforward to note that due to the proposed bounds, the convex constraints used to approximate the non-convex ones satisfy the following three properties fm,j pmj, Dj; p (11c) First of all we note that F (n) is a subset of the feasible set of the original problem due to (11a).Further, as a consequence of (11b) and the update rule given in Algorithm 1, the sequence n+1) , the optimal variables corresponding to F (n+1) should yield an improved objective value in the (n + 1) th run of the algorithm i.e., ℓ (n) ≤ ℓ (n+1) .It is easy to see that F is a compact set, and thus, the sequence {ℓ (n) } converges.
We also remark that Slater's condition holds for the subproblem in all iterations.This can be shown by choosing sufficiently small (p, d, ℓ).Thus, the Karush-Kuhn-Tucker (KKT) conditions must be satisfied for (10) in all iterations.Let p * , d * , ℓ * be the limit of the sequence p (n) , d (n) , ℓ (n) . 3ithout loss of generality we assume that p The same arguments also apply to the convex constraints in (2).Thus if p * , d * , ℓ * is a regular point of (3), then we can show that p * , d * , ℓ * is also a KKT point of (3).We skip the details here for the sake of brevity.

B. Complexity of Algorithm 1
As shown in Section IV, Algorithm 1 converges in few iterations and in each iteration the SOCP in (10) needs to be solved.The worst case complexity of generic interior-point methods for an SOCP depends on the number of optimization variables, number of SOC constraints and the dimension of SOC constraints [25].In our case, ( 10) has (M 2 + 2M ) optimization variables, O(M 2 ) SOC constraints and the total dimension of SOC constraints 4 is O(M 2 ).Using the complexity estimate of [25], the accuracy of the duality gap can be improved by a constant factor in O((M 2 ) 1/2 ) iterations and the time complexity of each iteration is O(M 4 × M 2 ).This leads to an overall complexity of O(M 7 ).This complexity bound is obtained by simply viewing each linear inequality constraint of the form c T x + d ≥ 0 as a special case of the SOC constraint c T x + d ≥ ∥Ax + b∥ 2 , where A = 0 and b = 0. Consequently, the problem data of the resulting SOCP consists of many zeros.Modern off-the-shelf conic solvers can efficiently exploit this sparsity, and thus the practical run time for solving (10) is much less than the worst-case time complexity.A further improvement in computational complexity can be obtained by using first-order optimization methods to solve the given problem in future study.

C. A corresponding OMA-MEC system
In order to showcase the advantages of the NOMA-based system, in this subsection, we present a corresponding baseline OMA-based system.Following the two-user (pure) OMA system in [5], we consider a multi-user OMA-MEC system, where a dedicated time slot is allocated to every user.Note that we reuse the notations from the NOMA-MEC system in this subsection when appropriate.Specifically, U m offloads its data to the MEC server only during Dm such that m j=1 Dj ≤ D m .The data offloaded (in nats) by U m to the server is given by Noff,m = B Dm ln 1 + g m Pm , where Pm is the power transmitted by U m during Dm .The problem of maximizing the minimum number of offloaded nats among all users in the OMA-MEC can be easily formulated but is skipped here due to the space constraint.Similar to the case of NOMA, we can derive a SCA-based method to solve the resulting problem.Explicitly in iteration n + 1, an SOCP can be formulated as follows: where p ≜ [ P1 , P2 , . . ., PM ] T , d ≜ [ D1 , D2 , . . ., DM ] T , and B fm Pm , Dm ; Similarly, we have qm ( Pm , Dm ; Although a dedicated interference free time slot is allocated to each user in OMA system to offload data to MEC server, yet a combination of power controlled data offloading of users     to previous time slots and low interference at delay sensitive users due to SIC make NOMA-MEC perform better for MEC purposes.In the section to follow, we perform various numerical experiments that corroborate this observation.

IV. NUMERICAL RESULTS
In this section we conduct numerical experiments to assess the performance of our algorithm to offload computational tasks to the MEC server.Here, g m is modeled as Note that, for simplicity, we have considered a constant difference between the users' deadlines, however, the proposed SCA-based algorithm holds good also for the case when the difference between the deadlines are random.Also, throughout this section, we assume that all of the users have the same transmit power budget, P t , i.e., P m = P t , ∀m ∈ M. We execute the proposed algorithm using CVXPY with MOSEK as the internal solver on a Linux PC with 7.5 GiB memory and Intel Core i5-7200U CPU.Note that in this section, we use the logarithm with base 2 and thus the offloaded tasks are expressed in bits.For all of the figures in this section, the average values of offloaded bits are computed over the same set of 100 independent channel realizations.Moreover, in Figs. 2, 3 and 5, we consider M = 4.In Fig. 2, where the numbers in the parentheses denote (E th in mJ, P t in dBm), we show the convergence behavior of the proposed algorithm for the NOMA-MEC system.We also show the convergence of the corresponding OMA-MEC system for comparison.It can be noted from the figure that both NOMA and OMA systems converge within a small number of iterations.Moreover, for the given system parameters, the NOMA-based system results in a much larger number of offloaded bits within the same number of iterations, compared to the OMA-based counterpart.Also, in Table I, we show the average problem solving time for the NOMA-MEC system for E th = 5 mJ, P t = 5 dBm and different number of users with number of iterations fixed at 20.
In Fig. 3 we report the average minimum number of offloaded bits as a function of energy threshold, E th for given P t .From the figure, the superiority of the NOMA-based system is clearly evident over its OMA-based counterpart.More interestingly, for the case of NOMA-based system, the objective first increases for the small values of E th (which we refer to as the energy-constrained regime), and then saturates for large values of E th (which we refer to as the powerconstrained regime).It is noteworthy that for two different NOMA systems with the same E th but different P t , the performance in the energy-constrained regime remains the same.It can also be noted from the figure that compared to NOMA, the effect of easing the energy constraint is less pronounced in the OMA-MEC system.
To investigate the impact of number of users, in Fig. 4, we plot the variation in the average minimum number of offloaded bits against the number of users.Note that in the figure legend, the numbers in the parentheses denote (E th in mJ, P t in dBm).It can be inferred from the figure that as the number of users, M , increases, the performance of both NOMA-based and OMA-based systems deteriorate.It is also evident that with an increase in the number of users, the performance gap between the NOMA-MEC and OMA-MEC systems also decreases.Note that for the NOMA-MEC system, as M increases, the inter-user interference for the higher-indexed users also increases, reducing the achievable rate for such users.
To investigate the impact of users' delay requirements, in Fig. 5 we investigate the variation of number of offloaded bits with ∆.Note that in the legend of the figure, the numbers in the parentheses denote (E th in mJ, P t in dBm, D 1 in s).Remarkably, the performance of both NOMA-MEC and OMA-MEC systems increases linearly with a linear increase in ∆.The poor performance of OMA is due to the fact that U m has to offload all its data only during the m th time slot.Under the same circumstances NOMA is able to offload more data due to availability of additional time slot.Thus, NOMAbased MEC systems can be considered as a suitable candidate to achieve gains associated with edge servers under stringent delay constraints of users.

V. CONCLUSION
In this correspondence, we have considered using NOMA to offload computational tasks for edge computing in a multi-user scenario.For the system model under consideration, we have formulated a non-convex optimization problem to maximize the minimum number of offloaded bits among the users, constrained by delay, transmit power and energy requirements.The original non-convex problem is solved using successive convex optimization strategy that leads to a provably convergent algorithm, each iteration of which can be solved efficiently by the SOCP solvers.The numerical results have showed that in NOMA systems with given power budget, the minimum number of offloaded bits first increases with energy budget in the energy-constrained regime, and then saturates in the power-constrained regime.It has also been confirmed that as the number of users increases, the performance gap between NOMA and OMA systems decreases.Our results have shown that for strict delay constraints, NOMA substantially outperforms OMA-based approach in terms of maximizing tasks to be offloaded to the MEC server.

Fig. 1 .
Fig. 1.Timing diagram of data offloading in a three user NOMA-MEC system.The duration shown by D 1 , D 2 and D 3 denote the hard deadlines, and those shown by D2 and D3 denote the additional times.

Fig. 3 .
Fig. 3. Average minimum number of offloaded bits versus E th .

used
Algorithm 1: Offloading rate maximization algorithm