Two-Level Game Based Spectrum Allocation Scheme for Multi-Flow Carrier Aggregation Technique

,


I. INTRODUCTION
In recent years, heterogeneous network (HetNet) have been proposed for the 5G wireless communication systems. An HetNet infrastructure consists of conventional macrocells for ubiquitous coverage and different types of overlaid small cells to provide seamless coverage. These small cells have very heterogeneous characteristics, and reside in both licensed and unlicensed spectrum offering hybrid mobile devices (MDs), which are supporting multiple radio access technologies (RATs). With the availability of service from the HetNet system and the option of using diverse RATs, each mobile device has a choice. Based on the quality of service (QoS) and the price charged, MDs can choose the most adaptable cell operator (CO) on the fly. Therefore, The associate editor coordinating the review of this manuscript and approving it for publication was Kezhi Wang. heterogeneous network COs compete with each other over price and QoS offered to attract MDs [1]- [4].
In future 5G cellular systems, MDs located at the cell edge can aggregate spectrum bands from different COs simultaneously to boost their data rates. This technique, namely multiflow carrier aggregation (MCA), is an extension of carrier aggregation (CA). Originally, the CA was introduced in the LTE-Advanced system to aggregate multiple spectra into a virtual carrier for high-throughput transmissions. MDs can increase their peak data rates by transmitting through the aggregated virtual carrier, which is aggregated by narrowband ranges of spectrum resource. Hence, the CA technique is considered a practical solution for the LTE spectrum fragmentation problem while offering significant flexibility for efficient spectrum utilization. The MCA extends the CA for general multi-tier HetNets while essentially involving multiband modeling and analysis. To achieve the benefits of MCA, it is very critical to assign different COs' spectrum resources VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ for each MD. Therefore, each device's application can be served by multiple COs, simultaneously [5], [6]. Even though there have been some control protocols, the research on the MCA system is still in its infancy, and several technical issues and challenges are to be addressed. For the synergistic use of multiple COs' aggregation, it is necessary to consider a new control paradigm. In this study, we adopt the basic concept of game theory for the future MCA system [7]. To fully maximize the multiuser MCA system, we develop a new effective joint spectrum allocation algorithm based on the game theory.
A. TECHNICAL CONCEPT OF GAME MODELS As a kind of game theory, the Stackelberg game is a noncooperative game model based on two kinds of different game players; a leader and followers. They are forced to act according to their hierarchy level. The leader acts first, and multiple followers react dependently based on the decision of the leader while attempting to maximize their satisfaction. However, complicated interactive situations, such as the MCA system, require the modeling and analysis of a more complex Stackelberg game with multiple leaders [8]. Generally, the MCA system operation can be formulated with multiple COs and MDs. As leaders, multiple COs make their service prices by considering the possible reactions of MDs. As followers, multiple MDs decide the spectrum amount for their multiple flows based on leaders' price strategies. Therefore, control decisions are coupled with one another; the result of each player's behavior might affect the behaviors of other players. In this paper, we formulate this situation as a multiple-leaders multiple-followers (MLMF) Stackelberg game.
As a kind of game theory, the efficient max-min game is a cooperative bargaining game model based on the extending of Kalai-Smorodinsky solution. Conceptually, bargaining problems resemble decision problems in the sense that game players wish to enter into contracts in order to generate surplus which then must be divided fairly among the players. Therefore, a bargaining solution enables players to fairly and optimally determine their payoffs to make joint-agreements. The main idea of efficient max-min bargaining solution (EMBS) is a tradeoff between selecting a payoff proportional to the ideal point and Pareto optimality. Therefore, it may shortly be described as a generalization of the Kalai-Smorodinsky solution to nonconvex n-player bargaining game, and it provides a unique solution [9]. In this study, the main concept of EMBS is applied to implement the follower's reaction, which is the spectrum allocation process for its multiple flows.

B. MAIN CONTRIBUTIONS
In this study, we develop a novel spectrum allocation scheme for the MCA system. To effectively share the spectrum resources of multiple COs, we design a two-level game model. At the upper-level, the MLMF Stackelberg game model is adopted to effectively decide the service price of each CO. At the lower-level, the EMBS is used to aggregate multiple flows in each MD while effectively allocating COs' resources. Based on the hierarchical interconnection of upper and lower games, control decisions of different type game players can cause cascade interactions; the COs' price decisions might affect the behaviors of MDs, and each MD's resource allocation for the MCA service is the input back to the COs. Based on the online interactive manner, our fine grained two-level game procedure is repeated to reach a fair-efficient solution, and can leverage the full synergy of non-cooperative and cooperative games while solving comprehensively some control issues. In detail, the major contributions of this study are as follows: • This study considers the MCA technology for the future network system. To develop the MCA system control scheme, our main issues are i) to decide the CO's spectrum price, and ii) to allocated the different COs' spectrum resources for each user's multiple flows.
• We concern the game theory for above control issues. We develop a novel two-level game model. The upper and lower level games are developed as different type game models, and work together interactively to reach an agreement that gives mutual advantages for game players.
• The upper-level game is formulated as the MLMF Stackelberg game model where COs are leaders and MDs are followers. To effective adjust the spectrum price, the decisions of leaders are made in a non-cooperative manner, but have to take into account decisions made by others to reach a compromise consensus.
• The lower level game is designed as a bargaining game model. In each individual MD, traffic flows are game players, and they bargain with each other the COs' different spectrum resources based on the idea of EMBS.
• We present simulation results for the performance of our MCA algorithm and compare it with the performance of the existing state-of-the-art MCA protocols. Using numerical analysis, we validate the superiority of our two-level game approach in terms of user's payoff, MCA system throughput, and fairness among COs.

II. RELATED WORK
There have been several papers in the area of MCA system operations to utilize the limited spectrum resource. Most algorithms have considered to model the interaction between multiple COs and MDs. The paper [16] formulates the distributed power and subcarrier allocation problem in selforganizing small cell networks as an evolutionary game. The strategy adaptation process is modeled by replicator dynamics and the evolutionary equilibrium is obtained as the solution. In addition, a stochastic geometry-based approach is used to analyze the average achievable SINR and average achievable rate, and a new distributed algorithm is proposed to reach the evolutionary equilibrium. Finally, the stability of the equilibrium is analyzed [16]. The scheme in [17] formulates a Stackelberg game to jointly maximize the system revenue and the individual utilities of different users for the proposed price-based resource allocation. Specifically, two pricing algorithms are proposed: non-uniform pricing in which different interference-power prices are assigned to different users, and uniform pricing in which a uniform price applies to all the users. In addition, the proposed algorithms can be implemented with low complexity and require minimal information exchanges. It is very useful in spectrum-sharing networks [17].
In [1], P. Yuan et al proposed the Hierarchical game based Cooperative Carrier Aggregation (HCCA) scheme by using the game theoretic framework [1]. They focus on the CAenabled HetNet infrastructure in which each CO can allocate multiple spectrum bands for each of MDs. To solve the resource allocation problem, a hierarchical game model is formulated to solve the resource allocation problem under the constraints of the maximum transmit power and maximum tolerable interference level. First, a Stackelberg game is established for the COs to regulate the power levels of the MDs while optimizing their payoffs. Second, a coalition is formed by MDs which can access the same spectrum resource; a simple distributed algorithm is proposed for MDs to autonomously form an optimal coalition formation structure under different prices. Finally, they show that the HCCA scheme can significantly improve the performance of the HetNet system [1].
The Efficient Multi-stream Carrier Aggregation (EMCA) scheme is a new MCA control scheme to maximize the system capacity with the energy efficiency [2]. The main focus of this work is designing an effective method of exploiting MCA technique to improve the energy efficiency in multi-layer HetNet architecture. Usually, COs are typically interested not only in minimizing the energy consumption, but also in maximizing the MCA system capacity. To satisfy these goals, a multi-objective optimization problem is jointly designed, and a new solution for this problem is provided according to the priority assigned by the COs to each objective. Based on the clear and simple cell-association policy, the EMCA scheme easily obtain a balanced performance between conflicting objectives. Finally, performance evaluation is provided to characterize the tradeoffs between the energy minimization and capacity maximization in a multi-layer HetNet infrastructure [2].
In [10], authors propose the Robust Multi-carrier Resource Allocation (RMRA) scheme for MDs, which are running realtime and delay-tolerant applications with utility proportional fairness allocation policy. To effectively allocate multiple spectrum resources among devices, they introduce a robust resource allocation algorithm from multiple carriers based on a MCA scenario. During the resource allocation process, logarithmic and sigmoidal-like utility functions are represented for both high-traffic and low-traffic situations. The RMRA scheme ensures fairness in the utility percentage achieved by the allocated resources for all users while providing the minimum price for the allocated rate. To verify its applicability, simulation results show that the RMRA scheme can greatly enhance the MCA system performance, and overcomes the traffic fluctuation with different network traffic densities [10].
The HCCA, EMCA and RMRA schemes have introduced unique challenges to efficiently solve the spectrum allocation problem in the MCA system platform. Recently, they have attracted lots of attention due to their various advantages. One of the key difference between our proposed scheme and the previously reported protocols is the concept of control paradigm. In this study, we adopt a new two-level game model, which combines collaboration and competition concepts; different game players compete with each other in the upper level game, and cooperate in the lower level game. Therefore, our approach can reach a fair-efficient spectrum allocation solution while achieving reciprocal advantages. Compared to these existing HCCA, EMCA and RMRA schemes [1], [2], [10], we demonstrate that our proposed approach attains a better performance during the MCA system operations.

III. THE SPECTRUM ALLOCATION SCHEME IN THE MCA PLATFORM A. THE MCA SYSTEM INFRASTRUCTURE AND OPERATING ASSUMPTIONS
We consider a multi-tier cellular network platform with K = {1, . . . , K } denoting the set of K tiers which may include different coverage areas. The small cells in lower tier are regularly deployed within the large cell in higher tier. COs of different type cells may differ in terms of corresponding coverage size, spectrum resource, access technology and service price. Without loss of generality, one CO in the k-tier has a coverage area of radius r k , and has a static portion of the spectrum resource to provide the basic services to its users in its area. There are n MDs N = {D 1 , . . . , D n } and D 1≤i≤n is assumed to have the capability to aggregate multiple traffic flows over several COs. MDs are randomly distributed in the cellular area, and they are heterogeneous in their experienced carrier qualities, carrier aggregation capabilities, and QoS requirements. Therefore, each device's application service can be represented by multiple subcarriers.
To satisfy the different goals of COs and MDs, we develop a two-level game model. In the upper-level game, the COs' prices are effectively adjusted based on the Stackelberg game model. In the lower-level game, each MD divides its application into multiple flows, and allocates the available COs' spectrum resources according to the cooperative bargaining game. Our two-level game model defines the phenomenon that co-exists competition and cooperation faces in the MCA system operation, and adapts the current network environments while satisfying multi-objective goals. Formally, we define the game model G MCA = O, F D 1≤i≤n , S CO∈O , S D∈N , P CO , U CO , U D , M, T at each time period of gameplay, and Table 1 lists the notations used in this paper.  represents the allocated spectrum amount from the CO for the D; D CO is assigned for f D CO . Usually, price control mechanism is designed based on the elastic demand paradigm; according to the service price (P s ), customers adapt their service requests. It is relevant under the MCA system operation. Fig.1 depicts graphically the elasticdemand paradigm by using demand curve (C d ) and performance curve C p . The C d shows the aggregated spectrum request of users (R s ), and the C p describes the relationship between the R s and P S . Traditionally, the C d can be expected to be monotonically decreasing in the generalized P s while the C p is monotonically increasing in the generalized R u . Under the fixed spectrum capacity (F c ), the equilibrium point can be determined by combining the C d and C p . In Fig. 1, let assume that the C 3 p and C d are selected. When the current price is P 1 s (or P 2 s ), the associated request amount is R 1 u (or R 2 u ). In order to toward the network equilibrium point (E * ), the current price should increase (or decrease) by P 1 s (or P 2 s ). [11], [12].
If the total spectrum amount is R 3 u under the C 2 p , the available spectrum resource is not enough to support all requests; traffic congestion occurs. The over requests ( R) should be rejected to meet the F c . To deal with this traffic congestion problem, extra charge ( P e s ) is used to reduce the potential spectrum demand. Finally, the demand-supply balance is obtained. Under the assume that the C 3 p and C d are selected, C 3 p , the collective user-satisfaction (S C ) can be maximized at the E * state. Therefore, each CO attempts to reach the E * by adjusting the P s . To maximize the S C , a mathematical programming formulation can be described as follows [11], [12].
where R A u is the total amount of all users' requested spectrum resources. Based on the adjustment of price policy P i s , the C d is dynamically decided by MDs. In Fig.1, the solution of Equation (1) can be viewed pictorially as a maximization of the area between the C d and C p ; the area under the C d minus the area under the C p within the E * [11].
In 1974, J. M. Smith introduced the fundamental concept of an evolutionary learning game model; it has been developed in biological sciences in order to explain the evolution of genetically determined social behavior. In the jargon of his idea, the changing rate of the players' selection is defined as Replicator Dynamics (RD), which describes the evolution in the proportion of each strategy to reach an equilibrium [13], [14]. In this study, we design a CO's price control algorithm by practically applying the RD approach. When a CO chooses a price strategy, it can change the current MCA system environment and triggers reactions by other COs and MDs. After making further changes, a selection probability of specific strategy evolves at a rate equal to the difference between the payoff of that strategy and the average payoff. During MCA operations, COs iteratively change their current price strategies and repeatedly interact with others. For example, if the payoff of i th price strategy P CO i is higher compared to other strategies, the selection probability for i th strategy η P CO i increases in proportion to the expected payoff increment. Therefore, the desirable strategy that will improve player's payoff is more likely to be selected. This interaction mechanism gradually leads the MCA system into a stable state; it means no individual CO can improve his payoff by unilaterally changing his strategy, and can be immune from being changed. It is relevant to the Darwinian evolution mechanism [14].
To represent the RD for the price control problem, let L be a number of possible price strategies, i.e., L = S CO , and P CO is the L-dimensional vector (η P CO 1 . . . η P CO i . . . η P CO L ). η P CO i stands for the variation of η P CO i ; it is the RD for the P CO i . E p P CO i , P CO l is denoted by the expected payoff for a player using the P CO i when it encounters a player with the P CO l , and E p P CO i , P CO is the payoff for a CO using the P CO i when it encounters the rest of other COs whose strategies are distributed in P CO , which can be expressed like as P CO Finally, the RD is defined as [14]; (2)

C. THE CONCEPT OF EFFICIENT MAX-MIN BARGAINING SOLUTION
To define the basic idea of bargaining solution, we introduce the notation and preliminary definitions of bargaining solution. Let U be the set of n-player bargaining problems, (a, S), where a ∈ R n denotes the threat point, and S is a feasible set. If players fail to reach some other outcome x = {x 1 , . . . , x n } ∈ S, this disagreement result is a threat point. Let R (R + , R ++ ) denote the set of all (non-negative, positive) real numbers and let R n R n + , R n ++ be the n-fold Cartesian product of R (R + , R ++ ). S ⊂ R n denotes the set of feasible payoffs, that satisfy i) S is compact and, ii) S ∩ ({a + R n ++ ) is non-empty. For an n-player bargaining problem, (a, S), M (S) is denoted the ideal point [9], and it is defined as follows; where J j (x) is the projection on the j th coordinate of x, and M j (S) is the maximal value of the j'th coordinate. With f (a, S) ∈ S, let f : U → R n be a solution to the bargaining problem (a, S) and let P(S) be the set of strongly Pareto optimal payoffs in S. Moreover, let D(a, S) be the diagonal of (a, S), and M d (a, S) is the maximal payoff on the diagonal [9]; D (a, S)) By using the above terminologies, we can define the generalize the Kalai-Smorodinsky solution, i.e., G KS (a, S), be as VOLUME 8, 2020 follows [9]; It can be interpreted as the maximal payoff in the intersection of the comprehensive hull of S and the diagonal D (a, S). Clearly max [9]. Therefore, Finally, a bargaining solution, f e Mm : (a, S) → R n with f e Mm (a, S) ∈ S, is an EMBS if and only if [9]; Main property of EMBS is no worse than the Kalai-Smorodinsky solution in the sense that no coordinate is smaller. Specifically, the EMBS satisfies Pareto Optimality (P), Restricted Symmetry (RS), Restricted Affine Invariance (RAI) and Comprehensive Monotonicity (CM) [9]; • P: f e Mm (a, S) ∈ P(S) for (a, S). • RS: π (f e Mm (a, S)) = f e Mm (π (a) , π (S)) for (a, S) and permutations, π, of N = {1, . . . , n}.
• RAI: Q(f e Mm (a, S)) = f e Mm (Q(a), Q(S)) for all strictly increasing affine maps, Q : R n → R n with Q j (x) = γ j × x j + δ j for (a, S).

D. THE PROPOSED MCA SPECTRUM ALLOCATION SCHEME
In this study, we adopt the MLMF Stackelberg game model to decide each CO's price strategy where MDs are follower and COs are leaders. As leaders, COs exploit their price strategies to induce MDs while increasing their payoffs. The CO's payoff U T t CO P CO i with the price strategy P CO i at time T t can be derived as follows.
where σ , η and χ are control parameters to evaluate the U CO (·) function, and N T t−1 CO , M CO are the CO's allocated spectrum amount at time T t−1 , and its total spectrum amount, respectively. The formulation of eq. (8) is a bounded, differentiable, real function that is defined for all real input values and has a non-negative derivative at each point. Therefore, it is monotonic, and has a first derivative which is bell shaped. According to the RD, each CO can get the variation of each price strategy in a step-by-step interactive feedback procedure. In conclusion, the variation of η P CO i at time T t , i.e., η T t P CO i , is defined as follows; where S CO is the cardinality of S CO . Based on a sequence of time steps, the vector P CO is updated using the information of each strategy variation. Therefore, COs select the most profitable strategy while ensuring relevant adaptability. The major concern of MD is to optimize the user payoff by effectively allocate the spectrum resources from its contactable COs. To develop the resource allocation algorithm for MDs, the concept of EMBS is applied. As a follower, each individual MD is a game planner, and the multiple flows of its application task are cooperative game players. The MD D i 's payoff U T t D i C D i , A D i with the set of its contactable COs C D i and application task A D i at time T t can be derived as follows.
where D i CO k is the spectrum amount, which is assigned to the D i from the CO k . ψ A D i is the total spectrum request of the A D i . X T t (CO k ) is the CO k 's price strategy at time T t , and is a control factor for each MD's payoff. d CO k D i is the distance from the CO k to the D i , and CO k is the CO k 's radius of coverage area. To effectively decide the D i CO k ∈C D i , we adopt the concept of EMBS. According to (6) and (7), the D i CO k can be adaptively adjusted, and the D i effectively assigns the spectrum resource from the CO k .
To implement the minimax decision rule, the formulation of eq. (11) is a function for minimizing the possible loss for a worst case (maximum loss) scenario. When dealing with gains, it is referred to as maxmin -to maximize the minimum gain. In (11), S D i = . . . , represents the allocated spectrum amount from the CO k for the D i , and f D i CO k represents the D i 's flow, which is connected to the CO k in the C D i .
The main steps of the proposed scheme can be described, and they are described by the following flowchart as follows: Step 1: For our simulation model, the values of system parameters and control factors can be discovered in Table 2, and the simulation scenario is given in Section IV.
Step 2: Individual MDs generate their application services, which can be split with the MCA technique.
For these services, multiple contactable COs would provide their spectrum resources.
Step 3: In the upper-level MLMF Stackelberg game model, the price strategies of COs are decided. As leaders, COs dynamically calculate the variation of each price selection probability by using the equation (9). Based on a sequence of time steps, it is used to adjust the strategy's propensity in P CO .
Step 4: According to the probability distribution P CO for strategy selection, each CO stochastically selects his price strategy X T t (CO k ) at time T t .
Step 5: Each individual MD has its application task, and each flow is assigned to a specific contactable CO. Based on the CO's price strategy, the allocated spectrum amount is dynamically decided.
Step 6: In the lower-level game model, multiple traffic flows in each MD are game players, and their payoffs are defined by using (10). Based on the idea of EMBS, each flow's spectrum amount D CO can be adaptively adjusted, and S D is decided according to (11).
Step 7: In a distributed manner, each individual MD operates its lower-level game in parallel.
Step 8: To reach a final consensus between the different stances of COs and MDs, they iteratively negotiate with each other based on the hierarchical two-level game approach.
Step 9: Proceed to Step 2 for the next spectrum allocation process.

IV. PERFORMANCE EVALUATION
In this section, the performance of our proposed scheme is evaluated by simulations, and it is compared with other existing protocols to confirm the superiority of two-level game based approach. The assumptions of our simulation environment are as follows: • The simulated MCA system platform consists of three tiers K = {1, 2, 3} where 4 macro-cell operators in the first tier, 16 micro-cell operators in the second tier and 64 femto-cell operators in the third tier.
• Multiple COs are regularly positioned in an area of 10 × 10 kilometer square area; macro, micro and femtocell operators' radii ( ) of coverage areas are 3.5, 2, 1.5 kilometers, respectively.
• There are one hundred MDs N = {D 1 , . . . , D 100 }, and they are distributed randomly over the MCA cellular area.
• There are five strategies for each CO's price policies S CO = P CO min=1 , P CO 2 , P CO 3 , P CO 4 , P CO max=5 , and they are defined as P CO 1 = 0.1, P CO 2 = 0.25, P CO 3 = 0.5, P CO 4 = 0.75, and P CO 5 = 1. • The process for service request generations is Poisson with rate (services/s), and the range of offered service load was varied from 0 to 3.0. • Four different service types are assumed based on connection duration and spectrum requirement. In each MD, application services are selected randomly.
• The total spectrum capacity of each CO is that one macro-cell operator has 10 Giga bps, one micro-cell operator has 2 Giga bps, and one femto-cell operator has 0.5 Giga bps.
• To reduce computation complexity, the amount of spectrum allocation is specified in terms of basic spectrum units (BSUs), where one BSU is the minimum amount (e.g., 512 Kbps in our system) of spectrum adjustment.
• To calculate the EMBS, the utility of disagreement point, i.e., d f D

CO
, is zero in our system.
• System performance measures obtained on the basis of 100 simulation runs are plotted as a function of the offered service request load.
• Performance measures obtained are normalized user's payoff, system throughput, and fairness among COs in the MCA system.
• For simplicity, we assume the absence of physical obstacles in the wireless communications. In Fig.2, the normalized user's payoff is investigated for various service request rates. In the point view of MDs, this performance criterion is related to user's satisfaction and service quality. In order to examine the performance of the proposed techniques with respect the different service request rates, we plot the system throughput in Fig.3. In our simulation model, the system throughput is estimated as the ratio of traffic service that is successfully completed to all requested applications. In Fig.4, we plot the achieved fairness among   COs for requested services of MDs. To compare the fairness performance, we use the Jain's fairness index [15]. The simulation results displayed in Fig.2 to Fig.4 justify the advantages of the proposed approach. By the reciprocal combination of MLMF Stackelberg game and EMBS, we jointly design an interactive two-level game model to strike an appropriate performance efficiency; conversely, the HCCA, EMCA and RMRA schemes cannot offer such an attractive outcome under widely different application load intensities.

V. SUMMARY AND CONCLUSIONS
In this paper, we introduce a novel spectrum resource allocation scheme for the MCA system. To implement our scheme, we adopt a novel two-level game paradigm for COs and MDs. By taking into account the current network conditions, two games at each level are sophisticatedly combined into the holistic scheme and mutually dependent on each other. In the upper-level, the CO's spectrum price is effectively adjusted based on the MLMF Stackelberg game model. In the lowerlevel, the different COs' resources are dynamically allocated for the MCA services according to the idea of EMBS. Under dynamically changing MCA system environments, our twolevel game approach can strike a well-balanced consensus between conflicting viewpoints of COs and MDs. Finally, we conduct simulations to show the effectiveness of our proposed scheme in terms of user's payoff, system throughput and fairness.
For the future work, our current study can be extended in a number of ways. One future direction is to explore the deviceto-device communications for massive IoT system scenarios. Another potential direction for the future research is to take the congestion problem into the traffic aggregation decision. In addition, we will extend our work by incorporating the mobility issue for MDs. Last but not the least, we will develop a more sophisticated scheme that can include the control issues, such as convergence time, service latency and system level energy efficiency, etc.