Hybrid Controlled User Association and Resource Management for Energy-Efficient Green RANs with Limited Fronthaul

To alleviate green house effect, high network energy efficiency (EE) has increasingly become an important research target in wireless green communications. Therefore, the investigation for resource management to mitigate the co-tier interference in the small cell network (SCN) is provided. Moreover, with the merits of cloud radio access network (C-RAN), small cell base stations (SBSs) can be decomposed of a central small cell (CSC) and remote small cells (RSCs). To achieve the coordination, the split medium access control (MAC) based functional splitting is adopted with scheduler deployed at CSCs and retransmission functions left at RSCs. However, limited fronthaul has a compelling impact at RSCs due to requirements of user quality-of-service (QoS). Accordingly, a traffic control-based user association and resource allocation (TURA) scheme is proposed for a centralized resource management. To deal with the infeasibility to control all RSCs by CSC, we propose a hybrid controlled user and resource management (HARM) scheme. A CSC performs TURA for RSCs to mitigate intra-group interference within localized C-RANs, whereas the CSCs among separate C-RANs conduct cooperative resource competition (CRC) game for alleviating inter-group interference. Based on regret-based learning algorithm, the proposed schemes are analytically proved to reach the correlated equilibrium (CE). Simulation results have validated the effect of traffic control in TURA scheme and the convergence of CRC. Moreover, the comparison of the proposed TURA, HARM, and CRC schemes with the benchmark is revealed. It is observed that the TURA scheme outperforms the other schemes under ideal fronthaul control, whilst the proposed HARM scheme can sustain EE performance considering feasible implementation.

guarantee the quality-of-service (QoS) of each user, the serving SBS with overloaded backhaul will offload some users to other SBS which is referred as traffic control [15]. In [16], the authors aim at maximizing the weightedsum rate for backhaul-constrained SCNs with carrier aggregation. Furthermore, it is crucial to appropriately allocate available frequency resources and SBS's transmit power to fulfill user's QoS requirement. Non-cooperative games in [17]- [19] are adopted to enhance network capacity or EE by each SBS which selfishly determines the resource blocks (RBs) and power assignments based on its own utility functions, whereas the overall system performance will be degraded due to lacking of coordination. Another proposed framework in [20] executes resource and power allocations by constructing a cooperative game between small cells for cross-tier and co-tier interference mitigation.
Moreover, the scheme proposed in [21] achieves optimal network EE based on a cooperative game for subcarrier assignment. Research in [22] assigns subcarriers and allocates SBS's transmit power by evolutionary game theory, which analyzes the average interference between SBSs.
Another functional split virtualizes all the network functions in the CSC; while the RSCs only contain the radio frequency (RF) processing unit for data transmission and reception (as shown in the top-most case of Fig. 1). This is the classic realization for C-RAN architecture and the VNFs are assigned to the most appropriate processor or hardware accelerator in CSCs to efficiently execute corresponding network functions of base stations. Although higher system performance can be achieved owing to full coordination among small cells, the requirements of low latency and ideal fronthaul will result in considerable expenditure to the network operators. Previous work in [23] conducts RB and power allocations for SCN by a centrally controlled unit, and an optimization problem is formulated to solve for maximizing network EE. Nevertheless, the energy conservation and QoS requirements have not been taken into account in both literatures [16] and [24]. The data rate requirements and restrictions of backhaul capacity are simultaneously considered in the literatures [14], [25] and [26]. In [14], an energy-efficient resource allocation algorithm in multi-cell orthogonal frequency division multiple access (OFDMA) systems is proposed. The work in [26] presents a joint resource allocation and admission control to minimize the sum of interference levels that macrocell can suffer from the small cells. Even though a near optimal solution can be obtained, the signaling overhead and computational loadings are potential drawbacks to the RF-based functional split management.
The above observations intuitively imply that there exists tradeoff between the overall system performance and the 3 deployment cost of fronthaul links. According to the analysis in [2], medium access control (MAC) can be divided into upper MAC as the scheduler and lower MAC as the hybrid automatic repeat request (HARQ) mechanism.
Additionally, functional split of MAC can deliver the benefits of centralization but only requires a small increase in transporting data in such network scenario. This is well-aligned with existing multi-vendor ecosystem for telecom operators based on the functional application platform interface. However, it is apparently infeasible for a CSC to control a huge amount of RSCs in realistic communication systems due to the hardware limitations. As a result, a practical scale for the implementation of dense SCNs is analyzed in this paper, where a CSC will be in charge of the radio resource management (RRM) for the RSCs in the localized small cell group such as a shopping mall or commercial building under the split MAC-based network functions. The interference management with limited fronthaul capacity will be invoked within a localized small cell group. Furthermore, the scheme for interference mitigation among the localized small cell groups will also be proposed in this paper. With the preponderance of split MAC-based functional splitting, this paper proposes a framework for subchannel and RSC transmit power allocation to maximize EE centrally by a CSC in its localized serving small cell group. The traffic control and small cell on/off mechanisms are also designed under the consideration of limited capacity for non-ideal fronthual.
The traffic control occurs with overloaded fronthaul of serving RSC and this overloaded RSC will offload the users to one of the nearby RSCs, which remain enough fronthaul capacity to serve the user. Meanwhile, RSC will also tend to offload the associated users to others when its loading is low so as to turn off RSC for energy-saving. Hence, not only the required QoS under the circumstance of limited fronthaul capacity, but also the power conservation of RSCs can be achieved by the mechanism of user association.
With the above statements, a joint optimization problem for traffic control-based user association and resource allocation (TURA) is proposed to be solved in this paper. To the best of existing knowledge, the merits of this joint optimization not only can reduce the computation loadings of CSC, but also reduce the power consumption to enhance the network performance. Unlike the ideal implementation for VNFs, which are co-located in a CSC, the interference alleviation for the group edge users among the localized small cell groups induces a challenge for the CSCs to coordinate their RRM since there is limited and asynchronous information exchanged between each other. Consequently, a cooperative resource competition (CRC) game for subchannel and transmit power allocation between CSCs is formulated in order to improve the network EE by effectively managing inter-group interferences. By regarding each CSC as a player in the game, a CSC can adopt its transmission alignments based on the observation of transmit power between each other and the received utility function. In the proposed CRC scheme, a distributed learning algorithm is employed by evaluating the regret value of different actions taken by the player. In summary, this paper proposes a hybrid controlled user association and resource management (HARM) scheme, which consists of both distributed and centralized RRM schemes for the C-RAN split MAC based functional splitting SCNs with the consideration of restricted fronthaul capacity.
The rest of this paper is organized as follows. The detailed descriptions about the system model and problem formulation of proposed HARM algorithm are provided in Section II. Section III illustrates the proposed TURA scheme for SCN in a localized C-RANs. Section IV formulates the CRC scheme among multiple localized C-RANs and adopted distributed algorithm to reach correlated equilibrium (CE). The performance of proposed framework is evaluated in Section V. Finally, Section VI draws the conclusions.
Notations: Denote bold capital letter as matrix and | · | as the absolute value. The operator max(·) returns the largest value in an array. ½(A) is an indicator function which is equal to 1 when the event A happens and 0 otherwise. A random variable with value between a and b is generated by rand(a, b).

II. SYSTEM MODEL AND PROBLEM FORMULATION
The detail descriptions for the architecture and operating process of proposed split MAC-based SCN are provided in the first and second subsections. Also, the energy-efficient optimization problem of resource allocation with traffic control and small cell on/off mechanisms under relevant constraints of subchannels, RSC transmit power, and capacity of fronthaul is formulated in the third subsection. The CSCs in different small cell groups communicate with core network by backhaul links. In the proposed split MAC-based SCN, each CSC will conduct centralized resource allocation for its serving RSCs in a localized small cell group. In other words, a localized small cell group can be regarded as a localized C-RAN. It is considered that the channel state information at the transmitter (CSIT) can be measured by UEs and feedback to CSC through the RSCs. Furthermore, there are N available subchannels in the system and the bandwidth of each subchannel is B. All the RSCs share the entire frequency band, which leads to inter-cell interference between the RSCs and a subchannel can only be assigned to a single UE. As a result, the signal-to-interference-plus-noise ration (SINR) γ n c,s,k of UE k served by RSC s in localized C-RAN c on the subchannel n is given as

A. System Model
where g n c,s,k and p n c,s,k are the deterministic channel gain and transmit power from RSC s in the localized C-RAN c to UE k on subchannel n. N 0 is the power spectral density of additive white Gaussian noise (AWGN) and W is the bandwidth of a subchannel. The term I n c,s,k in the denominator of (1) is represented in (2), which include both the intra-group interference in a localized C-RAN (the first two terms) and the co-channel inter-group interference caused from other localized C-RANs to the localized C-RAN c (the third term): where φ t,i,j ∈ {0, 1} indicates whether user j is associated with RSC i in localized C-RAN t and, on top of that, ψ n t,i,j ∈ {0, 1} decides the assignment of subchannel n to user j served by RSC i in localized C-RAN t. Given the SINR calculated in (1), the achievable data rate R n c,s,k for user k served by RSC s in the localized C-RAN c on the subchannel n can be formulated based on Shannon capacity as R n c,s,k = W log 2 1 + γ n c,s,k .
Therefore, the sum-rate R c within a localized C-RAN c is acquired as Furthermore, the transmit power of RSC s in localized C-RAN c is represented as The users are initially connected to RSCs with the largest reference signal receiving power (RSRP) and may be offloaded to other cells based on the conditions of traffic load and available fronthaul capacity. A signal power overhead P (O) is considered to reduce the ping-pong effect, which indicates handovers back-and-forth between two RSCs contributing to system over-loadings. Therefore, the power overhead in RSC s resulted from traffic control  is formulated as whereφ c,s,k represents the initial state of user association. Based on (5) and (6), the total power consumption P c within a localized C-RAN can be modeled as where P (CA) and P (CS) are the circuit power consumptions of RSC in active mode and sleep mode respectively.
Since switching RSC with low traffic load into sleep mode is an efficient approach to reduce the power consumptions of SBSs for green communications [27], P (CA) and P (CS) are consequently taken into considerations in the power model. The indicator function ½ K k=1 φ s,k > 0 can be viewed as an RSC on/off strategy which is equal to 0 when there is no user associated with RSC and RSC will fall into sleep mode for power-saving. The indicator function becomes 1 when there is a single or multiple users associated with RSC and RSC is in the active mode.
Accordingly, the EE of a localized C-RAN, which is defined as the ratio of total achievable data rate to the total power consumption of SBSs, can be expressed from (4) and (7) as

B. Operational Process for Proposed HARM Scheme
The operating flow chart of proposed HARM system for mitigating both inter-and intra-group interferences is illustrated in Fig. 3. Since the interference suffered from the UEs can be decomposed into both the intra-group interference within the same localized C-RAN and the inter-group interference from different localized C-RANs, the HARM scheme is proposed to overcome the above-mentioned problems by implementing the CRC and TURA algorithm repeatedly. By implementing the proposed TURA algorithm, the users will be associated with their serving RSC and allocated with proper configuration of subchannel and RSC transmit power considering the limited capacity of fronthaul and QoS requirement. The intra-group interference can be mitigated by the central control of CSC within each localized C-RAN. The decision strategies in TURA scheme including user association, subchannel allocation, and transmit power allocation within a localized C-RAN are respectively defined as Φ c = {φ c,s,k |1 ≤ s ≤ S, 1 ≤ k ≤ K}, Ψ c = {ψ n c,s,k |1 ≤ s ≤ S, 1 ≤ k ≤ K, 1 ≤ n ≤ N } and P c = {p n c,s,k |1 ≤ s ≤ S, 1 ≤ k ≤ K, 1 ≤ n ≤ N }. Furthermore, a CRC game is performed between CSCs based on the EE of their own localized C-RAN to alleviate the inter-group interference. The decision strategies Φ c , Ψ c , and P c obtained in TURA scheme will be utilized by the localized CSC to determine the probability of strategy for resource allocation w c , and will be delivered from a localized C-RAN to other CRANs. It is considered that erroneous informationw −c will be received by a CSC from other localized C-RANs owning to asynchronous communications between CSCs. Noted that the detail descriptions of w c andw −c will be explained in Subsection IV.C. (3). Moreover, with the adoption of proposed CRC scheme, the set of upper bounds for transmit power on each subchannel in a localized C-RAN c can be obtained as κ n,ℓ c,s,k and κ n,ℓ c,s,k is the constrained inter-group interference determined by proposed CRC scheme (will be shown in Subsection IV.A) and L s is the total power level. In other words, under the circumstance of bounded inter-group interference, each CSC will execute TURA scheme to determine user association, RSC transmit power allocation, and subchannel assignment for mitigation of intra-group interference. We can then effectively provide sum-rate enhancement and reduce power consumption under limited fronthaul capacity within a localized C-RAN. As a result, the proposed HARM scheme is performed by repeatedly executing TURA and CRC algorithms until it converges.

C. Problem Formulation
The objective of this work is to maximize EE through subchannel assignment, power allocation, user association, and small cell on/off mechanisms under the constraints of data rate requirement of each UE, maximum transmit power allowance of each RSC, and limited fronthaul capacity. The optimization problem of resource allocation can be formulated to acquire the transmission policies for Φ c , Ψ c , and P c in order to improve network EE, which is stated as follows.
The parameter Φ in (9a) is the set of user association configuration and Φ = {Φ c ∈ |1 ≤ c ≤ C}. The sets for decision policies for subchannel assignment and RSC power allocation are defined by is the maximum transmit power of each RSC which restricts the sum of allocated power on all subchannels. (9c) specifies that each user achieves its target data rate R (min) c,k according to the QoS requirement. (9d) depicts that the sum-rate of each RSC should be less than the allowance of fronthaul capacity B (max) c,s . Furthermore, (9e) describes that each user can be served only by a single RSC. The constraint in (9f) indicates that both φ c,s,k and ψ n c,s,k are binary integer variables for user association and subchannel assignment, receptively. (9g) defines the power allocation parameters to be non-negative values.

III. PROPOSED TRAFFIC CONTROL-BASED USER ASSOCIATION AND RESOURCE ALLOCATION (TURA) SCHEME WITHIN A LOCALIZED C-RAN
In this section, our proposed TURA scheme will be presented which centrally performs resource allocation by CSC for its corresponding RSCs in the localized C-RAN. The TURA scheme can mitigate the intra-group interference and provide higher network capacity by adopting efficient traffic offloading, user association, and resource allocation. Moreover, proper configuration of user association not only can satisfy the QoS requirement under limited fronthaul capacity but also can turn off the lightly loaded RSC to conserve the transmit power, which gives rise to achieve higher EE. Based on (9), the optimization problem for proposed TURA algorithm can be formulated for a localized C-RAN c as follows. max Φc,Ψc,Pcη The objective functionη c in (10a) is EE for localized C-RAN c based on estimated SINR, which is calculated as where the estimated sum-rateR c for localized C-RAN c is obtained as The termĨ c,s,k in (12) is the estimated interference, which consists of intra-group interference and erroneous intergroup interference due to asynchronous information exchanged between localized C-RANs.Ĩ c,s,k can be acquired Since the proposed optimization problem of resource allocation in (10) contains nonlinear objective function, which is a ratio of three transmit decision policies Φ c , Ψ c and P c , it is difficult to solve this problem via conventional linear programming methods. On top of that, the optimization problem is non-convex with respect to Φ c , Ψ c and P c since the discrete variables and co-channel interference are also taken into consideration. Generally speaking, the problem can be solved by exhaustive searching method, which tries all the allocated configuration for user association, subchannels, and, transmit power of RSC. However, the computational complexity grows exponentially with the number of RSCs. For conventional optimization methods to deal with this non-convex optimization problem, it is required to relax the discrete variables to continuous variables by adopting some programming techniques and transforming the original problem into a convex one. Nevertheless, the transformed optimization problem cannot be directly solved to obtain the near-optimal solutions [28]- [29].
To overcome the difficulty of dealing with the optimization problem in (10), the TURA algorithm with stochastic processes is proposed and described in this subsection. Particle swarm optimization (PSO), which motivated by the social behaviour of bird flocks or fish schooling has gained increasing popularity during the last decade due to its effectiveness in conducting difficult optimization tasks, especially in resource allocation of wireless systems. The potential solutions of resource allocation problem are called particles in PSO, and the particles will spread through the problem space to achieve the final resource configuration by considering historical data and the current best particle. Compared with the well-known genetic algorithm (GA), the main merits of PSO over GA relies on the momentum effect of velocity vectors for particle movement which can quickly move the current best solution of each candidate particle to the global best solution in order to result in faster algorithm convergence. However, the solutions from PSO can easily be trapped in local optimums, which can be effectively alleviated by adopting the quantum-behaved particle swarm optimization (QPSO) [30]- [31]. A random number generator is utilized in QPSO with a certain probability distribution to simulate the particle trajectories in order to provide global convergence of particles. Furthermore, unlike PSO, the QPSO does not require velocity vectors for particles and also possesses fewer parameters to adjust, which makes it easier to be implemented in realistic wireless systems. The position of a particle in QPSO, i.e., the candidate solution {Φ c , Ψ c , P c } of (10), can be iteratively updated based on the particle fitness and evolution process for approaching the optimal solution.

A. Fitness Function and Transformation for Unconstrained Form
In the proposed TURA scheme, each potential solution become a candidate by means of evaluating the quality of fitness function. In our considered optimization problem, the objective function in (10) is a key factor for the fitness function to decide how to allocate limited resource in wireless network. However, the fitness function is generally in an unconstrained form and a transformation from constrained objective function is required. To tackle this difficulty, the penalty function is adopted to transform the original optimization problem in (10) into an unconstrained one [30], where the fitness function is defined as the gain between reward and penalty functions. Note that the reward function is the objective function for achieving higher EE and the penalty function is the degree of the transmission policies that does not meet the constraints. Therefore, the fitness function can be formulated as where α is the penalty factor, which is a parameter for the particle to balance the fitness function between the EE performance and the penalty that does not satisfy constraints. Moreover, the ∆(Φ c , Ψ c , P c ) in (14) term represents the penalty function, which can be obtained as It can be observed from (15) that the value of penalty function relies on the gaps between the transmission policies and the corresponding constraints. The more constraints unsatisfied, the higher penalty the particle needs to suffer from conducting this decision of transmission policies, which impacts the direction of path to globally best position or is screened out when comparing with its historical particles during the evolution process.

B. Operation Process for TURA scheme
Since the updating process for each variable is implementing by the same manner, let X c generalizes the decision policies {Φ c , Ψ c , P c } for simplification. In each iteration t, there are I candidate solutions of optimization problem in (10) to be chosen from the searching space, where the searching space is referred as all the possible solution sets. The detailed descriptions of TURA scheme are given as follows.
1) Initialization: All the I candidate solutions are initialized at the beginning. For notational simplicity, the i-th candidate solution at iteration t is denoted as X i (t). The fitness function in (14) of each candidate solution will be calculated, and the threshold of fitness function F th for the algorithm is determined to achieve the convergent solution.
2) Evolution: After the initialization of I particles, we can obtain the best solution of candidate i-th particle in , which contributes to the largest value of fitness function in (14) in its history. Moreover, the global best solution among all I particles at iteration t, which is denoted as X (GB) (t) can be also acquired. The weighted mean of I elite candidate solutions at iteration t can be calculated as As a result, the i-th candidate solution in iteration t is updated by the evolution equation [32]- [33] as follows.
where µ i (t) = rand(0, 1) and ε i (t) = rand(0, 1). X (17) is the attractor between local and global optimal solutions for candidate i in iteration t, which is given by where λ i (t) = rand(0, 1). On the basis of the evolution equation in (17), the candidate solutions can be attained based on X   (18) is a random variable for the candidate at next iteration to start from a new position, which cross-correlates the historically best solution with the global best solution instead of starting from either X in the second term of (17) affect the speed that the position of next i-th candidate solution converging to the final solution. The term of β(t) is a coefficient that influences the convergence speed of the algorithm [1] which is given by where β (max) and β (min) are respectively the maximum and minimum searching ranges in the solution space. The maximum number of iterations is denoted as T . It can be observed from (19) that β(t) is linearly decreasing with t since X i (t) is approaching convergence for large t. The difference between X (Mean) (t) and X i (t) also affects the speed of convergence. If the difference is large, which means this candidate solution is far away from current average position, this candidate solution at the next iteration should accelerate to converge, and vice versa. The term µ i (t) is a random variable to alleviate the next candidate falling into local optimum. In addition to µ i (t), is also set to reduce the probability that the next candidate solution trapped into local extremes, which can be regarded as the direction for the starting point of next candidate to search the potential solutions.
3) Convergence: Ultimately, the proposed TURA algorithm completes once achieving the terminated condition.
There are two terminated conditions where the first condition is the maximum number of iterations, i.e., t = T , whilst the second one is to reach the stop criterion. A gap ratio τ i (t) is defined as and the convergence is achieved if all the gap ratio values τ i (t) are smaller than a convergence threshold F (th) as The gap ratio τ i (t) in (20) represents the distance between the i-th elite candidate solution and the global best solution in iteration t. (21) denotes that the fitness values of all elite candidate solutions are closed enough to the global best solution, i.e., the fitness function in (14) converges.
The overall procedure for proposed TURA scheme within a localized C-RAN is demonstrated in Algorithm 1.
Moreover, the computational complexity of proposed TURA algorithm is O(I × T × S × K × N ), which is linear to the numbers of candidate solutions, iterations, SCs, users, and subchannels [31]. The performance gain provided by TURA will be illustrated via simulation in Section V.

IV. PROPOSED COOPERATIVE RESOURCE COMPETITION (CRC) GAME AMONG LOCALIZED C-RANS
Consider the hardware limitation, the proposed TURA scheme cannot afford to control a large number of RSCs in a dense small cell network. Consequently, the entire network can be viewed as a gathering network of many localized C-RANs, where a CSC will be in charge of the RRM for RSCs within a group. The CRC scheme of subchannel assignment and transmit power allocation for RSCs among localized C-RANs is proposed and described Calculate the convergence speed β(t) by (19) 5: for i = 1, 2, . . . , I do 6: Calculate the local attractor X Update the candidate solution X i (t + 1) by (16)-(18) 8: Calculate the value of fitness function F (X i (t + 1)) by (14) 9: if F (X i (t + 1)) > F (X i (t)) then  Calculate the value of fitness functions F (X (HB) i (t + 1)) and F (X (GB) (t)) by (14) 15: if F (X (HB) i (t + 1)) > F (X (GB) (t)) then 16: X (GB) (t + 1) = X  Increment of t ← t + 1 26: until t = T or CONVERGENCE = TRUE in this section. After eliminating the intra-group interference by resource allocation within each localized C-RAN based on TURA, each CSC conducts CRC scheme to further alleviate the inter-group interference so as to achieve higher system performance. Since there is limited and asynchronous information exchanged between the localized C-RANs, it is difficult for all the CSCs to coordinate their RRM under centrally controlled operation. Hence, A distributed management is adopted to overcome above-mentioned difficulty based on game theory, where the resource competition between localized small cells can be formulated as a cooperative game. Each CSC only requires limited information about the probabilities of chosen actions from other CSCs. We design a learning algorithm based on EE of each localized C-RAN for the CSCs to determine their transmit actions, which consist of subchannels assignment and transmit power allocation of their serving RSCs.

A. Cooperative Game-Based Resource Competition Game
In this subsection, the formulation of proposed CRC scheme among localized small cell groups is analyzed and investigated. Based on cooperative game theory, the proposed CRC game in normal form can be denoted as follows.
where the CSC set C = {1, ..., C} denotes the set of players in a game, who compete for the subchannel set N = {1, ..., N } and resource for transmit power of RSCs, i.e., the CSCs are the chiefs for action decision. Also, A c is the action space of CSC c for power allocation vectors. Let κ n c,s,k = φ n c,s,k × p n c,s,k be the product indicator of subchannel assignment and allocated transmit power, which determines whether or how much power UE k will be allocated on subchannel n from RSC s in localized C-RAN c. A finite and discrete action space is a significant requirement for cooperative games. As a result, to form a finite and discrete action space A c for each player, let L c ∈ N be the number of discrete power and κ n,ℓ c,s,k represents ℓ-th transmit power level from RSC s to UE k over subchannel n, where n ∈ N , ℓ ∈ L c and L c = {1, ..., L C }. Notice that if φ n c,s,k = 0, zero transmit power will be allocated and κ n,ℓ c,s,k = 0. Thus, we define κ 0,0 c,s,k as no subchannel is assigned to the user and the action space of RSC s which is then expressed as Denote A by A = A 1 × · · · × A C as the entire action space of all the players. Moreover, the last term U c of (22) is the utility function of CSC c. In game theory, players will choose their actions based on their own utility functions so as to obtain the maximum reward. Since the major purpose of this paper is to maximum EE of each localized C-RAN and the performance of overall system can be further enhanced with the cooperation between CSCs, where each CSC will determine its strategy based on higher EE gain. Accordingly, η c can be regarded as the utility function for each CSC to decide its transmission strategy for resource allocation and U c can be described as where a c = {κ n,ℓ c,s,k |1 ≤ s ≤ S, 1 ≤ k ≤ K, 1 ≤ c ≤ C, 1 ≤ n ≤ N, 0 ≤ ℓ ≤ L c , s ∈ Z + , k ∈ Z + , c ∈ Z + , n ∈ Z + , ℓ ∈ Z + } represents the vector of actions taken by CSC c.

B. Existence of Correlated Equilibrium in Proposed Scheme
In the proposed CRC game G, each CSC aims to maximize its utility function cooperatively and further improve the EE performance of entire system by choosing an optimal transmission action a c , which is the solution of (9). Most of the existing works investigate the potential solutions of players' strategies by the stable point, which can lead to no play obtaining higher utility gain by changing their determined actions on the Nash equilibrium (NE) [34]. However, a higher utility gain can be acquired by the players cooperatively deciding their strategies via information sharing, which is called cooperative game. In the cooperative games, the stable point that no player will unilaterally deviate from the selected action to other ones can be held by the correlated equilibrium (CE) in the game theory. The concept of CE is defined as follows.
Definition 1 (Correlated Equilibrium). Denote ∆A as the set of all probability distributions over the finite action space A. The probability of correlated strategy A is given by P (A), where (P (A)) A∈A ∈ ∆A. On the condition that ∀a c ∈ A c and ∀c ∈ C, the CSC c will choose strategy a c rather than any other strategyã c to achieve the

stable point of CE if and only if
where A −c represents the matrix of transmission strategies taken by all CSCs except for CSC c. On top of that,

the action space formed by all the CSCs except for CSC c is expressed as
It can be observed from (25) that players will coordinate their actions with each other by exchanging the probability distribution of strategies cooperatively. On the contrary, another well-known concept for analyzing the chosen strategies is NE, where each player is inclined to selfishly decide its actions. Note that NE is a point inside CE considering the extreme case that different strategies are independent. Accordingly, a better overall welfare of the players can be intuitively reached on CE compared to the strategies on NE. It induces that higher network EE can be achieved by the CE approach with the cooperation between CSCs. The following theorem prove the existence of CE in our considered small cell networks.

Theorem 1. A CE exists in the proposed resource competition game between the small cell networks.
Proof. Since there is a finite number C of CSCs to choose discrete and finite action space in the proposed CRC game G, it is apparently that G is a finite game. It has been proved by Theorem 1 in [34] that there exists a CE in every finite game. Therefore, the existence of CE in the proposed CRC game G can be certified.
When the subchannels and RSC power are allocated appropriately to UEs, the stable condition, i.e., the Pareto optimum, can be reached by the system. Under the stable point of Pareto optimum, there is no player capable of acquiring higher reward since it potentially causes losses to others. Proof. It has been showed by Theorem 1 that there must exist a CE for the proposed resource competition game G. Moreover, (25) induces that each player can reach its maximum expected utility on the CE. Hence, the sum of total expected utilities of all players can be achieved by choosing the correlated strategies. The existence of Pareto optimum can be proved by contradiction as follows. If there does not exist Pareto optimum in the proposed game G, there must not exist the correlated strategies satisfying (25) and the CE must not exist, which obviously contradicts Theorem 1. Consequently, there must exist the Pareto optimum in proposed game G.

C. Operational Process for CRC Scheme
In this subsection, a distributed learning algorithm based on regret matching procedure [35] is adopted for each CSC to determine its correlated strategy in order to obtain the CE in proposed G. Each CSC will select the new strategy of subchannel and power allocation considering the regret value of others not employing strategies in the history. In other words, the higher regret value of non-employed strategies indicates higher probability for the CSC to deviate from its current strategy.

1) Initialization:
The allocation of subchannels, transmit power and user association will be initially configured from the TURA scheme within the localized small groups. Each CSC calculates its own utility function by (24) from the subchannels, transmit power of RSC and user association, which are Φ c , Ψ c , P c determined in the operational process of TURA scheme.

2) Evaluation of Regret Value:
Each CSC determines its potential taken strategies from the regret value. Given a history of adopted strategies, the average regret value of CSC c at time t can be expressed as The R t c given in (26) shows that the average regret value depends on the strategy played in the past, a c , and the unused strategy,â c , where both a c andâ c ∈ A c . The term D t c represents the quantity of difference between the EE for CSC c at time t to adopt the transmission alignments of a c andâ c for subchannel assignment and RSC transmit power allocation. As a result, D t c can be formulated as 3) Probability Exchanging and Strategy Updating: The CSCs will choose the strategies with the most profit of utility. After evaluating R t c of potential strategies, the probabilities that the strategies are prone to be selected are updated as follows.
The term ξ in (28) is a non-negative and large enough number to normalize the summation of probabilities for different strategies to one. Each CSC will exchange their probabilities of potential strategies with each other.
However, the error may occur due to asynchronous information exchanging from other CSCs. Therefore, the exchanged probability vector as shown in Fig. 3 can be represented as Set the initial iteration as t = 1 3: CSC c initializes its strategy a 1 s of subchannel assignment and transmit power allocation for the users according to the Φ c , Ψ c , P c in TURA scheme 4: repeat 5: Evaluate average regret value of different strategies based on its own utility function of EE by calculating (26) 6: Update the probability for the strategy according to (28) 7: Exchange the probability of potential strategies with other CSCs according to (29) 8: Choose the strategy for iteration t + 1 given the probability distribution of w t c (a t c ) by a t+1 c = arg max where w t −c is the set for probability of potential strategies from all the CSCs except for CSC c. Moreover, the parameter ρ ∈ [0, 1] is the error ratio due to asynchronous exchanged information, and ∆h ∼ CN (0, 1). As a result, each CSC can determine its strategy for transmission alignment as

4) Convergence:
The ultimately determined strategy of each CSC will be achieved according to the following criterion as The subchannel assignment and transmit power of RSCs will be allocated by their serving CSC within the localized C-RAN when the criteria of (31) meets. Otherwise, the CSCs will continue performing the algorithm until the condition of (31) is satisfied. The detailed procedure for adapted algorithm is described in Algorithm 2.

V. PERFORMANCE EVALUATIONS
In this section, the performance evaluation of proposed HARM, TURA, and CRC will be provided and illustrated through simulation. Consider a small cell network, the RSCs are deployed in grid in a square coverage and users are uniformly distributed within the coverage area. Each user is located at a minimum distance of 2 meters from each RSC. The network channel model and system parameters are considered based on [36] and [37], where the case of indoor dense urban information society in [37] is taken as the reference.   as independent and identically distributed Gaussian distribution random variable. The main parameter setting is listed in Table I. Moreover, if not mentioned specifically, the data rate of each user, maximum transmit power of RSC and fronthaul capacity of RSC are respectively set to be identical as R and B max c,s = B max , ∀c, k, s. All simulation results are averaged over random user locations and channel conditions according to Monte Carlo runs.

A. Convergence of CRC Scheme
In this subsection, the convergence of proposed CRC scheme is studied in Fig. 4 under different number of traffic loads K. Note that the CRC scheme mentioned in Section IV is designed to perform resource competition and interference management among multiple C-RANs. We consider that each C-RAN consist of a small cell with synchronous ideal backhaul which integrates a CSC and an RSC for the comprehensive capability to perform network functions. Accordingly, the proposed CRC scheme will be conducted for resource competition at small cells. It can be seen form Fig. 4 that the CRC algorithm can quickly converge to its CE and EE under few iterations by adopting the regret-based learning algorithm. More iterations are required to achieve the CE and converged EE   in the proposed CRC scheme when the number of users is increased due to the reason that there exist more available strategies for resource allocation to be chosen.

B. Performance of Proposed TURA Scheme
In this subsection, the simulation results are provided to demonstrate tthe effectiveness of proposed TURA scheme on traffic control, and the impacts from RSC on/off mechanism and the capacity limitation of fronthaul. Furthermore, the impact of signalling overhead for traffic control on EE performance is also discussed. Since the effect of traffic control is a critical concern in this paper, a benchmark of RSRP-based user association and resource allocation (RURA) method is considered for performance comparison, where it adopts the identical power and subchannel allocation as TURA along with RSRP-based user association.   The left subplot of Fig. 5 shows that there is no outage caused in both TURA and RURA schemes with K = 4 since the capacity of fronthaul is sufficient for the RSC to satisfy the QoS of each user. Furthermore, the outage probability is lower in proposed TURA scheme compared to RURA method with increasing number of users in the network, which illustrates the merits of TURA scheme on user offloading as shown in the right subplot to overcome the restriction of insufficient fronthaul capacity for RSCs. Note that the probability of traffic control decreases from N = 6 to N = 7 when the network traffic is overloaded since the total capacity of fronthaul is limited and there is no appropriate configuration of user association to satisfy the data rate requirements of users. Therefore, Fig. 5 demonstrates the traffic control capability of improving QoS satisfaction with the limitation of fronthaul capacity. The scenario of RSC on/off with unlimited fronthaul capacity provides a better performance on EE over the scenario of RSC on/off with limited fronthaul since the data rate of user are not restricted by the capacity of fronthaul.

1) User Offloading and Traffic
Additionally, the performance comparison between the scenarios of RSC on/off with limited fronthaul and RSC on with limited fronthaul depicts the merit of RSC on/off mechanism on the power-saving of RSCs to enhance network EE. Furthermore, it can be observed that the effectiveness of mechanism of RSC on/off mechanism is revealed especially under lower traffic loads, i.e., K = 9, where the RSCs tend to be turned off so as to conserve energy. Hence, the network EE decreases with the increasing signalling power due to the user offloading for achieving the minimum data rate requirements.

C. Performance of Proposed HARM Scheme
The performance comparison of EE and outage probability of the proposed HARM scheme considering effects of number of localized C-RANs and error ratio of information exchanged versus number of users is shown in Figs. 9 and 10, respectively. Fig. 9 depicts that the EE of HARM with zero error ratio ρ = 0 degrades with increasing number of localized C-RANs from C = 1 to C = 6. This is because more localized C-RANs divide the network will provoke a comparably more distributed control for resource allocation. Consequently, the coordination among localized C-RANs cannot be performed simultaneously under C = 6 C-RANs compared to the totally centralized  control under C = 1. With higher error ratio of ρ = 0.01, the HARM has a lower EE due to asynchronous exchanged information and more power consumption under a large scale network. Furthermore, as illustrated in Fig. 10, the outage probability of the proposed HARM scheme with zero error ratio ρ = 0 asymptotically increases with the increment of localized C-RANs due to non-coordinated interference management and restricted traffic control. Under higher error ratio of asynchronous information of ρ = 0.01, it potentially brings out higher probability of erroneous resource allocation strategies exchanged between CSCs, which induces inappropriate interference management. In other words, RSCs in the associated localized C-RAN cannot alleviate interference by allocating proper subchannels and transmit power, which also reflects a degradation of EE performance from ρ = 0 to ρ = 0.01 as illustrated in Fig. 9.

D. Performance Comparisons with Existing Methods
In this subsection, we will evaluate the performance of proposed HARM scheme to observe the integrated effects of both TURA and CRC methods. Note that the centralized control of the proposed TURA scheme considers that all RSCs are managed by a single CSC with upper MAC functions, while the lower MAC functions reside in RSCs  with confined fronthaul capacity. Moreover, the performance analyses of HARM are investigated along with the proposed CRC scheme applied in the distributed small cells. Note that the benchmark of RURA adopts RSRP-based user association along with non-adjustable power and subchannel allocation. Fig. 11   that there conducts no traffic control mechanism in the compared schemes of RURA and CRC. It can be observe that the proposed TURA scheme outperforms the other benchmarks due to centralized configuration of resource management and traffic control. On the other hand, HARM has higher outage probability and lower tendency of traffic control because the user cannot be offloaded between two neighbouring RSCs. We can also infer that the outage probability can be substantially reduced with higher fronthaul capacity, i.e., lower outage is achieved under B (max) = 30 Mbps compared to that under B (max) = 20 Mbps. However, users suffer from full outage among all schemes due to high traffic loads of K = {18, 21} and insufficient fronthaul of B (max) = 20 Mbps. Moreover, as shown in Fig. 13, it depicts that the increased offloading probability is revealed due to asymptotically saturated fronthaul. Furthermore, the probability of user offloading starts to decrease from K = 18 under B (max) = 30 and from K = 12 under B (max) = 20 since the fronthaul is overloaded which is incapable of supporting the data rate requirements. In addition to QoS satisfaction, user offloading will also be performed to save energy resulting in potential sleep-mode RSCs. RURA algorithms under a denser network, i.e., r ≤ 150, due to significant interference mitigation. On the other hand, inter-RSC interference is intrinsically smaller with longer distance among RSCs, which results in improved EE for RURA method. However, overloaded RSCs will perform traffic control mechanism under the employment of proposed TURA and HARM schemes in order to satisfy the minimum data rate requirements of users. Therefore, as depicted in Fig. 15, both TURA and HARM achieve lower outage probability than RURA due to traffic control mechanism, which TURA and HARM respectively reach around 40% and 20% decrement of outage. Nevertheless, TURA has much lower outage than HARM since user-offloading between different C-RANs can be conducted under centralized operation of TURA, which achieves zero outage under the sparser scenario of 90 ≤ r ≤ 150.

1) Different Traffic Loads and Fronthaul Capability:
This can also be reflected from Fig. 16 that the user offloading for HARM is less occurred than TURA scheme. Moreover, with shorter distance among RSCs, more frequent offloading mechanism will happen due to more induced interferences.

VI. CONCLUSIONS
In this paper, we conceive hybrid controlled user association and resource management for resolving the problem of subchannel and transmit power allocation for EE maximization considering limited fronthaul capacity under a large scale green C-RANs. Within a localized C-RAN, the CSC centrally performs the proposed TURA scheme for the RSCs to mitigate the intra-group interference and tackle the issue of limited fronthaul capacity for user QoS requirements. Furthermore, the the proposed CRC scheme can analytically achieve the CE and alleviate the inter-group interference among localized RSCs. The Pareto optimum is theoretically proved based on game theory. Moreover, the simulation results have investigated the effect of traffic control by verifying the convergence of CRC scheme. Additionally, the EE performance of the proposed TURA scheme outperforms the other existing methods