Introduction
With the advancement of 5G mobile communication technology, new services such as virtual reality (VR), autonomous driving, and smart factories have emerged [1]. Due to these services, the average data usage per user has grown exponentially [2]. In LTE Release 10, a heterogeneous network (HetNet) architecture has been introduced to handle high mobile traffic load. This involves deploying multiple small cells within a macrocell, aiming to reduce the load on the macrocell and boost network capacity in specific regions [3]. Small cells enable low-power communication by reducing the distance between the base station and users. Additionally, deploying small cells in cell edges and coverage holes can enhance energy efficiency and improve the quality of service (QoS) provisioning. Consequently, small cells play a key role in providing ultra-high-speed, ultra-low-latency data services. While small cells can enhance network capacity, they also bring specific network challenges that need to be considered. Due to the narrow coverage of small cells and fluctuations in mobile traffic demands, significant load imbalances occur among small cells, resulting in poorer spectral efficiency among users [4]. This issue could potentially arise due to a received power-based user association schemes used in the traditional homogeneous mobile networks [5]. In this scheme, users associate with the base stations that offer high received power or signal-to-noise ratio (SNR) (i.e., Max-SNR scheme). This algorithm only considers the channel state between users and base stations, neglecting the load of the base stations and the traffic demands of users. Therefore, if this scheme is applied to small cell networks, users cannot avoid performance degradation caused by load imbalance. To address this issue, numerous studies on user association have been conducted, spanning from HetNet to ultra-dense network (UDN) [6]. However, overly simplified traffic models are utilized in these studies, and the results are not practical in real-world environments.
A. Related Works and Contributions
In the context of LTE HetNet, substantial research efforts have been made to offload the load from macrocells to small cells. These efforts are presented in centralized schemes using optimization theory or decentralized schemes using game theory and reinforcement learning [7], [8], [9]. Interestingly, it has been demonstrated that a simple small cell biasing approach, cell range expansion (CRE), can achieve a near-optimal solution for HetNet user association [10]. As the CRE technique gained popularity, numerous studies were conducted to dynamically adjust bias values [11], [12]. However, within the 5G networks, state-of-the-art studies have raised concerns about CRE causing load imbalance among small cells [13], [14]. In [13], a Monte Carlo tree search (MCTS)-based heuristic scheme to maximize the load fairness among small cells has been suggested. The authors in [14] have formulated traffic-aware user association using open-source dataset [15]. Beyond the traditional human-oriented cellular networks, the user association problem for Internet of Things (IoT) devices and computation offloading with multi-access edge computing (MEC) have been investigated in [16] and [17]. However, despite their sophisticated algorithms, the traffic data is excessively simple and outdated. Obviously, the existing traffic datasets, randomly generated traffic, and Shannon traffic model used in many literature, including the works mentioned above, have limitations in reflecting the reality of the 5G network.
While conventional user association problems mainly focus on the mathematical aspects of algorithms with high complexity based on simplified traffic models, we pioneer a simple and practical algorithm leveraging real-world traffic data generated from commercial 5G networks. Recently, studies leveraging traffic data from real-world 5G applications have been published to address resource management for ultra-high-speed and latency-sensitive applications [18]. In this paper, the data used is not limited to specific applications but captures traffic generated by 5G network users in their daily lives. This approach aims to better reflect the traffic characteristics of 5G users who simultaneously use multiple applications (e.g., streaming music while playing real-time games).
Due to the absence of open-source datasets for daily 5G traffic models, we collected the dataset ourselves over three months from ten smartphones. This allows us to consider the temporal characteristic of 5G smartphone users and develop a practical load balancing algorithm with respect to the per-user performance. Moreover, as a one-shot scheme similar to CRE or Max-SNR, which makes decisions in a single computation without the need for iterative operations, it can be effectively applied to large-scale networks with a large number of base stations and users.
In this paper, we propose an intelligent biasing algorithm based on real-world 5G traffic load to enhance the performance of users at the edge boundaries (i.e., cell-edge users) of heavily-loaded small cells. Our analysis in Section II shows that 5G mobile traffic follows a consistent daily pattern and is predictable. Thus, in the proposed algorithm, small cells can proactively and intelligently adjust coverage by changing their bias value based on traffic load predictions. During times anticipated to have heavy traffic load, a small cell reduces its bias value and contracts its coverage, and vice versa. Therefore, cell-edge users of heavily-loaded small cells are associated with slightly more distant, but lightly-loaded small cells that offer expanded coverage. Considering the abundant presence of small cells around users in 5G networks, cell-edge users can choose under-loaded small cells at the cost of SNR. As shown in Figure 1, the proposed algorithm can be considered as an evolution of cell breathing techniques in macrocell networks [19], which we have termed ‘cell thinking’.
It is worth mentioning that the users may experience inter-cell interference by adjusting the coverage of small cells. Furthermore, from a practical perspective, the proposed algorithm requires a database system to accumulate per-user QoS and AI/ML capability for traffic load prediction. Thanks to the next-generation radio access network (RAN), open RAN (a.k.a O-RAN), we can be free from such concerns [20]. In Section II, we carefully design an intelligent small cell networks based on O-RAN architecture.
The major contributions of this paper are summarized as follows:
We develop a user association algorithm using real-world 5G traffic dataset collected by ourselves, instead of relying on outdated datasets or tractable traffic models. Our 5G traffic dataset can capture the fluctuations in traffic demand over time in 5G small cell networks.
By analyzing our dataset, we concluded that 5G traffic exhibits a predictable daily pattern. In light of this fact, we adopt the realistic assumption that traffic from small cells can also be predictable, as detailed in Section II.
We propose a one-shot user association scheme, which makes a decision with a single computation without iterative operations, based on the result of traffic load prediction. Thus, our algorithm has virtually zero computational overhead. We demonstrate that this simple and practical algorithm, leveraging real-world traffic data, can mitigate the load imbalance issue in 5G networks by comparing its performance with the Max-SNR scheme and a near-optimal approach. It is worth mentioning that in small cell networks, the concept of Max-SNR is analogous to CRE, which decides the association with the maximum biased SNR.
We conduct a comprehensive numerical analysis of our proposed algorithm on a realistic 5G small cell network simulation. In the simulation, we propose a novel resource allocation method using our real-world 5G traffic dataset. The inherent simplicity of our algorithm enables its application across a large number of users, potentially scaling to hundreds of users. This clearly distinguishes our proposed algorithm from existing papers that dealt with complex user association algorithms by using small simulation parameters. It demonstrates the remarkably practical nature of our proposed algorithm.
The remainder of this paper is organized as follows. Section II presents the system model including our collected dataset, intelligent 5G networks based on O-RAN architecture, and problem formulation. The proposed algorithm is described in Section III. The numerical results of the proposed algorithm are presented in Section IV. Section V deals with conclusion and future work.
System Model
In this section, we present the 5G traffic dataset we collected ourselves and propose an O-RAN architecture for its intelligent processing. Additionally, we propose a novel resource allocation method based on real-world traffic and formulate optimization problems.
A. 5G Mobile Traffic Dataset and Its Predictability
The dataset was collected from ten graduate students and one professor using 5G smartphones in the Mobile Communication Lab at Kyunghee University over approximately three months, from July 2023 to October 2023. The participants used 5G services and captured their natural, daily routines in the dataset during a 24-hour period. They did not artificially use applications and instead accessed various 5G services (e.g., social network services, games, video streaming) at their convenience. Additionally, they were allowed to use multiple applications simultaneously (e.g., enjoying social network services while using music streaming applications). Samsung Galaxy S smartphones were given for data collection, utilizing the ‘Netmonitor Lite’ Android application, which captures traffic log every second [21]. This log includes network information, signal strength, and data rate measurement. Leveraging the downlink/uplink data rate information, we can model the mobile traffic demand of 5G users.
Differing from existing mobile traffic datasets that provide cell-aggregated mobile traffic, our dataset records per-user mobile traffic load [22], [23]. In other words, our dataset more accurately demonstrates the dynamic traffic demand of 5G users. Moreover, since the experiments for data collection allowed the simultaneous use of multiple 5G services, our dataset is more realistic than the stochastic traffic models based on single-service scenarios [24].
Figure 2 shows samples from our dataset, which were collected at one-second intervals over a 24-hour period. Figure 2(a) depicts the traffic demand on weekdays over time, while Figure 2(b) illustrates the traffic demand during weekends. It should be noted that the traffic of 5G users is dynamic, but this dynamic nature itself exhibits ‘consistent patterns’ according to the users’ daily routines [23]. For example, on weekdays, peak values intermittently appear at specific times (such as during business hours and commuting periods), whereas on weekends, peak values consistently manifest in the afternoon. This implies that the dynamics of 5G traffic are sufficiently predictable based on users’ predetermined daily patterns. Additionally, in practical settings, as users frequently move along predefined routes, the aggregated traffic at base stations deployed in specific locations also demonstrates predictability.
Examples of real-world 5G traffic generated by a 5G smartphone over a 24-hour period.
Based on the analysis of complex yet predictable 5G traffic patterns, we assume that the small cells in 5G networks can predict their own traffic load to be handled in the near future (e.g., 1 hour later). Figure 3, cited from [23], further validates the soundness of our assumptions. With this realistic assumption and using our 5G traffic dataset, we propose a load balancing algorithm that adjusts the bias factors of each small cell in advance, based on predicted 5G traffic load, as discussed in Section I. This algorithm mitigates network traffic imbalances among small cells that arise when the Max-SNR scheme is applied. Details of our algorithm are provided in Section III.
The temporal patterns of aggregated mobile traffic from 9,600 base stations in Shanghai at different time scales, hourly and daily [23].
In our system model, each user’s traffic model is randomly selected from the samples in our dataset. Thus, we can model a dynamic, real-world 5G network over a 24-hour period for the simulation. The traffic demand generated in our simulation for the different numbers of users, which shows a high similarity to the hourly mobile traffic graph in Figure 3, is shown in Figure 4. The details of the network model we developed are discussed in the following subsection.
Given the predictability of our collected dataset, we can assume that the small cells can predict the traffic load they need to handle over specific intervals. For instance, there is very little traffic to handle from 3 AM to 4 AM, while the traffic is significantly higher from 2 AM to 3 AM. Based on this fact, our proposed temporal model for developing the proposed traffic-aware biasing scheme is illustrated in Figure 5.
The temporal model used to implement proposed traffic-aware biasing scheme. Our dataset comprises 24 hours of 5G traffic collected at 1-second intervals, allowing us to obtain load balancing results over a 24-hour period in the simulation. Since adjusting the bias every second is highly inefficient, we divide the 24 hours into 24 intervals of 1 hour each and determine the appropriate bias for each interval.
First, we divide the 24-hour period into equal intervals of length T, and calculate the bias for each interval kT based on the expected traffic load
Second, to implement the proposed traffic-aware biasing scheme, it is necessary to model the traffic load
B. Network System Model and Resource Allocation Scheme Based on Our Dataset
We use our real-world 5G user traffic dataset to model the traffic demands of users in our system simulation. Previous research commonly assumed that small cells evenly allocated bandwidth among users. However, this approach is not suitable for 5G networks, where users have varying traffic demands. To address this issue, our simulation leverages our dataset, which accurately reflects the dynamic traffic demands of 5G users, to ensure that small cells allocate bandwidth in proportion to the radio resources requested by users.
Figure 6 shows the proposed resource allocation method used in our simulation. As mentioned in the previous subsection, each user’s traffic demand is modeled using one of the 24-hour data rate patterns from our dataset. Consequently, in our simulation, users will require different amounts of radio resources every second. The bandwidth allocation is then determined based on the ratio of the radio resources requested by the users. It is noteworthy that the actual data rate recorded in the real-world traffic dataset is not used directly to determine the user’s throughput in the simulation. Instead, the user’s throughput is determined by the allocated bandwidth and the channel model applied in the simulation. The actual data rate values for users are aggregated to model the load that each small cell is expected to handle (i.e.,
For simplicity, we neglect small-scale fading in the channel model, assuming that the 5G traffic dataset we collected already includes the effects of real-world fading. The path loss model is based on non-line-of-sight (NLOS) conditions in an urban low-rise environment, as recommended in [25]. We choose the NLOS model because the environments where we collected traffic data were indoor settings with numerous obstacles between 5G base stations and congested commuting routes. We represent this model with the following equation:\begin{equation*} PL(d_{s}, f_{c}) = 10 \alpha \log _{10}{d_{s}} + \beta + 10 \gamma \log _{10}{f_{c}} + X_{\sigma }, \tag {1}\end{equation*}
The small cells in the simulated network are placed uniformly with inter-site distances (ISDs) [26]. As mentioned in Section I, our goal is to enhance the capacity for cell-edge users at the cost of SNR. In our system model, as shown in Figure 7, the ISD is set longer than the radius of a small cell to create cell-edge users (it is crucial not to set the ISD too long to avoid outages). While users within the radius of a small cell are guaranteed high performance, those outside this radius (i.e., cell-edge users) cannot be assured of high SNR. In such cases, it is preferable for cell-edge users to choose an under-loaded cell even if it provides a slightly lower SNR. The proposed algorithm assigns a greater bias to under-loaded small cells, making them more attractive to cell-edge users. Therefore, it is clear that our proposed algorithm is a better choice for cell-edge users compared to the Max-SNR scheme.
C. Optimization Problem Formulation for User Association Based on Real-World 5G Traffic
Assuming that each user can be associated with only one small cell, we aim to find a user association decision that mitigates load imbalance among the small cells. In the formulated optimization problem, Jain’s fairness index, which depends on the aggregated actual data rate values from our dataset, is used as the objective function [27], [28]. The sets of small cells and users are represented by S and U, respectively. The optimization problem for finding the association matrix \begin{align*} & \max _{a} \left \{{{ 2\log _{2} \left ({{ \sum _{i \in \mathrm S}^{} {\rho _{i,kT}} }}\right ) - \log _{2} \left ({{ \sum _{i \in \mathrm S}^{}{\rho _{i,kT}}^{2} }}\right ) }}\right \} \\ & ~~\text {s.t.} \begin{cases} \displaystyle \sum _{i \in \mathrm S}^{} a_{i,kT}(j) = 1, & \forall j \in {U}, \\ \displaystyle a_{i,kT}(j) \in \{0,1\}, & \forall i \in {S}, \forall j \in {U}, \end{cases} \tag {2}\end{align*}
\begin{equation*} \rho _{i,kT}=\ \sum _{t}^{kT}{\sum _{j \in \mathrm U}}^{}{a_{i,kT}(j) D_{j,t}}, \quad \forall i \in {S} \tag {3}\end{equation*}
Once the association matrix is determined based on the actual data rate values from our real-world dataset, we can then calculate the throughput for users in the simulated network. The throughput \begin{align*} r_{ijt}& =\frac {W_{i}D_{j,t}}{\sum _{j\in \mathrm U}^{}{D_{j, t}a_{i,kT}(j)}} \log _{2}(1+SNR_{ij}) \tag {4}\\ SNR_{ij}& =\frac {P_{i} G_{ij}}{\sigma _{s}^{2}} \tag {5}\end{align*}
We can evaluate the performance of the algorithm based on the fairness of the aggregated user throughput among small cells. Our algorithm that considers \begin{equation*} F_{kT} = \frac {{\left ({{\sum _{i \in \mathrm S}^{}{\sum _{t}^{kT}{\sum _{j \in \mathrm U}^{}{r_{ijt}}}}}}\right )}^{2}}{\left \vert {{ S }}\right \vert \sum _{i \in \mathrm S}^{}{{\left ({{\sum _{t}^{kT}{\sum _{j \in \mathrm U}^{}{r_{ijt}}}}}\right )}^{2}}} \tag {6}\end{equation*}
D. Intelligent 5G Network Based on O-RAN Architecture
We can cast our system model into the O-RAN specification, where new O-RAN elements intelligently manage the coverage of small cells by proactively adjusting their bias values based on the traffic load they need to handle during interval T. To efficiently manage the large number of small cells deployed in 5G networks, mobile network operators (MNOs) have pursued new standards for automated and intelligent operations. In line with this trend, in 2018, several global MNOs established the O-RAN Alliance and released O-RAN specifications for intelligent network architectures (i.e., O-RAN architecture [30]). Hereafter, we describe our intelligent network architecture, which consists of RICs and service management and orchestration (SMOs).
The O-RAN specification defines RICs to enable intelligent, data-driven RAN control [31]. The RICs are differentiated into two different categories: i) near-real-time RIC (near-RT RIC) responsible for services requiring latency from 10 milliseconds to 1 second, and ii) non-real-time RIC (non-RT RIC) for services with latency exceeding 1 second. The Near-RT RIC is positioned close to the RAN on the radio side to directly enhance the QoS for hundreds of users, while the non-RT RIC, which is hosted by SMO, orchestrates the network on a broader scale, serving a more extensive range of users [32]. Furthermore, SMO can support data collection and AI/ML model training systems, enabling intelligent network management to handle big data generated from 5G networks [33].
The RICs and SMO can control the mobile network through a centralized, data-driven approach, which means that these systems require data management systems. According to the O-RAN specification, near-RT RICs have two kinds of databases: Radio-Network Information Base (R-NIB) and UE-Network Information Base (UE-NIB) [34]. UE-NIB stores the data rate provided to each user, while R-NIB can store the aggregated traffic load of the served users. Additionally, SMOs and Non-RT RICs can perform traffic prediction using big data collection systems and AI/ML capabilities. Therefore, by leveraging the databases and AI/ML capability defined in the O-RAN specification, our traffic-aware biasing algorithm can be implemented into the actual system, making our solution highly practical. Figure 8 shows an example of the proposed scheme based on the O-RAN architecture.
Proposed Algorithm
The proposed algorithm calculates differentiated bias values for each small cell, taking into account the traffic load that each small cell is expected to handle. Our proposed algorithm is represented by following equations:\begin{align*} K_{i,kT}& = \frac {1}{\alpha }\left ({{\frac {\bar {\rho }_{kT} - \rho _{i,kT}}{\bar {\rho }_{kT}}}}\right ), \tag {7}\\ {\beta }_{i,kT}& = {\beta }_{i, (k-1)T}\left ({{1 + K_{i,kT}}}\right ), \tag {8}\\ \text {Bias}_{i,kT} & = \text {pathloss}\left ({{{\beta }_{i,kT}}}\right ) - \text {pathloss}\left ({{{\beta }_{i,(k-1)T}}}\right ), \tag {9}\end{align*}
As shown in Figure 9, the virtual radius
If
For simplicity, we assume that each small cell always handles at least a minimum load in each interval and do not consider cases where
Algorithm 1 gives the proposed biasing algorithm. Given T,
Small Cell Bias Calculation in the k-th Interval kT
Let current interval is
Initialize number of Time Interval
Initialize the scaling factor
Initialize the
Predict the traffic load to be handled
Total =0
for
Total = Total
end for
for
Compute the
Compute the
Compute the Bias
end for
Apply biases to each small cell
Determine association matrix
To get an insight into the effectiveness of the proposed one-shot algorithm, we need to compare it with the optimal performance obtained through an iterative solution. We examine combinations of cell-edge users and neighboring small cells using an exhaustive search. Since we experiment with hundreds of users, exploring all combinations is not feasible. Therefore, we obtain a near-optimal solution within the limits of available computational and time resources.
Simulation Results
Based on a complex system model and a simple algorithm, in this section, we evaluate the performance of our algorithm and compare it with Max-SNR scheme and near-optimal result using Monte Carlo numerical simulations. We perform 1,000 simulations for random networks, where each network consists of 20 small cells and randomly distributed users. We obtain simulation results for each interval and evaluate the performance based on the average of these results. The simulation parameters are summarized in Table 2.
To evaluate the performance of algorithms for cell-edge users, we analyze the SNR and throughput of users with 10th percentile performance (Low-10% users) and throughput of users with 90th percentile performance (High-90% users). Note that the SNR of High-90% users is naturally very high and thus not subject to evaluation. We can consider the Low-10% users as cell-edge users and the High-90% users as those monopolizing resources in under-loaded cells. As previously mentioned, if our algorithm performs well, Low-10% users will experience improved throughput despite a reduction in SNR compared to when Max-SNR is applied.
A. Numerical Analysis of Proposed Algorithm with Other User Association Schemes
Table 3 and 4 show the SNR and throughput of Low-10% users versus the number of users for three schemes, respectively. The scaling factor
Table 5 shows the average throughput of High-90% users for the three schemes. We observe that the throughput of High-90% users with the proposed scheme and near-optimal solution is lower than with the Max-SNR scheme. These results indicate that the resources allocated to users associated with under-loaded cells are effectively redistributed to cell-edge users. Table 6 clearly illustrates the load balancing achieved with the proposed and near-optimal schemes. Both schemes offer higher throughput to Low-10% users compared to the Max-SNR scheme. Notably, the performance of the proposed one-shot algorithm shows little difference compared to the near-optimal solution found through exhaustive search.
In Figure 10, we plot the average small cell fairness indices across 24 intervals for the number of users. As shown in Figure 10, the proposed scheme has significantly higher fairness indices than Max-SNR scheme. Although the fairness achieved with the proposed scheme shows a significant difference compared to the near-optimal solution, our scheme is considerably more practical when considering computational complexity.
B. Impact of Scaling Factor \alpha
on Proposed Scheme
In this subsection, we compare the performance of the proposed algorithm with respect to different values of the scaling factor
Tables 7 illustrates the SNR and throughput for cell-edge users when the
C. Time Complexity Analysis
The time complexity of the proposed algorithm only depends on the biasing calculation. Calculating the average traffic load
Conclusion
In this paper, we conducted a study to enhance load balancing among small cells in 5G networks using real-world 5G user traffic and O-RAN architecture. We collected 5G user traffic data ourselves and demonstrated its availability by modeling a 5G network simulation using our dataset. We proposed a one-shot user association algorithm based on the future traffic load of small cells for effective load balancing. In simulation, our algorithm adjusts the bias of each small cell according to the anticipated traffic load one hour later, based on the realistic assumption that traffic load can be predicted for each individual small cell. We compared the performance of our proposed algorithm with Max-SNR scheme and a near-optimal one. The results showed that the proposed algorithm allows to achieve significant performance improvements for cell-edge users compared to the Max-SNR, which is also a one-shot algorithm. This is because users are not merely associated with the small cell providing the maximum SNR; instead, they are associated with a small cell that has a lower SNR but a lighter load.
Future work could include enhancing the reliability of 5G user traffic datasets. The participants involved in data collection were limited to graduate students, which prevented the reflection of diverse daily patterns. We hope our paper proclaims the need for real-world 5G user traffic studies and encourages the release of diverse 5G user traffic data as open source. This implies that the simulated network can more accurately model real-world 5G networks and may pave the way for new research directions for 5G and beyond networks.
ACKNOWLEDGMENT
(Young-Jun Cho and Hyeon-Min Yoo are co-first authors.)