SyntheticNET: A 3GPP Compliant Simulator for AI Enabled 5G and Beyond

The rapid evolution of cellular system design towards 5G and beyond gives rise to a need for investigation of the new features, design proposals and solutions in realistic settings for various deployments and use case scenarios. While many system level simulators for 4G and 5G exist today, there is particularly a dire need for a 3GPP compliant system level holistic and realistic simulator that can support evaluation of the plethora of AI based network automation solutions being proposed in literature. In this paper we present such a simulator developed at AI4networks Lab, called SyntheticNET. To the best of authors’ knowledge, SyntheticNET is the very first python-based simulator that fully conforms to 3GPP 5G standard release 15 and is upgradable to future releases. The key distinguishing features of SyntheticNET compared to existing simulators include: 1) a modular structure to facilitate cross validation and upgrading to future releases; 2) flexible propagation modeling using measurement based, ray tracing based or AI based propagation modeling; 3) ability to import data sheet based on measurement based realistic vendor specific base station features such as antenna and energy consumption pattern; 4) support for 5G standard based adaptive numerology; 5) realistic and user-specific mobility patterns that are yielded from actual geographical maps; 6) detailed handover (HO) process implementation; and 7) incorporation of database aided edge computing. Another key feature of the SyntheticNET is the ease with which it can be used to test AI based network automation solutions. Being the first python based 5G simulator, this ease, in part stems for SyntheticNET’s built-in capability to process and analyze large data sets and integrated access to Machine Learning libraries. Thus, SyntheticNET simulator offers a powerful platform for academia and industry alike to investigate not only new solutions for optimally designing, deploying and operating existing and emerging cellular networks but also for enabling AI empowered deep automation in the future.


I. INTRODUCTION
Mobile cellular networks are one the most complex and expensive engineered systems in existence today. Given a typical modern Base Station (BS) has thousands of configuration parameters, optimal planning, configuration and continuous post-deployment optimization of the nation wide mobile network often containing hundreds of thousands of diverse sites is already one the most challenging and resource hungry engineering problem. In wake of internet of everything, e-governance, e-commerce, e-health and ubiquitous consumption of infotainment, optimal design and operation The associate editor coordinating the review of this manuscript and approving it for publication was Jesús Hamilton Ortiz . of emerging mobile networks is one the key propellers of the emerging digital society.
While rapid evolution of the cellular technologies towards 5G and beyond is a vital step forward to meet the capacity crunch, it further aggravates the complexity challenge being faced by the operators today. This is because the number of parameters per site and number of sites per unit area continue to rise making mobile network too complex to optimally design, configure, operate and manage. This calls for tools that can enable investigation and realistic evaluation of a myriad of new system level configurations and features in various deployments and use case scenarios.
Academic research community has heavily relied on mathematical models e.g., such as ones employing stochastic geometry, to get insights into the system level performance of various deployment scenarios [1]- [6]. However, to achieve tractability, these models have to build on countless restrictive assumptions and simplifications with respect to user and BS location distributions, transceiver architecture and configurations and propagation characteristics to name a few. Furthermore, such models are static in nature and fail to capture the impact of dynamics that are peculiar to mobile networks such as user mobility, handover (HO) and so on.
Field trials offer the most realistic evaluation of a new network design, solution or feature. However, relying on field trials alone to test every proposed design, solution or feature is not practical owing to the cost, time and effort required to conduct field trials. For this reason, only the most promising designs can be worthy of investment and resources needed for the field trials. In addition, given the large investments and stakes at risk, mobile operators want to minimize the chances of significant network performance impairment of a live mobile network even during the trial phase.
For 5G networks, given further increase in complexity, the process of designing an optimal network configuration that can maximize all the Key Performance Indicators (KPI) such as coverage, capacity, retainability and energy efficiency is even more challenging task. Identifying and maintaining the optimal network configuration is necessary for network operators to fulfill the promises of the much anticipated 5G networks. Deploying the new 5G network and innovative network functionalities and solutions being proposed for efficiency enhancement in 5G and beyond, particularly AI based network automation solutions as proposed in [7]- [9], in a real world without prior testing, will be a costly process and cannot be done practically. (See Fig. 1).
To address this problem, system level simulators are widely used in both industry and academia. Many 5G simulators emerged to date but, as per survey conducted by the authors and concluded in Table 1, none of them comprises of all the key components of 5G standard. Most importantly, as can be seen in Table 1, none of the existing known simulators have the specific features and flexibility needed to implement and evaluate an AI based design and zero touch automation framework envisioned for emerging networks as proposed for the first time in [7]. To tackle this problem, we have newly developed a system level simulator in Python platform, named SyntheticNET. The SyntheticNET simulator is modular, flexible, microscopic, versatile and built in compliance with the 3GPP Release 15 [10]. The presented simulator supports a large number of unique features such as adaptive numerology, actual HO criteria and futuristic database-aided edge computing to name a few. Instead of an Objected-Oriented Programming (OOP) based structure like existing simulators [11]- [23], SyntheticNET simulator supports commonly used database files (like SQL, Microsoft Access, Microsoft Excel). Site and user information, configuration parameters, antenna pattern etc. can be directly imported to the simulator. As a result, the simulation environment is more realistic and close to actual deployment scenarios.
Python based platform and the flexibility of different input and output data formats in SyntheticNET simulator allows validation of different Self Organizing Networks (SON) related features as well as new AI based network automation solutions [7]. Mobile operators can use it for planning, evaluating or even optimizing 5G networks. Research community can also benefit from it by implementing the new ideas on a true 3GPP-based realistic 5G system level simulator.

A. RELATED WORK
Recently many simulators targeting 5G network characteristics have been developed [11]- [23]. However, the survey of these concluded in Table 1, reveals that each of these simulator represents only a selected set of features present in 5G standard [10]. Among the available 5G simulators, Matlab [11] is the most advance 5G link-level simulator having the support of flexible frame structure, ability to select one of different resource scheduling techniques available and can incorporate mmWave channels as well. However, unlike Syn-theticNET, it is not a system level simulator. Few important features that MATLAB based 5G simulator does not support include realistic mobility and HO mechanism, categorization of User Equipment (UE) as per QoS Class Identifier (QCI) and having cloud based network deployment to name a few. Another popular simulator is Vienna 5G simulator [17] that is an open source system-level simulator for academic purposes and is based on Matlab platform. Unlike [11], Vienna does support cloud computing as well. However, this simulator lacks a key feature i.e., realistic mobility modelling and HO support.simulator also lacks some vital features to mimic a real cellular network such as realistic mobility modeling and HO support.
There are few other popular discrete-event 5G network simulators such as ns-3 [12], OMNeT++ [13] and OPNET [14]. Event driven simulators have a major portion of the protocol stack implemented in them, and the packet oriented nature of these simulators exhibits quite accurate link-level results. However, their high computational and network deployment complexity hinders them from modeling large Radio Access Networks (RAN) needed for more realistic analysis. These simulators are more suitable for core side modeling and they can not provide visualization of the crucial RAN KPIs such as coverage and capacity. The need of implementing and testing large RAN deployments can be highlighted from the fact that 5G networks will have an ultra-dense BS deployment with huge number of mobile subscribers which will include sensory devices, self-driven cars etc.

B. CONTRIBUTIONS
The need for a Python based system level simulator for 5G and beyond stems from the new use cases and design features anticipated in 5G and beyond [24]. These include smart vehicles and transport, critical control of remote devices, human machine interaction, and broadband and media everywhere.
During the planning and development of Synthetic-NET simulator, we make sure flexibility and modularity when implementing both existing network functions (e.g., propagation models, scheduling algorithms) and new network functions (user mobility, HO criteria, database-aided edge computing etc.). Thanks to the modular code structure, the SyntheticNET simulator is well-suited to the requirements of emerging network scenarios and its use cases, even beyond the scope of 5G.
For academic purposes, a free version of SyntheticNET will be available soon. In the following, we highlight the key features that make SyntheticNET simulator fit for the simulation of emerging cellular networks.
• SyntheticNET simulator is first python-based 5G network simulator and thus has the capability to handle large amount of data and access to Python based Machine Learning (ML) libraries. This unprecedented capability makes SyntheticNET simulator the first simulator that is purpose-built to test AI based network automation at all layers and validate the already standardised SON and next generation SON features.
• First microscopic simulator where each cell can have a unique parameter configuration that can be loaded in various industry compliant formats to model real datasheet based features such as antenna patterns, clutter, BS amplifier, etc. Unlike OOP based BS deployment where BSs are deployed as per an underlying distribution, SyntheticNET simulator can import a database of site information corresponding to real deployment. Such is maintained by mobile network operators with a detailed description corresponding to individual BSs. The database constitutes of location, tilt, power, azimuth, height, signal propagation description, and even other low-level details.
• Similar to BS database, SyntheticNET simulator can import UE specific details such as location, type (static/mobile), height, number of antennas from a real database.
• Ease of calibration through ML based models [25] from traces of real RSRP data, and then predicting accurate RSRP information on the bins where real measurement data is unavailable or has similar characteristics (terrain, BS tilt, user mobility).
• Support of vendor specific or measurement realistic antenna pattern specifications for individual cells.
• Apart from having well known mobility patterns like random way point, SLAW model etc., SyntheticNET simulator can replicate realistic user mobility patterns through integration of Simulation of Urban MObility (SUMO) simulator [26]. This realistic mobility path not only includes the user commute from home to office and back, but random trips to marketplace and entertainment areas can be configured as well. This can help us achieve realistic spatial and temporal load distribution across the deployed network area.
• The SUMO based mobility module in SyntheticNET simulator can also incorporate actual street, highway, walk way topographic data in various formats.
• Historical UE mobility paths can be imported directly from a real network, and can be leveraged by Synthetic-NET simulator to have better insights related to mobility, load distribution, user experience etc.
• SyntheticNET simulator supports flexible frame structure of 5G and corresponding Physical Resource Block (PRB) size and Transmission Time Interval (TTI).
• SyntheticNET simulator models realistic HO criteria where each cell can have individual HO related parameter configurations. This feature alone makes Synthet-icNET simulator unique as existing simulators model mobility and HO using at most few parameters such as CIO and cell offset. On the other hand SyntheticNET simulator models the mobility management to its utmost depth by incorporating dozens of parameters that affect the mobility related KPIs and other system level KPIs in an intricate and interdependent fashion.
• SyntheticNET simulator incorporates both cell-level parameters (e.g. A3 offset) and relation-level parameters (parameters affecting adjacent BSs on same frequency and inter-frequency e.g. CIO.) • Database-aided edge computing support where KPIs known through simulation or by importing csv files can be used to test AI enabled proactive network features (e.g. proactive mobility management, proactive resource allocation, proactive load balancing).
This paper describes the key features of SyntheticNET simulator. The rest of the paper is organized as follows. Section II provides high-level overview and explains the execution flow of the simulator. Section III then presents the key salient features of SyntheticNET simulator that makes it distinct from existing simulators. These include adaptive numerology, realistic propagation modeling, detailed 3GPP compliant HO triggering and execution mechanism modeling, database-aided cloud computing and the support of realistic mobility pattern. In section IV, we presented a use case where we showed how SyntheticNET simulator features such realistic mobility pattern integration and HO procedure can aid in realistic evaluation of different mobility prediction techniques. Section V concludes the paper.

II. SIMULATOR STRUCTURE AND EXECUTION
One of the goals of SyntheticNET simulator is to act as a key enabler for AI based revolutionary planning and optimization solutions for 5G cellular networks and beyond. To have an overview of the structure of the presented simulator is thus necessary to understand the capabilities and usefulness of SyntheticNET simulator. In this section, we provide this high level overview of the simulator without attempting to explain the functionality of each simulator module in detail.

A. SIMULATOR BLOCK DESCRIPTION
The overall structure of the SyntheticNET simulator is shown in Fig. 2. As shown in this figure, the SyntheticNET simulator can be divided into eight basic blocks. These blocks are briefly described below.

1) NETWORK DEPLOYMENT
Unlike the existing OOP based simulators, SyntheticNET takes the input in the form of commonly used database format. such as the widely used csv format. This block imports the individual BS characteristics which include location, cell id, type of BS, operating frequency, transmission power, electrical/mechanical tilt, azimuth angle, number of available antennas, Cell Individual Offset (CIO -herein referred to as cell bias) and even antenna pattern. The screen shot of a sample heterogeneous network deployment has been shown in Fig. 3.

2) PROROGATION MODELING MODULE
SyntheticNET allows incorporation of wide range of propagation models and associated data. Custom pathloss empirical models, measurement data based models or ray tracing based models can be used for realistic signal strength calculation.   SyntheticNET also allows importing of the detailed topographic data in various industry compliant formats for more realistic pathloss modeling. One key feature of SyntheticNET is its ability to support newly emerging AI based propagation models such as [25].

3) USER ASSOCIATION MODULE
UE location can be imported to the simulator in a similar fashion as the network configuration file described above. This module's main responsibility is to compute the signal strength of each UE located in the defined network bound. UE is associated to the serving cell with the smallest distance to the UE, or to the BS having the highest Reference Signal Receive Power (RSRP). UE is associated with the cell only if the RSRP is higher than a certain threshold (defined as qRxLevMin in 3GPP [10]). The latter approach is recommended for heterogeneous BS scenario where transmission power of each BS can vary due to location, surrounding, type of BS etc. While calculating the signal strength, height of BS and UE, and the angular separation between the UE and the respective BS azimuth angle is taken into consideration.
A key unique feature of this module in SyntheticNET is that it allows testing of custom AI enabled more sophisticated user association criteria that can take into account advance KPIs such as cell current and future load, network energy consumption, mobility pattern, QoE requirements, caching requirements, caching on edge statistics, and UE battery.

4) USER MOBILITY
Location of mobile UEs are modelled by this block. UE location is updated based on the velocity assigned for that individual mobile UE, and the direction is known from one of the selected model (Manhattan model, Random Waypoint, SLAW model etc.). In addition to predefined or historical UE path, SyntheticNET also supports realistic mobility pattern by integrating SUMO [26] in its mobility pattern modeling module (Section III-D). This feature can help advance AI enable proactive and holistic mobility management solutions.

5) HO PROCEDURE
HO procedure module provides a realistic 3GPP-based HO criteria evaluation so that vital retainability KPIs like HO attempt and HO success rate can be evaluated.
For HO procedure, configuration files which include the cell level and relation level parameters can be configured internally or can be imported to model a newly proposed or vendor specific HO implementation. More detail on this can be found in Section III-B.
A unique feature of SyntheticNET in this context is that unlike most existing simulators that consider only one or two basic HO parameters thus offer inaccurate results on mobility related KPIs, the HO module in SyntheticNET incorporates all 20+ 3GPP defined configuration parameters that affect mobility in a real network. Modeling these parameters in a simulator is a key step to enable holistic AI enabled network automation. These parameters not only affect mobility related KPIs but also determine overall signaling overhead, capacity, UE battery life and QoE. Next, the UEs served by the respective BSs are allocated resources according to selected scheduling scheme. The scheduling scheme can be custom or standard such as Round Robin, Proportional Fair, Max C/I etc. While allocating resources, priority criteria can also be defined. In default criteria, priority is given to UEs as per their QCI. Delay and jitter sensitive voice users are scheduled with the highest priority, followed by UEs corresponding to Vehicle to Everything (V2X) QCI. The remaining PRBs are allocated uniformly to FTP users. For each QCI class, UEs are allocated resources as per the scheduling approach described earlier.

7) INTERFERENCE MEASUREMENT
The SINR plays a vital role in determining the performance, quality and hence user experience in cellular networks. A large set of accessibility, performance, and retainability metrics, such as coverage, capacity, and mobility related KPIs are heavily dependent on SINR [7]. In most simulators reported in literature, for simplicity just RSRP based interference estimation is done. This abstraction introduces error that makes KPIs estimated by these simulators far different from the performance in a real network. To avoid this source of error, SyntheticNET uses the actual PRB level interference calculation. More detail on the PRB level SINR calculation can be found in Section III-A.

8) PERFORMANCE EVALUATION
User level and network level performance is evaluated in performance evaluation module. In this module, accessibility KPI can be estimated based on the number of static and mobile users located in areas where RSRP or SINR is below the cell association threshold. The default cell association threshold is the RSRP or SINR level below which UE is unable to camp on the cellular network due to lower message decode % in either or both uplink\downlink direction. Uplink messages are not decoded properly due to higher pathloss (low RSRP) where UE transmission power is not adequate to maintain the desired signal quality at the BS. On the other hand, low downlink message decode % is mainly due to high interference (low SINR). Retainability KPI can be computed in a similar manner as accessibility KPI, if during the connected mode or HO phase, RSRP or SINR remains below the Radio Link Failure (RLF) threshold for a certain duration. Configured HO parameters can thus be evaluated from retainability KPI. This feature can thus enable design of AI enabled algorithms to determine optimal HO parametersan important use case for industry.
SyntheticNET supports a variety of data rate calculation methods to represent vendor specific implementations and deployment. In SyntheticNET simulator, default UE specific and cell level maximum throughput (Mbps) is computed by employing the 5G NR [10] max data rate ( ) equation: where J is the number of component carriers, v j Layers is the maximum number of layers, Q m is the maximum modulation order, f is the scaling factor, R max = 948/1024, µ is the numerology which denote SCS as described earlier, T s is the OFDM symbol duration, N BW PRB is the number of PRBs allocated to UE and OH is the overhead.
Another distinct feature of SyntheticNET simulator is to identify the silence period where voice users cannot communicate due to either uplink or downlink issues. Silent period is usually observed when UE experiences RLF or the RSRP and SINR drops below the silence threshold.
Accessibility, RLF and silence thresholds are typically dependent on associated network parameters and to a certain extent the equipment vendor. SyntheticNET simulator supports the use of AI based techniques to identify the respective threshold for a current network deployment. Detail procedure to identify the respective threshold for a given network layout is beyond the scope of this paper.

B. SIMULATOR EXECUTION OVERVIEW
Initial setup of SyntheticNET simulator requires setting up following items: • Simulation duration. • Transmission Time Interval (TTI) length -which is dependent on the 5G µ parameter.
• Network-level parameter configuration -which may be unique for individual BSs e.g. HO parameters. VOLUME 8, 2020 • Relation-level parameter configuration -which includes parameters affecting adjacent BSs on same frequency and different frequency as well, such as CIO.
• UE description -location, static/mobile, height, number of antennas etc.
• (optional) UE historical location -to help evaluate AI based advance mobility management for example proactive HO management, load balancing and energy efficiency.
• (optional) Database of historical KPIs and parameter value pairs for enabling AI based network automation.
Upon execution, SyntheticNET simulator starts processing the data through each module described earlier. In each TTI, network level KPIs and user experience KPIs are calculated. As the simulation proceeds, SyntheticNET simulator executes the HO to a better cell if the HO criteria is met. Next, the resource allocation takes place based on the selected resource scheduling scheme and then PRB-level interference measurement takes place for all scheduled UEs. When the number of TTI elapsed reach the simulation duration, an output file is generated which gives the UE level and network level KPIs. Moreover, a log file is generated which contains signal level from all the nearby BSs along with other network level statistics of interest. This log file provides additional insights that can be leveraged to propose better interference mitigation and mobility management solutions.

III. DETAILED FEATURE DESCRIPTION
Elaborating each of the 5G specification [10] features incorporated in SyntheticNET simulator is not possible within the scope of this paper. Therefore, we present four of the key features that make SyntheticNET simulator superior to existing 5G simulators [11]- [23]. These are the vital components of 5G standard, and are hence, essential to accurately and realistically simulate a 5G network. To the best of authors' knowledge, none of the existing 5G simulator incorporates all four network features described in following subsections.

A. NR ADAPTIVE NUMEROLOGY
In 5G, 3GPP provides adaptive numerology in order to accommodate diverse services (eMBB, mMTC, URLLC) and the associated user requirements. The key idea is to adapt the transmission configuration to address the stringent QoE constraints considering the effect of UE mobility and varying channel conditions. 5G frame structure in SyntheticNET simulator supports adaptive numerology where the TTI duration and the number of PRBs per TTI vary in accordance with the flexible SCS. Structure of the 5G flexible frame and the SCS is governed by the µ parameter. When importing site info, the value of µ associated with each carrier frequency should be assigned so that PRB allocation and interference calculation takes place according to the respective frame structure.

B. NR HANDOVER CRITERIA
User mobility has been the raison d'etre of wireless cellular systems. To maintain reliable connection, it is incumbent upon the mobile users to perform HO from serving cell to the next suitable cell along their trajectory. HO frequency is mainly dependent on the mobile user speed and network deployment characteristics (BS density, heterogeneity, HO parameter configuration etc.). 5G networks will have a large HO rate, primarily because of network densification and a large fraction of mobile UEs. 5G standard follows breakbefore-make HO approach similar to LTE where mobile user may observe HO failure due to poor signal strength of participating BSs, sub-optimal HO parameter configuration or high user velocity. Therefore, apart from coverage and capacity, retainability is a vital KPI to measure user experience in 5G. For this reason, SyntheticNET simulator models the detailed 3GPPbased HO evaluation and execution process for mobile users. For each cell, intra-frequency Hand Over Margin (HOM) is calculated based on A3-offset, A3-hysteresis, serving cell CIO (Ocp or cell bias) and target cell CIO (Ocn). HO evaluation procedure initiates when RSRP of target cell exceeds the RSRP of serving cell by HOM. Next, SyntheticNET simulator's mobility block ensures HOM condition is fulfilled for each (Transmission Time Interval) TTI up till when the Time To Trigger (TTT) timer is expired. This is followed by a HO execution from serving to target cell and during this procedure, serving RSRP and SINR are recorded to help realistically quantify user throughput and retainability KPI for evaluating QoE metric. For more details of 3GPP defined HO execution mechanism, as implemented in SyntheticNET, see [9] and Fig. 5 therein.
HO parameter configuration files corresponding to each cell in the network are imported to SyntheticNET simulator as discussed in Section II. For HOM calculation, two types of configuration files are needed: a) cell-level HO parameter list, and b) relation-level parameter list. Respective parameters needed for intra-frequency HO criteria evaluation are shown in Fig. 6. A more detailed diagram detailing all 28 mobility related parameters and their associations with all 8 mobility related KPIs dictated by these parameters is given in Fig. 5,  in [9]. SyntheticNET simulator also supports 3GPP based inter-frequency HO. Description of inter-frequency HO has been omitted in this paper and can be found in [9]. Fig. 7 shows the SINR CDF of a mobile user traversing through the network layout shown in Fig. 3. During HO criteria evaluation (Fig. 6) i.e., during the time needed to execute HO, mobile user penetrates through the coverage of the neighboring cell without performing HO. As a result, UE observes temporal negative SINR (on dB scale) due to strong interference from the best server (HO target cell). The magnitude of negative SINR during HO phase increases with user velocity as user penetrate deeper into the coverage of neighboring cell. Similarly, larger HOM and/or TTT may contribute to more severe SINR dilapidation. This is illustrated in Fig. 7 for different user velocity and HO parameter configuration. For example, from Fig. 7 we see that for 0dB HOM and 0ms TTT, UE always stays on the best server while interference is observed from non top-1 cells. Consequently UE observes positive SINR (dB) most of the time. However, there are instances where UE observes negative SINR due to strong interference from multiple non top-1 cells. Since UE always stays on the top-1 cell and HO delay due to 3GPP HO criteria (Fig. 6) is not observed, the effect of user velocity on SINR distribution is negligible. This can be verified in Fig. 7 where UE velocity is changed from 100km/h to 200km/h, but the SINR CDF remains unchanged. Fig. 7 also shows SINR distribution plot for various HO configuration parameters. UE SINR decreases with more stringent HO criteria. This is inline with the temporal SINR degradation during HO criteria evaluation discussed earlier.  There is a trade-off between ping-pong HOs and HO delay duration. Because of shadowing, ping-pong HOs increase dramatically when HO configuration demands UE to stay on best server or when UE HO criteria is easily fulfilled. Conversely, ping-pong HO reduces for tighter HO condition, but HO delay increases causing negative SINR or sometimes Radio Link Failure (RLF) especially for high speed users. More detail on this can be found in [9].
It is worth highlighting that most existing simulators do not model HO procedures and associated configuration parameters in such detail to capture aforementioned and other mobility related important phenomena and the associated impact on overall throughput and user QoE that is inevitably experienced in real network.

C. FUTURISTIC DATABASE AIDED EDGE COMPUTING
SyntheticNET simulator also supports database aided edge computing approaches deemed essential for futuristic mobile networks [27]. SyntheticNET simulator divides the target area into a custom bin map whose size can be user defined (see Fig. 8). For a given network layout, SyntheticNET simulator quantifies several KPIs which include SINR distribution, Channel Quality Indicator (CQI) distribution, spectral efficiency, available resources, VoLTE Silence%, HO rate, HO failure% etc.
SyntheticNET simulator then allows to build and store historical database of the above KPIs and selected measurements for the bands of interest such as RSRP, SINR, CQI, PRB usage, mobility traces, QoE indicators such as RLF reports etc. Using tools from machine learning and stochastic optimization, this database can be then leveraged to design algorithms for Data Base Station (DBS) aided cell discovery and selection, proactive radio resource allocation and switching ON/OFF DBS proactively instead of reactively, to jointly maximize both spectral efficiency and energy efficiency without compromising QoE.
In addition to highlighting the areas with poor coverage or high interference, the database of a list of key KPIs can be utilized to propose and evaluate novel SON and AI based network automation features. For example by feeding the historical UE location, we can predict user location and can proactively perform inter-frequency HO to avoid low retainability KPI. Similar approaches can be designed and tested to achieve better load balancing in a multi-tier heterogeneous network.

D. REALISTIC MOBILITY PATTERN
Existing 4G or even 5G simulators [11]- [23] are limited to much simpler and non-realistic mobility models like random waypoint, SLAW model, Manhattan model etc. SyntheticNET simulator on the other hand, incorporates realistic mobility pattern by integrating the Simulation of Urban MObility (SUMO) [26]. SUMO is an open source, highly portable, microscopic and continuous road traffic simulation package designed to handle large road networks. It allows for more realistic simulation including pedestrians and comes with a large set of tools for scenario creation.
SUMO can help simulate a given traffic demand where the network scenario consists of individual vehicles moving through a given road network. Each vehicle can be modeled explicitly, has an own route, and moves individually through the network. Mobility patterns in SUMO are deterministic by default but there are various options for introducing randomness. Randomness can be added for certain aspects of test case scenario which include speed distribution, departure times, number of vehicles, vehicle type, route distribution etc. SUMO also supports traffic stops, departure speed, arrival speed, intersections, yield lane with low priority etc.
Thus SUMO empowered extremely realistic mobility modelling capability of SyntheticNET that can also incorporate realistic road maps and mobility traces, makes SyntheticNET simulator first of its kind 5G simulator capable to investigate a large set of mobility management and optimization problems. A sample use-case in the following section further elaborate the usefulness of realistic mobility modeling in SyntheticNET.

IV. A CASE STUDY USING SyntheticNET: AI-ASSISTED MOBILITY PREDICTION FOR HetNets
In this section, we give one example utility of SyntheticNET simulator through a case study that is not possible with simulators that do not realistically model mobility patterns and mobility management and HO procedures in the network.
This case study briefly shows how we can achieve AI-assisted mobility prediction of mobile subscribers through realistic traffic modeling obtained from SUMO. User Mobility Prediction can be one of the key enablers for AI based network automation and next generation proactive SON [28]. This can enable the reservation of network resources in future identified cells for seamless HO experience [4] as well as for traffic forecasting purposes for load balancing [29] and driving the energy saving SON functions [5], [30] as well as optimizing battery life [3].
In the first stage, we setup the network deployment by importing the site-info having the location of several macro and small cells, along with other associated parameters (power, height, tilt, azimuth etc.). Then we feed the realistic user mobility traces taken from SUMO into the Synthetic-NET simulator. To get the required user mobility data from SUMO, SyntheticNET first passes the network file and population definition file to SUMO. The network file describes roads and intersections where the simulated vehicles move during the simulation. The population definition file has a general statistical information which includes the number of households, locations of houses, schools and work places, free time activity rate, etc. Mobile users by default travel from home to workplace and vice versa. However, additional trips where users visit the entertainment area or grocery shop as per a defined percentage are configured as well. The additional trips are considered as a proxy for increasing randomness in user trajectories. Moreover, randomness in the daily user routes between home and workplace is configured as well.
SUMO then performs the simulation on the input data from SyntheticNET and generates realistic mobility pattern for the mobile users over the configured time interval. The realistic mobility traces generated by the SUMO are then used for mobile users in the user mobility module of the SyntheticNET simulator. During the trajectory, users perform HO as they move across cells. SyntheticNET simulator also keeps track of user location and serving cell id to be used as an input to AI enabled solutions for mobility prediction purposes.
In the exemplary scenario, we run the simulation for 10 days over the map of city of Tulsa obtained from open source map (see Fig. 9). Several macro BSs and small cells are deployed in the test area. After assigning the home, workplace and entertainment locations, we obtain the realistic mobility traces with different degrees of randomness in user paths. Scenario 1 (SC1) represents zero randomness between user trajectories, whereas in medium randomness scenario 2 (SC2), user makes equal number of random trips  to any entertainment location as between home and workplace. For high randomness scenario (SC3), in addition to randomness in the user trips as defined in SC2, users follow different routes between home and office 50% of the time.
Finally, we run eXtreme Gradient Boosting trees (XGBoost) and Deep Neural Network (DNN) multi-class classification algorithms on the data obtained from the simulation scenarios described above. Data is split into 70% training data and the remaining 30% into test data. Results of the AI-assisted mobility prediction techniques on test data for different shadowing standard deviation of RSRP can be found in Fig. 10. Fig. 10 shows that prediction accuracy of more than 90% can be achieved for certain scenarios. However, the prediction accuracy decrease as we increase randomness in the training data.
Mobility prediction will be one of the key enablers AI enabled network automation including next generation proactive SON solutions that aims at efficient resource management of emerging cellular networks. SyntheticNET simulator can help the research community implement their novel research ideas on a realistic 3GPP complaint network simulator and thus can have a better proposal evaluation. Detail description of the algorithm is out of scope for this paper and has therefore been exempted here. For detailed description, see [31]. VOLUME 8, 2020

V. FUTURE WORK
The viability, usefulness and uniqueness of SyntheticNET can only be ensured by putting continuous efforts in the development phase of SyntheticNET. In the following, we have identified few of the key features we will incorporate in SyntheticNET: a) Matrix Pre-Calculation: Existing mobile network simulators take significant amount of time when executed for a realistic network deployment with considerably large number of both BSs and UEs. SyntheticNet will use a novel approach to lower this simulation processing time by pre-generating matrices of signal strength and signal quality across the deployed network while still maintaining the effect of shadowing. In a similar manner, UE mobility patterns will also be pre-generated. This approach can reduce simulation interval by avoiding the calculation of the UE location and UE association related signal indicator values every TTI. b) Radio Link Failures: One of the most common problem experienced by mobile users is Radio Link Failure (RLF). RLF affects some of the core network KPI's such as retainability and throughput. Realistic implementation of RLF will be beneficial in learning why this event happens and in finding ways how it can be avoided. Thus, is it essential to capture and incorporate RLF event to SyntheticNet. c) Support of Complete List of Mobility Events (A1, A2, A3, A4, A5): 3GPP-standardized mobility events used in LTE will still be utilized in 5G. As a simulator that supports legacy and futuristic network, this necessitates intricate modeling and implementation of the most commonly used HO parameters to SyntheticNet. This will ease experimentation to learn how KPI behaves with changes on these parameters. d) Load Balancing Algorithms: Though heavily researched, load balancing is still a challenge in today's cellular network. Since 3GPP left the load balancing algorithm for innovative purposes, SyntheticNet will model the approach being used by major telecom vendors. Moreover, we will develop new innovative load balancing features and will evaluate the efficacy of the developed algorithms by comparison with the load balancing algorithms currently employed by major telecom vendors. e) Idle Mode Mobility: Modeling of idle mode users is frequently left untouched in most simulators available. However, it is essential to model these users in order to realistically capture the network dynamics. Even though idle mode users don't transmit any data, they do use signaling which affects the network specially in the uplink direction. With that in mind, SyntheticNet will include modeling of idle mode UEs to capture the key KPIs like signaling, battery consumption and accessibility.

VI. CONCLUSION
The importance of a realistic yet practical simulator adhering to 3GPP standard for cellular networks can be mirrored by the expected complexity of 5G and beyond networks. However, simulators which are currently available are bounded by too much simplifications, unrealistic assumptions and are lacking in implementation of vital network features making them insufficient in capturing the complexity and dynamics of a real cellular network. To address these challenges, we have developed the first 3GPP 5G standard (Release 15) compliant network simulator called SyntheticNET simulator. Synthet-icNET provides a more realistic and practical evaluation of different network scenarios as well as implementation of several key network features.
Unlike existing OOP based simulators where BS locations depend on an underlying distribution and cannot be preassigned, SyntheticNET simulator is microscopic where individual elements (BS and UE) of the network can have unique and hard coded parameters (azimuth, tilt, antenna pattern, height, transmission power etc.) which is the case in an actual network deployment. With the modular approach of SyntheticNET simulator, it is effortlessly possible to further extend the already implemented network functionalities with 3GPP release 16 and upcoming updates making this simulator future proof. With the flexible implementation of SyntheticNET simulator, it is possible to simulate large-scale networks with several thousand active heterogeneous BSs and several user types, without the need for specialized simulation hardware.
SyntheticNET simulator is the first and only simulator built to date which model more than 20 parameters essential to implement a detailed 3GPP-based HO process. With the added support of realistic user mobility traces, vital mobility KPIs like retainability and HO success rate can be precisely evaluated. In addition to mobility, other key components of SyntheticNET simulator includes ray tracing based models to give accurate signal strength calculation, and adaptive frame structure to help meet several 5G use cases (eMBB, URLLC, mMTC) requirements.
SyntheticNET simulator is the first Python based simulator with inherent ease to process, manipulate and analyze large data sets. Similarly, it has easy access to wide range of machine learning algorithms. This makes SyntheticNET simulator relatively easier to implement and evaluate AI based solutions for autonomous configuration and optimization of network parameters in a given multi-tier heterogeneous network deployment making it beneficial for research community and industry alike.
The presented use case on mobility prediction showcased the power of SyntheticNET in providing practical network deployment, hand over procedure and ease of incorporating realistic mobility patterns from other sources to provide a realistic evaluation of several machine learning techniques in predicting user mobility which would have been impossible or inaccurate using the currently available network simulators.
HASAN FAROOQ (Member, IEEE) received the B.Sc. degree in electrical engineering from the University of Engineering and Technology, Lahore, Pakistan, in 2009, and the M.Sc. degree in information technology from Universiti Teknologi Petronas, Malaysia, in 2014, and the Ph.D. degree in electrical and computer engineering from the University of Oklahoma, USA. He is currently working in Ericsson, USA. His research focused on developing ad-hoc routing protocols for smart grids. His research areas are big data empowered proactive self-organizing cellular networks focusing on intelligent proactive self-optimization and self-healing in HetNets utilizing dexterous combination of machine learning tools, classical optimization techniques, stochastic analysis, and data analytics. He has been involved in multinational QSON Project on self-organizing cellular networks and is currently contributing to two NSF funded projects on 5G SON. Dr. Farooq was a recipient of the Internet Society First Time Fellowship Award toward Internet Engineering Task Force 86th Meeting, USA, in 2013.
ALI IMRAN (Senior Member, IEEE) received the B.Sc. degree in electrical engineering from the University of Engineering and Technology, Lahore, Pakistan, in 2005, the M.Sc. degree (Hons.) in mobile and satellite communications, and the Ph.D. degree from the University of Surrey, Guildford, U.K., in 2007 and 2011, respectively. He is currently an Assistant Professor of telecommunications with the University of Oklahoma, USA, where he is the Founding Director of Big Data and Artificial Intelligence Enabled Self Organizing Research Center. He has been leading several multinational projects on Self-Organizing Cellular Networks such as QSON, for which he has secured research grants of over $2 million in last four years, as a Lead Principal Investigator. He is currently leading two NSF funded projects on 5G SON amonting to over $750K. He has authored over 60 peer-reviewed articles and presented a number of tutorials at international forums, such as the IEEE International Conference on Communications, the IEEE Wireless Communications and Networking Conference, the European Wireless Conference, and the International Conference on Cognitive Radio Oriented Wireless Networks, on his topics of interest. His research interests include self-organizing networks, radio resource management, and big data analytics. Dr. Imran is an Associate Fellow of the Higher Education Academy, U.K. and a member of the Advisory Board to the Special Technical Community on Big Data of the IEEE Computer Society.