A Survey of Machine Learning Applications to Handover Management in 5G and Beyond

Handover (HO) is one of the key aspects of next-generation (NG) cellular communication networks that need to be properly managed since it poses multiple threats to quality-of-service (QoS) such as the reduction in the average throughput as well as service interruptions. With the introduction of new enablers for fifth-generation (5G) networks, such as millimetre wave (mm-wave) communications, network densification, Internet of things (IoT), etc., HO management is provisioned to be more challenging as the number of base stations (BSs) per unit area, and the number of connections has been dramatically rising. Considering the stringent requirements that have been newly released in the standards of 5G networks, the level of the challenge is multiplied. To this end, intelligent HO management schemes have been proposed and tested in the literature, paving the way for tackling these challenges more efficiently and effectively. In this survey, we aim at revealing the current status of cellular networks and discussing mobility and HO management in 5G alongside the general characteristics of 5G networks. We provide an extensive tutorial on HO management in 5G networks accompanied by a discussion on machine learning (ML) applications to HO management. A novel taxonomy in terms of the source of data to be utilized in training ML algorithms is produced, where two broad categories are considered; namely, visual data and network data. The state-of-the-art on ML-aided HO management in cellular networks under each category is extensively reviewed with the most recent studies, and the challenges, as well as future research directions, are detailed.

Even though these are sensible and effective methods of enhancing the capacity of cellular networks, a serious side effect immediately emerges: mobility management [16]. The common ground for network densification and mm-wave communication concepts is that both lead to more frequent handovers (HOs), which is defined as the user equipment's (UE's) change of channel, resource, or cell 3 association while keeping an ongoing call or session. The underlying reasoning behind this consequence is mainly due the reduction of the footprint of BSs. First, in the case of network densification, the footprint is deliberately reduced with the use of small cells (SCs) in order to facilitate more BS deployments through frequency reuse. Second, concerning the mm-wave communications, the footprint of BSs reduces due to the higher propagation losses incurred at mm-wave frequencies (more dependency on line of sight (LOS)). Furthermore, the increased amount of bandwidth also shortens the range of mm-wave signals [20].
As such, the frequency of HOs grows due to the smaller footprints of BSs: mobile UEs would need to perform more HOs, given that there are now more BSs in a certain environment. Given that the average throughput of a user is a function of the number of HOs with an inverse proportionality [21], this issue has severe consequences in terms of communication quality-degrades the quality-of-service (QoS). Besides, as service interruptions are experienced during HOs, the user satisfaction rates are also affected negatively, undermining the great promises of 5G networks. These adverse effects are mainly cause by two reasons: 1) the number of HO experienced during a call or data transfer session; and 2) the HO cost incurred for each HO experienced. In this regards, the research activities on HO management have predominantly focused on these two aspects, such that minimizing the number of HOs and/or the cost incurred per HO.
Although the figures in terms of the growing number of IoT devices and BSs along with increasing demand for data-oriented applications have been discussed negatively so far, there are some positive impacts as well. The volume of data being generated by cellular networks is also growing considerably, making it a gold mine for network operators to exploit in such a way that more efficient management can be facilitated [16], [22]- [24]. In other words, although growing network sizes results in more complexity, the immense data volume generation becomes a key to alleviate such complexity: this so-called challenge brings its own opportunity and solution. In that regard, machine learning (ML) techniques have gained significant attention in the field of wireless communications, since such amount of data can be very well utilized for training ML models, which could help the networks gain experience and take proactive and more informed actions.
Therefore, in this survey, we focus on the application of ML algorithms to HO management in cellular networking with special attention aimed at 5G and B5G networks in order to keep the discussion timely as 5G has already become a reality, and visionary works about 6G has started to appear in the literature [25], [26], [26]. One of the common grounds of these studies, which try to plot the framework of 6G, is that they all agree that artificial intelligence (AI) and subsequently ML will play a key role in 6G networks as intelligence is expected to lie at the core of 6G networks. Moreover, terahertz (THz) frequencies have been projected to be used in 6G due to the abundance of the bandwidth available in these frequencies [26]. However, this makes the HO management concept even more crucial as THz band includes much higher frequencies that mm-wave band, and therefore the smaller footprint (and more frequent HOs) will be much more significant. To this end, we reviewed the state-of-the-art on ML-based HO management in cellular networks (with a special focus to 5G and B5G) by taking into account the data used during the implementation of such algorithms; a top-level taxonomy on the source of data generation is provided with two primary classes: visual data and wireless network data aided HO optimization. Visual data refers to the data that is captured from the environment of interest in a visual format, such as image and video. The data in visual format is then used to assist the HO process by, for example, detecting objects/blockages affecting signal propagation [27]. Wireless network data, on the other hand, is any kind of data that can be acquired by the wireless network, including received signal strength, channel state information, BS traffic load, neighbouring information, user locations, etc. As such, in addition to reviewing the most recent literature, to the best of the authors' knowledge, this is the unique attempt to survey the visual data assistance in HO management. Furthermore, discussions on HO management of legacy networks, including 3G and 4G, are omitted in this article, since a plethora of works surveying such networks are already available in the literature [5], [28]- [30].

A. OBJECTIVES AND CONTRIBUTIONS
As HO management is deemed as one of the most severe design challenges of 5G and B5G mobile communication networks, in this survey paper, we aimed at highlighting the current status of cellular communication networks as well as forthcoming issues related to HO management. Moreover, provided that ML assisted wireless communications has been projected to be at the heart of network management, this survey focuses primarily on the ML applications to HO management in the next generations of cellular networks. In this regards, the mobility management in 5G networks is thoroughly discussed with a special interest given to HO management in order to reveal the distinctive mobility management policy included in 5G standards, which makes it quite different from the legacy networks. Furthermore, we present the main characteristics of 5G networks, including mm-wave communications, heterogeneous networking, IoT, vehicular communications, device-to-device communications, and high-speed train communications, that make the HO management even more challenging compared to the legacy networks. One of the most distinctive contributions of this paper is that we provide an outlook in terms of HO management in B5G, especially 6G where THz communications is projected to be a key component. HO management in THz communications is particularly covered in this work, since the transmission range at THz frequency is quite short due to the large-scale molecular absorption loss [31], increasing the likelihood of HOs. Furthermore, as AI is considered to be very instrumental in designing 6G networks [25], [26], [26], the validity of this works spans from 5G to B5G networks. In this regard, to the best of the authors' knowledge, this paper is one of the few attempts discussing the HO management in B5G networks, and with this we intend to produce a timely and novel survey paper that both reveals the current status and mentions futuristic applications/technologies.
After that, ML algorithms are categorized as supervised, unsupervised, and RL and briefly introduced, followed by discussions on ML-based HO management. Through these discussions, we aimed at providing a basic understanding of the generic principles of the most popular ML algorithms as well as how those algorithms can be applied to HO management process in cellular networks. Besides, the stateof-the-art about ML-aided HO management is extensively surveyed by reviewing the most recent studies in order to showcase the current status and opportunities. A top-level taxonomy is followed while reviewing the state-of-the-art, such that the ML-aided HO management methodologies are classified based on the source of the data they utilize. As such, two broad categories are encompassed: visual data based and wireless network data based HO management techniques. With this novel taxonomy, the major objective is to recognize the visual data aided HO management schemeswhich has been long overlooked in the literature-by giving it a special place along with the traditional network data driven HO schemes. On the other hand, for the network-data based HO management, the most recent works are extensively reviewed under certain use cases: beam selection and BS selection. In addition, we also briefly discuss how intelligent HO schemes can help in emergency situations in the case of mobile clinics, ambulances, and remote hospitals, which could be also beneficial for pandemic scenarios, such as the current COVID-19 pandemic.
Another objective of this paper is to present the grand challenges for the application of ML algorithms to HO management, through which we aimed to address the current and future requirements of such implementations, and to identify possible research directions in order to make the ML integration to HO management in 5G and B5G more efficient, effective, and feasible. Therefore, with this section, we try to canalize the research focus to the identified topics to open a road for practical solutions.

B. RELATED WORKS
ML applications to self-organizing cellular networks were surveyed in [32], in which the authors provided in-depth coverage on the ML algorithms along with the characteristics of self-organizing networks. The authors presented the ML applications to cellular networks under the categories of the major functionalities in self-organizing networks: selfconfiguration, self-optimization, and self-healing. Various use-cases under each of the aforementioned functionality were provided, leading to a comprehensive picture of ML implementations to cellular networks. Thus, while this work focused on ML implementations and included a brief discussion on HO management, it did not primarily focus on HO management in 5G networks; instead, it drew a comprehensive framework for ML applications to multi-domain cellular networks, such as radio resource management, anomaly detection, backhauling, etc.
The work in [33] focused on the use-cases of mobility predictions, and provided an extensive review on the characteristics of mobility predictions (e.g., mobility predictability, user location, prediction output, and performance metrics) along with the methods of mobility prediction. Even though the review was not meant for ML alone, the methods covered are predominantly ML algorithms, hence it can be classified as focusing on ML. However, the scope of the work is not limited to HO management-albeit being included as one of the use-cases-, and visual data driven HO management is mostly ignored. The authors in [28] presented a brief survey on HO management in 5G and B5G networks, where they provided a background on 5G networks along with some enabling technologies, such as mm-wave communications, heterogeneous networks (HetNets), software-defined networking, and ML. Although the main focus of their work is HO management in the next generation of cellular networks, the authors reviewed the literature without extensive discussions. ML implementations were included in general, and the authors failed to demonstrate an in-depth analysis of how ML can be incorporated in HO management in 5G and B5G networks. In addition, the visual data aided HO management and HO in emergency scenarios were completely overlooked in their work.
A very comprehensive and detailed survey was given in [5], in which HO management was elaborated for both long-term evaluation (LTE) and 5G networks with comparative discussions. HO procedures in both LTE and 5G were presented step-by-step, and HO types were covered in a detailed manner. The literature was also reviewed without any particular attention to ML algorithms; as such, even though some ML applications were mentioned while reviewing the state-of-the-art methods, the scope of the paper was solely HO management, not the ML applications to HO management. Similarly, an extensive review of mobility management in ultra-dense networks was given in [29]. In particular, the authors included a meticulous tutorial on the mobility management in cellular networks, followed by discussions on proactive mobility management in the next generations of cellular networks, which comprises of a brief introduction to various ML techniques. Furthermore, the authors included an analysis of AI assistance in mobility management, where they mainly reviewed the literature by identifying the employed AI methods and use-cases. This work seems to be one of the most overlapping survey papers with our present work; however, the focus of the survey in [29] is broader as it tries to encompasses every single issue in mobility management. In our present work, on the other hand, the scope is kept limited to HO management in order to make the review more comprehensive in terms of HO management. Moreover, ML application is not the main focus in [29], whereas, in our present work, we try to exclusively analyse the integration of ML to HO management in cellular networks by discussing ML-oriented opportunities as well as challenges. Besides, visual data based HO management parts covered in our present work constitutes one of the most important novelties and contributions in this present work, as it is not available in [29] or any other mobility or HO management based survey paper in the literature.
Another comprehensive survey was conducted in [34] for mobility management in 5G HetNets. In particular, the authors provided a detailed tutorial on radio resource control (RRC) states included in 5G NR along with the initial access and reachability. RRC protocol is essential for cellular communication networks, and it performs several key functionalities including connection establishment/release, configuration/establishment/release of radio bearer (RB), broadcasting of system information, etc. This topic is more elaborated in Section III-A. Connected mode mobility (i.e., HO) with various types of HOs were also elaborated, followed by beam level mobility management issues. The ML implementations were not the main focus of their work, and thus the scope was primarily kept on mobility management. As such, since ML was not the target, the source of data generation (visual data and wireless network data) were also not discussed. The authors in [35] analysed femtocell HOs in HetNets and provided a detailed background on LTE HO procedure with a particular interest to femtocell HOs. After identifying some challenges with the HO decision process in two-tier networks, an inclusive review was conducted on the existing HO decision techniques. Although the paper was meant for 5G networks, the main story was originated around LTE networks, as there was no particular discussion on the mobility management in 5G networks. In addition, the scope was quite limited, as only HO decision techniques were discussed, and even though some of the cited literature included ML implementations, ML was not the main consideration.
A succinct survey on HO-oriented mobility management was conducted in [30], where the authors provided very generic discussions on mobility and HO mechanisms in Het-Nets. In particular, mobility management was divided into location management and HO management, and each group was elaborated subsequently. However, 5G or B5G cellular communication networks were not mentioned, and no special HO management scheme, such as ML-based HO management, was provided. Therefore, our present survey on intelligent HO management is more advanced compared to the one in [30] in terms of style, the methodology being followed, and comprehensiveness. Another brief survey on mobility management in 5G networks was given in [36]. An overview of the generations of cellular networks from 1G to 5G was first presented, followed by 5G structure and mobility management related discussions in 5G networks. HO management was also reviewed by introducing different types of HOs as well as HO parameters. However, the discussions were kept very short, and an in-depth coverage was not provided for 5G networks or mobility management. Furthermore, the author did not intend to make the survey around the ML applications to HO management.
The authors in [37] produced an extensive survey on mobility management by questioning the readiness of the state-ofthe-art solutions for the next-generations of cellular networks, namely 5G and B5G. First, the requirements of the next generations of cellular networks in terms of mobility management were first identified, followed by an introduction of their own qualitative performance metrics for the existing mobility management solutions. Moreover, a discussion on the effectiveness and sufficiency of the standards for both legacy networks and 5G as well as the research activities for meeting these requirements was included in their work. Lastly, potential enabling technologies and existing challenges were reviewed in detail. Compared to our present survey; i) the authors did not focus only on HO management, ii) ML applications were not mainly iterated although a mild discussion on deep learning was included, and iii) visual data assistance in HO management in addition to HO management in emergency scenarios were not covered.
A tabular overview of the relevant survey papers on mobility and HO management is given in Table 1, where the included works are analyzed in terms of their focus on 5G and B5G networks, HO management, ML applications to HO management, the use of visua data for HO management, and HO management in emergency scenarios.

C. PAPER ORGANIZATION
The reminder of the paper is structured as follows: the basics of 5G networks along with mobility management-oriented characteristics of 5G networks, including HetNets, IoT, vehicular communications, device-to-device communications, and high-speed train communications, are presented in Section II, while Section III provides an inclusive discussion on the mobility management in 5G. Section IV provides a comprehensive tutorial on HO management in 5G networks by detailing the HO types, requirements and performance metrics, and radio resource management. ML applications to HO management is elaborated in Section V with a brief introduction to different branches of ML (namely, supervised, unsupervised, and RL), followed by an extension literature review on the state-of-the-art in ML-based HO management techniques. Section VI highlights the challenges which ML-assisted HO management schemes would confront, and identifies future research directions. Lastly, Section VII concludes the paper with concluding remarks.

II. CHARACTERISTICS OF 5G AND BEYOND: SOME GENERAL CONCEPTS
This section presents general concepts behind 5G and B5G system in a cellular network. A short review of the architecture, channel characteristics and various features and applications of 5G are presented.

A. 5G SYSTEM
Although this survey focuses on HO management in NR, it is useful to provide a brief overview of 5G's architecture, interfaces and connections to serve as a background. The Next generation (NextGen) architecture is based on network function (NF) instead of a network entity (NE) that is obtained in LTE, according to 3GPP specification for LTE and new 5G systems [38]- [40]. In LTE's core network (CN) also known as evolved packet core (EPC), the appropriate network protocols and interfaces are defined among the entities for each network entity (e.g. serving gateway (SGW) and the mobility management entity (MME)). In contrast, network protocols and interfaces in 5G CN (5GC) are specified for each NF. The NF is the processing functionality in 5G networks, and it can be implemented in three ways [39]: 1) as a network element on dedicated hardware; 2) as a software instance running on dedicated hardware; or 3) as a virtualized function built on an appropriate platform, such as a cloud infrastructure. The advantage of NF over NE is that it dramatically decreases latency. This is achieved by carefully controlling the UE mobility (e.g. tracking and paging procedures) scheme and separating the user plane (UP) (also known as data plane is the dedicated channel that carries the network user traffic) from the control plane (CP) (which is responsible for routing data traffic through the network and for carrying out other control activities) to ensure that each plane's resources are independently scaled and that more NF can be deployed in a distributed manner [41]. Fig. 1 shows the 5G system architecture along with NFs and reference points. A reference point shows the interaction between the services in two NFs (e.g. N4 is the reference point that connects UPF and SMF). The NF in the UP consists of user plane function (UPF) acting as a gateway for the UE traffic passing through RAN to external networks such as the Internet. It is responsible for packet routing and forwarding, packet inspection, QoS handling, packet filtering, and traffic measurement. Several components of NFs run in the CP. Some of the components are: access mobility function (AMF), session management function (SMF), network slice selection function (NSSF), unified data management (UDM), policy control function (PCF), authentication server function (AUSF). For further information on these functions, the reader is referred to [41]- [43]. Overall, 5G architecture is divided into two parts, as shown in Fig. 2. The first part is the CN whose components have just been discussed while the second part is NextGen Radio Access Network (NG-RAN). The NextGen NodeB (gNB) serves as the access point for the 5G network, transmitting CP and UP traffic originating from N1, N2, N3 reference interfaces as shown in Fig 1. The purpose of the ng-eNB is to provide Evolved Universal Mobile Telecommunications System (UMTS) Terrestrial Radio Access (E-UTRA) UP and CP protocol terminations for UEs. In addition, 5G technology  also supports LTE via ng-eNB. It allows existing 4G radio networks to coexist with the gNB. For example, if both LTE and 5G radio coverage are available, a 5G UE may use either LTE and 5G radio resources. Therefore, when there is no 5G coverage, LTE serves the 5G UE using the ng-eNB. The connection interface between gNB and ng-eNB is known as an Xn interface, and NG interface is the connection interface between gNB/ng-eNB and CN more specifically to the UPF the NG user-plane part (NG-U) and to the AMF the NG control-plane part (NG-C). The last interface that needs to be mentioned is the radio frequency interface, which is the circuit between the UE and the active gNB or ng-eNB which is also known as Uu interface. This interface supports a broad spectrum from low to high frequencies [44].

B. CHANNEL CHARACTERISTICS OF 5G WIRELESS SYSTEMS
As mentioned earlier, 5G systems use mm-wave frequencies, along with sub-1 and 6 GHz spectrum. It is envisaged that B5G networks will use THz frequencies [45], [46]. Compared to the sub-6GHz band, the advantages of the mm-wave band include more available bandwidth and use of small antennas in devices. Antenna size is inversely proportional to frequency; therefore, mm-wave antennas for UE and BS are small and can be placed in small devices. However, 45776 VOLUME 9, 2021 the mm-wave band has some drawbacks that necessitate the use of sub-6 GHz frequencies in 5G. In this subsection, we present the rationale for the co-existence of multiband frequencies in 5G, as well as the characteristics and applications of different spectrum bands from sub-1 GHz to mm-wave.

1) SUB-1 GHz AND SUB-6 GHz IN 5G
In its early phases of implementation, 5G's main spectrum options were around 3.5 GHz and 4.5 GHz for sub-6 GHz with time division duplexing (TDD) technology. For the 3.5/4.5 GHz band, 5G aims to use existing BSs to help in the roll-out and implementation [47]. The 3.5 GHz band provides comparatively less coverage than the 2 GHz band used in legacy networks, and this is because radio propagation decreases as frequency increases. However, introducing MIMO beam-forming antennas at 3.5 GHz and higher spectrum reduces propagation losses, thereby significantly increasing coverage for 3.5/4.5 GHz.
The sub-1 GHz bands are also used through frequency division duplex (FDD) in 5G, especially for deep indoor penetration [48]. With its broader coverage, low data rate IoT connectivity and other critical communication like remote control or automotive applications can be introduced. Therefore, extensive coverage becomes imperative for these new use cases which can be served by the sub-1 GHz band [48]- [50].

2) mm-WAVE IN 5G
The propagation of waves at mm-wave is more prone to adverse effects of obstacles which can be caused by movement of people, presence of trees, foliage (outdoor scenario), furniture and walls (indoor scenario). Since the mm-wave spectrum is severely affected by rain and other atmospheric conditions, previous studies suggested that it was impractical to use this frequency band for mobile communications. However, this has been proven to be wrong, as recent studies have shown that atmospheric absorption does not create a significant loss when used in picocells-coverage below 200 m from transmitter- [51], [52]. These studies also show that even under very extreme rainfall, the rain attenuation would cause 1.4 dB and 2 dB loss at 28 GHz and 73 GHz, respectively. The impact of rain attenuation on mm-wave propagation, especially in urban picocell areas, will therefore become insignificant [52]. The short-range coverage of mm-wave has both advantages and disadvantages. Spatial reuse of frequency band, strong multi-path behaviour due to reflection are among the advantages of using mm-wave while one of the disadvantages of using this band is that many SCs are required to provide coverage due to the high propagation loss of mm-wave.
mm-wave is an inherently directional wave which means that there is a need for the transmitter and receiver to focus the beam towards each other, this is commonly known as beam steering. The main advantage of beam steering is to achieve high gain by focusing the transmitter and receiver towards each other. The beam steering is completed through a beam training/tracking process. Beam training is a process of finding the desired beam to connect the UEs in order to reduce initial access delay. Another critical parameter to consider is sensitivity to blockage. mm-wave has a higher frequency, making the size of its wavelength small compared to many physical objects, and thus the low ability of mm-wave to diffract through large objects makes it sensitive to blockage. For example, at the 60 GHz band, it is observed that there is a 20-35 dB increase in the path-loss if an obstacle (e.g., humans or furniture) is introduced between the mm-wave link [52].
While it has been demonstrated that using mm-wave frequencies such as 28 GHz and 38 GHz is possible even in complex urban environments, many challenges such as low throughput and high signaling overheads associated with HO still needs to be addressed to realize the full potentials of the mm-wave band [52], [53].
3) CO-EXISTENCE OF SUB-1 GHz, SUB-6 GHz AND mm-WAVE Given the rigid transmission efficiency standard for certain use cases such as vehicular networks, the use of mm-wave poses some significant difficulties in implementing reliable but high data rate communication. Critical IoT applications, including remote healthcare systems (for clinical remote monitoring and assisted living), traffic and industrial control (drone/robot/vehicle), and tactile Internet, etc., require higher availability, higher reliability, safety, and lower latency to ensure end-user experience as failure to satisfy these requirements would result in severe consequences, such as vehicle collision, and accident [15].
A CP and UP decoupled network is designed to circumvent these challenges by using the sub-6 GHz for the CP and mm-wave frequencies for the UP [54]. This guarantees that signaling from the CP reaches the UE with high reliability by using the sub-6GHz spectrum. On the other hand, the use of mm-wave frequencies for UP provides unprecedented data speeds, due to the vast bandwidth availability at the mm-wave spectrum. Therefore, the main purpose of sub-6 GHz and sub-1 GHz bands is to provide uninterrupted access to the CP or to provide coverage for areas where mm-wave cannot offer adequate coverage.

C. HETEROGENEOUS NETWORKS
HetNets comprises the deployment of BSs with different sizes. They are exciting low-cost approaches to meet the industry's growth requirements and offer a consistent connectivity experience. HetNet comprises SCs that support aggressive spectrum spatial reuse coexisting within macrocells, as shown in Fig. 3. A macrocell is a BS used in cellular networks with the function of providing radio coverage to a large area of mobile network access. The macrocell overlaps several SCs, and it has high output power, usually in the range of tens of watts and can provide coverage to a large area. However, the macrocell suffers from interference caused by the use of sub-6 GHz, which can travel far by nature. While the macrocell transmits radio waves over a long distance, if not managed properly, signal interference with other cells is very likely, which in turn could result in the degradation of network performance [55]. Nevertheless, macrocell has low spectral efficiency or area spectral efficiency, typically measured in (bit/s/Hz) per unit area, which results in less bandwidth and low data rate per UE. The data rate is the function of bandwidth, and SCs allows frequency reuse due to limited range hence more bandwidth and data rate per UE. Therefore, to increase the data rate, the idea of reducing BS footprint for macrocells was introduced [56].
In order to address the challenges facing the macrocell, the easiest and simplest way was to get the transmitter and receiver closer to each other. This solution creates a dual-benefit of high-quality links and more spatial reuse. Different cells with different sizes are considered based on the transmitted power, and the frequency of transmission used: macrocell, microcell, picocell, and femtocell are popular cells created by gradually reducing coverage range and transmission power. A summary of the types of cells in terms of coverage and capacity is presented in Table 2. As the BS footprint becomes smaller with smaller BSs, the use of mm-wave become more feasible. The mm-wave frequency suffers from high penetration loss which brings the advantage of enabling the reuse of mm-wave frequency in indoor environment for femtocell. For further studies on HetNets please refer to [55], [57]- [59].

D. INTERNET OF THINGS
In this modern era, various applications used by billions of people are daily made available via the internet, thereby making the Internet an essential tool to interconnect these applications, among which services like video streaming, file sharing, electronic commerce, etc. are increasingly taking place online. The types of interconnected devices includes smart phones and IoT devices such as sensors, wearables, etc. These IoT devices are able to communicate with each other to share information with little or no human involvement. Fig 4 illustrates some common IoT uses cases. As the number of IoT devices keep increasing, the traffic generated by these devices also increases, hence, the underlying protocols that support IoT should be reconsidered to support the massive interconnection of both new and conventional devices [60], [61]. Conventional devices need to be made smarter by incorporating advanced technologies such as ubiquitous computing, artificial intelligence, embedded devices, different communication standards and technologies, various application services, and different Internet standards. However, the problem is that these devices used in IoT are memory-limited and energy-limited, so information should be routed efficiently, and the proper channel between source and sink should be carefully chosen [61]. IoT has different use cases such as smart cities, smart home, vehicular sensors, health monitoring, and sport & leisure scenarios. Several of these use cases are discussed in the following subsection, while focusing on the differences in application domains requirements.

1) SMART CITIES
This involves the use of smart technologies to provide relevant information and automated services that would improve the standard of living of the people in a particular area. These smart technologies include: deployment of sensors for traffic monitoring to prevent traffic jams and detect bad roads, for automated street lights, smart grid, waste management systems where environmental sensors deployed in various locations to detect pollution, water level, or fire. In these cases, the early detection of abnormal environmental situations can be used to alert the appropriate authorities in order to enable them take the necessary actions when any incident occurs etc [15].

2) SMART HOME
This use case is sometimes classified as a part of smart cities. However, it is mostly limited to user-oriented applications, particularly for home networks [15]. Different services that can be classified under the smart home use-case include: (1) Connected home appliances where appliances such as smart fridges can be used to automatically order for the restocking of the fridge with food items or beverages when it detects that it is running out of supplies by checking a pre-defined threshold for the amount of each item. (2) Home video monitoring Homes can be equipped with small cameras that are mounted in different locations, and can be used to stream the video to the Internet for a remote monitoring. They can also be used to send alarms upon the detection of unusual movement or abnormal behavior, smoke, carbon monoxide, etc. in the monitored area.

3) HEALTHCARE/TELEMEDICINE/WEARABLE
This use case is becoming more popular as more devices such as watches and other wearable devices become increasingly available. Patients do not necessarily need to be monitored manually, but smart wearable devices track their health conditions for any abnormality. Such devices send an alarm message to a nearby hospital as soon as they detect anomalies with the patients being monitored.
All the use cases mentioned above face challenges that need to be addressed before IoT can become very efficient and able to integrate heterogeneous devices-device with different communication standards (protocols, technologies and hardware)-and applications envisaged for 5G [62]. These challenges include scalability, network management, security and privacy, interoperability and heterogeneity, network congestion and overload, and network mobility and coverage. To interconnect a massive number of devices and accommodate enormous traffic generated within 5G system, conventional sub-6 GHz is no longer sufficient, hence the need for the utilization of a new frequency band (mm-wave) [60]. This would led enhanced QoS for IoT devices.

E. DEVICE-TO-DEVICE COMMUNICATION
Device-to-device (D2D) communication involves the direct communication between two devices without passing through a BS. These devices could be smartphones, vehicles, etc. This kind of communication usually occurs when both device are in close proximity to each other [63]. The introduction of D2D communication is necessary to cope with the rise in the number of devices as well as the increase in demand for high speed connections. It is one of the technologies that is being exploited in 5G and B5G networks as its use would lead to enhanced link reliability, spectral efficiency, system capacity, energy efficiency and reduced network delays [64]. The use of mm-wave in 5G would facilitate D2D communications as more direct links would be supported, thereby enhancing the capacity of the network. In addition, due to the directional nature of 5G antennas, it would be possible to support more simultaneous connections in mm-wave systems. Despite the inherent advantages of D2D communications, due to UE mobility, and the fact that the UEs still need to connect to the BS in order to transmit control signals, the issue of HO needs to be carefully considered in order to prevent ping-pong effects which results in frequent HOs [65], [66].

F. VEHICULAR COMMUNICATIONS
Vehicular-to-everything (V2X) is a special case of D2D communication. It is a technology that provides communication between vehicles and surrounding devices, including hand-held devices, moving/stationary cars, and all other external IoT appliances. V2X is categorized into two main components: vehicle-to-vehicle (V2V) and vehicleto-infrastructure (V2I). The former allows communication between two or more vehicles. On the other hand, the latter deals with communication between cars and other devices in its external environment, such as traffic/street lights [67]. The most common and popular communication protocol that supports vehicular networks is the dedicated short-range communication (DSRC), and it can support all V2X architecture. DSRC uses 75 MHz bandwidth at 5.9 GHz band, and was expected to provide the data rate up to 27 Mb/s and a transmission range up to 1000 m [67], [68]. As a result of the high mobility of vehicles, one of the major challenges that vehicular networks suffer from is HO. This occurs because in the course of the vehicles movement from one location to another, the often move out of the coverage area of one network also known as road side unit (RSU) to another thereby leading to frequent change of connection from one RSU to another. This issue would become more pronounced with the use of mm-wave in 5G as the coverage area of the RSUs would become smaller [69]. Hence, HO management must be carefully considered for fast moving vehicles in 5G mm-wave communication networks in order to ensure seamless HO.

G. HIGH SPEED TRAIN COMMUNICATION
High speed train (HST) communications is one of the verticals that would be supported by 5G networks. The availability of large spectrum in the mm-wave frequency band would make the provision of enhanced mobile broadband services possible for passengers in high speed trains [49]. HST communication networks basically encompasses two kinds of communications, namely: critical and non-critical communications. The former is the communication between the HST and its associated infrastructures and is necessary to control the speed and ensure the safety, reliability and smooth functioning of the HST. The latter is required to provide services to the passengers on-board such as high quality video, and other data services [70]. Even though mm-wave has great potentials for application in HST communications, due to the high mobility of trains, HST communication is often prone to frequent HOs and fast fading channels, that potentially undermines its availability. As a result, some new technologies such as hybrid beamforming, beam management, network slicing, and distributed antenna system have been introduced in mm-wave communications to enhance its application in HST communications [50].

H. BEYOND 5G SYSTEM
5G is game-changer as it can provide data rates up to tens of gigabits per second, which is far beyond what is provided by legacy networks [71]. However, with the introduction of new use cases and applications such as virtual and augmented reality, remote surgery and holographic projection, 5G would not be able to meet the projected explosion in wireless data demands. As a result, research into higher frequency (beyond mm-wave) has risen, and THz frequency has become the B5G researcher's focus as the new spectrum for B5G systems. Only frequency bands in the THz range can provide the large amount of bandwidth that is needed to support the terabit-per-second data rates in order to support huge traffic types such as uncompressed videos that is envisioned in B5G networks [46], [72].
The use of THz band in 6G is required to provide the reliable communication that is required to support various critical applications, accommodate high data rates per area, and support massive amounts of connected UEs. The THz frequency band has quite similar characteristics to that of mm-wave. However, because it has a higher frequency compared to mm-wave, this means that it would be prone to all the challenges facing mm-wave alongside additional challenges. Therefore, there is a need for more advanced error control mechanisms, mobility management techniques, as well as other new features to enable the utilization of this frequency band in mobile cellular networks.

III. MOBILITY MANAGEMENT IN 5G
Mobility management in 5G is quite different from that of legacy networks (2G-4G) and in this section, we present the concepts behind the radio access mobility in 5G cellular network. We also briefly explain the mobility state procedures in 5G system that makes it more efficient than legacy systems.
Definition 1 (Access Stratum): Access stratum (AS) is the set of protocols in 5G that contains the functionality associated with the UE's access to the RAN and the control of active connections between a UE and the RAN.
Definition 2 (Non-Access Stratum): Non-access stratum is the set of protocols in 5G that handles functionality operating between UE and CN.
Definition 3 (RRC Context): The RRC context are the parameters necessary for establishing/maintaining communication between the UE and the CN.
Definition 4 (Cell Selection): Cell selection is the process of choosing a suitable cell 4 for the UE to camp on. This process is performed as soon as the UE is switched on [74]. 4 A cell with the measured cell attributes satisfy the cell selection criteria [73] Definition 5 (Cell Re-Selection): Cell re-selection is the process of choosing a suitable cell after the UE camps on a cell and stays in the idle or inactive state.

A. RADIO RESOURCE CONTROL STATE MACHINE
The RRC protocol is in the IP-level (Layer 3 /Network Layer) and is the protocol between UE and NG-RAN as specified by 3GPP TS 38.331 [75]. The RRC protocol's essential functions include; 1) broadcast of system information; 2) Control of the RRC connection-this procedure includes paging, establishment, modification and release of the RB. It also involves establishing an RRC context; 3) measurement configuration and reporting, and other functions specified by 3GPP TS 38.331 that can be summarized in [43], [75]. The RRC's operation is guided by a state machine that defines specific states where a UE may be present. The different states in this state machine have different amounts of radio resources that can be utilized by the UE once it enters into a particular state. Since different amounts of resources are available in different states, the state machine impacts the QoS that the user experiences and the energy consumption of the UE [76]. In addition, RRC states provide a clear distinction between HO and cell (re-) selection. The UE can be in one of the three RRC states, namely: RRC_Idle, RRC_Connected, and RRC Inactive state. Fig. 5 depicts the UE state machine and state transitions in 5G while Table 3 summarizes the RRC protocols and functions in each RRC state.

1) RRC_IDLE
In RRC_Idle state, the UE is not registered to a particular cell; hence, it does not have an AS context or receive any network information. This means that no specific link is established for communication between the UE and CN, and the UE does not belong to any specific cell. From the CN perspective, the UE is in the CN_Idle state, 5  (a kind of) sleep mode and wakes up periodically (according to a configured discontinuous reception (DRX cycle)) to listen for paging messages from the network through the downlink channel. During this period, no data transfer takes place and the UE enters into sleep mode regularly to reduce battery consumption. The network can reach the UEs in the RRC_Idle state by sending paging messages to notify them of changes in system information, warning messages such as earthquake and tsunami warning service (ETWS), and commercial mobile alert system (CMAS) which are send as short messages. In this state, the UE manages mobility based on the network configurations via cell re-selections. It also performs the neighbouring cell measurements needed for cell re-selection in order to determine which cell it is to connect (explained in Section III-B). To reduce the network signaling overhead and the latency experienced by legacy networks (such as LTE) during the transition to RRC_Connected state, the RRC_Inactive state was introduced in 5G. In 5G, the network initiates the RRC release procedure to transit a UE from the connected to the idle state. In addition, as UE moves from the idle to the connected state, both the UE and the network establish the RRC context.

2) RRC_INACTIVE
5G-NR introduced RRC_Inactive state from lessons learned during the development of LTE. The findings revealed that the transition of wireless devices from idle state to connected state is the most frequent high-layer signaling event in existing LTE networks, occurring about 500 − 1, 000 times a day. 6 This transition involves a significant amount of signaling overhead between the UE and the network, as well as between network nodes, which can lead to increased latency and power consumption in the UE. The solution is to switch to RRC_Inactive state which will result in a significant reduction in both latency and UE battery consumption. When the UE is in inactive state, its behaviour is similar to that in idle mode in term of power-saving. However, unlike the idle state, in the inactive state, RRC context is kept in both UE and gNB, and the UE is in CN_Connected state 7 from the CN perspective, meaning that its connection to the CN is kept intact. Different from RRC_Idle state, the primary purposed of RRC_Inactive state is to reduce the network signaling load and latency involved during RRC_Idle to RRC_Connected state transition. In RRC_Inactive state, the network signaling becomes faster since the AS context is stored in both the UE and gNB. While 5G CN connection is still retained -(UE remains in CN_Connected state), the UE in RRC_Inactive state is in sleep mode and wakes up repeatedly-according to configured DRX cycle (which in this case is controlled by the 5G-RAN), and regularly monitors for paging messages from the network. The procedure for notifying the UEs about any change of system information or warning message is the same as that of the idle state [78].

3) RRC_CONNECTED
In the RRC_Connected state, the RRC context and all parameters needed to establish communication between the UE and the RAN are known to all entities. The means that in RRC_Connected state, the network configures all required parameters for communication between the network and the UE. In RRC_Connected state, the UE is in CN_Connected state from the CN point of view. The cell to which the UE belongs and the UE's identity is known. In addition, the cell radio-network temporary identifier (C-RNTI) used for signaling purposes between the UE and the CN is configured. The connected state is intended to transmit data to or from the UE, and to minimize excessive power consumption of the UE. DRX is optimized while maintaining user's quality-of-experience (QoE) [79]. With a configured DRX cycle, the UE only monitors downlink signaling when active, and then goes into sleep mode for the rest of the time with the receiver circuitry turned off. This process allows significant power consumption reduction, as the longer the DRX cycle, the lesser the power consumption. For exhaustive discussion on how DRX reduces excess power consumption, please refer to [78], [80]. Also, the RRC context is established in gNB for the connected state, therefore, data transmission/reception can commence relatively fast, as no connection setup, and signaling is needed. In this state, the network manages mobility by the process known as HO, explained in the Section IV-E As regards cell re-selection when leaving RRC_Connected state, the UE attempts to camp on a suitable cell according to redirectedCarrierInfo when transitioning from RRC_Connected state to RRC_Idle or RRC_Inactive state [73]. In the connected state, if the network initiates the RRC release message or the UE and CN are no longer attached, the UE moves into an idle state, on the other hand, if the network initiates the RRC suspend procedure, the UE would transit from connected to inactive state [73], [75] (see Fig. 5). One significant difference among the different states, as seen from the preceding discussions, is the mobility mechanisms involved. Efficient mobility management is an essential aspect of any mobile communication system. In the following subsections, we describe the different mobility mechanisms including idle-and inactive-state mobility.

B. IDLE AND INACTIVE STATE MOBILITY
Most importantly, RRC states ensure that the mobile UE is accessible via network mobility mechanisms, mainly when the UE is in the idle or inactive states, during which it has limited connection to the network. The network, through paging, communicates with the UE occasionally, and also sends short broadcast message which carries information about changes in the system [73]. The area over which a paging message is sent is an essential feature of the paging process. Also, in both states, the device can switch from one cell to another via cell re-selection. The UE scans for candidate cells for cell re-selection, and if the UE discovers a cell with received power sufficiently higher than its current one, it deems this the best cell and contacts the network through random access [73]. UE tracking needs to be intelligently carried out to avoid high overhead due to paging, and signaling at the network and cell level respectively [81]. Hence, the cell-group level tracking system was introduced in 5G-NR to tackle the challenge of high overhead due to signalling and paging. Figure. 6 illustrates how tracking of UE in the idle and inactive state is carried out in 5G-NR. In order to enable effective UE tracking, the cells are organized into cell groups, and the UEs are only monitored on the cell-group level, as shown in Figure. 6. The network only receives new UE location information when the UE moves into another cell group outside its previous cell-group. In case of paging the UE, the broadcasted paging message is sent to all cells within the specific cell group -this is done to reduce the paging overhead. This is the primary tracking procedure in the NR for both states. However, there is a difference in the way that cells are grouped in both states as well as how paging is initiated.
For the idle state, cell groups are grouped into RAN areas, where a RAN area identifier (RAI) identifies each RAN area. The RAN areas, in turn, are grouped into an even larger group known as tracking areas, where a tracking area identifier (TAI) is used to identify tracking area. Thus, each cell belongs to one cell group which also belongs to one RAN area as well as a tracking area, and their respective identities are provided as part of cell system information.
Tracking areas are the basis for CN-based UE tracking, and the CN is responsible for managing and initiating paging. The CN assigns each UE to a UE registration area, which consists of a list of TAIs. When a UE enters a cell belonging to a tracking area outside its assigned registration area, it accesses the CN and performs a Non-Access Stratum (NAS) registration update. The CN records the UE's location and updates the UE's registration area, then it provides the UE with a new TAI list that includes the TAIs that the UE has been assigned. The UE is assigned a set of TAIs to avoid repeated NAS registration updates in case the UE moves back and forth between two neighbouring tracking areas. If the UE moves back to the old TAI within the updated UE registration area, no new update is needed.
In the inactive state, RAN Area becomes the basis for UE tracking, which is carried out in the 5G-RAN level. 5G-RAN is responsible for initiating the paging and managing RAN-based notification area. UEs are assigned a RNA comprising the following: a list of cell identities, a list of RAN areas, or a list of tracking areas. The RNA is assigned to a UE by its serving NG-RAN based on the UE's registration area and can cover a single or multiple cells (a subset of the tracking areas). As a result, the UE can move freely within the allocated RNA without contacting the NG-RAN. However, if it moves to an area outside its current RNA, it initiates RAN-based Notification Area Update (RNAU). Once the serving cell (ng-eNB or gNB) receives the RNAU request from the UE, it may send the UE to one of the following RRC states: RRC_Inactive, RRC_Connected, or RRC_Idle. If UE remains in the inactive state, the serving NG-RAN may continue to send a periodic RNAU timer to the UE, which is used to notify the network that the UE is still active. The value of the RNAU time is assigned based on the RRC_Inactive assistant information (RIAI) [39]. In summary, two levels of paging can be applied for reaching the UE depending on its RRC state: CN-based paging for idle state and 5G-RAN-based paging for the inactive state (see Table 3).

C. CONNECTED STATE MOBILITY
The connection between UE and network is established in the connected state. Connected-state mobility aims to maintain connectivity without interruption or noticeable degradation as the UE moves within the network. To maintain the connection between UE and network in the connected state, the UE is continuously searching for new BSs to connect to. The BS search is based on current carrier frequency (intra-frequency measurements) and different carrier frequencies (inter-frequency measurements) from the UE perspective.
Cell search in the connected state results in HO if suitable condition are met while for idle and inactive state, it results in cell re-selection. When it becomes necessary to perform HO in the connected state, the UE is not responsible for the decision. Instead, the UE performs signal measurement of the serving cell and neighbouring cells and generates the measurement report (MR)-containing cell level measurement results such as reference signal received power, signal-to-interference-plus-noise ratio, reference signal received quality, etc.-sent to the network. Based on this report, the network decides whether or not the UE is to HO to a new cell. The above procedure is not applied to the very small SCs (e.g 5G femtocell) that are tightly synchronized to each other [38], [76].

IV. HANDOVER MANAGEMENT IN 5G AND BEYOND
This section describes the step-by-step procedure for HO in 5G NR, introduces the various categories of HO, and also discusses HO requirements alongside its relationship with radio resources management.

A. TYPES OF HANDOVER
There are two broad categories of HO, namely; intra-/inter-frequency and intra-/inter-radio access technology (RAT) HO.

1) INTRA-/INTER-FREQUENCY HANDOVER
Intra-frequency and inter-frequency HO are the HOs types for which the carrier frequency is the subject of interest. If the UE is to move to the target cell with the same frequency as that in the serving cell, it is generally known as intra-frequency HO as seen in Fig. 7 Scenario 1. In contrast, inter-frequency HO occurs if the UE is to use a different carrier frequency in the target cell as shown in Scenario 2 in Fig. 7. Event A3 and A6 initiate intra-frequency HO. Both Event A3 and A6 are triggered when the neighbouring BSs RF condition is higher than that of the serving BS. Moreover, Event A6 is used for intra-frequency HO of the secondary frequency on which the UE camps. Event A4 and A5 are typically used for inter-frequency HO. Event A4 is triggered when the RF condition of one of the neighbouring BSs is higher than the threshold compared to that of the other BSs. On the other hand, Event A5 is triggered when the serving BS RF condition becomes lesser than the lower threshold and the RF condition of one of the neighbouring BS becomes higher than the upper threshold (where the threshold values are parameters that are optimized based on the network) [75], [82]. As mentioned in Section III-A3, HO occurs in the connected state and in that state, UE regularly sends the measurement report (MR)-containing cell level measurement results such as reference signal received power, signalto-interference-plus-noise, reference signal received quality, etc.-of all neighbouring cells to the serving cell. More information regarding the HO trigger events can be found in [29], [38], [75].
The UE essentially carries out the measurements in the measurement gap at different frequencies for inter-frequency cases [82] and [5]. The measurement gap is necessary because without it, the UE would not be able to measure the target carrier frequency while transmitting/receiving to/from the serving cell simultaneously. The measurement gap specifies the time interval when no downlink (DL) or uplink(UP) signal is transmitted. Measurement gap only applies to some cases of intra frequency HO where enhanced UE coverage is not guaranteed to be aligned with the serving gNB's centre frequency [5], [82]. However, the measurement gap is required for all cases of the inter-frequency HO as specified in 3GPP [82]. Researches are concerned with fundamentals question on how the measuring gaps can be reduced, as large measuring gap results in lower throughput and higher UE energy consumption.

2) INTRA-/INTER-RAT HANDOVER
In the case of intra-RAT HO, UE hands over from serving BS (S-BS) to the target BS (T-BS) which both use the same RAT. Intra-RAT HO is commonly referred to as horizontal HO [5] as shown in Fig. 8. Intra-RAT HO can be either intraor inter-frequency HO. Intra-RAT HO aims to preserve the connectivity of the UE with the existing network and the primary reason for this kind of HO can be attributed to load balancing or measurement trigger conditions [83]. Once UE HO occurs, it prefers to camp on the cell which provides the strongest received signal.
In contrast to intra-RAT HO, inter-RAT (or vertical) HO occurs when the UE hands over to a T-BS which uses a different RAT from the S-BS. Unlike in intra-RAT HO where the cell with the highest received signal is selected, in inter-RAT HO, other factors such as user mobility, service type, as well as the network property and state are considered when selecting the target cell. It also involves the switching of the logical interface between the two RATs [84]. The latency incurred during inter-RAT HO is still prohibitive for many application and services, thus, it poses a severe problem VOLUME 9, 2021 in the NexGen mobile systems [84]. In order to improve the user experience, centralized architecture for inter-RAT HO, which integrates legacy and NR network protocol was proposed [84]. Fig. 9 demonstrates how the UE performs inter-RAT HO. From the figure, it can be seen that both distributed, and centralized CN architecture for multi-RATs are possible. The advantage of using centralized architecture is that it can lead to a significant reduction in HO signaling and interruption time. 8 The centralized architecure comprises unified CN along with the baseband unit (BBU) and remote radio head (RRH) separated through a transport mechanism such as optical fiber. In a C-RAN architecture, the RRHs are connected to the BBU pool through high-bandwidth transport links known as fronthaul.

B. HANDOVER REQUIREMENTS AND KEY PERFORMANCE INDICATORS
Since HO has adverse effect on the overall performance of wireless networks, different features and requirement are necessary to reduce the impact of HO. Also, various key performance indicators (KPIs) are used to measure how the network performs during a HO. The various HO requirements and KPIs are presented as follows: • Seamless HO: a seamless HO occurs when UE perceives continuation of connection during HO with little or no interruption during gNB switch. This guarantees the UE's active connection.
• HO interruption time: is a period where the UE is not permitted to send user plane packets to the BS. The UE experiences seamlessly HO if the interruption time is very small (≤ 1ms) [85], [86].
• HO cost: is defined as mobility interruption time per HO multiplied by the number of HOs for a particular UE's trajectory. This metric is imperative in the network as it 8 Michael Wang, 5G, C-RAN, and the Required Technology Breakthrough, Published on 21 Jun. 2018. Available online at https://medium. com/@miccowang/5g-c-ran-and-the-required-technology-breakthrough-a1b2babf774. Accessed on 25 Oct. 2020. has a direct relationship to the system throughput [53]. HO cost decreases as the number of HOs and/or the mobility interruption time per HO decreases.
• HO failures rate: For any given UE trajectory or unit time, the HOs failure rate is the number of HO failuresunsuccessful HOs-divided by the number of times the UE experienced the HOs.
• Signaling overhead: HO signaling overhead are the various data generated during the process of HO to facilitate the operation. However, the HO process interrupts the data flow and results in the reduction of the UE throughput. [29]. There are other performance metrics that are essential to ensure optimal performance in wireless networks, particularly for HO optimization. Further details can be found in Tayyab et al. [5].

C. HANDOVER AND RADIO RESOURCE MANAGEMENT
In 5G, the term radio resource includes both traditional (from the legacy system) and extended resource concept [87]. These legacy resources include energy consumption (cell and UE transmitting power), frequency (channel bandwidth, frequency of the carrier) and antenna configurations. In addition, the extended resource definition in 5G covers the hard resource (number/type/configuration of antennas, the existence of nomadic/unplanned nodes, or mobile terminal relays) and soft resources (network node and UE software capabilities). It is also important to meet UE requirements such as QoS or QoE for all the UEs while properly managing resources. On the other hand, proper resource management can help networks fulfill HO KPIs, for example, by reducing the probability of HO failures while maintaining the QoE during and after HO [88], [89]. To increase wireless system efficiency, it is necessary to address and take into account the fundamental issues related to HO and resource management such as admission control, bandwidth and power control.

D. DUAL CONNECTIVITY
Dual connectivity means that the UEs can establish connection to two different cells at the same time [90]. Usually, in dual connectivity, UEs either connect to BSs of different sizes (macro cell and SC) or two different RATs simultaneously (e.g. 4G and 5G network), as illustrated in Fig. 10. Since the UE can be connected to two different RAT over different frequency bands simultaneously, the interruption time is reduced to zero. However, this would trigger an additional likelihood of HO where new HO cases are introduced relative to a single connection. These new HO scenarios (see Fig. 10) occur in two situations; when the UE switches the connection either from SC to macrocell or from SC to SC. With the introduction of mm-wave, the use of dual connectivity could lead to an increase in HO probability, thereby causing additional problems with mobility management, including an increase in signaling overhead, synchronization complexity between RATs for multi-RAT connectivity, simultaneous utilization of resources in multiple BSs, and reduction in battery  lifespan. The increase in signaling overhead is due to flow control between the RATs [90], [91], and these issues could be addressed using intelligent approaches.

E. HANDOVER MANAGEMENT IN NR
NR physical layer uniquely differs from the legacy RAT with the following features: dual connectivity, high-frequency spectrum, forward compatibility, ultra-lean design, use of mm-wave and relay for devices (device-to-device). NR supports both multi-connectivity and single-connectivity selection depending on the configuration set. For both configurations, hard HO is used during path switching [29]. In both licensed and unlicensed spectrum, NR operates between 600 MHz and 73 GHz. Forward compatibility means designing radio-interface architecture that enables new service requirements and accommodate new technologies while supporting legacy network UEs. While the ultra-lean design principle aims to decrease the always-on transmissions (for example, signals for BS detection, broadcast of system information) to achieve high data rates with low energy consumption in the network. The main challenge for NR is the coverage due to the use of high frequency with high penetration loss that makes the cell footprint to become smaller. In this section, we describe the NR HO with a brief introduction of critical features and the entities involved in NR mobility. Also, a step by step HO procedure is provided for intra-AMF/UPF. The types of HO in NR are described as follows: 1) Intra-gNB HO: This occurs when both the source and target cells 11 belong to the same gNB, as shown in Fig. 11. 2) Inter-gNB HO without AMF Change: Inter-gNB HO generally occurs when serving and target cells are from different gNBs. There are two different types of HO within inter-gNB HO without AMF change, depending on whether the HO involves a change of UPF or not. However, the inter-gNB HO discussed here does not include a change of AMF in both cases, as shown in Fig. 12. Inter-gNB with 11 Cell here means the part of sector gNB that has specific beams and covers the specific environment.  intra-UPF HO is presented in Fig 12 scenario 1, where the HO involves a cell change with the same UPF, while Fig. 12 scenario 2 presents inter-gNB with inter-UPF HO where the cell switch involves a change of UPF.
3) Inter-gNB HO with AMF Change: In this case, the HO requires a change of AMF from the source to the target AMF. However, the HO involves no change of SMF, and only the NG interface is used as depicted in Fig. 13. There are two cases of inter-gNB HO with AMF change; in the first case ( Fig. 13 scenario 1), the same UPF is maintained while the second case ( Fig. 13 scenario 2) involves a change of UPF during HO. The basic HO procedure in NR is shown in Fig. 14 [5], [38]. It consists of three phases, namely: HO preparation (Steps 0-5), HO execution (Steps 6-8) and HO completion (Steps 9-12), which are described as follows: • Step 1: the UE measuring procedure is configured according to access restriction and roaming information by the serving gNB (S-gNB), and the UE sends an MR to the target gNB (T-gNB).
• Step 2: the S-gNB determines to HO the UE, based on the MR and radio resource management information.
• Step 3: the S-gNB sends a HO request message to the T-gNB (which includes the necessary information to prepare for HO to the T-gNB).
• Step 4: the T-gNB executes the admission control procedure if the T-gNB can grant the resources.
• Step 5: the T-gNB sends a HO request acknowledgement to the S-gNB. As soon as the S-gNB receives the HO request acknowledgement message, data forwarding may be initiated.
• Step 6: S-gNB sends a HO command to the UE. • Step 7: S-gNB sends the Sequence number status transfer message to the T-gNB.
• Step 8: UE detaches from the S-gNB and synchronizes with the T-gNB.
• Step 9: the T-gNB informs the AMF that the UE has changed the cell, through the Path switch request message.
• Step 10: 5GC switches the DL data path towards the T-gNB.
• Step 11: the AMF acknowledges the Path switch request.
• Step 12: the T-gNB informs the S-gNB that the HO was successful and triggers the release of resources by the S-gNB by sending a UE Context Release message. Finally, the S-gNB release the radio resources associated with the UE. It is essential to point out that the above procedure is applied for HO between NR and NR technologies.

F. MOBILITY AND HANDOVER MANAGEMENT IN B5G
Researchers have anticipated some use cases and applications that make B5G to be different from 5G. Some of these use cases include integrated unmanned aerial vehicles (UAVs) communications, high mobility of devices (above 500 kmph), holographic projection, etc [92]. The high mobility of devices, UAVs, and other applications that use radio waves at the mm-wave and THz spectrum presents unprecedented wireless communication challenges in B5G. Among these challenges, mobility and HO management are anticipated to be the two most challenging issues in B5G networks since B5G networks would be highly dynamic, and multi-layered, which would lead to more frequent HO. High mobility of devices and UAVs results in uncertainties of their locations and keep in mind that high frequencies such as mm-wave and THz that would be used in B5G can be easily blocked by humans, buildings, etc.
Heuristic and traditional HO methods would not be able to react quickly. An alternative solution is to adopt artificial intelligence models for mobility prediction and optimal HO strategy in order to guarantee communication connectivity. Even though the introduction of multi-connectivity is a very promising solution, the procedure still needs intelligent HO management strategies to optimize the cell (re-)selection process in order to reduce signaling, guarantee high data rate, high reliability, and low latency in the B5G [92]. The HO procedure for the B5G might be similar to that of 5G, but there are no standards for B5G system yet. A summary of the challenges associated with HO management in NR alongside their causes and potential solutions are presented in the Table. 4.

V. MACHINE LEARNING FOR HO MANAGEMENT
The use of mm-wave and higher frequencies in 5G and B5G networks is going to introduce new challenges and complexity to the HO management that would be difficult to handle by conventional methods. Firstly, these frequency ranges suffer from severe attenuation (e.g., larger penetration losses), which means their transmission distance will be small. As a result, more BSs need to be deployed to cover the same area that would have been covered by those utilizing the microwave frequencies [104]. This implies that the size and the complexity of the network is going to greatly increase and the users will be prone to more frequent HOs which would greatly affect their QoS, particularly for high mobility users and applications.
Secondly, due to the use of directional beams for transmission in mm-wave networks, the presence of obstacles on the path of the transmitted beam can partially or completely hinder the user from gaining access to the network or negatively impact the signal quality. As such, in mm-wave communication networks, the users are not only faced with the challenge of selecting the optimal BS but also the optimal beam to connect to per time in order to maximize their QoS. Hence, optimal beam selection has become another factor to consider in HO management process which would further add more complexity to the HO process because of the massive number of beams that the user has to select from during each HO instance [105], [106].
Finally, there is also the need to provide some high mobility based essential services for emergency scenarios such as medical services to patients in ambulances en-route hospital through real-time consultations with the doctors that are situated in a remote hospital. Especially, in the pandemic situation that we find ourselves in now, this kind of services may be needed to sustain the lives of the patients in critical conditions before they get to the hospital to receive proper medical attention [107], [108]. Intelligent HO optimization would help predict the route of the ambulance, determine the optimal BSs to connect and also pre-allocate the resources that will be needed at the BSs. This will help prevent intermittent service interruptions and guarantee the QoS need to support the communication between the paramedics in the ambulance and the doctors at the remote location [109], [110]. Therefore, effective HO optimization would enable the selection of the optimal BS and beam for user connection that will maximize user connection, reduce excessive or unnecessary HOs, and enable the detection of obstacles and their avoidance. These are some of the issues that make HO optimization in mm-wave communications networks more challenging to handle compared to the previous generations of cellular networks. Moreover, since the HO process involves various network parameters that must be considered and optimized in real time in order to ensure seamless HO, this would be very challenging for most conventional methods to handle. The challenge with conventional methods of HO management is that they are computationally demanding to implement, particularly when the network dimension becomes very large. As such, before they can decide which target BS to associate the user with, the user must have moved from that location. This would result in sub-optimal HO decision and degradation in user QoS. In addition, they cannot accurately capture certain details of the network such as the presence of different types and sizes of obstacles, as well as the dynamic traffic demand patterns that are typical of 5G and B5G networks, which are also important for making an optimal HO decision [109], [111], [112].
However, ML techniques can assist in bringing intelligence and helping the network to self-optimize. ML techniques are able to learn various network characteristics from data generated from the network, in order to optimize various aspects of the network. They are able to capture hidden details and patterns in the network from the network data that cannot be represented by analytical models [111]. They are self-adaptive and as such can react to changes in network environment and even predict future network or user demands before hand, thereby enabling the network to adequately prepare to handle such demands when they occur [109]. They can be designed in a computationally efficient manner such that the training phase of the algorithm, which is often computationally demanding, can be carried out offline, and then the trained model deployed online to carry out real-time optimization after which the model can be updated periodically, as it experiences new data [113].
In this section, we first present an overview of the major categories and types of ML algorithms used for HO optimization. Then, we delve into reviewing the state-of-the-art on ML-aided HO management. A top-level taxonomy is followed while reviewing the state-of-the-art, such that the ML-aided HO management methodologies are classified based on the source of the data they utilize. As such, two broad categories are considered: visual data based and wireless data based HO management techniques. The major objective of this novel taxonomy is to recognize the visual data aided HO management schemes-which has been long overlooked in the literature-by giving it a special place along with the traditional wireless data driven HO schemes. The visual aided wireless communications is an emerging research area in wireless communications where visual information (pictures/videos) captured from cameras, light detection and ranging (LIDAR), etc., are combined with wireless sensory data for wireless network optimization such as channel prediction, HO optimization, etc [114], [115]. This is necessary because mm-wave communication networks possess unique challenges that would be difficult to handle using only wireless sensory data but with the assistance of visual data, some of these challenges can be handled properly. On the other hand, for the wireless data based HO management, the most recent works are extensively reviewed under two use cases: beam selection and BS selection.

A. AN OVERVIEW OF MACHINE LEARNING ALGORITHMS
It has become very important to include AI/ML in the BS's and beam selection process during HO, in order to achieve the primary objective of providing a seamless HO and to ensure that the UE achieves maximum throughput during the entire duration of its connection to the network. The HO optimization problem is a decision-making problem, and intelligence is imperative to ensure that the optimal decision 45788 VOLUME 9, 2021 is taken at each HO instance in a more efficient and effective manner.
We begin by defining ML and discussing the various categories. According to [116], ML is a set of computation procedures that evolved from formidable techniques in the field of AI that allow the computer to self-learn, discover patterns, and generate models from historical data without being explicitly programmed. The objective of ML is to identify features of a given data set that are likely to influence an outcome of interest given the input, and then use those learned features to predict the result in a new situation not previously encountered [32]. A substantial collection of ML techniques (model and algorithms) has been created to solve various challenges in different domains. These algorithms can be classified according to how learning is performed. They have been broadly categorized into three major classes [117]. Table 5 presents an overview of ML approaches based on their learning styles.
Definition 6 (Labelled Data Set): A labelled data set is a data set with clearly defined features (input) and target (output). The features are usually related to the target and enables the ML algorithm to identify the target or map the input to the output during the training phase.
Definition 7 (Unlabelled Data Set): An unlabelled data set is a data set that does not have labels. That is, there is no clear description of the features or targets in the data set.

Definition 8 (Model Training):
Model training is the process of exposing an ML algorithm to the training data set (i.e., labelled or unlabelled data set) in order to enable it to learn the mapping between the features and the target. Thereafter, a model is obtained that can correctly predict the right target, even when it is feed with a new data set that it had not previously seen.

1) SUPERVISED LEARNING
As the name suggests, it is the learning technique which requires a labelled training set consisting of inputs features and output. The learning model tries to search for a function that maps the input to the desired output by minimizing both the bias and variance error of the predicted results. After that, new data set is then applied to the trained model in order to predict the output. Supervised learning is basically classified into regression-where the predicted output is continuous-, and classification-where the predicted output is discrete or categorical. Examples of supervised learning algorithms include: artificial neural networks (ANN), support vector machine (SVM), extreme gradient boosting (XGBOOST), k-nearest neighbour (kNN), decision tree, random forest, etc, [118]. Supervised learning algorithms can help provide user mobility information through prediction of future location, trajectory, cell, etc., which is needed for proactive HO optimization and efficient resource allocation in 5G and B5G networks order to enhance the QoS of users [32].

2) UNSUPERVISED LEARNING
Different from supervised learning, in unsupervised learning, the training data set is unlabelled. The learning model in this case, tries to find hidden patterns, structures, and correlations within the training data set. They are mainly employed for anomaly detection, pattern recognition, and the reduction of the dimension of a data set. Common examples of unsupervised learning algorithms are k-means clustering, principal component analysis, expectation-maximization (EM), etc [119]. With the deployment of ultra-dense cellular networks and use of diverse kinds of devices (conventional UEs and IoT devices) in 5G and B5G, clustering algorithms can enable scalable and decentralized HO optimization particularly for cases where user mobility patterns are heterogeneous thereby reducing complexity. As an example, the authors in [120] proposed a two-layer approach to HO optimization in an ultra-dense network where k-means was first used to cluster the devices with similar mobility pattern then deep reinforcement learning was implemented to determine the optimal HO policy of the devices within each cluster.

3) REINFORCEMENT LEARNING
Unlike supervised and unsupervised learning that deal with continuous or discrete output prediction and identification VOLUME 9, 2021 of hidden pattern or structures in data, RL is concerned with making decisions in order to obtain an optimal policy in a given environment. It is a trial and error kind of learning whereby an agent interacts with the environments, takes action and gets feedback in terms of reward or penalty, depending on whether the action taken is good or bad for a given objective. The outcome of RL is to learn the optimal policy that would enable the agent to make an optimal decision at any given state of the environment. RL algorithms can be value-based (e.g. Q-learning, SARSA), policy-based (e.g. policy gradient, proximal policy optimization (PPO) and actor-critic (A2C)) [121], [122]. Reinforcement learning algorithms are suitable for mobility management and HO optimization as they are able to adapt to varying user mobility and network condition in order to determine the optimal HO policy. They are particularly relevant in 5G and B5G networks because of increasing network dimension and complexity in order to reduce HO delays and minimize frequent HOs [32].

4) DISTRIBUTED LEARNING
Conventional approach to ML requires that the training data be stored in a central location either at the data centre or in the cloud. However, this approach has several challenges including data privacy, latency, increased signaling overhead, increase in the energy consumption of the UEs, etc. This has led to a growing interest in distributed learning where the data is processed at the location where it is generated, and only the trained ML models are transmitted to the central entity [123] and [124]. In this regard, federated machine learning [123] and [125] has been introduced to handle the aforementioned challenges and is gaining increasing application in the field of wireless communication [124] and [126]. Federated learning is a distributed machine learning approach which enables data generating entities to jointly learn a shared prediction model without having to send their data to the central entity. In this case, only the trained models are transmitted to the central entity. This ensures the preservation of data privacy and requires less communication resources for model transmission [127] and [128]. Federated learning has been applied in [129] and [130] for human mobility prediction in order to preserve the privacy of users and in [103] for proactive HO mm-wave vehicular networks in order ensure the privacy of user location information, minimize communication overhead while minimizing frequent HOs.

B. MACHINE LEARNING BASED HANDOVER OPTIMIZATION
HO optimization is necessary when selecting the BS/beam that a user should connect to, in order to minimize frequent HO due to the small footprint of mm-wave BSs in 5G and THz wave BSs that are envisioned to be used in B5G. This is because frequent HOs increase the HO cost, thereby reducing the network throughput. Throughout this paper, we will refer to HO as defined in [53] which establish the term HO cost.
With efficient HO optimization, the network is able to select the best T-BS that will provide a higher throughput for UE.
Before ML came into play, classical methods for BS selection were based on specific parameter measurement. These methods include selecting the T-BS based on distance, or the BS that provides a higher KPI such as reference signal received power, received signal strength indicator, and signal-to-noise ratio. In the measurement-based approaches, the channel state information (CSI) from the MR of all neighbouring BSs is measured, and the one with the best CSI is selected as the potential T-BS. These approaches are practical for sub-6 GHz frequencies; however, they are inefficient solutions in mm-wave and THz application band due to severe path loss and susceptibility to LOS blockage [102].
ML techniques can play a significant role in HO optimization and BS station selection by reducing delays, computational overhead, and frequent HOs. They help predict the T-BS and also ensure that adequate resources are available at the T-BS before HO occurs in order to ensure a seamless HO. In this section, we consider ML-based HO management in 5G networks from the perspective of visual, and wireless data aided HO optimization. In Table 6, we present a summary of the state-of-the-art ML-based HO optimization in 5G mm-wave communication systems.

1) VISUAL DATA AIDED HANDOVER OPTIMIZATION
Successive generations of cellular networks have mainly depended on wireless sensory information such as CSI, received power, etc., for network design and optimization. However, the use of mm-wave and THz frequencies in 5G and B5G networks would mean that BSs will have many antennas, communication will be through a large number of LOS beams, which would be subject to blockages of various types and would limit signal reception at the user end. In addition, much signaling overhead would be involved in the selection of the optimal beam for user connection in mm-wave networks if only wireless sensory data are exploited for optimal beam selection considering the massive number of beams that would be involved [114], [115].
The vision assisted HO optimization has become necessary because of the complexity of the mm-wave networks, and it might not be possible to capture all the conditions of the environments like obstacles, buildings, etc., using wireless sensory data. As a result, detecting or predicting the presence of obstacles that would block the received beam and reduce the throughput at the user end is very difficult to achieve using only wireless sensory data. However, with vision assisted HO optimization, visual data (image/video) is combined with wireless sensory data to enable proactive obstacle prediction and optimized beam/BS selection that would help enhance user QoS [131]. In addition, with the advancement in computer vision, the training overhead that is normally associated with training ML models for optimal beam selection can be greatly reduced by utilizing the images of networks in developing deep learning algorithms for efficient HO operation [132], [133]. In the following paragraphs, we review the research works that have been proposed on the use of visual data for HO optimization in mm-wave networks.
One application of visual data for HO optimization is the prediction of obstacles that might affect the magnitude of the received power or data rate at the user end. In this regard, the authors in [27], proposed a cooperative sensing scheme for proactive HO in mm-wave networks using a combination of images captured from multiple cameras and received power. The idea is to map camera images with HO decision using DRL such that a proactive HO decision can be initiated before the received signal is blocked by an obstacle. The advantage of using multiple cameras is to cover areas that are inaccessible by other cameras so as to get a complete view of the network environment. The camera images also enable the prediction of obstacles that will affect the mm-wave links. The authors in [134] developed a DRL framework using camera images for optimizing the HO timing by predicting the future data rate of mm-wave links and ensuring that proactive HO is performed before data rate degradation occurs due to presence of obstacles.
Another application of visual data for HO optimization is the prediction or selection of the optimal beam that the user should connect based on user mobility and the presence of obstacles in the mm-wave network environment. Following this research direction, the authors in [135] demonstrated how data obtained from LIDAR sensors could be used to reduce the overhead associated with mm-wave beam selection and LOS detection, and proposed a decentralized architecture using deep CNN. Their work was extended in [136] by developing a deep learning-based centralized architecture for beam selection and the detection of the LOS in vehicular networks by combining location information and LIDAR data. In [137], the authors proposed a novel beams selection scheme that is capable of predicting the optimal beam to connect to at any position in the cell using image-based 3D reconstruction and CNN. They argue that the proposed method takes images from ordinary cameras and is cheaper to implement compared to LIDAR-based approaches in [135], [136].
The work in [131] considered the problem of beam selection and blockage prediction using camera images, channel state, and deep learning for a single user communication in a mm-wave network. The beam selection problem was formulated as an image classification problem such that the UEs are mapped to a class of beams having a unique beam index, depending on their location in the image. However, to detect users that are blocked, the images are matched with channel information due to the difficulty of detecting obstacles in still images. The authors in [132] first developed a realistic image data set for ML-based mm-wave network optimization that considers many BSs, users, different obstacles, and rich environmental dynamics. Then leveraging the image data set and information regarding previously selected beams, a ML based vision-aided beam tracking framework was proposed to predict the future beams of mobile users in a mm-wave communication system.

2) WIRELESS DATA AIDED HANDOVER OPTIMIZATION
Non-vision assisted HO optimization, on the other hand, does not involve the uses visual sensory information such as images and videos for HO optimization. It uses the conventional wireless sensory information such as received signal level and channel state, and user location information to optimize the switching of user connection from one BS or beam to another. This is the general technique that is commonly used in wireless communications systems. In this session, we review the state-of-the-art in HO optimization in mm-wave communications networks from the perspective of the BS and beam selection by exploiting the CSI and user mobility information such as user location, trajectory, etc.

a: BEAM SELECTION
Due to the high path loss and sensitivity to blockage experienced by mm-waves, a large number of BSs comprising multiple directional antenna arrays have to be deployed. The use of multiple antenna arrays enables the formation of narrow signal beams with a high gain when the phase or amplitude of each antenna is adjusted. This approach, commonly known as beamforming [138], enables the formation of directional links between the BSs and UEs. However, because each BS comprises multiple beams, the challenge becomes selecting the optimal beam that will serve the UE in order to satisfy its QoS. In the following paragraphs, we review the most recent works on ML-based beam selection in mm-wave and THz communication systems.
The beam selection problem is sometimes modeled as a multi-classification problem, after which a supervised or deep learning algorithm is used to identify the beam class. In this regards, the authors in [139] proposed a data-driven approach for analog beam selection in hybrid MIMO systems. The beam selection problem was first formulated as a multi-classification problem and then solved using SVM in order to obtain the optimal analog beam for each user. The performance evaluation shows that the proposed method has similar data rate to that of traditional methods but with lesser complexity. In [140], the direction of arrival information was leveraged to developed a ML scheme for beams selection in mm-wave communications. The beam selection problem was expressed as a multi-class problem, and three supervised learning algorithms namely kNN, SVM, and ANN were used to solve the problem. The authors in [141], proposed a beam selection policy for THz systems based on ML approach with low complexity. The beam selection problem was first formulated as a multi-classification problem after which a random forest algorithm was used to determine the optimal beam class.
In [142] and [143], a ML framework for analog beam selection was proposed using SVM, which considers the transmit power of the SCs and channel information as inputs while the model training was performed using sequential minimal optimization in order to achieve high sum-rate at a lower computational complexity. A DNN model for beam selection where channel knowledge is not required was developed in [144]. The beam selection problem was modeled as an image reconstruction problem, after which the DNN was used for interpolation. The proposed model was first trained offline-to reduce the training overheadbefore online implementation of the trained model was performed. A beam selection framework for mm-wave vehicular networks using different ML-based classification models was proposed in [145]. The training data set comprised the vehicle location, type of receiver vehicle and its surrounding vehicles as well previously selected beams. It was observed that the random forest algorithm outperformed other classification algorithms in terms of accuracy and efficiency. A neural network framework for beam selection in THz communication networks was proposed in [146]. The proposed model was trained using data samples obtained from the THz channel based on the multi-classification approach. The proposed model was able to determine the optimal beam for each user with low complexity compared to the conventional exhaustive search method.
Another category of mm-wave beam selection technique exploits the CSI of sub-6 GHz to minimize the search overhead involved in selecting the optimal beam as well as for initial beam establishment. In this regards, the authors in [147] proposed a DNN based framework for selecting the optimal mm-wave SCs and beam in a HetNet involving mm-wave SCs and sub-6 GHz macrocells. They utilized the CSI from sub-6 GHz macrocells for both SCs and beam selection in order to minimize the latency resulting from using conventional exhaustive search approach for beam selection. The authors in [148] introduced a deep learning approach to mm-wave beam selection in 5G and B5G using sub-6 GHz CSI. They argue that using the sub-6 GHz CSI for the mm-wave beam selection would help reduce the search space required for establishing the initial beam. In [149], a deep learning framework was proposed for predicting mm-wave beam and blockage while using sub-6 GHz channel. They proved that under certain conditions, a mapping function exists, that can be used to predict the optimal beam and blockages in any environment. Then they went further to to show that this mapping function can be learnt using a large enough neural network after which a DNN model was designed to perform both predictions. The work in [150], suggested a deep learning approach for the prediction of the optimal mm-wave downlink beam. The developed DNN model takes as input a combination of features extracted from both the sub-6 GHz channel and mm-wave band in order to enhance prediction accuracy and achievable data rate.
RL techniques have also been applied to mm-wave beam selection in literature. In this regard, different (deep) RL algorithms such as multi-armed bandit (MAB), Q-learning, deep Q-learning approaches have been proposed. A novel ML-based beam tracking and alignment framework for a sparse and time-varying mm-wave channel was proposed in [151]. The channel tracking was performed using Bayesian learning and Kalman filtering after which the optimal beam selection strategy was obtained using MAB. A fast ML algorithm for beam selection in 5G mm-wave vehicular networks using contextual MAB (CMAB) was proposed in [152] and [153]. The proposed model considers the traffic pattern and different types of blockages in order to select the optimal beam in real-time without prior training of the model. In [154], a beam tracking approach based on MAB is proposed to determine the optimal beams and data rates of the beams in a mm-wave communication system. The proposed model uses the beam quality information, and the feedback obtained from users during initial access to determine the optimal beam and transmission rate during the next transmission.
The authors in [155] proposed an online learning algorithm for optimal beam selection in mm-wave vehicular networks using CMAB. The developed algorithm is able to predict the beam direction of the target mm-wave BS from the serving mm-wave BS depending on the current traffic pattern while considering the user QoS requirements. In [156] multi-agent RL (MARL) approach for the joint optimization of user scheduling and beam selection in mm-wave networks was developed. The proposed method ensures that the delays associated with beam selection are minimized while ensuring that the users QoS are satisfied. The authors in [157] proposed a framework for mm-wave beam prediction in multi-UAV communication systems using Q-learning. The proposed model exploits the received coupling coefficients (a pair of analog beamforming vector from the transmitter and receiver side of the channel) to determine the optimal beam that will maximize the received signal-to-interference-plus-noise-ratio.
A learning-based approach for optimizing beam search in mm-wave BSs in an indoor network environment while considering user mobility has been proposed in [158]. The proposed approach uses multi-state Q-learning while exploiting user trajectory-based data from the radio. They argue that the proposed method is superior to traditional methods because it jointly considers both BS and beam selection, can be adapted to different indoor environments and user mobility and minimizes the delays due to beam search. A beam selection framework for high mobility vehicular networks which aims at enhancing data rate, minimizing the number of HOs and disconnection time was proposed in [159]. The proposed framework utilizes parallel Q-learning to determine the optimal beam for each vehicle. The algorithm leverages the possibility of simultaneously collecting information from multiple vehicles on the road to hasten its convergence to the optimal solution. In [160], an RL framework for beam selection in NLOS scenarios was introduced. The proposed framework employs Q-learning to determine the optimal NLOS beam for each user based on the user's QoS requirement.
The user location can also be exploited in order to identify the optimal beam selection for user association. The authors in [161], proposed a beam selection strategy based on ML that considers the user position and receiver orientation to select the optimal beam pair, thereby reducing the overhead associated with beam alignment. Moreover, since their approach to beam selection is based on multiclassification, the neural network model is enriched with a large amount of CSI to enable it not only to select the strongest beam based on the magnitude of received signal but also an alternative beam. This makes the proposed approach resilient against blockages. A hierarchical learning-based beam selection scheme was proposed in [162] for multi-users in mm-wave vehicular networks. They developed a graph neural network (GNN) model for beam pair selection while considering CSI and user positions. A deep learning model based on CNN architecture was proposed in [163] for selecting the beam that gives the best communication performance to users in a massive MIMO system while considering user position. In [164], the authors developed a learning-based beam alignment scheme for mm-wave systems that can determine the optimal BS while only exploiting the user position. The proposed scheme can predict the optimal BS and beam even with incomplete user location information with reduced search time. A position-based online learning framework for optimal beam pair selection and refinement was proposed in [165] while considering only user position. The beam selection and refinement problem was first modeled as a continuum-armed bandit problem after which a risk-aware greedy upper confidence bound (UCB) algorithm was developed for beam selection while a hierarchical optimistic optimization (HOO) was used for beam refinement. The observed that where more information regarding the environment can be obtained from BS or user devices, the training overhead can be further reduced.

b: BASE STATION SELECTION
A proactive HO framework that enable users to switch connection to another BS before link disconnection was proposed in [101]. The proposed method uses deep learning to predict obstacles and trigger HO before link disconnection occurs, thereby ensuring the reliability of the link and preventing data transmission delays due to link disconnection. In [166] and [166], a HO mechanism for selecting the optimal BS in mm-wave network based on MAB approach was developed in order to ensure the user has a longer connection time with the BS after HO. The considered the user's post-HO trajectory, and the blockages along the LOS to predict future HO. A RL framework for minimizing frequent HO while satisfying users QoS was proposed in [167] using MAB. The proposed framework takes into account the channel conditions and user QoS requirements before triggering HO. Furthermore, two BS selection algorithms were also developed based on user density for both single-user and multiply user HO scenarios, respectively. An intelligent HO decision framework for BS selection was proposed in [97]. The proposed framework uses a double DRL (DDRL) algorithm to learn the optimal BS for user association in order to minimize the number of HOs and optimize the average throughput along the user trajectory. A distributed learning framework for HO optimization in dense mm-wave networks was proposed in [168] and [169] in order to minimize frequent HO and optimize user throughput. The framework employs MARL where each user was modeled as an agent and takes an independent HO decision based on its local observation, thereby reducing signaling overhead.
The authors in [170] introduced a HO optimization algorithm based on RL for 5G systems. They modeled the HO problem as a CMAB, then developed a Q-learning solution.
In [171], the authors proposed a deep learning model for user localization and proactive HO management, while considering user behaviour in the network. The proposed model uses the received signal measurements to reduce the number of unnecessary HOs and predict the user location while ensuring that the throughput of the network is maintained. A joint optimization framework for minimizing HO frequency and maximizing user throughput was proposed in [172]. The HO and power allocation problem was modeled as a cooperative multi-agent task, after which a MARL framework using proximal policy optimization (PPO) was developed. The model training was performed in a centralized manner after which decentralized policies were obtained for each user. The authors in [173], proposed a learning framework that jointly optimizes HO and beamforming for mm-wave networks. RL algorithm was employed to determine the optimal backup BSs along user trajectory that will help reduce the overhead signaling during channel estimation for user association and minimize the number of HOs. This would ensure an enhanced data rate along the user trajectory.
A learning-based load balancing HO mechanism was proposed in [95]. The user association problem was modeled as a non-convex optimization problem, after which a deep deterministic policy gradient (DDPG) RL algorithm was applied to solve the optimization problem. The algorithm's goal is to associate all the users in different trajectories in the network environment to the optimal BSs in such a way that maximizes their sum rate as well as reduces the number of HO occurrences. The authors in [174] exploit user-centric information to predict user future content request and mobility pattern, after which the optimal user association with UAVs, UAVs' position and content to cache at the UAVs were determined. The goal of their work was to enhance the QoE of the users while reducing the UAV's transmission power. A ML framework using echo state networks was proposed to predict the user future content requests and mobility pattern after which analytical derivations of the optimal UAV locations and contents to cache at the UAV were performed. In [175], a joint optimization framework was proposed for both resource and cache management over licensed and unlicensed for UAV networks. To solve the optimization problem, a liquid state machine learning algorithm was developed to predict the distribution of the user content as well as to enable the UAV select the optimal resource allocation strategy for serving user requests. In addition, a closed form expression was derived to determine the optimal user content to cache and optimal resource allocation.

VI. CHALLENGES AND FUTURE RESEARCH DIRECTIONS
Although many studies have been carried out to address the issue of HO management in 5G specifically for mm-wave applications, many significant challenges still needs to be addressed. In this section, we briefly highlight some of the challenges associated with the application of ML techniques for HO management in 5G and present future research directions.
A. DATA SET AVAILABILITY ML-based implementations rely on the availability of sufficient 12 and quality data 13 for model training. However, ML based mobility and HO optimization require data set containing user mobility history which is usually very difficult to obtain due to various data protection regulations [118]. Hence, synthetic data via network simulations are normally used for model training. Also, there is the issue of data uniformity where data set generated cannot be used across different platforms. Hence, there is a need to create quality data sets that can be used as benchmarks to assess the accuracy of different ML models that are being proposed for mobility predictions and HO optimization in order to verify their authenticity.

B. PRIVACY AND SECURITY
Mobile service providers are typically responsible for protecting the privacy of their customers. As a result, it is very difficult to release the complete and quality data sets from mobile networks without revealing the personal identity of the users and compromising their privacy. Moreover, ML model security is another issue that should be considered as deep learning models can be subject to adversarial attack. These attacks often inject fake data set to the training data set, thereby reducing model accuracy and resulting in sub-optimal network performance [178] and [179]. More researches need to be conducted on how to properly anonymise data sets from mobile operators in other to prevent vital user information from being revealed while retaining the relevant features of the data set. There is also a need for more research on how to secure deep learning model from adversarial attack that seeks to undermine their accuracy. In addition, more privacy-preserving ML algorithms such as federated learning needs to developed and employed for mobility management and HO optimization in order to guarantee the security and secrecy of user information.

C. GENERALIZATION OF THE ML MODEL
Generalization is the ability of a ML model to learn from seen data and be able to predict the unseen data. Nevertheless, it is not always clear if the trained model is truly generalized since it is difficult to determine if the data set that was used 12 Sufficiency here means that the available data set should be large enough to enable proper training of the ML model. 13 Quality here means that the data set must be free from missing entries, duplicated entries or any form of noise that may hinder the ML model from accurately learning from the data set.
for model training captured all the environment features and parameters such that the model would have been exposed to all these features during model training for effective generalization to happen. Therefore, it is essential to ensure that when generating synthetic data set or obtaining real data sets for mobility predictions and HO optimization that all the features in the environment are adequately represented in the data set in order to enhance generalization of the ML models.

D. OFFLINE VERSUS ONLINE LEARNING
Due to the large dimension of 5G and B5G networks as well as the large number of parameters that needs to be learnt during mobility prediction and HO optimization, network designers often have to resort to offline training to reduce both time and space complexity. The successful implementation of offline trained model depends on the adaptability and generalization ability of the model. However, HO optimization often require real-time training and decision making. Hence, there is a need to reduce the number of parameters that needs to be trained by employing clustering method [120] and the use of hardware acceleration [180] to facilitate the ML training process. There might also be a need for both offline and online learning where the model goes through a periodic update and refinement during real time implementation.

E. CENTRALIZED VS DISTRIBUTED DEPLOYMENT
ML models can either be implemented centrally or in a distributed manner, depending on the network configuration with each having its advantage or disadvantage. On the one hand, the advantage of decentralized implementation is low signaling overhead and lesser computation. At the same time, it faces the challenge of inaccurate network optimization due to localized or lack of global network information. On the other hand, the centralized learning case has a global information of the environment and is able to perform a coordinated and collaborative learning that leads to global network optimization. However, it results in massive signaling and computation overhead due to periodic data collection as well as end-to-end delays. Hence, there must be trade-off considerations between global accuracy and huge overhead [181]. With increased network dimensions, complexity and heterogeneous UEs in 5G and B5G networks, it would be more suitable to implement decentralized ML approaches for mobility management and HO optimization as they can help preserve user privacy, reduce latency and communication overhead and also minimize the energy consumption of UEs [124]. However, the issue of coordination for decentralized learning and the challenges involved in sending the locally trained models from the user devices to the central entity due to imperfect channel conditions [126], requires that more research and investigations needs to be carried out on how to effectively implement decentralized ML approaches for mobility management and HO optimization in 5G and B5G networks. VOLUME 9, 2021 F. FREQUENT HANDOVER The deployment of large number of mm-wave small cells (due to the short transmission distance of mm-wave signals) to serve the traffic demands of the growing number of UEs would result in increased and unnecessary HOs as well as HO failures. More frequent HO occurrences results in increased signalling overhead, reduced QoS of UEs and increased device power consumption [182]. Therefore, more improved models needs to be developed to reduce the number of HO occurrences as well as optimize the procedure of the HO decision-making process.

G. SIGNALING OVERHEAD
The increase in the number of UEs as well as the large deployment of mm-wave BSs would make the UEs more prone to frequent HOs. The process of HO usually involves the transmission of packets which results in increased signaling overhead in the network. The more the number of HO occurrences, the higher the signaling overhead. Signaling overhead during HO leads to interruptions in data transmission which reduces the user throughput and increases latency [183]. Newer approaches needs to be developed to reduce the number of HOs and shorten data interruption time due to transmission of HO signaling messages by optimizing HO parameters and eliminating the concept of area notification.

H. DEVICE POWER CONSUMPTION
The process of HO in mm-wave communications requires that the UEs makes intra-frequency or inter-frequency measurements depending on its carrier frequency and that of the BS. These measurements increases as the number of mm-wave BSs increases, thereby resulting in increased power consumption for the UEs [184]. This power consumption can be minimized by reducing the measurement gaps and ensuring that the UEs take measurements within the DRX cycle. The smaller, the DRX cycle, the lesser the power consumption [100]. Hence, mobility and HO management schemes that can reduce the DRX cycle needs to be developed in order to optimize the device power consumption.

I. LOAD BALANCING
The uneven distribution of UEs within the network due to random cell positioning and UEs mobility makes some cells to become more loaded than others as a result of more UEs associating with those cells than others. This load imbalance among the cells causes frequent HO and degradation in the QoS of the UEs [94]. Also, in an attempt to prevent frequent HO, most of the HO optimization methods proposed in literature suggest HO skipping or prolonged user connection to a BS [185], which could lead to load imbalance in the network. Therefore, further research needs to be conducted on the effects of the proposed HO optimization schemes on the load balance of the network in order to ensure the QoS of UEs and minimize network congestion.

VII. CONCLUSION
HO management has already been one of the main issues in cellular networks, and is envisioned to be more critical with the introduction of 5G networks due to the prospective capacity enhancement technologies. ML has been quite pervasive in numerous domains, including healthcare, agriculture, disaster prevention, etc., and it has become a reality in 5G networks with proven capabilities in terms of effectiveness and efficiency. Besides, almost all the visionary works that attempt to draw a framework for 6G network foresee that ML will lie at the heart of 6G. In this survey, we first tried to take a snapshot of the current status of cellular communication networks, and then gave a comprehensive tutorial in both mobility and HO management in 5G after elaborating some distinctive characteristics of 5G networks. After that, the major ML branches, namely supervised, unsupervised, and RL (RL), were introduced, and their applications to HO management process were presented. An extensive review on the recent studies on ML-aided HO management techniques was provided under a novel classification that is based on the source of the data for ML implementations. Lastly, the challenges that can be faced while incorporating ML into HO management procedure were identified and thoroughly discussed, followed by a discussion of future research directions.
Although there are multiple survey papers available in the literature reviewing mobility and/or HO management in 5G networks, this is a unique attempt to solely focus on ML applications to HO management. Further, the scope of the paper was kept limited to HO management in order to analyze the topic in a concise and clear manner. The state-of-the-art was reviewed which encompasses the most recent studies in order to demonstrate the current research focus within the community and to keep the readers up-to-date with the most relevant and timely data. In addition, we provided a novel taxonomy about the source of the data and visual data aided HO management techniques, which has been overlooked in the existing survey papers, was included in the discussions along with the traditional wireless data driven applications. With this, we aimed to divert the research focus from conventional approaches to the visual data sources due to the promising potentials of utilizing them in HO management. We also included a discussion on how intelligent HO management can be helpful in emergency scenarios, where there could be mobile clinics, ambulances, remote hospitals, etc. This is quite important on its own considering the current COVID-19 pandemic, which created a chaos in the world and put the need for intelligent and remote control under the spotlight.