A Comprehensive Review on Artificial Intelligence/Machine Learning Algorithms for Empowering the Future IoT Toward 6G Era

The evolution of the wireless network systems over decades has been providing new services to the users with the help of innovative network and device technologies. In recent times, the 5G network systems are about to be deployed which creates the opportunity to realize massive connectivity with high throughput, low latency, high energy efficiency and security. It also focuses on providing massive Internet of Things (IoT) network connectivity as well as services for good health, large-scale agricultural and industrial production, intelligent traffic control and electricity generation, transmission and distribution systems. However, the ever-increasing number of user devices is directing the researchers towards beyond 5G systems to allocate these user devices with higher bandwidth. Researches on the 6G wireless network systems have already begun to provide higher bandwidth availability for densely connected larger network devices with QoS surety. Researchers are leveraging artificial intelligence (AI)/machine learning (ML) for enhancing future IoT network operations and services. This paper attempts to discuss AI/ML algorithms that can help in developing energy efficient, secured and effective IoT network operations and services. In particular, our article concentrates on the major issues and factors that influence the design of the communication systems for future IoT with the integration of AI/ML. It also highlights application domains, including smart healthcare, smart agriculture, smart transportation, smart grid and smart industry that can operate efficiently and securely. Finally, this paper ends with the discussion on future research scopes with these algorithms in addressing the open issues of the future IoT network systems.


I. INTRODUCTION
Wireless communication has been serving the world with a plethora of technological inventions from the time of the discovery of cellular frequency reuse [1]. The evolution of the communication systems from the first generation (1G) to the fourth generation (4G) has occurred due to the innovations in channel coding, multiple access and antenna The associate editor coordinating the review of this manuscript and approving it for publication was Xiaolong Li . design technologies. Recent advances in these technologies as well as invention of new hardware devices and software systems have given birth to the fifth generation (5G) wireless network systems, which also have allowed to realize the concept of Internet of Things (IoT) and deploy smart facilities to ensure good health, massive agricultural and industrial production, intelligent monitoring system for effective traffic control and electricity generation and many more. While the 5G network systems are on the way of being fully standardized, several researches on the sixth generation (6G) wireless communication systems have already begun [2] to provide seamless connectivity to the increasing number of users and enhance the IoT services.

A. EVOLUTION TOWARDS FUTURE NETWORKS
A revolution in the wireless communication system occurs in almost every decade. Several multiple access, such as frequency division multiplexing (FDM), time-division multiple access (TDMA), code-division multiple access (CDMA) and orthogonal frequency-division multiple access (OFDMA) have helped to introduce new services along with voice call and short messages services (SMS). In this section, the evolution towards next generation network systems is highlighted. This evolution is represented in Fig. 1.
The concept 3G wireless network was launched in 2000, which enabled people to use internet service to send e-mail and have access to web browsing and video calls with the help of Enhanced Data rates for GSM Evolution (EDGE) and CDMA2000 standardizations [3]. Highest available data rate was reported to be 2 Mbps. However, cellular users could get a maximum of 384 kbps data rate. CDMA has been used as a multiple access scheme to provide all these facilities to the users. Although 3G was supposed to provide wide bandwidth, in indoor environments the bandwidth becomes narrower [3]. Besides, the latency provided by 3G is still another issue. With the emergence of the 4G network system, improvement in latency and user data rate was observed. It offered 10 Mbps data rate with 100 Mbps peak. The latency has improved by about 10 times with respect to that of 3G.
The evolution of the long term evolution (LTE) standard, namely, LTE-Advanced has enhanced the user experience in terms of mobile application usage through cellular data.
However, it is shown in [4] that enormous growth of cellular traffic makes it difficult for 4G network systems to provide uninterrupted connectivity to all the users. Besides, new smart devices are being deployed to enable smart applications, which requires large bandwidth to stay connected to the internet. The 5G wireless network system realizes the concept of internet of things (IoT) to provide smart services such as smart healthcare, smart manufacturing, smart electrical power systems, smart transportation and smart agriculture. The requirements of the International Mobile Telecommunications-2020 (IMT-2020) standards which are formed by the radio section of the International Telecommunications Union (ITU-R) are expected to be fulfilled by 5G [5]. Despite supporting enhanced mobile broadband (eMBB), ultra-reliable and low-latency communications (URLLC) and massive machine-type communication (mMTC), application such as holographic communication requires higher communication requirements which is not possible for the current 5G communication specifications [6]. This is where 6G shows its potential to explore higher bandwidth and transmission speed with even lower latency with respect to 5G.
According to the IMT-2020 standard, the data rates, latency, spectral and energy efficiency for 6G are much higher than that for 5G/B5G, as shown in table 1. In [7], the 6G requirements are also reported in terms of reliability,  • It demonstrates the advantage of AI and ML in improving IoT network operations and services in terms of their energy efficiency, level of security, and overall effectiveness.
• The influence of AI/ML in enhancing smart facilities are discussed.
• Future research opportunities with AI/ML-based implementations in the domain of communication systems for future IoT are highlighted.

C. ORGANIZATION OF THE PAPER
This paper is organized as shown in Fig. 4. Section II discusses the applications of AI/ML in different network paradigms and enablers. Section III highlights the AI/ML algorithms and their exploration in recent research areas. Section IV discusses how the smart facilities provided by the communication systems for future IoT can be improved with the help of AI/ML algorithms. Section V provides the future research possibilities and related challenges, and then section VI concludes the paper. distribution execution [8]. However, HetNets face resource management related issues, interference and system optimization. For instance, in [9], the study attempts to address the cell selection approach in HetNets. This operation needs to consider different cell sizes and high frequency reusability and issues such as high handover rates due to users' movement in small areas, which can affect the network performance as well as the user experience. In device tracking applications, Received Signal Strength (RSS)-based Wi-Fi fingerprinting is one of the most common methods in modern communication systems. However, this fingerprint-based localization method involves time-consuming procedures during offline survey phase [10]. This may lead to the failure of device detection accuracy in HetNets. ML implementation in the cell selection approach helps in looking out for effective solutions. A central controller, for example, software defined network (SDN) controller, can utilize ML to perform effective cell selection operation in the communication systems for future IoT. However, it needs to be able to select the appropriate base station by taking into account the characteristics of the HetNets and user device mobility as well as minimization of handover rate [9], [11]. Mobile user device detection with Wi-Fi fingerprinting and crowdsourcing mechanisms can relieve the effort of surveying procedures. However, the presence of heterogeneous devices in large magnitude can affect this localization method. The integration of ML model and crowdsourcing-based Wi-Fi localization method can be performed to address this issue [10]. ML methods such as supervised learning are generally used to learn the statistical model of the wireless sensors' data, which requires effort seeking ML training. In such cases, unsupervised learning models can be employed to determine the statistical model of the sensors' collected data [12]. Along with optimum cluster selection in heterogeneous networks, energy harvesting mechanisms can be executed with AI/ML approaches for construction of energy efficient IoT network systems [13], [14].

2) ULTRA-DENSE NETWORKS
Ultra-dense network paradigm is identified to be evolved from HetNets, as mentioned in [15]. This paradigm consists of a large quantity of base stations, forming pico/femto cells, and access points such as remote radio heads (RRH) and relays. The low powered base stations in large numbers increase the system throughput and provide good network coverage. Although the reduction of path losses between the base stations and the user devices allows the opportunity to improve network throughput and thus to serve more VOLUME 10, 2022 users with spectrum resources, this also allows to introduce inter-user interference. Besides, the inclusion of new devices within the cells increases the network load and thus affects the resource allocation problems [16]. Similar to Het-Nets, handover optimization needs to consider user device mobility [17].
Most of the interference mitigation approach involves interference intensity modeling on the basis of geographical distance data and the path loss model, which provides moderate estimation of the interference. This is where AI/ML methods can be implemented for more precise modeling of the interference [16]. Load aware resource allocation can address the resource allocation problem for increasing the number of user devices in the ultra-dense networks. Clustering-based resource allocation strategy can minimize the resource allocation processing complexity [18]. However, ignoring the influence of inter-cluster interference, the number of antennas of both the transmitters and receivers in the network can hamper the deployment of clustering schemes. Also, by not considering the estimation of distribution of user device density causes clustering mechanisms to provide inconsistent performance. ML based clustering algorithms have grabbed the attention of the researchers as these algorithms are able to address the above-mentioned clustering related issues [18]. Other than resource allocation problems, user association schemes with ML can use older data and utilize them to form clusters [17]. An energy-efficient network topology system can also be developed with the help of unsupervised learning algorithms [19].

3) OPEN RADIO ACCESS NETWORK (O-RAN)
O-RAN architecture is the evolved form of cloud RAN (C-RAN), a cloud network architecture that processes signals and executes network functionalities [20]. This network utilizes three components, namely baseband units, RRHs and fronthaul links providing interconnection between them, to perform the above-mentioned tasks. C-RAN enables effortless network deployment and reduces the maintenance cost. Besides, since the baseband units can be represented as cloud devices, network scalability can be achieved by supporting several network protocols. However, issues such as fronthaul latency, overhead in RRHs-cloud links, and security threats prevail due to C-RAN being a centralized network architecture. This entails the integration of virtualization technologies and edge computing, which gives the birth of virtual RAN (V-RAN). Scalability and latency issues can be addressed with the help of mobile edge computing (MEC). Since the future wireless network requires automated network processing, ML technologies are introduced in the RAN architecture. The concept of O-RAN thus has emerged.
The O-RAN controller uses ML algorithm to perform network functions through open RAN interfaces. In [21], deep learning (DL) implementations on RAN architectures are surveyed, which includes resource management, mobility management and spectrum management achievement. Algorithms such as long short-term memory (LSTM), deep neural network (DNN), reinforcement learning (RL) can be used for resource allocation strategy to be executed by RAN. Power allocation and resource management strategies may use deep RL (DRL), federated learning (FL) to deal with the associated problems. Challenges related to mobility management include handover management, energy utilization of both the base stations and the user devices. Learning algorithms such as recurrent neural network (RNN), Q-learning, and actor critic learning (a DRL based learning algorithm) are used along with DNN and LSTM. Spectrum management associated challenges involve channel estimation, signal encoding, decoding, classification and detection along with beam selection. These challenges are attempted to be addressed by DNN, LSTM, convolutional neural network (CNN), and RL.

4) MASSIVE MIMO (mMIMO)
The mMIMO employs high quantity of base station antennas. This aids the increase of network throughput and spectral efficiency. mMIMO has the potential to take care of interference through efficient beamforming as well as spatial multiplexing. Researches on mMIMO includes channel estimation, precoding, decoding, DoA estimation, localization and power control. Conventional CSI estimation techniques use mathematical modeling, which can help to model MIMO networks without considering nonlinear amplifiers. However, such conventional channel modeling approaches fail to analyze the effect of the non-linear amplifiers in received data. This leads to inaccurate estimation of received data and increase in bit error rate (BER). Also, the performance of SIC decoding technique decreases in presence of such amplifiers for not accurately realizing its effect on the received data. pilot contamination, channel correlation and overall energy utilization by mMIMO networks are also required to be analyzed.
Recent researches on mMIMO include channel estimation and signal detection mechanisms with the help of ML approaches. In [22], an extreme learning machine (ELM) is used to perform signal detection in mMIMO systems. In [23], several DL methods are surveyed for the mMIMO system. DNN based approaches are used to estimate CSI and direction of arrival (DoA). Improved performance is achieved with respect to least square (LS) and minimum mean square error (MMSE). CNN based methods are used for mmWave band network analysis, CSI feedback, Wi-Fi positioning and power control optimization. The use of other ML algorithms such as feed-forward neural network (FNN), LSTM and multi-layer perceptron (MLP) are also used for precoding, SIC decoding along with aforementioned analysis. Such implementation ensures improved BER, signal retrieval and user detection operation. The influence of non-linear amplifiers in channel equalization, precoding and SIC-based decoding are also properly addressed. Cell-free (CF) mMIMO system aims at expanding wireless network coverage by deploying several access points (APs) connected to central processing unit (CPU), which can realize massive connectivity requirements of 6G-communication systems for future IoT [24]. 87540 VOLUME 10, 2022 Deep learning (DL) can offer high spectral efficiency in the CF-mMIMO system, as explored in [25].

5) VISIBLE LIGHT COMMUNICATION (VLC)
VLC utilizes the visible light spectrum, ranging within 430-790 THz band, which is license free [26]. Other than lighting purpose, the light emitting diodes (LEDs), serve high speed communication with high privacy. The LEDs serve as base stations, which enables low construction and maintenance costs. Since the performance VLC is not affected by any form of electromagnetic radiation, it is safe to use in places where the probability of electromagnetic emission is high. VLC based research studies address issues such as resource allocation problems, channel modeling, handover management, interference problems and indoor positioning. Precise channel modelling for VLC requires consideration of factors such as optical front-end characteristics, light induced noise, and atmospheric effects [27]. The resource allocation schemes rely on the accurate CSI, traffic and data rate demands, as mentioned in [28].
AI/ML-based algorithms help to determine the CSI of VLC network, strategize resource allocation methods by taking into account the above-mentioned factors [27]. Linear regression models, ANN, random forest methods, MPL-NN and radial basis function based NN (RBF-NN) can be implemented to model the propagation channel. For resource allocation strategies, genetic algorithm (GA) and particle swarm optimization (PSO) [28]. Indoor positioning for VLC networks uses ML approaches such as K-nearest neighbor, Gaussian process regression random forests algorithms and Bayesian regularization-based DNN. However, ML algorithms need to minimize the effort requirements of site survey. Also, the positioning determination methods must not be affected by the amount of training data [29]. Deep reinforcement learning (DRL)-based algorithms can be utilized in situation where training data is available in limited amount. A study realizes this feature and analyzes an DRL-based adaptive handover mechanism for hybrid 6G VLC network in order to achieve improved data rate performance [30].

B. ENABLING TECHNOLOGIES 1) RECONFIGURABLE INTELLIGENT SURFACE (RIS)
RIS is an electromagnetic (EM) material surface which has the ability to electronically control the electromagnetic wave propagation [31]. This technology is able to address the spectrum scarcity caused by the obstacles the high-frequency EM signals face. The installation of relays and base stations in large numbers in order to minimize EM wave loss leads to high energy utilization. However, RIS implementation in wireless networks possesses some challenges. Accurate CSI estimation is a challenging task because the signal tends to show obstruction-prone nature and the user end devices may have dynamic characteristics. As a result, RIS-based network analysis gets compromised. Besides, the phase shift caused by the reflecting surface creates complications in designing beamforming mechanisms. The security and privacy issues also exist in RIS-based network systems, which need to be addressed by effective approaches.
ML algorithms have gained popularity for dealing with RIS-based network related issues. ML is adopted in channel estimation, beamforming development, resource management, detection and security-based operations. In [31], several ML based approaches and their adaptation in RIS-based networks have been surveyed. An RL-based scheme is used to maximize system throughput under the consideration of precise and imperfect CSI [32]. A channel estimation scheme, proposed in [33] for RIS enabled mMIMO, uses CNN which is trained with the help of pilot signals. Other than mMIMO, CNN also estimates CSI for RIS-mmWave systems [34]. It is shown in [35] that FL methodology can reduce the transmission overhead created by centralized learning method for CSI estimation. Beamforming strategies with DL based implementations can help in adjusting phase shifts and phase optimization in RIS and minimizing RIS-based network model complications. Resource management, symbol detection and security concerned RIS-based networks use DL, DRL and FL based algorithms, which enhance the probability of effective deployment of such networks [31].

2) SIMULTANEOUS LOCALIZATION AND MAPPING (SLAM)
Camera-based localization approaches have gained much popularity because of providing valuable information about any object/environment, explored by the robotic devices such as aerial vehicles and drones, with low-cost implementations and easy hardware setup [36]. SLAM mechanism is one of those localization approaches which performs both localization and mapping, as the name suggests [37] and used in applications such as virtual and augmented reality [38]. Upon recognizing the identified object/environment, the robotic device learns its locations and builds a map of its surroundings and the trajectory of the identifier. Though several visual SLAM schemes exist which are designed with the help of sparse feature points and photometric consistency of dense pixels, as reported in [38], these cannot execute self-learning ability and process large amounts of data.
DL-based algorithms have proven to be effective in performing identification of the object/environment, classification of the image data and semantic segmentation [36], [39]. These algorithms help in addressing problems such as the instability of the system operations generated from the pose estimation of the camera (due to this being affected by the vibration of the mobile bodies, brightness of the scene etc.) and incomplete three-dimensional map reconstruction. In [39], a SLAM framework is proposed which combines the information about the camera pose as well as the depth and semantics of every captured frame in order to recreate the map. DL methods tapped with unsupervised learning strategy can reduce the expense of labeling of large data. Thus, in [38] unsupervised learning-based DL method is employed to create the trajectory, maps and estimate camera pose. A pose estimation error minimization approach is studied in [40] to accurately extract the features of the tracked objects and environments. Energy effective SLAM operations may be ensured by analyzing the minimum number of samples required to be collected by the sensors/cameras for effective reconstruction of the map images, which is done in [41].

3) TERAHERTZ (THz) WIRELESS NETWORKS
Future network applications require services provision with high QoS and QoE requirement satisfaction, which is not possible to provide to the users with RF band allocation in massive scale. Thus, future wireless network systems need to acquire THz bands, ranging from 0.1 to 10 THz [42]. The adoption of THz bands in wireless network systems allows the introduction of both nano-scale and macro-scale applications. However, high-directional communication link, required to address the channel attenuation problem, necessitates the beam alignment schemes among the network devices. Path loss due to molecular absorption affects the beamforming design, resource allocation and user association schemes. Since the THz communication uses short wavelength, CSI and beamforming methods will be affected by small variations in the communication channel. Signal detection approach proposals need to consider the hardware imperfection possessed by the high-frequency transceivers. Intelligent handover schemes, routing approaches, traffic prediction and caching schemes need attention for such high-frequency enabled networks.
The above-mentioned issues can be addressed by ML-based computation methodologies. In [42], the use of ML in different layer-based applications are studied. For modulation recognition, CNN, RNN, DNN and expectation maximization are used. KNN, Baysian learning, and DNN are employed in different studies to perform channel estimation and beam tracking. Signal detection is mostly conducted with the help of DNN based approaches, as highlighted in [42]. The aforementioned operations with ML help to solve problems in the physical layer of THz communication networks. In the MAC layer, operations such as beamforming design, channel allocation and power management are carried out. DNN, K-means clustering, Q-learning and DRL are used for developing beamforming schemes as well as power management. For channel allocation strategy development, DNN, K-means clustering and Q-learning are employed. Network layer deals with employment of user association, mobility management, routing and traffic clustering schemes. Smart user association strategy can be developed by means of DNN, K-means clustering, DRL and Q-learning. Mobility management schemes may employ KNN and Q-learning approaches. Routing algorithm design can utilize decision trees approach and Bayesian network, K-means clustering, EM and multilayer perceptron (MLP) may be adopted for traffic clustering scheme development. Transport layer operations such as traffic prediction, caching and computational offloading may incorporate NN, DNN, K-means clustering, Bayesian network, DRL and Q-learning based ML methodologies.

4) BLOCKCHAIN
Blockchain is a distributed and decentralized network which promotes network security [43]. Each transaction is ''cryptographically signed'', verified by the participating miners who has the record of all the transactions contained in the ledger. After proving the transaction, the shared ledger is updated by including it to the last block. The harsh value assigned to the ledger blocks prevents the alteration of the transaction. In recent times, a concept of distributed machine learning (DML) has emerged which enables the sharing of the interpretation of specific datasets among the miners without violating privacy.
Blockchain has the potential to provide effective measures against security threats. However, threats such as majority attacks and sybil attacks can influence the voting and generate fake identity of the miners which can affect the validation of the transaction [44]. The blockchain-ML schemes can prevent such attacks for safe and secured transactions [45]. SVM, CNN, LSTM and DL can be employed to determine the attack pattern and determine prevention schemes against them. In another study [46], DRL is used in a blockchain network to jointly optimize consensus protocol selection, computation resource utilization and network bandwidth resource allocation. Federated learning can also be used to design data sharing mechanisms in blockchain networks to prevent private data leakage [47]. A case study is performed in [48] for a railway system where an FL-based blockchain system can securely access the data from the rail system to intelligently control the train.

5) QUANTUM COMPUTING
Quantum computing is highly capable to perform fast computations and ensure the reliability, security and energy efficiency of the computational systems [49]. Since it follows the primary concept of quantum mechanics, it supports parallel processing of multi-dimensional data in large volumes. This creates new research opportunities in communication networks. In [49], some applications of quantum computing in wireless communication networks are highlighted, which are multi-user detection, indoor localization, routing and load balancing optimization and channel estimation. In these studies, the performance of quantum computing-based approaches has been shown to be on par or superior to that of conventional computing-based approaches.
ML algorithms are already being used to solve several problems related to communication networks. Quantum computing can enhance the performance of the ML algorithms in terms of computational speed and complexity [50]. Thus, the concept of quantum machine learning has emerged. ML operations such as classifications, regressions and clustering have been attempted to execute with QML. RL method integrated with quantum computing has been studied in for spectrum allocation. The training of DL algorithms with quantum computing mechanisms are proposed in. [49]. This can enhance the modeling of the algorithm for better performance 87542 VOLUME 10, 2022 extraction. FL can also be combined with quantum computing to create a learning model without violating data privacy, which is studied in [51]. Optimal quantum key distribution (QKD) protocol selection and intruder detection during the QKD process can be executed by employing ML algorithms [52], [53].

III. RESEARCH TRENDS ON AI/ML ALGORITHMS FOR FUTURE IoT WIRELESS NETWORKS
Several AI/ML algorithms have been developed for the development of efficient communication systems for future IoT that would provide reliable and secured services to the users. AI/ML has been used at different layers of the communication systems to enhance operations executed at those layers, as shown in Fig. 5. These AI/ML-based communication operation systems must display low computational complexity and execution time as well as high detection and interpretation accuracy, as shown in Fig. 6. In this section, research trends on communication systems by using heuristic algorithms, supervised and unsupervised learning algorithms, reinforcement learning (RL), deep learning (DL), deep reinforcement learning (DRL) and federated learning (FL) algorithms are highlighted.

A. HEURISTIC ALGORITHMS
This type of algorithm aims at providing quick effective solutions to a given NP-hard problem [54]. A defined heuristic function is used to find the heuristic value of the artificial network nodes. Based on these values, the optimum solution is derived. This approach is applicable for situations where there are no existing solutions to the given problems. Three types of heuristic approaches will be discussed here, which are, particle swarm optimization (PSO), ant colony optimization (ACO) and genetic algorithm (GA). Table 2 summarizes the heuristic algorithm-based researches for B5G/6G communication systems for future IoT.

1) PARTICLE SWARM OPTIMIZATION (PSO)
PSO is a bio-inspired algorithm in which some points or particles move in a vector space and share their experiences with each other in order to look for the best possible solution of a given problem [54]. This approach will help in determining the minima of a given function with the help of iterative search by the sample particles. PSO is applied to develop resource allocation strategy and energy efficient network system design.

a: APPLICATIONS
A spectrum allocation problem is addressed in [55] which is intended to solve with PSO for achieving high throughput and QoS. Two scenarios are considered in the study, one is network utilization and the other is both network utilization and fairness. The adaptive PSO tunes the algorithmic parameters to determine optimum solution for the problem. High spectrum usage efficiency is achieved with the proposed PSO method compared to traditional PSO algorithm.
Another study on resource allocation for mobile users is studied in [28]. Data such as channel information and QoS demands of the users and overload status of the access points are extracted by an assigned controller. Based on the collected data, by using a PSO based algorithm the controller allocates spectrum to the users. The proposed resource allocation methodology is compared to Round Robin algorithm, best channel quality information and genetic algorithm (GA) in terms of system throughput and satisfaction index, which represents the fairness of the resource allocation strategy. An energy efficiency aware coverage control scheme for wireless sensor networks is proposed in [56]. PSO is implemented for optimization of location related information to reduce overlapping of the sensing radius. A node sleeping scheme is also introduced to decrease the number of active sensors. Industrial wireless networks use rechargeable sensors for monitoring purposes. This requires a charging schedule to replenish the energy of the sensing devices, which is proposed in [57]. According to the energy status of the sensors, the proposed PSO based algorithm arranges the sensing devices so that the devices could be charged in due time without hampering the industrial monitoring operation, which in turn enhances the energy utilization rate. Clustering mechanism allows the enhancement of network lifetime by reducing high energy consumption. However, improper cluster head assignment can accelerate the energy usage of the devices. So, a type-2 Fuzzy logic-based PSO algorithm is studied in [58] to elect cluster head. Along with achieving improved network lifetime, packet transmission is also improved with the proposed algorithm. Beside improper cluster head selection, high energy depletion issues prevail in inter-cluster relay networks due to high traffic load. In order to manage efficient energy consumption, a PSO based clustering scheme is presented in [59]. Also, the presented PSO based scheme minimizes the network fault due to unexpected failure of master cluster heads by assigning an additional cluster head. This ensures the network reliability. Other study cases with PSO based approaches are underwater target positioning strategy [60], sensor localization [61], path loss modeling [62] and channel parameter estimation [63].

2) ANT COLONY OPTIMIZATION (ACO)
As the name suggests, ACO is derived from the analogy of the behavior of ants searching for food [64]. This optimization method is similar to the PSO method. The sample points, which are known as ants, start moving from one point in different directions of a given space to search for an optimal route to the destination point. The next set of searching objects then follow the determined optimal route to reach the destination point. This algorithm is used in wireless networks for server/network device deployment, routing and energy efficient network design.

a: APPLICATIONS
An ACO based sensor node deployment scheme is proposed in [65] where conventional ACO is used at first to learn VOLUME 10, 2022  about the sensors deployed in the network. The inessential sensors are then eradicated by means of modified ACO based algorithm. Optimal number of sensor deployment with low computational cost is claimed to achieve with the proposed algorithm. In [66], an energy efficiency aiming virtual machine placement strategy is proposed for cloud computing enabled networks. ACO is coupled with a technique named order exchange and migration (OEM) to involve fewer physical servers and assign optimum number of virtual machines to ensure less energy consumption of the overall network system. In case of target tracking applications, the energy aware scheduling of mobile sensor movements is studied in [67]. A framework is proposed where the sensing area is defined within which the sensors are allowed to move. ACO is modified to configure the tracking parameters at regular intervals to ensure energy saving and effective tracking accuracy. The concern of security protocols along with QoS and network lifetime allows the evaluation of network reliability effectively. With this view, ACO based energy-efficient, QoS aware and secured routing mechanism is proposed in [68]. This computes the end-to-end transmission delay and trust factor of the sensors, which helps in realizing QoS and security of the wireless network.

3) GENETIC ALGORITHM (GA)
GA provides improved solutions to specified problems by adopting the concepts of evolutionary biology [69]. The solutions and their parameters are represented respectively by chromosomes and genes. A fitness (or objective) function is used to evaluate the fitness of every solution. Three most common operators are utilized in such algorithms, namely, selection, crossover and mutation. In the selection stage, individuals with higher fitness values are chosen for the creation of next generation solutions. The mating of two parent solutions is executed in the crossover stage to derive 87544 VOLUME 10, 2022 new solutions. The diversity of population is maintained in the mutation stage by mutation phenomenon of the derived solutions. However, mutation rate value should be carefully chosen in order to prevent the performance degradation of GA based strategies.

a: APPLICATIONS
GAs are mostly used for server deployment, offloading operations, routing configurations and energy efficient network designs. For instance, in an edge computing enabled network, user offloading decisions must be considered to develop server deployment strategies. The objective of jointly optimizing the server deployment and the offloading decision scheme is attempted to fulfill in [70]. GA is used in the proposed scheme for the optimization problem along with achievement of both the satisfactory service and user delays. In [71], sensor node placement in integrated access and backhaul (IAB) network and link distribution in non-IAB network is studied which uses GA based approaches. This study also evaluates the bypass of temporal blockages by considering the effect of routing mechanisms. Another routing optimization method is studied in [72] that uses GA for optimum sensor node selection which addresses redundancy and energy depletion problems. While designing an energy efficient wireless network, the performance and energy depletion of the network devices must be in equilibrium state for their optimum performance extraction. A compressive sensing algorithm is thus presented in [73] where GA optimizes quantity of measured samples, range of transmission and sensing matrix. Cluster head selection in a sensor network, proposed in [13], uses GA which takes density of the nodes and distance among them, their energy consumption and capability of the heterogeneous network devices into account to develop the fitness function of the GA. Machine learning strategies such as backpropagation neural network and unsupervised learning are integrated with GA for indoor localization and network topology control [19], [74].

B. SUPERVISED LEARNING
Supervised learning determines the mapping of input and output variables [75]. This mapping is done with the help of training datasets. These type of learning algorithms have two stages, namely training and testing stages. In the training stage, the training datasets are used to determine the mapping, which includes learning of ML parameters such as weights. Once these parameters are learned, in the testing stage new input data are provided to these ML models to predict the output data. Three types of supervised learning methods will be discussed as follows, which are, regression analysis, support vector machine (SVM) and K-nearest neighbor (KNN). Table 3 summarizes the supervised learning-based researches for B5G/6G communication systems for future IoT.

1) REGRESSION ANALYSIS
Regression analysis analyzes the relationship between two or more variables [54], [75]. Accuracy rate of the regression approach is evaluated by means of a defined cost function. Linear and logistic regression are the examples of most common regression algorithms. Nonlinear regression algorithm is also observed to be used in wireless network analysis. These types of algorithms are utilized in channel information learning, outrage analysis, device localization, target detection and energy efficient system design. Applications: The performance of orthogonal matching pursuit (OMP) incorporated channel estimation technique is evaluated in [76]. A mathematical framework presented in the study is based on defined different normalized mean square errors. Linear regression algorithms are used to predict bit error rate performance. An adaptive linear regression model is studied in [100] which predicts outrage in very high throughput satellite (VHTS) networks. This study aims towards accurate prediction of outrage in harsh environmental conditions, which can largely affect the VHTS network operating in Q/V frequency bands. Fingerprint based localization system uses crowdsourcing mechanism for efficient site survey process. However, this localization system may provide poor performance in the network with heterogeneous devices if the difference between the survey phase and client phase is large. Therefore, to address this issue, a localization approach with crowdsourcing mechanism is proposed in [10] which uses linear regression for calibration across the training network devices. In target tracking applications with ultra-wideband (UWB) radar, a linear regression-based parameter estimation for classification algorithm is proposed in [77]. The proposed classification algorithm addresses the problems of target tracking in real-time data streams with a small number of samples. The reduction of delay in data collection process is attempted to achieve in a study with the help of Kernel Ridge Regression (KRR) [78]. Ridge regression is formed by combining linear least squares method with L2-norm regularization. This regression is later combined with a kernel trick to form KRR. In medical video communication systems, the trade-off considerations among video quality, encoding rate and bitrate demands are crucial. With this objective, a video encoding framework is proposed in [79] where an encoding space is constructed and linear regression is used to predict the objective models for quality, bitrate and computational complexity. Energy harvesting scheduling approach is designed in [80] for wireless power transfer use case, which utilize linear regression and artificial neural network (ANN). Other than linear regression, logistic regression is implemented for indoor device detection [81] and data corruption estimation [82]. The identification of cellular connection origin in the indoor environment is studied in [81] with the help of a proposed data-driven model. Logistic regression is used on radio connection data accessed from a network management system. Cross-technology interference can affect the performance of low-powered network technology under a high-powered network system. A case study is conducted in [82] where ZigBee transmission is affected by WiFi system. Logistic regression-based model is formed from the training with known reference/pilot symbols. Low powered clock synchronization is essential in wireless network enabled agricultural applications. Thus, a clock offset prediction method is proposed in [83] that uses non-linear regression algorithm to learn about the sensor clock characteristics. The parameters of the algorithm are tuned with the help of Gaussian function. This approach is shown to have low energy consumption during time synchronization operation.

2) SUPPORT VECTOR MACHINE (SVM)
The SVM algorithm is trained by sample data points, known as orthogonal vectors, for their categorization by defining hyperplane(s). The number and position of these hyperplanes depend on the orthogonal vectors. The best hyperplanes are chosen where the distances among the categorized data points are maximum. SVM is mostly used for classification related problems. Spectrum sensing, resource management, pattern recognition, fault detection in sensor networks, target tracking and secured communication network designs are some examples of the implementation of SVM.
Applications: A blind spectrum sensing approach is proposed in [84] that uses Cholesky decomposition and SVM classification theories. The studied approach aims at recognizing primary users in low signal-to-noise ratio (SNR) values. In order to obtain received signal strength (RSS) values by estimating the direction-of-arrival (DoA) for Electronically steerable parasitic array radiator (ESPAR) antennas, used in wireless sensors, an SVM based estimation scheme is presented in [85]. The SVM algorithm is trained by means of measured radiation pattern related data of the ESPAR antennas. The efficacy of intelligent resource management strategy in video data supported networks depends on precision of video traffic estimation. Therefore, an SVM based video traffic estimation method is proposed in [86]. The estimation method uses smoothing mechanisms for preprocessing of the video streams, which helps in predicting video traffic by SVM. Mass transit hubs operation requires data related to passengers' trajectory (collected with the help of cameras and WiFi protocol) and its modelling for prediction and optimally control the traffic of the transit hubs. This traffic modelling is possible by using machine learning methods, which is performed in [87]. The characteristics of the pedestrians are categorized into chaotic and non-chaotic situations. The trajectory in the former situation is predicted by means of a regression method whereas the trajectory in the latter situation is predicted by using SVM regression method. As mentioned in III-A1, faults in sensor network devices affect the system reliability. This implies the necessity of designing fault detectors which is convenient to execute with the help of machine learning approaches. Thus, an SVM classifier-based fault estimation methodology is proposed in [88] which is used to define a decision function. The decision function is then used by cluster heads to detect faults. Another study is conducted in [89] which concerns precise data transmission among the cluster heads and associated cluster sensor nodes. The proposed strategy for identification of any data transmission failure is based on the SVM classification approach, whose parameters are optimized by an improved flower pollination algorithm (IFPA). Target tracking application may create issues if the sensing accuracy is compromised by the closely spaced targets. This problem is addressed in [90] by combining SVM and Kalman filter. The proposed strategy obtains the updated hyperplanes that helps the classification operation to improve the target tracking mechanism. Security concerned network system design also uses SVM based strategies. In [91], digital ID spoofing attacks are shown to be handled by means of feature-reduced RF-distinct native attributes fingerprints and SVM. An SVMbased trust management scheme underwater acoustic network is proposed in [92] to minimize the probability of misjudging the genuine sensors as malicious sensors. Protection from the selective forwarding attack, one type of denial of service (DoS) attack, an SVM enabled screen-confirm method is studied in [93]. The proposed method requires information such as transmission frequency and time, connection duration and energy status to filter out DoS attacks from the wireless networks.

3) K-NEAREST NEIGHBOR (KNN)
KNN is a non-parametric algorithm which is used generally for classification problems [75]. In this algorithm, the selected sample points are grouped into different categories. The Euclidean distances among chosen training data points and new data points are measured to determine under which classification the new data points would fall. This categorization process can also be seen as the voting process where the nearest neighbor data points participate. The majority votes determine the category of the new data points. The implementation of KNN is observed in resource allocation, indoor localization and clusterization of devices.

a: APPLICATIONS
Fair spectrum operation may be affected by the collusion among the user devices during auction of spectrum. This requires the construction of spectrum auction approaches which prevents inter-user collusion. A spectrum auction strategy, immune to collusion, is proposed in [94] where KNN classifies small cells by considering their location and radius of interference. Cell selection problem is an open issue in heterogeneous networks due to their presence in large numbers and difference in base station behaviors. A KNN based cell selection algorithm is presented in [9] for mmWave communication to choose optimum base stations from two-tier base stations. Indoor localization efforts based on received signal strength fingerprint techniques are affected by instability of the signal strength indicator and negligence of Wi-Fi signal distribution diversity and network traffic. Besides, achieving precise positioning accuracy creates enormous workload in fingerprint collection. Thus, in [95], a convolutional neural network-based localization method is proposed which predicts nearest predefined reference points/positions by using data related to received signal strength indicator instability. The KNN approach is then utilized to determine the target location. The diversity of Wi-Fi signal distribution is considered in [96]. KNN is used to estimate the target location by assigning each access points' degree of contribution as weights during the computation stage in order to search for matching reference positions. Another KNN based fingerprinting system is studied in [97] which uses an online multitask metric learning technique for the improvement of the distance estimation executed by the proposed system. The high spatial correlation of received signal strength is utilized in [98] for the reduction of fingerprint collection workload. KNN is used in the study to assist the matrix completion model. The performance of KNN based localization is also observed in [99] for ZigBee and bluetooth low energy along with Wi-Fi operating at 2.4 GHz band. Superior performance is achieved with KNN against Naive Bayes approach. In case of device clustering optimization problem, a cooperative clustering approach is proposed in [18]. The approach uses KNN for clustering operation in order to achieve maximum system throughput.

C. UNSUPERVISED LEARNING
Unsupervised learning method does not use training data sets to determine the input-output mapping [101]. This method rather learns about the pattern from a given input data. The input data is fed into this model and output is computed. A cost function is used for the evaluation of the learning methodology. The labeling and analysis of the input data is performed by the users which enables it to execute its functionalities in real-time application. Examples of unsupervised learning algorithms are K-means clustering, expectation maximization and Principal Component Analysis (PCA). The following will discuss ongoing researches on wireless networks with K-means clustering and expectation maxi-mization (EM) algorithms. Table 4 summarizes the unsupervised learning-based researches for B5G/6G communication systems for future IoT.

1) K-MEANS CLUSTERING
As the name suggests, this approach divides the data points into different clusters. Besides clustering the data points, this clustering method updates the cluster center in each iteration. A cost function evaluates the clustering function by measuring Euclidean distances between the center of the cluster and data points located in that cluster. Along with energy efficient network design, k-means clustering is used for antenna arrays, resource allocation strategies, routing protocol design, network node deployment, network system stability and security concerned network system design.
Applications: Antenna array design requires the adoption of subarray partition technique to reduce complexity of the feeding network and system cost. In [102], k-means clustering is utilized to obtain an optimal configuration of the antenna subarray. A joint resource allocation and clustering aiming method for machine-to-machine network is proposed in [103] where the problem statement is considered as energy efficiency maximization problem. K-means clustering algorithm is applied to determine the clustering mechanism. A reliable K-means approach-based routing protocol is proposed in [104] where the cluster quantity and cluster heads are selected with the help of continuous Hopfield network. Another routing protocol based on K-means algorithm is proposed in [105] which analyzes data transmission rate and energy consumption of the sensor network. Smart manufacturing system utilizes k-means clustering algorithm for network node deployment, as shown in [106]. The edge computing based optimal network nodes deployment aims at finding the balance between deployment cost of computing resources and network delay. Non-linear equalization in wireless networks has received massive interest for signal detection. Unsupervised learning based non-linear equalization studied in [107] uses K-means clustering which does not require CSI and power amplifier related information. This helps in achieving hardware constraints, system complexity and cost. Maintaining the stability of the UAV network and reducing packet loss rate and latency is a challenging task. The K-means clustering algorithm, utilized in [108], incorporates the mobility and location of the UAV devices into itself to execute the above-mentioned task. Optimal cluster size is determined from the relationship between maximum coverage probability of the cluster head and cluster size. Several energy consumption concerned network system designs take K-means clustering into account, as observed in [109], [110], [111], and [112]. Energy aware secured data transmission scheme is presented in [113] that uses fuzzy trust evaluation and K-means algorithm-based outlier detection. DoS immune IoT network system configuration is studied in [114] where the clustering of data related to immune network traffic and attacked network traffic is performed by means of K-means algorithm.

2) EXPECTATION MAXIMIZATION (EM)
This iterative algorithm determines the missing data in a given dataset by performing maximum likelihood estimation [127]. It comprises two steps, namely, estimation step (E-step) and maximization-step (M-step). The first step performs maximum likelihood estimation to determine the missing data. The second step uses both the given and estimated data for optimization of the algorithmic parameter. This unsupervised learning method is used for network equipment deployment, user device localization, channel modeling, model parameter estimation, distributed estimation for ubiquitous sensing, signal detection and energy efficient network construction.

a: APPLICATIONS
An EM based user distribution and downlink traffic demand estimation approach is proposed in [115] in order to execute UAV deployment. Besides, the contract theory framework is used to ensure exchange of trustworthy information between the base station and the UAVs. In [116], EM based fingerprint-based localization considers both online and offline grid positioning data sample processing. This strategy is claimed to determine the true location of the user devices. A manifold learning-based target positioning algorithm is studied in [117] where EM is utilized for reconstruction of sparse representation of this manifold sparse representation in a noisy environment. The localization scheme proposed in [118] considers time difference of arrival (TDOA). The joint synchronization and localization problem is attempted to solve with EM and Gauss-Newton algorithms. The study in [119] analyzes both the centralized and the distributed passive localization methods and implements the methods in an asynchronous network. A multitarget localization approach is proposed in [120] which is based on compressive sensing theory. In the study, inaccurate location knowledge of the sensors is considered as tunable parameters, which enables the localization problem to be formulated as joint sparse signal estimation and parameter optimization operation. EM also helps in learning wireless CSI. In [121], a channel modeling approach is proposed for RIS incorporated wireless network. The outrage probability of the wireless network under various RIS channel conditions is analyzed. A downlink channel characterization scheme of sub-6GHz wireless network is studied in [122]. This measurement-based channel characterization utilizes EM algorithm for multi-dimensional channel parameter estimation. As it is known, CSI is used for signal detection. However, as stated in [123], in Ambient backscatter communication (AmBC) network, CSI acquisition is an arduous operation. Therefore, two constellation learning-based signal detection methods are proposed which are derived from the EM algorithm. A distributed blind estimation scheme is presented in [124] where random transmission approach converts sensors' sensing values are utilized to monitor their states and statistical inference methodology extracts the sensor values with the help of EM and Gaussian mixture model. In [125], EM is used to estimate the parameters of Gaussian mixture model in channel multipath clustering applications. The proposed approach investigates the channel multipath propagation characteristics in wireless networks. A power efficient Wi-Fi Direct data transmission mechanism is proposed in [126]. The EM based methods not only optimizes the energy consumption but also minimize the transmission delay of the multimedia traffic.

D. REINFORCEMENT LEARNING (RL)
RL learns to execute operations through iterative trials and error computations [54]. This learning method consists of several elements such as environment, agent, state, action and reward spaces. An agent is assigned in an environment space which learns to perform actions on the basis of the state the agent experiences. A reward value is provided to the agent for its choice of action. The agent uses its own experience to train itself and choose the corresponding action/policy to maximize the reward. Table 5 summarizes the reinforcement learning-based researches for B5G/6G communication systems for future IoT.

a: APPLICATIONS
MDP provides a mathematical framework for reinforcement learning that models decision making policy in conditions consisting of partly random outcomes and the policy has partial control over the conditions [75]. The decision making agent chooses to perform an action which is available in a given space. The newly performed action creates a new state space and a corresponding reward is provided to the agent. The probability of MDP moving into the new state is related to chosen action and state transition probability computation. MDP is mostly used for network design in dynamic conditions, resource allocation and scheduling, data transmission scheduling, delay optimization and energy aware network systems.
The resource scheduling problem in dense RAN in [128] is formulated as a stochastic game. The decision-making process of wireless service providers, incorporated in the network, is based on MDP. Data transmission policy, based on MDP, not only helps in achieving low-complexity in data transmission scheduling but also high energy efficiency and low packet loss [129], [130]. The Radio head clustering and server matching problem is attempted to solve with MDP, which in turn reduces the system delay. Computational offloading time minimization in MEC by selecting an optimal offloading node is studied in [131]. The MDP-based offloading scheme selects the offloading node on the basis of bandwidth of heterogeneous servers and device location and the decision problem is solved with the help of value iteration algorithm (VIA). In another study, the data offloading problem is formulated as MDP which is solved by policy iteration (PI) algorithm [132].

1) Q-LEARNING
Q-learning is a model-free learning method that determines the appropriate actions to be adopted on the basis of the agents' current state [75]. The reward value is found with a Q-function, which has an initially fixed value. This ''Q'' value is updated every time the agent executes an action and receives corresponding reward. The application of Q-learning is observed in routing protocol policy making, resource allocation, security concerned network design, power allocation and deployment schemes development.

a: APPLICATIONS
A Q-learning-based routing protocol is presented in [94] for wireless sensor networks. The Q-function is defined in terms of data aggregation efficiency, energy consumption of the sensors and their residual energy. In [133], the Q-learning based routing protocol performs the data forwarding operation by considering the received rewards, which depends on the evaluation of possible forwarding operation effects as well as routing performances before choosing the receiving network nodes. A resource allocation methodology is proposed in [134], where Q-learning-based channel assignment supports energy harvesting procedure and learning the random behavior of the primary user in the cognitive wireless network. The Q-learning method also helps the secondary user to choose good quality channels for data transmission. Spectrum access technique can be derived by Q-learning, as shown in [135]. The proposed method attempts to transmit multimedia data through unused spectrum cavities. The Q-function is defined by considering the delay and throughput requirements of the multimedia data transmission applications. The Q-learning assisted random access approach for clustering-based and (Non-orthogonal multiple access) NOMA-based mMTC, proposed in [136], uses pre-clustering mechanism to allow the network devices to operate with small Q-table which accelerates the convergence mechanism of the Q-learning algorithm. Packet transmission scheduling mechanism in wireless ad hoc networks with SIC is studied in [137] which uses Q-learning approach. Cluster formation in CR based ad hoc network, studied in [138], uses the Q-values by considering channel quality, energy and condition of the network nodes' conditions. The application of Q-learning algorithm is observed in applications such as energy harvesting [14], data dissemination [139], power allocation [140] and security concerned network system designs [141], [142].

2) STATE-ACTION-REWARD-STATE-ACTION (SARSA)
SARSA is similar to Q-learning technique [146]. However, it is an on policy technique whereas Q-learning is an off policy technique. To be precise, SARSA learning agent learns 87550 VOLUME 10, 2022 the Q-value on the basis of present action derived from the present policy. Q-learning agent learns the Q-value on the basis of the action derived from previous policy to quest for maximum reward. This makes the Q-learning computationally heavier than SARSA algorithm. a: APPLICATIONS SARSA-based power allocation strategies is studied in [143] for energy harvesting incorporated MIMO system. The performance of the power allocation strategies is analyzed in terms of obtained average throughput. The resource allocation policy with SARSA is developed in [144] for light network traffic NOMA-based IoT network to maximize the average sum rate performance. It is shown that SARSA offers resource allocation strategy with low computational complexity. In [145], a SARSA-based malicious attack detection scheme is studied for fog computing enabled network, where the performance is compared to traditional SARSA and Q-learning in terms of false alarm, miss detection and average error rates. In [146], RL-based intrusion detection mechanisms is analyzed for the wireless sensor network system that have the potential to achieve high attack detection rate with high accuracy.

E. DEEP LEARNING (DL)
DL analyzes the given data sets to determine the characteristics and interpret the relation between these data [147]. It is similar to artificial neural networks, however the difference lies in the number of hidden layers in the neural network. The implementation of DL is enabled by the breakthrough development of the learning algorithms and hardware capabilities [54]. The enhancement of learning algorithms such as supervised learning and reinforcement learning can be achieved by adopting DL mechanism, which is observed in [148]. This allows the use of DL in various applications such as network access, routing optimization, device localization, antenna orientation, channel estimation, resource allocation and security threat detection. Table 6 summarizes the deep learning-based researches for B5G/6G communication systems for future IoT.

a: APPLICATIONS
In [148], several ML based network access and routing algorithms are surveyed. DL and other algorithms, namely, supervised learning, reinforcement learning and imitation learning are jointly adopted to develop network access, routing and congestion control algorithms. An indoor localization technique is proposed in [149] where logistic regressionbased approach with deep learning framework is adopted. In [150], indoor localization is executed by using the Wi-Fi CSI. A deep learning method is used to improve the regression performance of the localization system. A channel estimation technique is presented in [151] for a multi-user mMIMO system with angular-based hybrid precoding. CSI VOLUME 10, 2022 of the user cluster is estimated by means of DNN and fuzzy c-means clustering approaches. DoA estimation with the help of DL is also studied in [152]. The Q-learning adopted in [153] for resource allocation problem of edge computing based IoT network uses DL mechanism for allocation policy determination. DL is also used to solve the allocation problem before learning the policy. Power allocation strategy using DL is observed in [154], where the objective of secrecy rate maximization of the secondary user in the CR network is attempted to be fulfilled. The power allocation problem can be solved under perfect and imperfect channel estimation considerations with DL based methods, as shown in the study. In multimedia streaming applications, DL methods are used to not only determine the QoS from service provider perspective but also from user perspective [148], [155]. The launching of jamming attacks along with defense mechanisms against these attacks are analyzed in [156], which uses DL method. The DL based jamming attack is shown to effectively affect the performance of the transmitter. On the other hand, the defense mechanism allows the transmitter to take a wrong effort in data transmission in order to mislead the jammer, and thus achieves satisfactory throughput performance.

1) TRANSFORMER ALGORITHM
Transformer algorithm is based on attention mechanism [166], which contains an encoder-decoder architecture. The input data are sent parallelly to the transformer network architecture after converting these data into embedding and executing positional encoding on these data. The encoder block contains multi-headed attention layer, where the attention vectors are computed from the feeded vectors, and feed forward layer, where the attention vectors are converted into the vectors that would be processed by the next encoder/decoder blocks. The final processed vectors are then feeded into decoder block, where the attention vectors generated from the targeted output data are also feeded into (This is similar to the processing of input data to generate attention vectors). Both the attention vectors (from the input and the output data) are forwarded to another attention layer. In this layer, the input-output data mapping is performed. The output of the mapped attention vectors goes to feed forward layer to convert the attention vectors into vectors to be processed by either the next attention layer or a linear layer. The linear layer (which is a feed forward layer) and the softmax layer operates sequentially to convert the attention vectors into predicted data.

a: APPLICATIONS
This algorithm can be useful in automating IoT facilities such as smart healthcare, smart transport and in mobile network operations such as modulation recognition, signal and network threat detections. For instance, in [157], the performance of a transformer-based algorithm is compared to DL models such as VGG11, VGG13, ResNet18, DenseNet121 etc. in terms of medical image classification accuracy. Although the study reports inferior performance of transformer-based algorithm with respect to the mentioned DL-based algorithms, it suggest some probable transformer-based approaches which could lead to the extraction of improved classification accuracy of the medical images. A traffic sign recognition mechanism is designed in [158] with the help of DNN consisting of CNN and transformer-based algorithm. The designed system claims to have high accuracy in terms of recognition. In [159], a novel deep ensemble learning-based methodology is combined with two-transformer based algorithm and a DNN model for classification of wildfire regions and precise region detection. In case of cellular network system, modulation recognition and network intrusion detection is also being conducted with transformer-based algorithms to analyze their classification and detection accuracy [160], [161].

2) GRAPH NEURAL NETWORK (GNN)
As the name suggests, GNN works with the data having graph like structures [167]. A input graph, with information such as features of nodes and edges, are fed into neural network layer architecture, where the input data are iteratively processed to transform these data into useful and operable information. During this iterative operation, the features of each nodes in the input data are updated by extracting the features of the neighbor nodes and using the extracted features to update the feature of the definite node. After the final iteration, the prediction on the provided data is obtained.

a: APPLICATIONS
The GNN-based approaches can be useful for device tracking and energy efficient topology control in IoT-oriented communication systems as well as data classification and prediction in smart IoT application systems. For instance, in [162], an energy-efficient topology control algorithm is constructed, which is based on GNN. The study focuses on prolonging the network lifetime in wireless ad-hoc IoT (WAIoT) environment. In [163], a GNN-based DL model is used to track unknown IoT device with the help of UAVs equipped with RSS indicator sensors. This model, to be potentially implemented in 6G communication systems, is analyzed in terms of time taken and distance travelled by the UAV's to reach the target. In case of smart application sectors, a GNNbased sentiment classification algorithm is designed in [164] to interpret the aspect of a text. In [165], GNN is used to develop a prediction mechanism for smart transportation system which can predict missing information about the traffic data from a given dataset.

F. DEEP REINFORCEMENT LEARNING (DRL)
DRL mechanism is somewhat similar to the RL mechanism, which is mentioned in section III-D. However, with the increased complexity in the problem study, the number of states and the probable actions will increase or will not be available to extract [54]. The Q-value table, which records the reward value for actions executed at a particular state, will thus be difficult to construct. The DL mechanism is then  implemented for mapping the state and the corresponding action to deal with the difficulty of storing the reward value for the high problem complexity. Even when the reward value is partially available or not available, DRL helps to generalize this reward value. Table 7 summarizes the deep learning-based researches for communication systems for future IoT.

a: APPLICATIONS
A resource scheduling algorithm for mobile edge computing (MEC) based IoT network is proposed in [153]. MDP is used to formulate the resource allocation problem and deep Q-learning based algorithm is utilized for resource allocation policy learning. Network slicing aims at assigning physical resource blocks in such a manner that QoS requirements of advanced network services such as eMBB, URLLC and mMTC are fulfilled. However, it is an arduous operation since higher time slots creates computational burden on the resource block assignment mechanism. This problem is addressed in [168] with the help of DRL-based network slicing method in the B5G communication systems to maximize the long-term throughput. This mechanism can also be useful for 6G communication systems, according to the claim made by the authors. In [169], a DRL framework is presented, where a novel algorithm is proposed for optimizing resource block allocation and minimizing power allocation for URLLC service. Also, a refiner Generative Adversarial Network (GAN) method is proposed in the paper for the high reliable generation of data to help the resource and power allocation. The proposed framework is demonstrated to guarantee high reliability and low latency. In [170], the DRL-based algorithm attempts to jointly optimize the computation offloading and resource allocation policies with a view to minimize the computation overhead of the NOMA-assisted MEC. A joint optimization scheme of beamforming matrix for base station and RIS is attempted to develop in [171] with the help of DRL-based approach for B5G vehicle-to-infrastructure (V2I) communication system to address non-convex and time-varying issues. High network throughput is demonstrated to be achieved with the VOLUME 10, 2022 proposed scheme for the RIS-assisted V2I communication system. An adaptive handover mechanism for a VLC-assised hybrid 6G network system is designed in [30] by means of DRL-based algorithm. The proposed algorithm demonstrated to provide better average downlink data rate than deep Q-network (DQN), Sarsa and Q-learning-based algorithms.

G. FEDERATED LEARNING (FL)
FL algorithm allows distributed network devices to transfer the learned outcomes through ML training. In other words, several devices collaboratively undergo ML training to update and share the ML parameters amongst themselves [101]. This allows the network devices to maintain their privacy since they only share the ML training results without revealing the actual data. Besides, the distribution of the learning mechanism enables minimum hardware capability utilization. FL is mostly applied in security concerned applications along with resource allocation strategies and energy efficient network system designs. FL based resource allocation and management schemes are analyzed in [172], [173], [174], and [175]. In [173], the mobile user devices transmit the trained ML models to the base station, which then generates a global ML model from the received local ML models. The spectrum assignment to the user devices is determined with the help of a greedy algorithm. In [172], the bandwidth allocation is based on the energy status of the wireless devices and present condition of the wireless channels. A stochastic optimization method is considered for both the user device selection and resource allocation. FL based resource management also considers packet errors and energy constraints of the devices along with their privacy, as shown in [174], [175], and [176]. The study in [177] concerns asynchronous FL in wireless networks. Proper transmission scheduling scheme is proposed to help the edge devices process its own data for the FL training. In [178], both FL and blockchain are implemented in a digital twin incorporated wireless network. This implementation not only ensures low latency but also network privacy. The ML training at user end and user data transmission are not controlled by the FL mechanism, which leads to the deterioration of the global ML modeling. This issue is addressed in [179] by developing a reputation concerning transmission scheduling policy. An FL-based threat detection algorithm is presented in [180] for edge computing enabled wireless networks. The proposed attempt claims to minimize communication overhead and ensures its learning convergence.

IV. ENHANCEMENT OF SMART FACILITIES THROUGH AI/ML IMPLEMENTATION
This section discusses the use of AI/ML technologies in smart facilities such as smart healthcare, smart agriculture, smart transportation, smart grid and smart industry. It focuses on the effective learning strategy implementation for the provision of effective services to the users.

A. SMART HEALTHCARE
IoT has improved the healthcare sector by introducing automation technologies that ensures cost-effective medical services provision [181]. Technologically advanced devices such as wearable devices, smartphones, pervasive and ambient sensors can help in monitoring the patients' health and providing proper medications from distant places. Besides, emergency medical services can be accessed by using these helping devices. Since these devices generate large amounts of data, ML algorithms are required to be implemented for data processing [182].
With the help of cameras, information about calorie intake, human blood pressure, accidents generated from human activity and sign languages can be recorded and analyzed with the help of ML approaches for taking appropriate actions. Regression analysis, Naive Bayes (NB) and CNN based algorithms are used for these purposes [182]. ANN, random forest and gradient boosting algorithms help in predicting glucose level, stroke occurance, and monitoring infant health. NB can also be used to predict any heart disease in the beginning stage [183]. Epilepsy risk estimation, COVID-19 identification and ebola spreading controlling mechanisms can be developed with the help of KNN, CNN and decision tree algorithms. Along with these algorithms, SVM, random forest and ANN are also used in performing diagnosis of heart related diseases, lung cancer and chronic kidney disease (CKD).

B. SMART AGRICULTURE
According to [184], crop production rate is required to be increased by almost 2-3 times with respect to the present production rate. Thus, it is vital to implement advanced technologies and data processing algorithms to enhance crop productivity. Wireless sensors are useful in monitoring crop production processes and detect diseases, weeds, and anomalies in the irrigation procedures. In [83], it is mentioned that monitoring of the irrigation is influenced by the clock synchronization of the sensors since timely monitoring can enhance the efficacy of the crop production systems. This raises the interest in designing an ML based clock synchronization mechanism that takes into account the energy consumption of the sensors in synchronizing time. Since a large volume of agricultural data is generated by the wireless sensors, it needs precise processing for the extraction of valuable information. Big data processing mechanisms and ML methodologies process these important data so that the farmers and sellers can make decisions related to the crop supply.
In [185], several researches on the application of big data and ML approaches in agricultural production are reported. ML algorithms are able to perform hyperspectral data processing for analyzing irrigation related information. Data related to crop and seed classifications and identifications, chlorophyll measurements in crop leaves, crop hydration status, crop field preparation and classification can be processed with the help of NNs, random forests, SVM, ELM and MLP and DL-based algorithms. However, the lack of quantity of data and superior quality images can degrade the performance of ML algorithms. This requires the emergence of efficient algorithms which can provide precise prediction with limited data. Besides, effective and scalable computational architectures are needed for hyperspectral information processing. Ensemble machine learning needs to be explored in agricultural applications to address the above-mentioned challenges.

C. SMART TRANSPORTATION
This smart city facility requires intelligent systems in order to ensure smart and safe utilization of user mobility facilities [182]. Data related to road traffic conditions, parking facilities and available transport can be analyzed with ML based approaches which allows smart control over the transportation systems. Wireless sensor collected data are used by ML algorithms such as KNN, SVM, and ANN. The KNN and SVM based algorithms require accurate and pre-processed data to execute their proper functionality. DL based algorithms such as deep belief network (DBN), support auto encoder (SAE) and RNN have shown their efficacy in predicting the traffic conditions. LSTM can also be integrated with RNN to predict traffic flow on roads and highways. RL with deep learning mechanisms can be used to adaptively control the traffic signal lights and ensure equal priority provision to all the vehicles [186].
ML can use the trajectory data of human with the help of GPS, Wi-Fi and cellular connection for human movement estimation and transportation system modeling. A case study has been considered in [87] where a short-term traffic flow is attempted to predict with the help of regression model and SVM model in non-crowded and crowded situations. DL models can be employed to learn about the human movement, available transportation mode and their joint prediction [182] as well as estimated time of arrival to the destination. LSTM, RNN and deep residual network (ResNet) are commonly applied in such cases. CNN and RNN can be combined to compute the travel time required by vehicles for moving from one location to another location. Apart from tracking human mobility, transport mode availability and traffic conditions, parking spot tracking in cities is another alarming issue due to the increase in number of vehicles. Visual information-based parking space identification is a challenging task. Conventional strategies mostly fail to identify precise parking space since they consider few specific situations. ML model such as CNN can be deployed to find the vacant parking spaces by means of an image-based user interface.

D. SMART GRID
Smart grid enables intelligent monitoring of the overall electrical grid systems [181]. Advanced metering infrastructure (AMI) utilizes smart meters for generation of electrical energy distribution data. Big data analytical methods integrated with AI/ML can process the collected data for their proper understanding [187]. Short-term load forecasting mechanism aids in fulfilling the energy requirement by optimizing the resource and operational costs. ML algorithms such as RBF, random forest, linear regression, decision tree, SVM-based regressor, and ANN-based approaches are commonly used for load forecasting. The choice of the best ML model relies on the accuracy of the forecasting mechanism and convergence rate of the ML model. A study is conducted in [187] that evaluates various ML models in terms of prediction accuracy and time of execution.
Smart grids are not immune from mechanical and software faults due to harsh working environments [188]. This raises the necessity of developing systems for detection and diagnosis of faults. However, the diagnosis methodology should incorporate cognitive features to take necessary steps for detection and elimination of any threats generated from the VOLUME 10, 2022 faults. Hidden Markov models (HMM), SVM and NN can take the fault detection and diagnosis mechanisms to the next level. In [189], matching pursuit decomposition (MPD) is used to learn the characteristics of the voltage signals. Clustering algorithm is appointed to arrange the collected data in groups. These data are used for training the detection algorithms based on hidden Markov model (HMM) which detects any faults in the grid system. The location of fault generation is determined from contour map generated from the characterization of the voltage signal clustering. In [190], fault location is determined by means of Gaussian process regression and fault is identified by SVM training.
Intrution in smart grids become vulnerable if the grid systems lack defence mechanisms [191]. Thus, it is essential to implement low complexity based self-learning approaches to prevent the smart grid system from any security threats which can damage this internet-based infrastructure. Intrusion detection mechanisms based on machine learning approaches are able to protect the smart grid infrastructure from cyber-attacks. Some machine learning algorithm-based intrusion detection techniques are surveyed in [191]. To be specific, rule learning is implemented in such defence systems. GA, decision tree, PSO and Bayesian learning are some of the automated learning mechanisms used for protection against DoS, against over-current protection and privilege escalation attacks. It can be, therefore, said that the evolution of machine learning algorithms is undeniable in ensuring the efficacy of IoT systems.

E. SMART INDUSTRY
Industrial IoT (IIoT) has evolved the concept of manufacturing by including wireless sensors, information technology, machinery control mechanisms and computer software for production monitoring [192]. In order to develop even smarter manufacturing processes, big data analytics and automated learning techniques must be incorporated with the computing machineries. ML has the potential to enhance the data processing and analyzing methods for interpretation and thus has received huge interest from the researchers.
As surveyed in [192], ML is applied in maintenance and quality management, production planning and control and supply chain management. Among these sectors, quality management and production planning control has received much attention regarding ML implementation. Supervised learning models have been mostly explored in quality management. Defect detection, identification and classification as well as online quality control are the aspects of quality management explored mostly with supervised learning, followed by unsupervised learning. Production planning and control strategy is studied with reinforcement learning along with supervised and unsupervised learning. The research areas covered by the researchers in this direction are performance prediction, process control and production scheduling. Similar to researches on quality management, supervised learning is mostly employed in maintenance management. Less researches in this area are conducted with unsupervised learning and reinforcement learning. In supply chain management research areas, the three mentioned learning mechanisms are employed to analyze modeling and coordination, demand forecasting and inventory management. The optimum performance of the manufacturing system can be achieved by considering appropriate application of the ML algorithm in the required manufacturing operations.

V. FUTURE RESEARCH OPPORTUNITIES AND OPEN ISSUES
This section discusses the scope of future research with AI/ML algorithms in developing the communication systems for future IoT. The discussion covers several areas such as resource allocation, channel modeling, data management, energy efficiency and security and privacy.

A. RESOURCE MANAGEMENT
ML algorithms have reduced the complexity of resource management in wireless networks, as discussed earlier. However, the increment in the number of devices in ultra-dense networks induces intra-cluster and inter-cluster interference. Besides, heterogeneity in wireless networks, such as dynamic traffic conditions and QoS requirements of the heterogeneous devices adds more complexity in managing resources [101]. In [148], it is mentioned that video streaming applications demand high throughput with low latency and security level. On the other hand, service provision applications demand high security with satisfactory throughput. This raises the necessity of designing protocols that satisfy multi-dimensional QoS requirements and deploy optimal resources on the basis of those requirements. ML algorithms can be employed to design such protocols which considers the demands of heterogeneous network nodes and manage network resources accordingly.

B. CHANNEL MODELING
One of the key analyses for wireless network system design is the modeling of wireless channel. Future IoT network systems enable the integration of several devices. To ensure uninterrupted communication among these devices, channel modeling is another obligatory research direction [101]. The traditional methods face problems in estimating wireless channels due to several reasons. For instance, mMIMO systems use nonlinear amplifiers in most cases. The traditional methods are not able to address the effect of such amplifiers. In RIS-based networks, the propagated signals become weak while reaching the destination due to the obstacles faced by the signals during transmission. This makes it critical to acquire the CSI. In case of THz communication, the use of short wavelength causes the CSI to be affected by small magnitude of communication channel variation, as highlighted in section II-B3. ML algorithms to model wireless channel by considering these issues can be used which could be an interesting research direction.

C. DATA MANAGEMENT
Future network devices generate a huge volume of data which needs data processing techniques for the extraction of valuable information, which is possible through big data analytics. Its' application in different organizations and industries has evolved the conventional way of data accumulation and interpretation processes. Even the communication devices produce heterogeneous data which requires intelligent mechanisms for their classification. ML can classify the data into different categories and make interpretations from these data. However, training load may increase if too many data are utilized for training ML algorithms. Again, less and inaccurate data can provide poor outcomes from the ML algorithm. This requires the data management schemes that will determine the accuracy and optimum number of data so that ML algorithms can handle the big data processing and derive useful information the organizations will utilize.

D. ENERGY EFFICIENCY
This is an essential factor to be considered for network system design. Wireless sensors always face energy depletion issues while collecting the data, transmitting them to the controlling and decision-making units and taking necessary corresponding actions. This open issue needs to be addressed by designing effective energy saving and harvesting mechanisms. In [54], wireless network system designs with ML algorithms have been surveyed from the energy efficiency point-of-view. In future, the base stations will act as energy sources as well as provide computational and storage resources. Heterogeneous communication protocols may cause large energy depletion problems if resource and power allocation strategies do not concern power utilization. Incorporating ML algorithms can help in developing energy-effective resource management and user association as well as routing path for wireless rechargeable devices. Another challenge arises in designing energy utilization optimization algorithms for the sensors since the accuracy of the sensing mechanism is dependent on the energy usage. Thus, it is required to develop strategies to execute accuracy-energy usage tradeoffs, which is another interesting research direction for ML-based implementations.

E. SECURITY AND PRIVACY
Security and privacy related problems in wireless networks have always been a hot topic. DoS attacks are commonly faced by the communication networks and services. To immune the network systems and services, ML algorithms are applied to detect any form of anomaly in network behavior and service provision. Since ML algorithms are trained by means of training data, secured utilization of the accessed data must be ensured to enhance privacy protocol. Technologies such as blockchain and quantum computing have arrived to provide immunity from cyber-attacks. As discussed in section II-B4, blockchain uses consensus mechanisms to ensure secured transactions. However, fake identity generation and data leakage can affect the validation of the transactions and violate privacy respectively. Thus, ML implementation in ensuring secured data sharing and enhancing the security of other technologies holds high research interests.

VI. CONCLUSION
Wireless network system has been evolving over the time to ensure better quality of life with innovative technologies. The 5G network standards have displayed huge potential in realizing smart IoT systems in order to provide good network coverage as well as new and improved services to the massive number of users. As the number of user devices are rising, bandwidth requirements are also rising to ensure uninterrupted network connectivity. Besides, new hardware and software technologies are creating opportunities to facilitate the users with new services, which is not possible with the 5G standard. This motivates the researchers to study beyond 5G and 6G communication standards.
The 5G network systems have already explored the importance of AI/ML for promoting automated network operations. Therefore, to deploy B5G/6G system, the contribution of AI/ML cannot be overlooked. Realizing such significance, this paper provides a comprehensive survey on leveraging AI/ML in communication systems for empowering future IoT. The evolution of the communication systems is discussed along with the necessity of the inclusion of AI/ML algorithms in the system. Most common AI/ML algorithms are discussed and recent researches conducted by using these algorithms are provided. It is also mentioned how the AI/ML algorithms can enhance the IoT services provided by the smart facilities. This paper is concluded by highlighting the open research issues and potential research opportunities in AI/ML-based communication systems for future IoT.