Performance Analysis of Deep Learning Based Routing Protocol for an Efficient Data Transmission in 5G WSN Communication

For the past few years, huge interest and dramatic development have been shown for the Internet of Things (IoT) based constrained Wireless sensor network (WSN) to achieve efficient resource utilization and better service delivery. IoT requires a better communication network for data transmission between heterogeneous devices and an optimally deployed energy-efficient WSN. The clustering technique applied for WSN node deployment needs to be efficient; therefore, the entire architecture can obtain a better network lifetime. The entire network is partitioned into various clusters. Moreover, the cluster head (CH) selection process also needs proper attention to achieve efficient data communication towards the sink node via selected CH and increase the node reachability within the Cluster. An energy-efficient deep belief network (DBN) based routing protocol is developed in this proposed framework, which achieves better data transmission through the selected path. Due to this, the packet delivery ratio (PDR) gets improved. In this framework, the nodes in the whole network are initially grouped as clusters using a reinforcement learning (RL) algorithm, which assigns a reward for the nodes belonging to the particular Cluster. Then, the CH required for efficient data communication is selected using a Mantaray Foraging Optimization (MRFO) algorithm. The data is transmitted to the sink node via the selected CH using an efficient deep learning approach. Finally, the performance of the proposed deep network-based routing protocol is evaluated using different evaluation metrics: network lifetime, energy consumption, number of alive nodes, and packet delivery rate. Finally, the evaluated results are compared with a few existing algorithms. The proposed DBN routing protocol has achieved a better network lifetime among all these algorithms.


I. INTRODUCTION
With 2G (2 nd generation) networks, wireless communication fulfils user needs like voice and data transmissions. The smartphones utilize 3G technology to use multimedia and video transmissions with limited bandwidth. But, the increased bandwidth was provided by the revolutionary change in 4G communication. Moreover, billions of users worldwide adopt smartphones for their daily use [1]. It is said that smartphone usage is higher than the population in the world. Smartphones generate heavy traffic because of 3G and 4G communication for data transmission. Also, other issues like congestion and quality of service (QoS) problems are created by this overdue usage of smartphones [2,3]. At this juncture, the advancement of technology (5G) is needed, and Device-to-device (D2D) communications came into existence. For cellular networks, 5G technology is standard in which cellular companies worldwide are starting to deploy the 5G telecommunications in 2019 [4]. But most of the current cell phones utilize the 4G networks for their data and video transmission. The service area of 5G networks is separated into small geographical areas named cells which is like a previous network [5]. The 5G wireless devices are connected via the local antenna to telephone network and internet using radio waves in each cell. The benefit of this 5G network is that it has higher download speeds up to 10 gigabits per second (Gbit/s) and provides greater bandwidth [6].
IoT is one of the fastest-growing technologies for 5G communication, and it may be used in many aspects of our lives. Large densities of sensing devices, such as WSN are used in IoT systems to monitor the surrounding environment. WSN has played a vital role in the communication domain wireless communication, data transfer is done based on the transport layer, and it launches a particular protocol for data transmission. The transport layer protocol utilizes the congestion control mechanism to confirm a good network resource allocation. In addition to that, congestion control in a wireless network is considered the most critical event of the transport layer. If the congestion control is applied in any of the mobile wireless communication (i.e. 5G communication), then the entire wireless network's performance will collapse. So, in order to avoid this problem, routing protocols are developed [10,11].
The routing protocol is considered a basic requirement for sensor networks to discover the route from sender to destination sink nodes. To alleviate the delay, routing protocols are developed, and they identify exact routes for data forwarding to accomplish energy efficiency in WSN [12]. Reliable routes are generated by inheriting the benefits of both energy conservation and route discovery methods. For data-intensive sensor networks, it is adaptive to the traffic patterns and network density [13]. To adjust the scalability and flexibility of the network, neighbour adoption, routing decision, and power conservation is considered the functions of routing protocols. Particular routing protocols concentrate on the network lifetime maximization, delay and hop-count rather than energy efficiency. So, these metrics are selected to enhance the design features of the routing protocol [14].
To obtain a better network outcome and solve a complex decision-making problem, machine learning algorithms are considered a prominent solution [15]. In routing and energy management, strong machine learning algorithms are combined with WSN assisted IoT to moderate and examine complex decision-making issues. Learning-based algorithms create correct solutions to the problem [16]. In the routing process, machine learning techniques analyse the constraints and automatically learn the dynamic of networks, specifically congestion points, quality of links, topology changes, and new flow arrivals to enhance service quality. Each sensor nodes (SNs) make a decision on the basis of observation state and decision making, which may result in intelligent behaviours. Further, learning and decisionmaking iterations are repeated until determining the optimal solution [17]. The proposed routing architecture mainly aims: ➢ To perform an SN grouping based on a clustering algorithm. ➢ To present a recent optimization algorithm for CH selection ➢ To propose a machine learning-based algorithm for efficient routing. ➢ To estimate the proposed performance and it is compared with recently developed methods. The main contribution of the proposed architecture is: Initially, cluster formation is achieved using RL algorithm. The RL algorithm checks the entire nodes in the network and assigns a reward to the nodes that belong to the particular network. Then, an optimization based CH selection is carried out, which performs an effective selection than the existing optimization algorithms. The selection algorithm considered in this work is MRFO which optimally selects the best CH that satisfies the multi-objective functions. The multiobjective functions considered for CH selection are energy, delay, traffic density, and distance. However, several protocols are developed recently for efficient routing, but none have attained a better and efficient routing. Artificial intelligence (AI) techniques have recently introduced their pathway in routing problems. We have also used the deep learning approach to achieve efficient routing in our proposed approach. The routing results of the proposed DBN architecture is found effective than the other existing architectures.
The remaining sections are organized as follows: Section II discusses a few recent works related to clustering and routing. The problem definition and motivation are discussed in section III. Then, the detailed description for each technique in the proposed framework and the details about the multi-objective functions are discussed in section IV. Then, the results obtained using the proposed routing protocol and its corresponding results are deliberated in section V. Finally, the proposed architecture is summarized in section VI.

II. LITERATURE REVIEW
A few recent works related to our proposal is given below: To improve the performance of 5G wireless communication, many studies have been conducted. Routing protocols proposed for wireless communication receive great attention because of major improvements in 5G technology. A brief overview of the existing research on routing protocols was discussed.
K. Thangaramya et al. [18] developed Neuro-Fuzzy based routing technique for WSNs in IoT. Data sensing, data collection and data transfer from one to another one were done with the help of WSNs in IoT. The network's QoS was enhanced by using the intelligent routing in IoT assisted WSN. Plenty of researches were conducted for energyefficient routing in previous years. To improve the existing technique, this paper developed Neuro-Fuzzy Rule for Cluster generation in IoT based WSNs. Conversely, the technique must be improved to ensemble WSN within the IoT environment. Finally, it was analysed that this routing algorithm provides efficient performance in various parameters like delay, energy utilization, network lifetime, and PDR.
For dynamic cluster-based routing, S. Sujanthi and S. NithyaKalyani [19] proposed a QoS-aware secure deep learning method in WSN assisted IoT. Because of WSNassisted IoT's open and resource-constrained nature, IoT's security and energy efficiency was considered a challenging issue. So, this paper developed the dynamic cluster-based hybrid WSN-IoT network using the Secure Deep Learning (SecDL) approach. Moreover, the network was designed to be Bi-Centric Hexagons and mobile sink technology for energy efficiency enhancement. A Two-way Data Elimination and Reduction outline was enabled to handle the data aggregation in each Cluster. For aggregated data, highlevel security was accomplished by a One Time-PRESENT (OT-PRESENT) cryptography algorithm. The ciphertext was transformed to mobile sink through selected route to confirm high QoS. Crossover based Fitted Deep Neural Network (Co-FitDNN) was developed to perform optimal route identification. Subsequently, this work concentrates on user security because IoT users were used to accessing the sensory data.
Ru Huang et al. [20] established a deep learning link reliability prediction for routing in WSN. A resilient routing algorithm was developed in this paper for a better routing process in WSN. A deep-learning model referred to as the Weisfeiler-Lehman kernel, and Dual Convolutional Neural Network (WL-DCNN) method was proposed in this process for lightweight subgraph extraction and labelling. It was influenced to improve self-learning capacity with strong generality. For WSN, the WL-DCNN model was designed to perform resilient routing. The design of resilient routing was applied in WSN for estimating the target links reliability in which it captures topological features under the routing table attack that shows varying degrees of damage to the local link community.
Ibrahim A. Abd El-Moghith and Saad M. Darwish [21] for WSNs. The basic operations of WSN were easily destroyed by the routing attacks, which significantly damaged the whole network. To guarantee the efficiency of WSN and routing protection, it requires a trustworthy routing scheme. The dependability among the routing protocols was enhanced with trust protection, centralized decisions, or Cryptographic schemes. So, in this, a trusted routing with Markov Decision Processes (MDPs) and deep-chain was proposed to improve the routing efficiency and security. To authenticate the process of delivering information, the suggested architecture uses a proof of authority mechanism within the blockchain network. A deep learning technique was developed to absorb the properties of several nodes. MDPs were used to choose the finest neighbouring hop as a forwarding node to transfer the messages quickly and safely.
The composite fuzzy method was proposed by Y. M. Raghavendra and U. B. Mahadevaswamy [22] for energy-efficient routing in WSN. In WSN, energy consumption optimization in SN batteries plays a major role. Due to transmission and sampling rate, the energy in WSN's battery got depleted. The energy consumption technique was modelled for important parameters which affect the longevity of WSN network. In this work, Fuzzy membership functions were considered as the parameters to enhance the network lifetime. The parameters were optimized with the help of Fuzzy logic at multiple levels. In [32], a hybrid metaheuristic cluster-based routing (HMBCR) approach was introduced, which performs efficient clustering and routing process. Levy distribution based brainstorm optimization (BSO-LD) was introduced for efficient clustering. Then, a hill-climbing based water wave optimization (WWO-HC) approach was introduced, which performs the optimal route selection process.
CH selection using an efficient algorithm by including some critical parameters was performed in [33]. Routing via the selected CH will be efficient for performance enhancement. By considering this process, a hybrid optimization algorithm was introduced for CH selection and routing, which was named GA-PSO (genetic based particle swarm optimization algorithm). An optimal route for sink mobility was identified using PSO. Distributed Autonomous Fashion integrated with Fuzzy If-then Rules (IDAF-FIT) algorithm was introduced in [34] for efficient clustering. During clustering, the CH was also selected using the if-then rule. Then, an Adaptive Source Location Privacy Preservation Technique using Randomized Routes (ASLPP-RR) was used in this method for optimal route selection. Finally, the security analysis process was introduced to enhance data confidentiality. Along with Cluster based routing, the rate control concept was also included in [35], which further enhances the lifetime for high simulation time. For lifetime enhancement, initially, the nodes were clustered using hybrid K-means and Greedy best-first search algorithm. Then, the firefly (FF) optimization was introduced for rate control. Finally, the Ant Colony Optimization (ACO) was introduced, which selects the optimal path for data transmission. African buffalo optimization (ABO) based routing was introduced in [36]. Based on African buffalos behavior, the optimal route selection process was performed. ABO acts as the main controller, and all the nodes are managed in correspondence with BS. It effectively transfers the packets from source to sink with a high network lifetime.
The most efficient approach for decision making is multicriteria decision making (MCDM) approach. To further improve the MCDM, fuzzy logic was introduced, which overcame the issues shown by MCDM. In [37], a fuzzybased MCDM for CH selection and a hybrid model for routing was introduced. Then, optimal CH selection was achieved using the generalized intuitionistic fuzzy soft set (GIFSS) method and a hybrid shark smell optimization (SSO), and a genetic algorithm (GA) was used to achieve efficient routing. Finally, a few performance metrics were evaluated, which shows the effectiveness of the GIFSS-SSO approach.
WSN utilizes a number of nodes to gather data from the surrounding environment. However, during such a process, energy conservation was considered a major objective. It mainly relies on clustering and routing strategies. Therefore, in [38], an Energy-Aware Distance-based Cluster Head selection and Routing (EADCR) protocol was developed, which enhances the lifetime and energy efficiency of the nodes in WSN. A modified form of fitness function was introduced during CH selection which aims to reduce energy consumption. Then, the shortest path identification approach was introduced for the routing process. This approach utilizes Euclidean distance to reduce the consumed energy. This combined approach enhanced the overall network energy and lifetime.
WSN utilized a large number of SNs to achieve complex communication. However, nowadays, the amount of SNs has been reduced. Further, the communication and sensing capabilities are also gets reduced. It automatically reduces the routing QoS performance. To overcome this, a Fuzzy based Relay Node Selection and Energy Efficient Routing (FRNSEER) was introduced in [39], which makes the routing more efficient and effective. In this, the fuzzy rules were utilized to select the sink node. During data transmission, a better utility factor and energy can be achieved by introducing the active selection of relay nodes. To achieve better communication, a sensor hub with less energy expenditure was scheduled in between the sink and relay node.
In [40], a two-tier distributed fuzzy logic-based protocol (TTDFP) was introduced for efficiency enhancement in multi-hop WSN. Clustering is used to meet the needs of efficient aggregation in terms of used energy. Gathered data were transferred to CHs, while CHs relay received packets to the base station in a clustered network. Hotspots and/or energy-hole issues may emerge as a result of using a multihop topology. This was reduced by the TTDFP approach, which is an adaptive, distributed protocol that effectively scales and execute for WSN applications. Moreover, it utilizes optimization techniques to tune the fuzzy parameters. It achieved a high network lifetime and energy efficiency.
Clustering was identified as the effective communication platform in WSN. Recently, fuzzy was considered as an efficient approach for clustering, as they have provided a crisp output. However, optimal solution identification may take large time. Therefore, a clonal selection with rule-based fuzzy clustering was introduced in [41], which overcomes the issues shown by fuzzy algorithm. While comparing with other fuzzy-based approaches, this CLONALG-M has shown better achievement. It depends on the principle of clonal selection, which integrates the adaptive immune process as its basic principle. The output that was approximately deployed on the basis of membership function was determined using the immune system principle to increase the overall performance. Experimental analysis has shown that this approach had outperformed other techniques.
Clustering increases the network's energy efficiency, scalability, and communication capacity. Static and dynamic clustering, as well as equal and unequal clustering, are the two types of clustering. Hotspots need a large overhead and are prone to connection issues in wireless sensor networks, which can only be achieved by uneven clustering. To avoid such hotspot issues, a zonal division based fuzzy logic approach was introduced in [42]. In this, the clustering was performed by fuzzy logic, which minimizes the energy consumption rate. It shows better performance by achieving reduced energy consumption, enlarging network lifetime and load balancing.
Normally, all the existing approaches have introduced optimization based techniques or some other techniques for clustering and routing. But, none of the works was concentrated on AI and the optimization approach. In our work, we have used RL, and DBN approaches for cluster formation and routing. These approaches have further improved the overall network lifetime; therefore, the system can withstand a longer time period.

III. PROBLEM IDENTIFICATION AND MOTIVATION
Routing in WSN assisted IoT is considered a significant task that should be controlled very carefully. Routing is to establish a data transmission communication between the SNs as well as base station (BS). The data routing process differentiates the WSNs from remaining wireless ad hoc networks as well as modern communication strategies in terms of several challenging characteristics like energy consumption and low network lifetime. There are three main problems that are considered in the WSN routing process. First of all, for the deployment of more SNs, there is no possible way to develop a global addressing process. So, traditional IP-based protocols are not essential for sensor networks. Secondly, the applications of all the sensor networks want a stream of sensed data from multiple sources to a specific sink node or BS in which conflicts with the typical communication networks. Thirdly, similar data generation is achieved using multiple sensors within the phenomenon data vicinity in which leads to the heavy redundancy traffic in the whole network. Moreover, this kind of redundancy leads to more energy consumption as well as more bandwidth utilization. It also leads to many other issues, such as delay, packet loss, and bandwidth degradation. So, this motivates us to develop an efficient routing process based on the machine learning concept in which it learns from the previous interactions to efficiently select its action in the future.

IV. METHODOLOGY
This section describes the RL based routing protocol in detail. Then the energy consumption model is introduced. In addition, some definitions, terminology, and assumptions are presented to aid understanding. Nomenclature

A. NETWORK MODEL
The assumptions that are considered while developing the WSNs network model is discussed below: • The source and SNs are static in nature.
• Data collection from the CH is done using one sink.
• Based on heterogeneous nature, the SNs are classified as advanced, intermediate and normal nodes.
• The sink should be a supernode, which needs to be kept updated regarding the details about all the SNs.
• CH aggregates data from SNs and transfers the gathered information to the sink node.
• Inter-data communication technique is selected in this approach to perform the data communication via CH.
• The node that reaches zero battery level is considered the dead node.
The model for Cluster-based single-hop communication in WSN assisted IoT is shown in Fig. 1. This work focuses on the development of optimal path routing in a 5G wireless communication network (WSN assisted IoT) based on the machine learning (deep neural network) concept. Before performing the routing process, it is essential to cluster the sensors into a group of nodes. Because, clustering technology is essential for energy-efficient transmission, which extends the survival rate of the networks and also consumes the minimum amount of energy. So, in this work, clustering is performed based on the RL algorithm. This algorithm is a centralized technique in which the BS or sink node performs the clustering process and assigns each SN into a specific cluster based on the information of their location. After the allocation of SNs in a cluster, CH selection is made by using the optimization algorithm. However, the overloading of data aggregation and data receiving from their member SN of each CH consumes more energy in a hierarchical clustering based WSN. So, it is essential to select the CH in a proper way to achieve extended network lifetime. In order to achieve this, a Multi-Objective MRFO algorithm is introduced here to elect the CH from a cluster. This algorithm is recently developed into bio-inspired optimization algorithm to address real-world engineering problems. The process flow for proposed approach is shown in Fig. 2  The multi-constraints like delay, energy, traffic density, and distance are considered as some parameters to finalize the CH from each Cluster. The most important task in sensor networks for enhancing WSN performance in terms of data integrity, throughput, energy efficiency, and latency in dynamically unreliable, asymmetric and changing, unreliable, and varying wireless channels is the right selection of optimal route. After the selection of CH, the efficient data transmission is done by proposing a Deep Belief Neural Network (DBN) based routing protocol. In order to achieve this routing process, the neural network considers some factors like residual energy, distance from CH, number of neighbour nodes and link distance. In this way, the proposed routing deeply learn about the nodes behaviour in terms of communication so that an energyefficient routing can be achieved.

B. ENERGY MODEL
The radio energy dissipation model considered in this work is obtained from [23]. In this, the energy to the radio electronics is served by the receiver, whereas the transmitter provides the energy to power amplifiers and radio electronics. The fading model used in this process is multipath. The distance d that exists between transmitter and receiver is found higher than the defined threshold. The energy dissipation encountered in free space is represented as 2 (1) Then, the distance between the tolerant bit error rate (BER) and sender and receiver is identified to evaluate the multi-path or free space fading model, The energy consumed while receiving a k bits of data packets is represented in (3), During data aggregation, the energy consumed by CH is represented in (4), Where, n -number of messages, k -bits number in the data packet, and the total energy that is consumed while aggregating a single bit is represented as Eagg P .

C. CLUSTER GENERATION USING REINFORCEMENT LEARNING
RL is a process that performs the learning process and provides a reward value to the favourable actions. RL process includes a few essential components they are agent, action, state, reward, policy, value function, and environment model. Based on the Markov decision process (MDP), the RL performs its process, which integrates greedy selection and temporal difference approach as a selection and mathematical modelling process [24,25]. In this work, the SNs are clustered using a RL algorithm. For RL based clustering, the nodes in WSN acts as the learning agent. Based on particular policies, the learning agents analyse the energy level of each adjacent neighbours for clustering. Before forming the clusters, the MDP for each node is evaluated. State, action, policy, and reward are integrated within the MDP. The temporal difference procedure is used by the learning agents to obtain the action policy regarding the network environment.

FIGURE 3. Reinforcement learning
The model RL is shown in Fig. 3. Each SN integrates the RL concept for clustering, which initially evaluate the route cost and provide that information to CH on the basis of updated Q-value. The link cost exhibit between the present node and next-hop node is illustrated by the reward parameter [24]. The basic rule followed by MDP is [S-set of states, Ttransition function, Aset of actions, and Rreward function]. All states S, which exhibit action A is selected by the learning agent, and then with this selected action, the energy consumption is estimated for each Cluster. Finally, a proper decision is made by evaluating the reward R parameter obtained from the estimated energy consumption. Then, the current state and action are incremented to 1, which is represented as S to Si+1 (state) and A to Ai+1 (action). The optimal policy Q, which increments the reward parameter, is developed by the learning agent from the learning experience. This optimal policy is used for optimal CH selection.
The learning agent aims to improve the intelligent strategy by making the ( ) i S V  value high. This process is referred to as policy, and it is represented in (6), Finally, the Q-value is updated using (7),

D. CLUSTER HEAD SELECTION USING MRFO ALGORITHM
The probabilistic process is used along with CH selection, which selects the appropriate node from the Cluster as CH. It analyses multiple objectives such as traffic density, energy, delay, and distance for CH selection. Excess energy is consumed by nodes during data gathering, forwarding and receiving. While comparing to the other nodes, the CH node will get enormous energy as they are highly responsible for forwarding and receiving data from different SNs. Further, it also aggregates the collected data. Therefore, it is essential to select the nodes that sustain more energy while performing all these tasks. Such nodes need to be selected as CH, which is identified by analysing the multi-objectives which are discussed below,

1) MULTI-OBJECTIVE FOR CH SELECTION
The node having high energy, coverage with the lowest cost and closest to the user will be chosen as CH. All CHs which are selected from each cluster perform data aggregation and packet forwarding to BS, either directly or via additional hop [26]. After picking the CH from each Cluster, the route for transferring the aggregated data to BS will be determined. Distance, energy, latency, and traffic density are the multiconstraints utilized to attain energy-aware routing. The relevance related to energy-aware restriction in WSN routing is discussed in this section.

Distance
In WSN, the requirement for the distance metric in data transmission is explained by the distance measure. When a SN is converted into a CH, the distance between cluster members is computed so that it kept to a minimum. The shortest distance between the SN and the CH is taken into account, and the SN closest to the CH is chosen for data transmission. In (8), the formula for distance has explained. The distance occupied by data from CH to sink and the data packet travelling distance from the sink to cluster node is taken as the numerator term in the distance formula. The distance should be in the range of 0 to 1. As a result, the normalization is completed.

Energy
The energy parameter for the network node should be set to maximum, indicating that the node's energy is adequate to carry on data forwarding over the network. However, the energy encountered for data forwarding in WSNs is set at minimum. By removing the cumulative energies from one, the maximising issue is turned into a minimization problem, as indicated in (9). The essential metric is energy which can be estimated by determining the remaining energy in each node. The remaining energy is obtained by identifying the cumulative cluster energy and the sum of energy from all clusters. The energy metric is modelled, and it is shown in The node which shows maximum energy will be considered as optimal CH. The cumulative energy-related to CH is represented as

Delay
For the ideal cluster head, the network latency [27] must be reduced, and its result is found and directly related to total members in that particular Cluster. The latency grows in proportion to the number of cluster members, indicating that the number of cluster members aggregated under the ideal Cluster should be kept to a minimum. In other words, the transmission delay is determined by the number of cluster members. As a result, the Cluster with the smallest number of members starts transferring data packets. The network delay needs to be reduced while selecting the optimal CH, and it is directly related to all the cluster members. If the cluster member increases, then the delay in the network also gets increased.

Traffic density
To maintain an effective network, the traffic density needs to be maintained minimum. Traffic density mainly depends on dropping a packet, channel load, and buffer utilization. The average obtained by these three parameters will provide the traffic density. The ratio between the buffer space and buffer size are evaluated to determine the buffer utilization, which is defined in (13), During data transmission, the ratio between the transmitted packets and dropped packets are evaluated to determine the packet drop ratio. The channel load is defined in (15), R C C busy l = (15) Where, the channel that is in a busy state is represented as busy C , and the total rounds that are specified during the simulation time is represented as R . The channel load is obtained by considering the number of rounds and channel state of the simulation time.

2) MANTARAY FORAGING OPTIMIZATION (MRFO) ALGORITHM:
The Manta ray is a marine creature that contains two pectoral fins and flat body [28]. The MRFO algorithm is used in this proposed architecture which analyses the multi-objective function for CH selection. The mathematical modelling for the MRFO algorithm is discussed below:

Mathematical model of MRFO
The mathematical model for the foraging behaviour of MRFO contains three different strategies they are chain foraging, cyclone foraging, and somersault foraging.

Chain foraging
Initially, mantaray search the entire solution space for plankton (the node that satisfies the objective function). After determining the plankton position, they swim towards the optimal solution. The node that contains high energy, less distance to sink node, less traffic density, and less delay are considered as best CH. Each mantaray move towards the best plankton by following the preceding mantarays. Based on the identified best solution, each individual update its current position. The charging foraging model is represented in (16)   indicates weight coefficient. The area which is having a higher plankton concentration is represented as The updated position of th i individual is represented as

Cyclone foraging
All solutions follow the preceding solutions to reach the best position of plankton. Then, the individuals perform a spiral path, and it is modelled in (17), Where, the random number in (17) is represented as  whose value may range from [0, 1]. The mathematical expression for cyclone foraging in n-D dimension is defined in (18), r represents the random number whose value ranges from 0 to 1. Each individual performs a random search based on the reference position (i.e. plankton). The cyclone foraging attains good exploitation and also improves the exploration capability. To reach the optimal solution, each individual needs to update their position instead of staying in the current position. To achieve such position update, a new reference position is allocated for each individual. This stage is represented in (20),  (18) Update the location using equation (21) Update the location using equation (16) Update the location using equation (22) t/Tmax<rand Yes Yes No

Somersault foraging
Each individual makes a random movement around the plankton and perform the somersault to reach a new position. The somersault foraging performed by mantaray is described in (22), represents the random number whose values ranges from 0 to 1. The individuals in the entire search space may update their new position between the current and best position. The solution present at its current position experiences disturbance, which may get reduced while moving close to the best solution. The three strategies are shown by this MRFO algorithm improve the efficiency of the CH selection process. Almost all the nodes will reach near to the optimal solution, but the node that perfectly satisfies the fitness function is selected as the best CH.

E. DEEP BELIEF NETWORK-BASED ROUTING
DBN is the most efficient deep learning network, which is otherwise represented as probabilistic generative networks (PGN). It contains multiple layers, whereas each layer contains a number of hidden and visible neurons. DBN layers include restricted Boltzmann machine (RBM) and multilayer perception (MLP) layers [29,30]. Two of these layers contain input, and hidden layers, and the MLP additionally contains the output layer. In DBN, the two different layers, such as hidden and input layers, are connected with tuneable weights, which are considered as the major significance of DBN architecture. The input given to the neural network is discussed below, Sink: It is considered as the destination node which gathers the aggregated data. Action history: Before aggregating current data, the data communication for the previously aggregated k data is performed, which is considered as the action. Future node: Total number of 'C' aggregated data that waits behind the present aggregated data is represented as a future node. Max-distance node: The maximum possible distance exhibited by the node from all the nearby nodes is considered as max-distance node.
The hidden layer present at first contains a combination of 4 hidden neuron subsets. There are 28 neurons found in each subset which are meshed with their respective input neurons. Additionally, two hidden layers having 128 neurons are also considered in this DBN architecture.
Two RBM layers are present are RBM 1 and RBM 2, which contains input and hidden layers. The mathematical model for RBM 1 is represented in (23) Where, the total neurons present at the input of MLP is represented as r . The hidden neurons of MLP layer is represented as, Where, the total hidden layer neurons of MLP layer is represented as y . The output of MLP layer is defined in Where, the weight corresponding to the hidden neuron x and output neuron z of MLP layer is represented as G xz w . The output provided by the hidden layer is represented as Where, the bias corresponding to the output of MLP is represented as n K . Then, the weight that links the input neuron n with the hidden neuron x is represented as nx w . The algorithm for DBN routing is presented in Algorithm 2.

Algorithm 2
Initialize, sink, action history, future node, and maxdistance node as initial parameters Define RBM 1, RBM 1 input is defined using equation (23) Then, apply weight coefficient Output from RBM 1 is obtained using equation (25) Step 1: Train RBM 1 and RBM 2 layers Input, the RBM 1 output as input to RBM 2 RBM 2 input is defined using equation (26) Then, apply the weight coefficient Output from RBM 1 is obtained using equation (29) Step 2: Train MLP layers Input, the RBM 2 output as input to MLP layer, and MLP weights MLP output is represented as

1) TRAINING PHASE OF DBN CLASSIFIER
The DBN classifier needs to be trained well to obtain the weights and biases rather than identifying the finest data transmission route. The training procedure mainly intends to tune the MLP and RBM layers, which highly depends on the weights obtained from each learning phase.
Step 1: Train RBM 1 and RBM 2 layers: Initially, provide the input features to RBM 1 and identify the probability distribution for each data. Then, encode the weight to each input to obtain the output from RBM 1, and the obtained output is provided as an input to RBM 2. A similar process is carried out in RBM 2 to obtain the input in the vector format for the MLP layer.
Step 2: Training phase of MLP layer: Following steps are processed by MLP layer, and the input for this MLP layer is obtained from the RBM 2 layer. Initialization: at first, the MLP weights are initialized, and then the random initialization process gets progressed. The weight of hidden and visible layers are represented as represents ground value and network output. The network error needs to be less. Therefore best solution can be achieved. Finally, through the selected path, the data transmission is carried out successfully with less energy consumption.

V. RESULTS AND DISCUSSIONS
The performance of the proposed deep learning-based routing protocol is analysed by simulating the proposed architecture in the Matlab platform. To conduct such experiment, the nodes number may be varied from 200 to 1000 nodes. These nodes are deployed in (1000x1000) m 2 area. The performance of the proposed DBN-RP is compared with five existing algorithms they are Genetic based energy efficient clustering (GEEC) protocol, TTDFP, EADCR, CLONALG-M, and Deep neural network (DNN). The simulated parameters used in this work are listed in Table I.

A. EVALUATION METRICS
Network lifetime: The total rounds or time taken by the network to perform the operation is identified by network lifetime metric. It also provides information regarding the time in which the node dies while performing the data transmission task [31]. The equation used to evaluate the network lifetime is represented in (36), Number of alive nodes: The total nodes that contain a considerable amount of energy to forward and receive the packet is provided. With this factor, the network lifetime can also be evaluated. ) ( (38) Where, the energy utilized by the CH in network is indicated as CHE, and SE is the energy consumption of the member node.

B. PERFORMANCE ANALYSIS
In this section, the performance analysis for CH selection and routing is carried out using different network parameters they are network lifetime, throughput, number of alive nodes, and packet transmitted to CH. The results obtained by these different metrics are illustrated and discussed below paragraphs. The overall performance obtained by proposed and existing routing protocols are illustrated in Table II.  The total number of alive nodes that are obtained for different rounds is illustrated in Fig. 5. The total number of alive nodes available in the overall area with an increase in the number of rounds is found better for the proposed approach than other existing algorithms. The major objective shown by energy-aware clustering protocol is network lifetime enhancement. It is useful to evaluate the time in which the last SN becomes lifeless. The number of alive nodes achieved by GEEC for various rounds is found much less than the proposed routing protocol. However, the alive nodes obtained by DNN is almost similar to the proposed protocol. It illustrates that the introduction of deep learning in WSN routing has attained better network lifetime by identifying the best path for data transmission without causing much reduction in energy efficiency.

FIGURE 6. Packets sent to CH vs number of rounds
The total packets that are successfully transmitted to CH for different rounds is shown in Fig. 6. The packets are successfully transmitted to sink node via CH by the proposed architecture, which is found better than other algorithms. This is because the proposed architecture has integrated the most efficient and simple optimization algorithm for CH selection. The multi-objective fitness function is considered by the proposed MRFO algorithm for CH selection. The four different objective functions are energy, delay, traffic density, and distance. The node that satisfies these multiobjective conditions are selected as the CH. Then, the remaining nodes in the Cluster transfer the gathered data to CH.

FIGURE 7. Energy vs number of rounds comparison
The energy maintained by each node in the network for various iterations is shown in Fig. 7. The energy rate attained by proposed technique is found higher than the other existing algorithm. The proposed approach has conserved more energy than existing techniques. This is because the optimal selection of CH has improved the energy conservation efficiency of the proposed architecture. While reaching the 5000 rounds, the energy in the network gets drained. The existing DNN architecture has conserved 1.0348% energy which is found better than the GEEC, TTDFP, DNN, EADCR, and CLONALG-M techniques. The energy efficient network can be widely demanded by various applications. The total energy rate maintained by proposed routing protocol is found better therefore the lifetime of entire network is also gets enhanced.

FIGURE 8. Packets transmitted to sink vs number of rounds
The average improvement shown by the proposed method while transmitting the data packets to the sink is shown in Fig. 8. The overall improvement shown by DNN while transferring the data to the sink is 3.0852% which is found larger than other existing algorithms. But still, it shows little degrade in network performance which is overcome by the proposed DBN based routing protocol. The other existing algorithms like GEEC, TTDFP, DNN, EADCR, and CLONALG-M have shown less improvement in data packet transmission. However, the effective clustering done by RL algorithm has shown an effective result in data transmission. Moreover, the proposed protocol has performed the packet transmission at a better rate without causing any loss in transmitted data.

FIGURE 9. Energy consumption vs network size
The energy consumed by proposed protocol is compared with existing protocols and the comparison results are shown in Fig. 9. The energy consumption needs to be less; therefore better network lifetime can be achieved. The entire nodes in the network is deployed randomly, therefore a particular threshold needs to be defined during CH selection. Further, different route selection needs to be defined to achieve an efficient routing process. The energy consumed by proposed architecture is found less than other existing protocols. An increase in network size increases the energy consumption, which needs to be reduced to achieve effective performance. To achieve such objective, the entire network is initially clustered using an efficient RL algorithm. The network lifetime achieved by the proposed algorithm against the five different existing algorithms is determined and the comparison results are illustrated in Fig. 10. The CH selected using the MRFO has achieved a better lifetime which is found higher than the other existing algorithms. The existing algorithms have shown a large changes in network lifetime, but the proposed architecture depicts only a little variation. This is because the proposed DBN architecture has analysed the network repeatedly by assigning a different weight parameters for each path. Due to this, the proposed architecture has attained better network lifetime.

FIGURE 11. Throughput vs network size
The data that are sensed by each node in the entire network is then transferred to CH, which then transmit the gathered data to sink node. The data is transmitted in the form of packet. The CH that transmits large amount of data is considered as the most efficient one. The throughput is obtained based on the transmitted packets. The throughput obtained by the proposed algorithm is compared with existing algorithms and the comparison outcomes are shown in Fig. 11. The throughput is found higher for proposed protocol than the other existing algorithms. Recently developed CLONALG-M also obtained almost similar performance; however it fails to attain an effective result on energy consumption. Due to this the network lifetime is also gets reduced. To avoid such defect, the DBN based routing protocol is introduced in this proposed work which automatically maximize the effectiveness of entire network.

C. ANOVA analysis
The most efficient and well-known statistical analysis approach is an analysis of variance (ANOVA). It is performed to prove the accuracy and reliability of the proposed framework. It is used to determine the variation that occurs between 2 or more means. ANOVA determines the F-value (test statistic). Using F-value the p-value is determined. The p-value is used to identify the data that makes assumptions regarding the null hypothesis. Based on mathematical notation H0 is: n1=n2=n3=n4. To frame an alternate hypothesis, let us assume that at least one of the obtained mean needs to be different. In this, the ANOVA is performed for 1000 SNs, the number of simulation instances is 20, and the critical significance level is set as 0.05. The ANOVA result concludes whether the mean obtained by the algorithms are found similar (accept H0 (null hypothesis)/ reject the alternative hypothesis (H1)) or not (reject H0). ANOVA outputs the F-statistic value, using that the p-value is estimated. In ANOVA, two conditions are checked for rejecting the null hypothesis they are: (i) If the value shown by p-value is found minimum than significant level, and (ii) if f-statistic greater than f-critical value. The tables III, IV, and V has shown the ANOVA results achieved by proposed and existing techniques in terms of energy consumption, network lifetime, and throughput.

D. Complexity analysis:
Space and time complexity analysis for clustering: (a) (b) Figure 12. Time and space complexity analysis for the clustering process The time and space complexity related to clustering algorithm is depicted in figure 12 (a, & b). In this proposed framework, the RL algorithm is introduced for clustering. The time complexity of the proposed RL is compared with existing k-means, and fuzzy c-means clustering techniques. While comparing these two clustering approaches, the learning algorithm used in this proposed work has attained better time complexity. An increase in nodes numbers may automatically increase the time complexity. However, the proposed has taken less time than other existing techniques for clustering. The space complexity is found inverse of time complexity, i.e. increase in node number reduces the total space complexity. For that, the proposed has shown better results than the existing technique. The learning strategy exhibited by RL clustering algorithm has improved the total clustering performance, which is found better than traditional unsupervised clustering algorithms.  Figure 13. Space and time complexity for the routing process The space and time complexity analysis related to routing process is shown in figure 13 (a, & b). The complexity of proposed DBN routing is compared with three existing routing techniques they are, GIFSS-SSOGA, SSO, and GA algorithms. The comparative analysis for time and space complexity indicates that the proposed has attained efficient result than the techniques taken for comparison. The proposed framework highly intends to obtain an efficient routing. For that, a learning based clustering and optimized CH selection process are introduced. These two approaches have enhanced the overall performance of DBN based routing process. In the next section, the overall proposed framework is summarized and conclusion is provided. In conclusion, the improvements and achievements of proposed architecture are discussed.

VI. CONCLUSION
In this work, a DBN based routing protocol is developed for IoT based WSN, which includes the RL algorithm for node clustering. With this routing protocol, the energy balanced clustering and routing is achieved. The learning algorithm integrated within the proposed architecture has improved the network lifetime of entire architecture. Then, the CH from each Cluster needs to be selected to perform effective data transmission. The selection of CH is identified as the major consideration in WSN therefore, to achieve such an objective an efficient MRFO algorithm is introduced in this proposed architecture. It considers four different objectives to select the best CH for transmitting the data to the sink node. The objectives that are considered for CH selection are distance, delay, traffic density, and energy. Through these selected CH, the data is transmitted to sink node without causing any loss to transmitted packets. Then, the shortest path that is needed for data transmission is selected using a deep learning based routing algorithm. Basically, the routing protocols developed at the existing architectures does not attained a satisfied result. To avoid such issue, a DBN architecture is introduced in this proposed methodology for shortest path identification. Through the identified path, the data transmission can be achieved in an effective manner. The performance shown by the proposed DBN is found better than the existing techniques. The selection of shortest path for data transmission has attained better network lifetime and energy efficiency. Finally, the complexity analysis and statistical analysis are also evaluated to show the effectiveness of proposed architecture.