A New Hybrid Routing Protocol Using a Modified K-Means Clustering Algorithm and Continuous Hopfield Network for VANET

Vehicular Ad-hoc Networks (VANET) offer several user applications for passengers and drivers, as well as security and internet access applications. To ensure efficient data transmission between vehicles, a reliable routing protocol is considered a significant challenge. This paper suggests a new clustering-based routing protocol combining a modified K-Means algorithm with Continuous Hopfield Network and Maximum Stable Set Problem (KMRP) for VANET. In this way, the basic input parameters of the K-Means algorithm, such as the number of clusters and the initial cluster heads, will not be selected randomly, but using Maximum Stable Set Problem and Continuous Hopfield Network. Then the assignment of vehicles to clusters will be carried out according to Link Reliability Model as a metric that replaces the distance parameter in the K-Means algorithm. Finally, the cluster head is selected by weight function according to the amount of free buffer space, the speed, and the node degree. The simulation results have proved that the designed protocol performs better in a highway vehicular environment, compared to the most recent schemes designed for the same objective. In fact, KMRP reduces traffic congestion, and thus provides a significant increase in Throughput. In addition, KMRP decreases the transmission delay and guarantees the stability of the clusters in high density and mobility, which acts better in terms of the Packet Delivery Ratio.

Cluster member nodes can send their data to the Cluster Head. Then, the Cluster Head can eliminate data redundancy using data aggregation techniques [19].
Indeed, most clustering algorithms require a number of clusters specified in advance. Determining the number of clusters is therefore considered to be one of the most difficult challenges to solve in the clustering process. This number is considered a basic entry for most clustering algorithms, which plays a very important role in achieving good quality clustering. While, clustering algorithms take into account the Euclidean distance to classify a set of data in clusters, in a VANET environment the clustering process must take into account the reliability of the links between the vehicles as a major parameter to assign each vehicle to a cluster. The reliability of the links is based on the mobility model which describes the vehicle network.
A robust clustering algorithm must capture the mobility of vehicles and provides relatively reliable and stable information to ensure the proper dissemination of data between clusters. In this paper, by combining the Maximum Stable Set Problem solved by Continuous Hopfield Network with the K-Means clustering method, a new routing protocol (KMRP) is proposed for VANET which uses a clustering-based topology to create the appropriate route between vehicles to transmit data. Thus, we use a Maximum Stable Set Problem (MSSP) [20] combined by Continuous Hopfield Network (CHN) to select the initial cluster heads (CHs) and to determine the number of clusters [21]. Then, the assignment of each vehicle to a cluster is based on a link reliability model. Eventually, in the maintenance step, an appropriate weight function is used to select suitable CH vehicles taking into account parameters such as the buffer size of the node, the speed, and the node degree.
In short, the main additions of this paper are: • Overview of the different clustering-based routing protocols schemes proposed for VANET in the literature.
• Proposing a cluster-based routing protocol for VANET by combining MSSP and CHN with a modified K-Means algorithm. The clustering scheme is divided into two levels: Level 1 -Initial phase: in which the initial cluster heads and the number of clusters are defined, by transforming the set of vehicles into MSSP which will be resolved by CHN.
Level 2 -Clustering phase: By using a modified K-Means algorithm, each node calculates a link reliability value with all accessible CHs, then it joins the most appropriate according to several metrics such as traffic density, relative speed, and distance. Then, an update of the cluster heads was introduced. In each cluster, the node with the largest weight will be selected as a new cluster head. Finally, the re-clustering process is repeated again.
The rest of the paper is structured as follows: A short study of the clustering-based routing protocols suggested in the literature is given in section 2. Section 3 presents a general overview of the K-Means algorithm. While in section 4, the proposed method is presented and discussed in detail. Then the results obtained from the simulation are reported in section 5, and the conclusion and final comments are given in section 6.

II. RELATED WORK
Several clustering methods and routing protocols have been proposed for VANET networks. In this section, we briefly survey the main solutions suggested for VANETs in the literature. Then, we give a qualitative comparison between these protocols listed in Table 1.
A. CLUSTERING SOLUTIONS BASED ON NEURAL NETWORKS 1) HYBRID ROUTING SCHEME USING IMPERIALIST COMPETITIVE ALGORITHM AND RBF NEURAL NETWORKS FOR VANETs A Hybrid routing scheme using the Imperialist Competitive Algorithm (ICA) and Radial Basis Function (RBF) neural networks for VANET was proposed in [22]. This method is broadly divided into three phases: 1. Cluster formation: considering the degree and velocity of the nodes, the proposed scheme uses ICA algorithm to create the clusters.
2. Cluster head election: based on the RBF neural networks, the authors proposed a suitable fitness function that considers several parameters such as unoccupied buffer size and the expected transmission count to select the new cluster heads.
3. Cluster head communication: the proposed scheme selects the gateway nodes (GN) in each cluster to establish communication between the different cluster heads.
However, this proposed scheme requires more memory and computation which influences data transmission delay between vehicles.
2) A ROUTING PROTOCOL FOR VEHICULAR AD HOC NETWORKS USING SIMULATED ANNEALING ALGORITHM AND NEURAL NETWORKS Bagherlou and Ghaffari [23] proposed a cluster-based routing protocol for VANET. This method aims to create the appropriate route so as to transmit information between the sending and receiving nodes. The essence of this approach can be summarized in two steps: 1. Generate the appropriate clusters using the Simulated Annealing. This step takes into account several parameters such as the degree, coverage, and ability of the nodes.
2. To select the Cluster Head, the authors used the Radial Basis Function neural network and an appropriate fitness function with parameters of velocity and free buffer size. Each cluster head selects two nodes as appropriate gateway nodes for transmitting data from one cluster to another.
However, the authors did not introduce a detailed election scheme. A clustering-based reliable low-latency multipath routing scheme (CRLLR) was proposed in [24]. The key component of this model contains two processes: 1. The cluster head (CH) is selected using the link reliability value.
2. Ant Colony Optimisation (ACO) is employed as an efficient method in order to find suitable paths between vehicles.
However, the proposed routing algorithm requires more routing packets in Route Discovery Process.

2) RMRPTS: A RELIABLE MULTI-LEVEL ROUTING PROTOCOL WITH TABU SEARCH IN VANET
A reliable multi-level routing protocol with Tabu Search (RMRPTS) was proposed in [25]. Moridi and Barati suggested an improved version of the AODV routing protocol based on the clustering, in order to provide a multi-level routing algorithm for VANET network. Two main process characterize this model: 1. The Fuzzy Logic system is used at the first to determine the most stable links between the member nodes of the clusters, based on several parameters like Link Expiration Time (LET) and Link Reliability (REL). 2. At the second level, the best path between the clusters head is selected using the Tabu Search. Several mobility parameters such as distance, direction, and velocity of the nodes are taken into account at this level.
This approach aims to reduce the number of failed links and the number of packets lost during communication between vehicles, but it does not ensure the stability of clusters.

3) CBQoS-VANET: CLUSTER-BASED ARTIFICIAL BEE COLONY ALGORITHM FOR QoS ROUTING PROTOCOL IN VANET
A new QoS-based unicast routing protocol for vehicular networks (CBQoS) was developed in [26]. Fekair et al. applied two methods: 1. A clustering scheme was used in order to optimize the exchange of routing information according to QoS criteria.
2. An artificial bee colony algorithm was used to select the optimal paths between the nodes according to QoS requirements.
However, this method does not provide a cluster maintenance phase.

C. ROUTING PROTOCOL BASED ON CLUSTERING 1) A DATA DISSEMINATION SCHEME BASED ON CLUSTERING AND PROBABILISTIC BROADCASTING IN VANETs
A novel data dissemination scheme based on Clustering and Probabilistic Broadcasting (CPB) was proposed in [27]. Three main process characterize this model: 1. The clustering procedure is carried out using an appropriate clustering method, which depends on the direction of the vehicles.
2. The exchange of packets between the cluster head and the member nodes is based on the calculation of a probability, which depends on the number of times that the same packet is received in an interval.
3. Then, the elected cluster head continues to disseminate data toward the transmission direction.
However, the clustering method has been designed for certain particular applications.

2) CONTROL OVERHEAD REDUCTION IN CLUSTER-BASED VANET ROUTING PROTOCOL
In order to decrease the control overhead in a clustering process, a Control Overhead Reduction Algorithm (CORA) has been proposed in [28]. This routing protocol uses an appropriate mechanism to calculate a suitable period for updating or forwarding CMHELLO packets between Cluster Head and nodes members. This method does not take into account the mobility parameters when creating clusters and selecting cluster heads.

3) DISTANCE AND CLUSTERING-BASED ENERGY-EFFICIENT PSEUDONYMS CHANGING STRATEGY OVER ROAD NETWORK
A distance and cluster-based energy pseudonyms changing scheme for VANET was designed in [29]. This clustering algorithm includes two stages: 1. In the clustering optimization, each cluster head effect pseudonyms changing uses predicted energy and distance of each cluster member.
2. The nodes which are located near the Road Side Unit (RSU) are responsible for sending data from the entire network to report server. Therefore, vehicles in this area will require less energy to change the pseudonyms.
This method uses the distance and energy parameters to classify vehicles into clusters, which cannot be sufficient to meet the requirements of the characteristics of VANET.

4) CLUSTERING BASED ENERGY EFFICIENT AND COMMUNICATION PROTOCOL FOR MULTIPLE MIX-ZONES OVER ROAD NETWORKS
In order to decrease the loopholes of established clustering protocols for several mix-zones on the road networks, Arain et al. [30] suggested a clustering-based energy-efficient and communication protocol (CEECP). In this way, to benefit from V2V as well as V2I communication for cooperative traffic information systems (CTIS), the authors designed a novel CEECP for a chain scenario to connect to RSU. Indeed, the clustering provided by this work is based on the road unit (RSU) which may not be available in the future due to the cost of deployment.

5) VANET CLUSTERING BASED ROUTING PROTOCOL SUITABLE FOR DESERTS
A novel algorithm for the process of organizing a cluster structure and a cluster head election (CHE) suitable for VANET (CBVRP) has been proposed in [31]. This robust clustering scheme uses location and velocity to classify clusters and select cluster heads. Its main objective is to guarantee optimal manipulation of the equipment on each vehicle and to ensure reliable delivery information. It has been proposed for deserts and can perform great communication efficiency. Unfortunately, the way to create the clusters is mainly based on the partitioning of the network, which negatively affects the lifetime of the clusters.

6) CLUSTERING IN VEHICULAR AD HOC NETWORKS USING AFFINITY PROPAGATION
A new clustering scheme was based on the affinity propagation algorithm in a distributed way for VANET in [32]. This algorithm takes into account the mobility of the nodes during the formation of the clusters and produced high stability clusters. The cluster creation process is measured in terms of the average cluster change ratio, the average number of clusters, the average cluster duration, and the average duration of cluster members. However, the precision of the parameters used in the implementation can have a significant effect on the quality of the result.

7) PassCAR: A PASSIVE CLUSTERING AIDED ROUTING PROTOCOL FOR VEHICULAR AD HOC NETWORKS
Wang and Lin [33] suggested an appropriate routing protocol for the one-way multi-lane, highway scenario (PassCAR). This protocol uses the passive clustering method. Indeed, the selection of the cluster head and gateway nodes is based on a multi-metric election strategy or several parameters are taken into account, such as the degree of the node, the lifetime of the link, and the number of expected transmissions. This approach aims to create suitable and reliable clusters during the route discovery phase, but it generates high routing overhead costs.
To summarize this section, these proposed works have one or more of the following limitations: (i) Formation of clusters and cluster heads selection are based on the partitioning of road networks instead of being based on the parameters of mobility and the quality of the links between vehicles; (ii) Clustering algorithm requires more computation and memory which makes the clustering process relatively slow; (iii) Clustering-based routing protocols are based on the RSU which may not be available in the future due to the cost of deployment. However, our proposed scheme provides a complete routing protocol for VANET which consists of an unsupervised and faster classification method and a suitable routing protocol.

III. K-MEANS CLUSTERING ALGORITHM
The K-Means algorithm is considered as one of the most popular algorithms in the clustering procedure [34], [35]. It is widely used in several fields, such as Data Mining, Sensors Networks, and Ad hoc Networks [36]. It is a simple unsupervised learning algorithm that classifies a data set as a set of K clusters specified at the start of the procedure. Its main objective is to reduce the distance between the members of the cluster and the cluster head.
As mentioned in Figure 1.a, the algorithm initially chooses an initial number of clusters K . The objective is to reorganize a set of points x j with 1 ≤ j ≤ N , into K clusters. For this, K-Means randomly selects K points x i with 1 ≤ i ≤ K of data in the data set as centroids, where each centroid belongs to a cluster C.
Then, the algorithm assigns each point in the data set to the nearest centroid. This process is based on an objective function, which calculates the sum of all squared distances in a cluster, for all clusters, as shown in Figure 1.b. The calculation is carried out using the objective function (1): is the distance between the point and the centroid of the cluster. x j is the position of the point, and u i is the position of the centroid with i = 1, . . . , K , K is the number of clusters.
After assigning the points in each cluster, as shown in Figure 1.c, the K-Means algorithm updates the position of each centroid using (2): Finally, the clusters are formed, as shown in Figure 1. d. The global algorithm is described in the following pseudo-code and flowchart in Figure 2:  VANET networks [34], but in a frequent change in the number of connected vehicles, and in the network topology, there are some drawbacks which are caused, especially in the random choice of the initial number of clusters, and in the objective function for attributing each point to a cluster [44].

IV. PROPOSED SCHEME
The integration of clustering methods in the routing protocols intended for VANET has shown its efficiency in terms of bandwidth and routing overhead costs [23], [24], [28]. In fact, routing protocols based on clustering are robust and effective in the face of periodic changes in the network topology, due to the high mobility and frequent disconnection of vehicles. In this sense, our proposed scheme is designed for a highway topology and includes a method for selecting the initial cluster heads using neural networks, and a method for performing clustering, using a modified K-Means algorithm. In the proposed scheme, each cluster has a cluster head that supports receiving, collecting and aggregating information from all member nodes, and distributing this information to other cluster heads. Thus, the member nodes of each cluster will establish one-hop communications with the associated cluster head. In the route discovery process, each node sends routing packets only to its cluster head. Consequently, the proposed method decreases the number of control packets in network and reduces network overhead.
The proposed scheme is divided into two phases: Initial phase and clustering phase. In the first phase, we have started with an initial set of nodes V , to which we have applied MSSP and CHN. The result is a set of CHs S CH , and therefore we have achieved a number of clusters K = |S CH |. Then, in the second phase, a modified K-Means clustering algorithm uses K and S CH to establish the final clusters by populating each cluster with its CH and the nearest nodes depending on link reliability. Eventually, we are back to the same initial set of nodes V with clustering on.

A. INITIAL PHASE 1) SELECTING THE INITIAL CLUSTER HEADS USING MSSP AND CHN
Determining the number of clusters, and the best initial cluster heads for a data set remains one of the major challenges of cluster analysis [34], [35]. These are basic parameters for the execution of the K-Means algorithm. Therefore, it is necessary to specify appropriate cluster heads to obtain the good results. To fill this deficiency, we present a method to carry out a good determination of these parameters.
The procedure to be followed is divided into two steps: In the first step, we have reformulated the Maximum Stable Set Problem to find a stable set of cluster heads. This problem will be modeled later as a quadratic programming (QP)0 − 1; Then, the second step is given to apply CHN in order to solve the QP problem. This step requires an energetic function which will be associated with the CHN, and an appropriate parameterization method concerning the MSSP problem will be given.
To create a stable set S of vehicles, we have converted the set of nodes V to Maximum Stable Set Problem. This method aims to select a stable set of vehicles that will be initial cluster heads. These vehicles have a property of being pairs and not adjacent.
Let V be a set of n nodes with V = {V 1 , V 2 , . . . , V n }. Each node of the graph is presented by a vehicle. Thus, we have illustrated VANET by an undirected graph G = (V , E), where V is the set of vehicles in the network and E is the set of connection links between pairs of vehicles.
Let V i be a vehicle, with position (x i , y i ) and velocity v i . Let V j be a vehicle, with position (x j , y j ), and velocity v j . We have assumed that all vehicles have the same range of transmission R. In order to build an edge between V i and V j , we first calculated the similarity between them (S i,j ) as follows: there is a connection (edge) in the graph between the vehicle V i and the vehicle V j .
ε is a similarity parameter that is fixed a priori (ε = 0.991833). Let S ⊂ V be a stable set of vehicles. We have proposed a binary variable x i for each vehicle V i in the network as follows: Note that two adjacent nodes V i and V j cannot be in the set S: The quadratic constraint can be accumulated in a single one: where C is defined as: Then, we have described the objective function to maximize the size of the stable set S by the following method: Therefore, the quadratic program (QP)0 − 1 is given with a linear function subjected to quadratic constraints which illustrate the MSSP problem with n binary variables. This problem can be introduced by the following algebraic form: In the next section, we will show how to present the MSSP problem in the form of an energy function, which will be associated with the CHN.
Step 2: Using the Continuous Hopfield Network to solve the MSSP.
Hopfield neural network was introduced by Hopfield and Tank at the beginning of 1980 [20]. This method, which belongs to neural networks as associative memory, has shown a great capacity to solve several optimization challenges through several studies. Therefore, a certain number of researchers have been encouraged to apply this method to solve several optimization problems [21], [43].
The CHN includes interconnected neurons with a hyperbolic tangent activation function. Each neuron represents a processing unit, described by a mathematical language, and inspired by the information processing mechanism of biological neurons [45].
As shown in Figure 3, CHN has some important points, such as: FIGURE 3. Hopfield neural network architecture. VOLUME 9, 2021 -The neural network has symmetric weights with no selfconnections.
w ij = w ji and w ii = 0 -CHN introduces neurons with an inverting output and a non-inverting output.
-Each neuron has an output which must be the input for other neurons but not the input of itself.
-If the neuron output is the same as the input, the connection is considered to be excitatory, otherwise, it will be considered to be inhibitory [37].
Let W ij be the connection weight from neuron i to neuron j. Note i i b an offset bias for neuron i. The dynamics of the CHN are illustrated by the following differential equation: where x, u, and i b will be the vectors of neuron states, outputs, and biases. Then the transfer function x i = g(u i ) is a hyperbolic tangent. It is bounded below by 0 and above by 1.
u 0 indicates a parameter used to control the gain of the activation function. In the case where we have a combinatorial problem, we will be forced to reconstitute this problem in terms of energy or Lyapunov function associated with CHN [38]. Thus, we have defined the energy function in the following form: We have used CHN to solve the maximum stable set problem. This problem which is expressed as an energetic function is designed in 0-1 quadratic programming whose goal is to minimize the linear function subjected to quadratic constraints (Formula 10).
As it is indicated in [21], The energy function comprises the objective function f (x) of the QP problem and corrects the violated constraints of the problem QP with a quadratic term and a linear term. Thus, the generalized energy function associated with CHN is expressed by the following formula: With ∝> 0, φ ∈ R, and γ ∈ R. The generalized energy function E(x) can be expressed by the following simplified form [38]: ∝, φ, γ are parameters-setting, which are determined by the Hyperplane method described in [39].
with ∝> 0, φ > 0, γ ≥ 0 − ∝ +φ − γ = We have chosen the starting points for assigning vehicles to the clusters so that the number of the stable set is maximized. Therefore, we have determined the maximum stable set problem. Finally, the starting points are randomly initialized by the following expression: where i ∈ {1, . . . , n}, n is the number of vehicles; z indicates a random variable in the interval [−0.5, 0.5].
We have fixed ∝,γ and in order to calculate the parameter φ. Then, the values of the parameters sitting are: We were able to solve the quadratic programming (QP) problem, by using CHN. The thresholds and weights of CHN can be calculated using the sitting parameters φ, ∝, γ . And we have reached an equilibrium point for CHN using an algorithm that was proposed in [40]. After that, we have obtained a stable set of vehicles as shown in Figure 4, which will be considered the initial cluster heads (Initial number of clusters). The latter is considered as a basic input parameter for the modified K-Means clustering method. In Figure 4, the vehicles which have a yellow color are considered as a stable set of nodes (Initial cluster heads), while the nodes of red color are regarded as Cluster Member.
Example of Selecting Initial CHs Using MSSP And CHN: In this example, we have determined the set of nodes V to build a graph, so that each node of the graph is represented by a vehicle. So, we have supposed to have eight nodes in this set V = (V 1 , V 2 , V 3 , V 4 , V 5 , V 6 , V 7 , V 8 ). Then, we have calculated the similarity between different nodes using equation 3: Let M the matrix obtained after calculating the similarity between all the nodes. It is a symmetric matrix (8x8) whose columns are the vehicles. M , as shown at the bottom of the next page.
In the next step, we have determined the connections between the nodes so that: With ε = 0.991833 is the similarity threshold.
From this step, we can get an undirected graph G = (V , E) with V = V 1 , V 2 , V 3 , . . . , V 8 and E is the set of edges as mentionned in Figure 5. We have associated the graph obtained with the maximum stable set problem. Then we have modeled the graph in the following quadratic programming form (QP): To solve the quadratic programming (QP) problem, we have used the Continuous Hopfield Network. We have obtained a stable set mentioned in Figure 6, which will be equal to (V 1 , V 3 , V 5 ). These nodes will be considered as initial CHs.

B. CLUSTERING PHASE 1) CLUSTER FORMATION
Our algorithm takes into account the link reliability model as an objective function to assign each vehicle to a cluster. It is a metric that estimates a prediction on the future state of a link between two vehicles [41]. It is calculated using several metrics, such as traffic density, relative speed, and distance between nodes. The link reliability model is a probabilistic function which consists on the one hand of determining the probabilistic density function concerning the lifetime of a link for a mobility model, and on the other hand of estimating the link reliability either by discrete sum, or by adaptive integration. In fact, the reliability of the links relating to vehicular networks depends mainly on the mobility model of the nodes within the network.
A mobility model is a system that describes the movement of a set of nodes where each node moves independently from other nodes in the network. Each model shows the topology of the network and it only represents a specific scenario of a network. There are several models of mobility mentioned in the literature, such as Manhattan, Gaussian Markov Model, Random Waypoint Model.
Thus, we have used the traditional theory of traffic flow to estimate the reliability of the links. This theory provides a global view of the topology of the vehicles in the network. It is calculated using road density, average speed and traffic flow. The traditional theory of traffic flows is introduced by a statistical probability which gives a distribution of the probability of each metric.
To calculate the link reliability model, firstly we have expressed the probability function for the link lifetime t ll based on classical vehicular traffic theory [42], using the traffic density λ and the relative speed v: where D r [m] is the vehicle transmission area, v [km/h] is the relative speed assumed to be as Normal distribution: The relative speed [km/h] pictures the mobility situation of two vehicles. Unlike MANET networks, two neighboring vehicles can have a low relative speed even if they are both traveling at high speed.
While traffic density [veh/km] indicates the number of vehicles on a part of the road, it affects the speed of vehicles on the road. This speed decreases if the value of the traffic density exceeds a critical value, otherwise the speed of the vehicles increases. Therefore, this metric puts its impact on vehicle speed and road capacity.
Let V j be a vehicle; 1 < j < N , with velocity v j and position (x j , y j ). C i is a centroid; 1 < i < K , with velocity v i and position (x i , y i ). As mentioned in [42], the link reliability model is calculated using (18): where λ [veh / km] is the traffic density, λ c is the critical value of traffic density, t 0 is the connection start time, p(t) is probability function for the link lifetime t 0 , δ is the correction factor that improves the influence of traffic density, T ij is the probability that the connection between the vehicle V j and the centroid C i remains available, and it can be calculated using (19): where L ij is the distance between the vehicle V j and the centroid C i , v ij is the relative moving speed of V j and C i . The assignment of a vehicle to a cluster depends on the value of its link reliability model with the corresponding cluster head. This value provides an estimated indication of how long this vehicle can remain in the cluster, taking into account acceleration and change of position. Thus, by this method, the probability that a vehicle joins a cluster does not just depend on the distance between this vehicle and the cluster head, but also on the difference in speed and traffic density. Therefore, the new objective function F for the modified K-Means algorithm based on the link reliability model is expressed by (20): Consequently, a vehicle can join the cluster with the highest link reliability value between this vehicle and the cluster head. In Figure 7, the vehicles that form each cluster are indicated by a red color, whereas each cluster head is indicated by yellow color.

2) CLUSTER MAINTENANCE
Several specific reasons in VANET, such as mobility and permanent disconnection of vehicles forced us to propose a method of maintaining the clusters during the routing protocol process. The maintenance phase contains two steps: The CH updating, and the Re-establish clusters.
CH updating: In this step, each member node calculates and sends its weight value to the associated cluster head. Then the node with a weight value greater than the CH value will be selected as a new cluster head. Indeed, the cluster head will be replaced from time to time according to the network topology. The calculation of the weight value is based on three metrics: the buffer size of the node, the speed, and the node degree [33]. Therefore, to calculate this value, we have used the following formula: To calculate the value of N max , we have used the following formula: R is the transmission range of the vehicle; l is the length of the vehicle; n l is the number of lanes.

Re-establish clusters:
In the VANET environment, where it is characterized by high mobility of vehicles, there is probably a disturbance of the clusters formed. The clustering phase must be triggered automatically several times in order to update the clusters constantly.
The pseudo-code of the Modified K-Means is described in algorithm 2, and the pseudo-code of the clustering method is described in algorithm 3. However, the algorithm of the KMRP is described in algorithm 4. Then, the flow chart in Figure 8 illustrates the process of our proposed KMRP scheme.

Algorithm 2 -Pseudo Code of Modified K-Means
Input: Number of centroids k; Set of nodes V ; Set of CHs C k . Output: Set of clusters with their CHs. Begin 1: Repeat 2: For each node in N do 3: Calculate the link reliability between the node and the CH of each cluster using equation (20)

V. SIMULATION AND RESULTS
In this section, we describe the simulation environment, the parameters used and a report on the simulation results obtained. In the simulation, we have evaluated the performance of our proposed scheme compared with the results obtained by ICA-RBF [22], and RMRPTS [25]. These more recent solutions are designed for the same objective.

A. SIMULATION SETUP
In order to evaluate its efficiency, we have simulated our proposed routing protocol KMRP using network simulator NS2. The performance of the KMRP is compared to distributed and centralized routing protocols designed for the same purpose in terms of Throughput, average End-to-End delay (E2ED), and Packet Delivery Ratio (PDR). We have carried out the simulation according to two scenarios; change the number of vehicles from 100 to 300 and change the speed of vehicles from 60 Km/h to 120 Km/h. For the two scenarios, we have considered a 5 km road with 2 lanes, as shown in Figure 9. The vehicles are moved in two directions. The maximum speed of each vehicle has been set at 120 km/h. In other words, the maximum speed refers to the real speed of a vehicle in a highway scenario which can be any value between 60 and 120 in the simulation. We have run the simulation 10 times, and we have taken the average of the results obtained. All the parameters of the simulation are described in Table 2.
B. SIMULATION RESULTS 1) THROUGHPUT Figure 10 illustrates the effect of density on Throughput for KMRP, ICA-RBF [22], and RMRPTS [25]. The Throughput reaches 830 kbps for KMRP and decreases a little when the number of vehicles becomes 300. Generally, the KMRP protocol offers a little high Throughput compared to ICA-RBF, and RMRPTS. This means that ICA-RBF, and RMRPTS offer insufficient bandwidth to pass a large number of control VOLUME 9, 2021  packets. On the other hand, KMRP forms an appropriate number of clusters, where each cluster head takes responsibility for exchanging data between the different cluster members, which reduces network congestion. Indeed, the reduction of traffic congestion plays an important role in the reduction of collisions, which consequently leads to a significant increase in throughput. In addition, in KMRP, the cluster head selects the appropriate nodes to send data, which will have a higher probability of successful transmission. Therefore, the proposed scheme improves the throughput compared to other schemes. Figure 11 shows the impact of the velocity on Throughput for KMRP, ICA-RBF, and RMRPTS. As vehicle speed increases, the value of Throughput decreases for all the schemes. This is due to a more changing VANET topology and link failure probability when the speed increases to 120 Km/h. But our approach provides higher throughput than other compared schemes, since the probability of link failure is reduced by using the link reliability model in the clustering process, which estimates a prediction on the future state of a link between two vehicles. Consequently, the other schemes are quickly affected by high mobility of vehicles.
2) AVERAGE END-TO-END DELAY Figure 12 compares the impact of the density on Average End to End delay for the proposed scheme, ICA-RBF [22], and RMRPTS [25], while Figure 13 shows the average Endto-End delay for KMRP, ICA-RBF, and RMRPTS by varying the speed of vehicles. As mentioned in these figures, the average end-to-end delay increases for ICA-RBF and RMRPTS when the number of vehicles is large, and when the speed of vehicles increases. The increase in transmission delay is caused by higher use of bandwidth. In addition, these schemes require more computation and memory for clustering, which increases the time for partitioning the network into clusters. KMRP overcomes these difficulties by using faster algorithms which require less computing time and memory. On the other hand, KMRP provides less link failure by selecting a more stable and reliable route from source to destination. Therefore, the average end-to-end delay is reduced. Consequently, KMRP provides less end-to-end delay for all density and velocity values.  Figure 14 shows the impact of density on the Packet Delivery Ratio for KMRP, ICA-RBF [22], and RMRPTS [25]. As shown this figure, KMRP presents a higher PDR than other schemes. In fact, the PDR of ICA-RBF and RMRPTS increase from 91% and 85% to 96% and 94% respectively, whereas the PDR of KMRP retains a value greater than 94% and reaches 96% when the number of vehicles is 300. This is due to the performance regardless of the state of the network, such as high density. The efficiency of the proposed protocol depends mainly on the ability to adjust the size of the cluster, and to choose an appropriate number of CH, using CHN algorithm and MSSP method. Then, KMRP guarantees the stability of the clusters and therefore avoids redundant and repetitive data transmission. In this sense, KMRP has better management of packet transmission in a fairly dense network compared to other schemes. Figure 15 illustrates the impact of velocity on PDR for KMRP, ICA-RBF [22], and RMRPTS [25]. In this figure, the PDR decreases when the speed of vehicles increases. This is due to the more changing VANET topology and the increase in vehicle speed. However, the PDR of our scheme reaches a value superior to other approaches. In fact, KMRP selects the most suitable nodes as cluster head using MSSP, CHN, and K-Means during the clustering process. In addition, KMRP periodically performs the maintenance phase of the clusters where the mobility of nodes is high, which makes the clustering process suitable for the VANET environment.

VI. CONCLUSION
In this paper, we have proposed a new clustering-based routing protocol using a modified K-Means algorithm combined with Maximum Stable Set Problem and Continuous Hopfield Network in order to improve data transmission in VANET in a high density and high mobility environment. In our approach, the Maximum Stable Set Problem solved by Continuous Hopfield Network is used to select the appropriate CHs. Then, the clustering process is based on the link reliability model, and the maintenance of the cluster uses parameters, like a free buffer, vehicle speed and node degree to select a new cluster head. In each cluster, a vehicle will be considered as a cluster head if it has a maximum free space, an appropriate velocity, and a maximum node degree. To assess the effectiveness of the proposed approach, a simulation was performed in a highway vehicular environment, and a comparison was made with ICA-RBF, and RMRPTS. The obtained results from this simulation showed that KMRP reduces traffic congestion and collisions, which results in a significant increase in Throughput. Additionally, KMRP provides a fast algorithm that does not require too much computation and memory, which therefore reduces the end-to-end delay. Finally, KMRP provides better PDR than other schemes in high density and mobility. Our proposed approach guarantees the stability of the clusters and thus avoids redundant and repetitive data transmission.
KHALID KANDALI was born in Casablanca, Morocco, November 26, 1985. He received the master's degree in network and computer science from the Faculty of Sciences and Techniques, University of Hassan 1st, Settat, Morocco. He is currently pursuing the Ph.D. degree in computer science with the Graduate School of Technology, Moulay Ismail University, Meknes, Morocco. He is also involved in the vehicular ad hoc networks, wireless sensor networks, and machine learning.
LAMYAE BENNIS received the Ph.D. degree in computer sciences from Moulay Ismail University, Meknes, Morocco, in 2020. She is currently a member of the Mathematics and Computer Sciences Laboratory, Faculty of Sciences of Meknes. Her research interests include educational technologies, intelligent systems, learning analytics, and serious games. Her current project is the conception and development of an authoring tool in order to generate an adaptive ubiquitous learning games.
HAMID BENNIS was born in Meknes, Morocco, in September 1977. He received the Ph.D. degree in computer science and telecommunications from Mohammed V University at Agdal, Rabat, Morocco, in 2011. He is currently a Professor of computer science and telecommunications with the Department of Computer Science, Graduate School of Technology, Moulay Ismail University, Meknes. He is also the Head of the research team Communication Systems, Artificial Intelligence, and Mathematics. He is also a contributing author of a number of refereed journals, book chapters, and proceeding articles in the areas of wireless communications. His current research interests include mobile ad hoc networks, wireless sensor networks, wireless power transmission (WPT), machine learning, and microwave hybrid circuits. VOLUME 9, 2021