Tuning the Evaporation Parameter in ACO MANET Routing Using a Satisfaction-Form Game-Theoretic Approach

A Mobile Ad Hoc Network (MANET) is a communication network that links communicating devices (nodes) and does not contain permanent infrastructure. There are no dedicated routing devices in MANETs, in which the routing task is assigned to a routing algorithm installed on all communicating nodes. In this work, communicating nodes utilize one of the most widely used algorithms: Ant Colony Optimization (ACO) routing algorithms. The ACO algorithms aim to balance between exploring new routes for the communication packets vs. utilizing the best-known routes discovered during the communication session. Achieving the optimality in this tradeoff is traditionally set manually by assigning many values to some parameters and measuring the network performance after the simulation session. This manual optimality tuning approach depends on human intuition and does not cope with MANET’s dynamic topology. In this research, we introduce a novel method to find an optimal balance for the exploration-exploitation tradeoff during the communication session. We formulate weighing the benefits of exploring new routes vs. exploiting known ones upon the MANET performance as a game between the two semantic players. This equilibrium is reflected as an optimal value for the pheromone evaporation parameter of the ACO algorithm during the communication session. Experimental results show a higher performance of this online tuning algorithm than the traditional offline tuning algorithms.

from their nest to the food vs. the notion of artificial ants in 102 communication networks. 103 Parameter tuning in ACO routing algorithms is either per-104 formed online or offline. Offline parameter tuning is per-105 formed before the algorithm's execution. It is performed in 106 a trial-and-error method and relies on human experience to 107 adjust the optimum parameters' values. It may be useful in 108 stationary environments, but it is not suitable for dynamic 109 ones somewhere the parameters have to cope with differ-110 ent instances of the problem [18]. Online parameter tuning, 111 on the other hand, is more adaptive somewhere the parame-112 ters' values are adjusted while solving the problem instance. 113 This adaptability has a computational cost. The authors 114 of [19] categorized online parameter tuning approaches for 115 meta-heuristic algorithms generally into 3 categories: Simple, 116 Iterative, and High-Level. All categories utilize the notion of 117 generating some values for the parameters and then evaluat-118 ing them according to the performance metrics. The simple 119 approach is a single step of setting parameters' values and 120 then evaluating them. The iterative approach is a repeated 121 process of generating parameters' values and then evaluating 122 the outcome performance metrics. The high-level approach is 123 also iterative, but the generate-phase involves producing elite 124 selected values of the parameters according to search methods 125 instead of random values. Researchers work to get the benefit 126 of dynamic parameter tuning and reduce computational cost 127 at the same time. 128 AntHocNet as an ACO-based routing algorithm [20] uti-129 lizes the notion of pheromone to rate the suitability of 130 possible routes for ant agents. Pheromone deposition is 131 performed with the passage of ants over the route. The 132 pheromone amount increases accumulatively on any route 133 with the passage of more ants over it. An exploratory param-134 eter exists to identify which route the ant agent will fol-135 low based on the pheromone level of the available routes. 136 This exploratory parameter has been studied in our previous 137 research [21] to perform online tuning for it. On the other 138 hand, a pheromone evaporation process exists to decrease 139 the amount of pheromone in each route. The aim of the 140 evaporation process is to avoid keeping high pheromone 141 values for abandoned, low quality paths. The evaporation rate 142 is controlled by an evaporation parameter that is tuned offline 143 in the AntHocNet algorithm.

144
This paper is an extension of our previous conference paper 145 [21]. In the previous paper, we introduced an online parameter 146 tuning method for the exploration parameter of the MANET 147 VOLUME 10, 2022 routing algorithm using game theory. In this paper, we tune another parameter, which is the pheromone evaporation rate 149 parameter, using another game theory approach which is 150 satisfaction game.    The game notion creates a balance between the two compet-201 ing players based on the QoS parameters measured from the 202 network environment.

203
The rest of this paper is structured as follows: Section II 204 presents a literature review of the research that tackles ACO 205 usage in MANET routing and methods used for parameter 206 tuning. Section III introduces the details of the approach con-207 tributed in this paper. In section IV, we validate the introduced 208 algorithm with a set of experiments and evaluate its results 209 against those of other algorithms. Section V discusses the 210 obtained results and highlights the possible extensions of this 211 research.

213
A well-known reactive routing protocol is the Ad hoc On-214 demand Distance Vector (AODV) [22]. The source node 215 searched for a route for the destination node in its routing 216 table. If no direct route is found, a chain of broadcast pro-217 cesses is performed to expand the search till a route to the 218 destination node is found. Although AODV ensures finding 219 the destination node, it has a high routing overhead. Naserian 220 [23] used a game theoretic approach to enhance the AODV 221 protocol. The aim is to reduce the flooding behavior in the 222 route discovery process. Each intermediate node is consid-223 ered a player. When it receives a RREQ packet to propagate 224 it to other nodes, it takes a decision (game strategy) whether 225 to propagate the packet or drop it. The decision is taken based 226 on a network gain factor vs. the cost of forwarding the packet. 227 The Destination Sequenced Distance Vector (DSDV) is a 228 typical proactive routing protocol in MANETs that is based 229 on the Bellman-Ford algorithm [12]. DSDV keeps at each 230 node a routing table that contains the up-to-date routing 231 information for all nodes in the network. This is achieved by 232 forcing each node to send two types of packets frequently to 233 its neighboring nodes, namely: full dump packets and incre-234 mental packets. Full dump packets carry all the information 235 in the routing table. The incremental packets carry only the 236 updated information since the last sent full dump packet. The 237 aim of this process is to keep all nodes aware of the network 238 changes. Although this technique is useful in keeping an 239 up-to-date routing table in all nodes, it has a performance 240 drawback in the case of large-scale networks.

241
The AntHocNet routing algorithm is one of the ACO 242 implementations in MANET routing [24]. It is a hybrid 243 routing algorithm. It contains two phases: (1) the reactive 244 path-set up phase and (2) the proactive path maintenance 245 phase. In the reactive path-set up phase, ant agents of the 246 ACO are used to find a path to the required destination, and 247 the pheromone information is kept in a pheromone table in 248 each node. The aim of the proactive path maintenance phase 249 is to sample paths while no destination is required in order 250 to update the local pheromone indicates the goodness of the path through neighbor n to reach 294 the destination d beginning from the current node i. To choose 295 a certain neighbor n as a next hop for the succeeding ants to 296 reach a certain destination d, the node i takes a probabilistic 297 decision according to the following formula: P nd is the probability of forcing a FANT to reach the des-300 tination node d through the neighbor node n. β 1 is the 301 exploration parameter, and N i d is the set of all neighbors of 302 the current node i that carry route information to destination 303 d. The pheromone amounts T i nd are built accumulatively by 304 receiving more and more BANTs carrying the QoS mea-305 surements they encountered. The QoS metric used in this 306 work to calculate τ i d , as in [20], is the number of hops that 307 the BANT passes-over through its journey back to node i. 308 The BANT uses the inverse of this hop count (τ i d ) to update 309 the corresponding pheromone value T i nd in the pheromone 310 table T i as follows: The parameter γ controls how the pheromone Assuming we have a normal form game that consists of: a set 328 of players (I ) containing m players, a strategy profile S and a 329 set of utility functions U . Any player selects a single strategy 330 s k from a set of available strategies S k such that s k ∈ S k . The 331 strategy profile of the game is a vector s = {s 1 , s 2 , . . . , s m }, 332 which represents the set of strategies chosen by all the m 333 players such that s k is chosen by player k. We denote the set 334 of strategies chosen by all players except a specific player 335 k by the symbol s −k . So, the strategy profile chosen by all 336 players of the game can be expressed as s = {s −k , s k }. The 337 utility function u k (s) is the gain of any player k when the set 338 of users I choose the strategy profile s. 339 A Nash equilibrium is said to be achieved if a strategy 340 profile s is agreed upon among the players of the game, 341 such that no single player can gain a benefit by changing 342 its strategy s k unilaterally [26]. A specific strategy profile 343 s = s * 1 , s * 2 , s * 3 , . . . . . . , s * m is said to be the Nash equilibrium 344  maximize the same parameter. The Exploration player aims 396 to let more ants explore the network rather than utilizing the 397 best-known routes marked by high pheromone values in the 398 pheromone table T i . To achieve this target, it tries to reduce 399 the parameter's value to make a great reliance on the new 400 incoming information (τ i d ) carried by the BANTs in updating 401 the pheromone table's value. In this case, we consider that 402 the pheromone evaporation is high. On the other hand, the 403 Exploitation player aims to maximize the parameter to get 404 the most benefit from the existing value in the routing table 405 and relies less on the new incoming value with the BANTs. 406 In Eq. (2), the value of γ is between 0 and 1. In our imple-407 mentation, we give the freedom to the Exploration player to 408 set a value γ 1 for γ between 0 and γ Limit and the freedom to 409 the Exploitation player to set a value γ 2 for γ between γ Limit 410 and 1. That is: γ Limit is an arbitrary value between 0 and 1, that is the maximum value the Exploration player can assign to γ and,

442
The utility function of player 1 (Exploration player) is the 443 average SNR measurement of the next three generations of 444 the BANTs after setting the new value of the γ parameter 445 in Table 2. Similarly, the EED measurement is the utility 446 function for player 2 (Exploitation player). So, SNR(γ ) and 447 EED(γ ) are the utility functions of the two players corre-448 spondingly. Each of which is a QoS function measured from 449 the MANET environment and is based on a single variable 450 (γ ) from the point of view of each player. In other words, 451 after setting the γ parameter from  Table 3. The proposed 486 algorithm is compared with the AntHocNet and the AODV 487 algorithms regarding the effect of the network size on the 488 average EED, the Packet Loss Ratio, and the Throughput 489 metrics. The AODV is a reactive routing algorithm that is 490 discussed in section II. In the first three experiments, we com-491 pare the proposed algorithm against AntHocNet and AODV 492 in terms of EED, Packet Loss Ratio, and Throughput in 493 The End-to-End delay metric is defined as the average time 506 taken for packets to travel from a source node to a destination 507 node in a network. It gathers all types of delay for a certain 508 packet from its source till it reaches its destination. If we are 509 tracing the packet delay from a sender node i to a destination 510 node j, we denote the delay from i to j as D i,j as follows [32]:  route selection that is reflected upon delay time. The AODV 530 algorithm has a greater delay compared with the proposed 531 algorithm and the AntHocNet. The packet loss ratio metric measures the ratio of the number 535 of lost packets to the total number of sent packets. As shown 536 in Fig. 5, the proposed algorithm is relatively equivalent to 537 AntHocNet for a number of nodes of less than 60, and then 538 it outperforms AntHocNet in terms of the Packet Loss Ratio 539 for larger networks up to 100 nodes. The reason is that our 540 algorithm gives the nodes the flexibility to search more routes 541 or retain the best-known routes according to the network 542 conditions. This route selection flexibility reduces the packet 543 loss. The two algorithms are nearly equivalent for small num-544 ber of nodes as there is no many alternative routes to explore 545 in case of bad performance. In larger networks (>50 nodes), 546 the proposed algorithm's notion of giving the Exploration 547 player more freedom is reflected in the performance metrics. 548 Both of the proposed algorithm and AntHocNet outperform 549 the AODV algorithm in terms of packet loss ratio. The throughput metric is the rate at which information is 553 sent successfully through the network [35]. Fig.6 shows the 554 throughput of the proposed algorithm against AntHocNet 555 and AODV. Both of the proposed algorithm and AntHocNet 556 player aims to lower it in order to increase pheromone evap-584 oration, while the Exploitation player aims to increase the 585 parameter and decrease the evaporation rate. The proposed 586 algorithm introduces an equilibrium between the two players 587 to tune the parameter. Experimental results show that the 588 proposed algorithm is competitive with the AntHocNet in rel-589 atively small networks and outperforms it in large networks. 590 Future work is intended to conduct more experimental results 591 and perform more QoS metrics associations with the game 592 players.