A Centralized Strategy for Multi-Agent Exploration

This paper introduces recently developed Aquila Optimization Algorithm specifically configured for Multi-Robot space exploration. The proposed hybrid framework “Coordinated Multi-Robot Exploration Aquila Optimizer” (CME-AO) is a unique combination of both deterministic Coordinated Multi-robot Exploration (CME) and a swarm based methodology, known as Aquila Optimizer (AO). A novel parallel communication protocol is also embedded to improve multi-robot space exploration process while simultaneously minimizing both the computation complexity and time. This ensures acquisition of a optimal collision-free path in a barrier-filled environment via generating a finite map. The architecture starts by determining the cost and utility values of neighbouring cells around the robot using deterministic CME. Aquila Optimization technique is then incorporated to increase the overall solution accuracy. Numerous simulations under different environmental conditions were then performed to validate the effectiveness of the proposed algorithm. Algorithm consistency aspects in achieving the expected results (area explored rate and time) is demonstrated through statistical means. A perspective analysis is then performed by comparing the performance of the CME-AO algorithm with latest state of art contemporary algorithms namely conventional CME and CME-WO (CME Whale Optimizer). The comparison duly accommodates all pertinent aspects such as % area explored, number of failed runs, and time taken for map exploration for different environments. Results indicate that the proposed algorithm presents two distinct advantages over the other conventional state of the art CME based techniques a) enhanced map exploration in cluttered environment and b) significantly reduced computation complexity and execution time, with almost no fail runs. This makes the suggested methodology particularly suitable for on-board utilization in an obstacle-cluttered environment, where other techniques either fails (stuck locally) or takes longer exploration time.


I. INTRODUCTION
We investigate the issue of using an autonomous robot to explore two-dimensional (2D) regions that are unknown to it beforehand. Such an issue continues to be difficult The associate editor coordinating the review of this manuscript and approving it for publication was Mouloud Denai . since it has to deal with concurrently performing two tasks: 1) online representation update to keep track of the regions that have been investigated, and 2) looking through the representation for a continuous path that can be followed to direct the investigation. Whenever the environment is significant, architecturally intricate, and 3D, and the issue computations get more difficult, and assuring complete VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ environment exploration might get difficult. Furthermore, the study deals with the challenge of exploring an unfamiliar area with a finite number of obstacles and bounded perimeter using a group robots. Teams of autonomous mobile robots can perform information-gathering activities including exploration, surveillance, and inspection with increased efficiency, dependability, and robustness, among other benefits [1]. These benefits are attained by utilising some kind of team coordination, which is frequently built assuming the ability to interact without boundaries [2], [3], [4], [5]. However, dealing with communication-challenged circumstances is a common requirement for operations in the real world. Robots in these environments can only communicate with colleagues nearby (locally), depending on their transmission capabilities and the environment itself (e.g., obstructions or disruptions occupancy). It may be difficult to achieve a good degree of coordination as a result.
One of the most difficult challenges in a multi-robot system is determining the best collision-free route for each unit in order to increase system efficiency while adhering to a set of limitations [6], [7]. One of the traditional methods is to take the robotic out to explore its surroundings in order to locate landmarks that can be utilized in trajectory tracking. The robot relies heavily on its sensors, map creation, and location updates in this circumstance. Moreover, the ecosystem lacks information, and mapping approaches need large storage and time and are ineffective in generating the ideal path.
The primary goal of mobile robots automation is to create a physical architecture that can offer independence to robots for persistent trajectory tracking in a congested predictable environment without the involvement of human controllers [8]. Robot routing determines the objective site by avoiding obstacles in its way from the origin place while fulfilling limitations such as distance, energy, and time [9], [10], [11]. This procedure is divided into four distinct configurations: (1) perception: the robot gathers the information needed utilizing sensors. (2) Localization: the robot identifies where it wants to go in the surroundings.
(3) Path planning: the robot determines its guiding path in order to reach the desired place while avoiding obstacles. (4) motion control: change the movement to create the path's needed direction [4], [4], [12], [13], [14], [15], [16], [17]. The topic of multi-robot path planning is concerned with computing pathways for robotic systems in such a way that each device has an optimal route, but the full path of all the merged devices is optimal. This is a more sophisticated work than single-robot path planning, where the factor of coordination among the numerous robots is not appropriate, and the single robot may calculate the path using its methods. The motion planning challenge might be controlled or distributed. A planner simultaneously plans all of the robots in centralized planning, generally considering all of the intricate interactions that they may have. Consequently, an extremely complicated design space is generated, across which the searching has to be done. On the other hand, decentralized planning assigns a separate planner to each robot. Each robot is designed individually in its configuration area, which simplifies planning. Then, attempts may be taken to avoid collisions between the different machines. Compared to decentralized planning, centralized planning requires more time but is more effective.
Therefore, the goal is to develop a technique which utilized primitive algorithm such as deterministic method and integrating it with bio-inspired algorithms.

A. MOTIVATION
According to the literature study and cited publications, hybridized algorithms are the most recent trend with a wide range of applications and utility [37]. Different algorithms are combined to create competing hybrid algorithms that produce the greatest outcomes with the least amount of work and time. The evolution of new hybrid algorithms/techniques is all geared at catering for the limits observed in earlier algorithms, and to improve the optimization parameters while exploring diverse complexity environments, as evidenced by the listed articles. As a result, there is a pressing need to develop novel hybrid tactics that combine bio-inspired methodologies with the CME framework while also meeting spatial restrictions.
In our proposed algorithm, we made sure that different workspace regions are investigated at roughly the same rate, preventing certain areas from being searched considerably later than others. Further we incorporated a novel parallel communication protocol to enhance performance by reducing the search time and increased exploration rates with almost no fail run. Both these aspects are critically desirable for many exploration applications, such as search and rescue.

B. RESEARCH CONTRIBUTIONS
The paper presents, the implementation of of recently developed Aquila Optimization Algorithm specifically which is specifically configured for Multi-Robot space exploration in an unknown environment. In our proposed algorithm, we made sure that different workspace regions are investigated at roughly the same rate, preventing certain areas from being searched considerably later than others. A novel parallel communication strategy is added for effective communication between agents. The result demonstrates that the proposed algorithm explores the unknown environment in an efficient way in a lesser amount of time. Finally, the performance of the proposed Aquila Optimizer has been demonstrated via the simulation. Different workspace regions are investigated for algorithm efficiency. The results acquired are analyzed and compared with other state of the art methods published in the literature.

C. PAPER ORGANIZATION
The rest sections in this paper are organized as in Section II; the related works are worthy of being mentioned. In Section III, the motivation of this research is given. Section IV presents the main procedure of the Aquila Optimizer, which is used to solve the robot path planning problem. In Section V, the definition and mathematical notations of the integrated CME-AO exploration problem are given. Detailed experiments and results are discussed in Section VI. In section VII, perspective analysis to compare the performance of the proposed CME-AO with contemporary algorithms namely conventional CME and CME augmented Whale optimizer (CME-WO) is performed. The conclusion and future work are presented in Section VIII.

II. RELATED WORK
Autonomous exploration is a challenge that has been approached from many directions. The strategy outlined in this work is based on significant information theory findings, frontier-based exploration, topological exploration, and a few random sampling-based techniques that are briefly covered in this section. The most related works that can support this research are worthy of mentioning, as follows.
Using information theory to solve the exploration problem is a common strategy. The techniques maximise the information gained throughout the following few actions [38], [39]. These techniques are also extended for multi-robot exploration system. However, the bulk of these approaches are based on greedy tactics, and their efficiency is constrained by their narrow focus. The appropriate way out could have been prioritizing long-term exploration path rather than just maximizing rapid reward functions [40], [41]. The problem can be formulated using borders (frontiers), or the line separating mapped and unmapped territories. When exploring, the vehicle continues to move toward frontiers, extending the marked regions' boundaries until the entire environment has been studied [42], [43], [44]. Faigl and Kulich's technique [45], in contrast to most of these methods, solves a special case of the ''art gallery issue'' to discover the lowest collection of perspectives necessary to cover the borders.
Fang et al. [46] employed the behavior-based technique known as social potential fields to determine the robots' trajectory and drive them toward uncharted territory. They then adjusted each robot's tilt and speed to enhance the system. The simulation demonstrated that a multi-robot system with robot speeds between 0.7 and 1.0 m/s and angles between 0.2 and 0.2 rad would provide the best coverage of an uncharted space. When compared to current trends, this kind of deterministic optimization has little impact and ignores the other needs of the robot system.
A typical path planning technique for multi-robot devices is provided in [47]. By assessing a different cost function, the additional coverage from the motion control is maximized. Then, a goal determination mechanism is built to promote collaborative exploration for a multi-robot system. The suggested path planning method is tested on several unknown and difficult environment maps. The simulation outcomes and quality assurance show that the proposed joint path planning method is successful. The work produced a new method for determining the ideal path route for multi-robots in a congested scenario by combining enhanced particle swarm optimization (IPSO) with the dynamically perturbed velocity (DV) method [48]. The algorithm aims to reduce the maximum path distance, which equates to minimizing the estimated time of arrival of all robotics in the ecosystem to their different locations. However, other factors such as map exploration rate were not reported. A centralized analytical system for tackling multi-robot route planning issues in generic, two-dimensional, continuous settings is described in [49] to reduce job completion worldwide. The system achieves a high level of performance by combining an ideal discretization of the constant surroundings with a quick, near-optimal resolution of the resulting discrete planning issue. The strategy can tackle difficulties with hundreds of robots that constantly take up more than 30% of the available area. However, real time application aspects which requires analyzing the exploration time was not reported.
Pugi et al. [50] introduces a multi-robot design exploration method based on an optimization algorithm and K-Means cluster analysis to ensure the balanced and sustained exploration of large workspaces. Both properties are demonstrated by comparing the proposed method to several state-of-the-art algorithms. The algorithm although reduces the variance of the average waiting time on certain locations, but samples the work space areas at different rates, which causes certain areas from being searched considerably later than others. They only applied the deterministic global optimization for CME. Similarly Gul et al. [51], utilized Whale optimization algorithm for the purpose of space exploration in an unknown environment by coordinating multiple robots. The research presents a search method for performing search exploration that mimics whale hunting behavior by combining deterministic Coordinated Multi-Robot Exploration (CME), and metaheuristic phase enhanced Whale Optimization Algorithm (WOA) approaches. The frequency is continuously modified using a probabilistic objective function to optimize exploitation and exploration technicians. Although the proposed algorithm showed improvements in results, yet the optimization time and exploration area needed significant improvement for actual on board utilization.
The optimum multi-robot path planning is investigated by Yu et al. [52], on graphs using four minimization goals: the makespan, the total arrival time, and the total distance. A new solution matching is built between multi-robot route planning and a unique form of the multi-flow channel to achieve this purpose. Novel and comprehensive methods are created to maximize several criteria based on this equality and integer VOLUME 10, 2022 linear programming. The use of integer linear programming design algorithms and heuristics proved its quite productive.
Mathew et al. [53] research focuses on a collaborative team of cars making autonomous deliveries in urban contexts. The collaborating team consists of two vehicles with complementing functionality: a truck limited to movement across a road system and a quadrotor micro-aerial device with capacities, which can be launched from the vehicle to execute deliveries. The simulations evaluated the efficiency of the provided methods and showed instances of delivery route calculations across real metropolitan street maps. However, sensors integrating issues were not explicitly discussed.
Das et al. [54] presents a novel approach for determining the optimal path for multi-robots in a constrained environment. The proposed method combined enhanced classical Q-learning based on four fundamental values with enhanced particle swarm optimization (IPSO) by changing variables and the preferentially perturbed velocity (DV) technique for improving convergence. The improved classical Q-learning stores the Q-value of the state's optimal action, saving data storage. However, area complexity factors were not analyzed to determine algorithm complexity in varying conditions.

III. AQUILA OPTIMIZER FRAMEWORK FORMULATION
Abualigah et al. [55] suggested Aquila Optimizer, a new swarm intelligence algorithm, in the year 2021. Aquila has four main hunting behaviours for different types of prey. Aquila can switch hunting techniques quickly for different prey and then attack with its fast speed and strong feet and claws. The following is a brief summary of the mathematical model.
Step 1 (Expanded Exploration): In this strategy, the Aquila soars high above the ground and thoroughly surveys the search zone, then diving vertically once the Aquila has determined the prey's location. This behavior's mathematical representation is as follows: where X best (t) is the best position so far and X M (t) is the average position of all Aquila's in the current iteration. The current iteration and the maximum number of iterations are represented by t and T, respectively. N is total population size, and r1 is a random number between 0 and 1.
Step 2 (Narrowed Exploration): For Aquila, this is the most usual technique of hunting. After descending within the designated area and flying around the prey, it uses short gliding to assault the prey. The formula for updating the position is as follows: where X R (t) denotes a random Aquila location, D is the dimension size, and r2 denotes a random integer inside the dimension [0, 1]. The Levy flight function, abbreviated as LF(D), is presented as follows: u and υ are random numbers between 0 and 1, whereas s and β are constant values of 0.01 and 1.5, respectively. The spiral shape in the search is represented by y and x, which are determined as follows: where r3 is the number of search cycles between [1,20], D 1 is an integer array from 1 to the dimension size (D), and ω is 0.005.
Step 3 (Expanded Exploitation): When the region of prey is roughly determined in the third way, the Aquila descends vertically to launch a preliminary attack. AO uses the chosen region to go close to the target and attack it. This behaviour is presented as follows: where X best (t) signifies the best position so far, and X M (t) denotes the current position's average value. The exploitation adjustment parameters a and d are set to 0.1, ub and lb are the problem's upper and lower bounds, and r4 is random number within the problem [0, 1].
Step 4 (Narrowed Exploitation): The Aquila pursues the prey's escape path and then attacks it on the ground. This behaviour is mathematically represented as: where X(t) denotes the current position and QF(t) is the value of the quality function, which was utilized to balance the search strategy. G 1 is the Aquila's tracking prey movement parameter, which is a random number between [-1,1]. When chasing prey, G 2 symbolizes the flying slope, which falls linearly from 2 to 0. r5 is random value ranging from 0 to 1.

A. PERFORMANCE EVALUATION
The results of Aquila are then compared to benchmark functions (as referred in Table 2, 3 and 4 in [55]). Finding acceptable points that satisfy the stated condition from a set of potential candidate outcomes associated to fmin is the primary goal.
The performance of the aquila algorithm is evaluated using the unimodal function F1 (spherical function) as defined in  Table 2 in [55]. In the first run, a quick convergence rate is observed, as shown in Figure 1, with a total execution time of 1653.903269 seconds. With 5× 5 boundaries, the 2D space is chosen. It took 10 population sizes and 20 iterations to get the best answer. Figure 1 depicts the optimization drift subfigure (d). The main advantages of aquila were found to be (a) rapid convergence, (b) lower memory requirements and ease of implementation, (c) avoidance of local, and (d) fewer parameter tweaking. Despite the algorithm's early rapid convergence, it had an intrinsic limitation of gradually slowing down due to the diversity problem.

IV. INCORPORATION OF PARALLEL STRATEGY
Parallel strategies, such as Data Mining [56] and Deep Learning [57], are commonly utilized to fasten the processing speed and to solve complex and complicated problems [57]. Genetic Algorithm (GA) is one of the most well-known and primitive names in the world of intelligent algorithms [58]. Experiments have shown that using a parallel technique allows for multipoint parallel search in space, which improves performance and communication among population. The search speed is enhanced due to the utilization of multiple CPUs. As a result, the parallel strategy method performs better than the original algorithm [59], [60]. With the growing amount of data, a single communication method is often insufficient. Roddick [61] split the single communication method into three and applied them to PSO. Experiments have shown that the three communication options improved PSO's efficiency. The parallel approach was then improved by several researchers based on Chang's hypothesis. Yang et al. [62] and Zhu et al. [63] proposed a parallel strategy including various groups and techniques. According to its unique communication plan, each population will be updated. When a particular number of iterations have been completed, the populations communicate and share information. Their final update techniques are the identical, with one group's ideal value being used instead of another group's worst value. Nasrabadi et al. [64] used a parallel approach in which numerous groups of the same method worked together. Each group utilized the same technique to evolve separately at first. i.e., after a given number of repetitions, populations begin to share knowledge. As demonstrated in Figure 2, we added randomization to the communications strategy and employ multiple tactics in local and global search which assist groups to completely communicate. For the local, two groups are randomly chosen, and every T iteration, one group of particles with the best fitness is substituted for the other group of particles with the lowest fitness. Every two iterations, the global best particle replaces the worst particle in the group. The goal of implementing these two strategies is to improve the algorithm's randomness, strengthen communication between populations, and avoid premature convergence, all while improving the algorithm's robustness.

A. EXPLORATION
Exploration of an unknown environment is a key topic in autonomous mobile robots, which involves exploring uncharted locations while developing a map of the environment. Traditionally, humans plan the environment ahead of time, and the robot uses that map to navigate the VOLUME 10, 2022 environment later while avoiding impediments. Exploration has the potential to cut the human out of the map-making process in an unknown environment. Exploration algorithms have a wide range of applications in domains such as space robotics, sensor deployment, and defensive robots, among others.
Brian Yamauchi introduced a consistent and clear technique in the field of autonomous exploration in 1997 [38], [65], which is the foundation for most of the current exploration algorithms. The basic question in exploration is: Given our current understanding of the world, where should we move the robot to gather the most data?. The concept of frontiers can provide an answer to this. The terminologies utilized for space exploration are elaborated below: • The zone not yet covered by the robot's sensors is referred to as the Unknown Region.
• The Known Region is the area that the robot's sensors have already surveyed.
• Open-Space is a well-known area that is devoid of obstacles.
• Occupied-Space is a well-known area with an impediment.
• Occupancy Grid is a representation of the surroundings in a grid format. Each cell has a probability that indicates whether or not it is occupied.
• The Frontier is the line that divides known and unknown regions. A frontier is a collection of unknown points with at least one open-space neighbour.

V. FORMULATION OF INTEGRATED CME-AO ALGORITHM
Multi-robot coordination Exploration entails a search process aided by a multi-robot team. They go all over the place surfing. The information they gather during exploration is used to create a map of a specific area. Different communication algorithms are also available for robots. There are two methods of exploration available: centralized and decentralized exploration. The former uses a single map, whereas the later uses a unique map for each robot [66]. Our paper focuses on a centralized application that considers the distance travelled locally by each robot, as well as the cost. The most crucial task when exploring unfamiliar places in an environment is to have knowledge of Frontier cells. It's described as a cell that has been investigated and is directly adjacent to an unexplored cell. Each cell's price is proportionate to the distance between the robot's starting point and the cell. The map is divided into evidence grids that span the length of it. The evidence grids are made out of Cartesian grids with cells. A probability value is assigned to each cell, indicating whether the region is inhabited or not. The sensor model refreshes the grid when a sensor reading is acquired in real time.
Each cell has a numerical value, thus if a cell has been investigated previously, the cost of that cell is added to the new cost and the cell is marked as a Frontier cell, as shown in Equation 13. To reduce the temporal complexity, the condition must be met. K x,y = min x 2 + y 2 .P(occ x+ x , y + y) If the sensor beam has previously passed through the cell, it is labelled as a frontier cell, and the cost is added backwards in the previous step. The probability values of unknown cells, non-occupied cells, and occupied cells are 0.5, 0 and 1. When the sensor beam contacts the cell from a given distance, the likelihood value lowers. Refer to [38] Equation 2.
The goal of cost reduction is to discover the lowest value from all of the neighbouring cells, which determines the robot's next best position. Finding cost for a single robot is simple compared to a multi-robot system that requires constant integration. For this reason, the CME method proposed the utility arrangement for robot-related tasks. The basic goal is to reduce the cost of surrounding cells in order to identify the robot's ideal next best position.

A. INTEGRATED STOCHASTIC OPTIMIZER
Maximizing utility values is defined as: it is assumed that all evidence grid cells in the map have the same utility value at the start. Refer to Equation 14 to see how these values change when the robot(s) traverse the map. In order to explore new positions on the map, the robot(s) are interested in exploring those cells with high utility values.
The state of earlier modification Ugcj−1 is represented by the cell utility value Ugcj. During the exploration phase, the value changes, and the robot's current position is subtracted from a value equal to the probability occupancy of the selected cell. Equation 15 is used to get the ideal value.
The following is the problem statement: Robot motion with sensor coverage is used to explore an unknown area. Static sensors have largely been noticed in the literature being utilized for area coverage or surfing an unknown environment [67]. The proposed method begins with the creation of a map of an environment with the help of robots(s). Aquila optimizer Algorithm is the optimization methodology applied in this case. It assists in the generation of positions that alter the order. The optimizer is a deterministic exploration approach that assists the robot in selection of new position. Subtract RU i c and K gc 8:

return obtained solution
The proposed CME-AO exploration is elucidated by the Algorithm 1. The utility has a maximum value of 1. The robot sensory cell is divided into 8 vector cells, V s , each of which is made up of V 1 x,y , V 2 x,y . . . . . . , V 8 x,y . The cells are regarded as viable contenders for the posts. Equations (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11) are used in the suggested technique to update the X(iter+1) position. Using Equation 15, the algorithm then evaluates the cost and subtracts the utilities from the cost for the eight vector cells.
After evaluating those legitimate candidates with priority, the suggested optimizer computes the four utility value for the leading optimizer. Due to G1, G2 and QF (refer to step 11 to 44 of Algorithm 1 and the occupancy probability values of the dominated cells), the priorities have been swapped. The dynamic parametric characterization t ≤ ( 2 3 * T ) is used to choose between exploration and exploitation phases. To swap the phases, a random number r1 is used. When the entire environment is surfed in the proposed approach, G1, G2 and Quality Function tries to converge.
X 2 (t + 1) = P aq,i (occ x+ x,y+ y ) × LF(D) The important point to notice is that we did not take in account the mean value from Equation 1 & 10. Because it is related to the natural behavior of aquila that utilizes the intelligence dominance agent. Finding the average robot placements among the different agents of aquila's is not necessary in the target selection problem. Therefore, the robot(s) does not require to extract mean value information at every instance.
The best aquila operator is automatically fed the robot's next best position X(t+1), and the largest value is assigned to that operator. As a result, Equation 14 reduces the utility values of neighbouring cells. The G1, G2 and QF generates new random values for the next iteration and slowly converges the algorithm.
During each iteration, the hybrid CME-AO updates the next optimal aquila operator value. The expenses and benefits of grid cells surrounding a robot provide data. The best operator is chosen by the G1, G2 and QF. The robot is therefore forced to plan maneuverability using the best cell value, which is calculated using the occupancy probability. CME-AO is employed for the robot's next move. When compared to investigated cells/areas, unexplored cells/areas have a higher utility value. When the cost of examined cells is removed from the utility value, the cells with higher utility values become more appealing to the robot.
When the utilities of the unexplored cells are removed from the costs with least values, the maximal values become appealing targets for the future robot positions. For both ways, this principle is true. The suggested hybrid stochastic technique, on the other hand, has four optimum alternatives for changing the hierarchical order based on stochastic parameters. It means that, unlike the CME, the highest value may have been stored in expanded exploration or expanded exploitation phase, rather than just narrowed exploration or narrowed exploitation.

VI. SIMULATION RESULTS AND DISCUSSIONS
The findings of the proposed multi-coordinated exploration using aquila technique is shown in this section. To test the practicality of the suggested approach, the map's complexity is modified. The number of barriers are varied for modification purpose. Map size is kept as 20m width and height. The entire area is 20m×20m, with a ray length of 1.5m. For the map generation, the robotics toolbox was used. The black region formed represents the barrier-infested area, while the rest of the area represents the open space that must be explored.
As elaborated in Figure 3, the robots complete the first iteration by commencing their search by scanning the path with the help of sensors. This aids in the spatial divergence process, which lowers the utility of the chosen target. Despite the fact that the algorithm depicted a high exploration rate, parametric variables setting was further improved. The cost value is initially lowered, and if an equal cost value is found, the last cost is saved. Another aspect to be taken care was that if the cost and utility values are the same, the robot might get stuck at a certain point. As a result, a proper solution for entire area surfing is essential to avoid this occurrence.
In order to find the total explored cells area, following Equation 20 was further utilized.

TotalArea =
Unsurfed area − Surfed area Surfed area (20) This parameter was used to evaluate the area that the multi-robot will be surfing. It can reach a maximum value of 1, indicating that 100% of the area has been explored, or 0 indicating that no area has been surfed. After the simulation, the value obtained from Total area is utilized to make a comparison. The framework validation was then performed utilizing two different scenarios denoted by Map 1 and Map 2 respectively. Sensor range, map size, number of obstacles, iteration, and initial robot location are among the most noticeable map characteristics. The exploration process achieved is TABLE 2. Statistical analysis to determine average exploration rate and time by CME-Aquila optimizer.  pictorially depicted by Figures 4 & 5 respectively for the two cases. Results indicate that a total of 97.87% and 96.667% of the total were explored efficiently within a short time of just 31 and 32 sec respectively. It remains to mention that performance characteristics could be further increased by increasing the number of iterations. The results clearly demonstrated the effectivity of the proposed algorithm, as not only a significant portion of the area was explored, but at the same time both the computational complexity and exploration times were significantly lowered. CME-AO therefore successfully exhibited that map exploration was achieved efficiently and effectively.
The performance evaluation of CME-AO was further ascertained through statistical means. The analysis is carried out by performing multiple simulations for two different environmental conditions as demonstrated in Figure 4 and 5 respectively. The analysis was performed with an aim to determine algorithm efficiency in terms of the average (%) area explored and total time taken in multiple simulation runs. The results obtained for two environmental conditions denoted by Map 1 and Map 2 respectively, during multiple runs are depicted in Table 2. As elaborated in the table, proposed CME-AO, depicts a consistent high area exploration rate with average rate of exploration for map 1 and map 2 being 97.0780% & 96.1520%. Similarly, the average time consumed by CME-AO is just 31.0440 sec & 32.290 sec for the two map configurations. This clearly demonstrates that proposed methodology not only efficiently explores different configuration maps but also consistently shows the same high rates on multiple runs. This consistency in producing the expected results makes the suggested methodology specially suitable for near real time applications where traditional techniques such as CME either fails (stuck locally) or takes longer exploration time.

VII. COMPARISON OF CME-AO WITH CONTEMPORARY CME AND CME-WO ALGORITHMS
After performing extended simulations under different environmental conditions to evaluate the the performance of our proposed CME-AO, now we further extend our results. In this section, we will perform a detailed study and do a comprehensive analysis by comparing the results achieved from CME-AO with contemporary CME and CME-WO (whale optimizer) algorithms. The analysis will carried out under setting with a higher level of intricacy and introducing complex situation loaded with various obstacles in random order. We will investigate the performance in three different environmental conditions denoted by Map 1, Map 2 and Map 3 respectively. These maps have varying degree of complexity with different orientation, number and length of obstacles. The comparison duly accommodates all pertinent aspects such as % area explored, number of failed runs, and time taken for map exploration for different environments.
Map 1: Figure 6 represents the implementation of coordinated multi-robot exploration implemented with our proposed CME-AO, conventional CME and whale optimizer algorithms under ware-house 2D map. The simulation result shows that in case of the conventional CME algorithm (Figure 6(a)), the robot got stuck due to its incapability to explore the path in correct direction. The entire simulation is therefore required to be re-run. It is important to mention that re-running of the simulation wont change results as CME is incapable to generating random solutions which is why if robot got stuck at any point or in map, the entire map needs to be changed. In case of CME-WO algorithm, the map was explored in a sub-optimal way with map coverage of approx 84.65% (Figure 6(b)) with exploration time of approx 91.34 sec. In case of CME-Aquila Figure(6(c)), it can be evidently seen that the entire map is explored efficiently with an exploration rate of approx 97.931%. The map was explored in almost 300% lesser time i-e just in approx 31.3 sec. Furthermore, the number of failed simulations were zero in case of CME-AO against the non-zero failed simulation numbers of CME and CME-WO. This clearly demonstrates the superior performance (exploration rate, exploration time and Failed simulation runs) of proposed CME-AO algorithm in comparison with both the latest contemporary algorithms. It is also apprised that during MATLAB simulation runs, we captured the cases and generated warnings when the neighbouring cell is occupied by obstacle or hit by another robot (refer Figure 6 (a)).

Map 2:
The performance of CME, CME-WO and CME-AO was again compared in a different condition environment. As evident in Figure 7, map exploration of about 86.7% and 92.9% with exploration time of about 81 sec and 54.87 sec were achieved through CME and CME-WO respectively. CME-AO because of its unique feature of inbuilt parallel strategy and efficient computation logic, exhibited superior performance with map exploration rate of about 98.95% in just about 34 sec.
Map 3: Figures 8, represents performance comparison of the three algorithms in another environmental condition. It is evident that CME alone did not produced satisfactory performance (81% map exploration, 95 sec exploration time, 02 Fail runs). This is mainly due to the inherent limitation of utilization of deterministic methods, which forces the robot to follow the same route every time when the simulation begins. Similar nature performance is achieved by CME-WO (87.65% map exploration, 97.34 sec exploration time, 02 Fail runs), who also did not entirely explore the area (the grey area shows the unexplored area efficiently). A comparatively superior performance was exhibited by CME-AO, with map exploration of about 98.72% in just about 37.15 sec with zero fail runs. It is important to mention that during the entire validation process, the orientation, size and quantity of obstacles were maintained the same for a fair comparison.

A. SUMMARY OF RESULTS
The entire results of Section VII are summarized in Table 3 for speedy reference and readers' convenience. The referenced table depicts the results achieved both the proposed CME-Aquila (CME-AO) and the referenced CME and CME-Whale (CME-WO) algorithms. The results revealed that the proposed CME-AO methodology satisfactorily achieves the primary objective of space exploration in fewer simulation runs and produce consistent results. The conventional CME % CME-Whale algorithm apart from exploring the maps inefficiently, also requires additional runs/time for exploration. We achieved the enhanced results by incorporating a unique parallel strategy embedded in the standard Aquila algorithm. This significantly increased the particle communication, thereby reducing the execution time significantly. Results also indicate that the proposed methodology significantly improved system's robustness and produces a major improvement over the original algorithm. As depicted in referenced Table 3, the execution time is reduced by a factor of 200%-300% in all the tested map configurations. To summarize, through simulations we have demonstrated two salient benefits (a) Enhanced system robustness aspects as the algorithm successfully explore areas where other algorithms fail (b) significantly reduced execution time (worst case improvement factor of almost 200 %). This makes the proposed CME-Aquila (CME-AO) algorithm a prominent choice for on-board practical utilization. VOLUME 10, 2022

B. LIMITATIONS
On basic or ordinary maps, the technique performs similarly to CME, however when taking into account the complexity of the map, the proposed method performs exceptionally well considering the simulations and iteration number. For CME, it is valuable to understand that if the method is ineffective in certain conditions, there is no way to explore the ideal solution given the map conditions. Besides occasionally changing the map circumstances, which is not always possible. Moreover, the proposed optimization strategy technique may be implemented on hardware for actual world applications.

VIII. CONCLUSION
The study formulates the integration of a deterministic CME method with a stochastic aquila strategy for multi-robot exploration. The robot maneuverability is determined by the data collected from sensors in order to set the waypoints. After that, a stochastic method is used to refine the result even further. By fine-tuning the parameters, the hybrid technique aids in a powerful performance. The results were further optimized by incorporating a unique parallel strategy embedded in the proposed architecture. This significantly increased the particle communication, thereby reducing the execution time significantly. As shown in section VII, the proposed algorithm presents two distinct advantages of enhanced map exploration in cluttered environment and significantly decreased execution time. Besides this, algorithm consistency in producing the desired results in different configuration maps and multiple simulations was also demonstrated. The suggested hybrid methodology's intrinsic advantage makes it particularly ideal for a wide range of operations, such as search and rescue missions, disaster management, reconnaissance, and so on, where the search space is congested and constrained. This study is expected to pave the way for academics to develop hybrid tactics for multi-robot space exploration.
FAIZA GUL is currently pursuing the master's degree with the Department of Electrical Engineering, Air University, Aerospace and Aviation Campus Kamra. Her research interests include development of LIDAR systems, development of computational algorithms for autonomous guided vehicles, soft computing techniques, and mobile motion control and its application solutions.
ADNAN MIR received the bachelor's degree from the University of Peshawar and the master's degree from Leeds University, U.K. He has over ten years of experience in teaching at various prominent institutes. He is currently an Associated with the University of Technology Sydney, Australia. LAITH ABUALIGAH received the degree in computer information system and the master's degree in computer science from Al al-Bayt University, Jordan, in 2011 and 2014, respectively, and the Ph.D. degree from the School of Computer Science, Universiti Sains Malaysia (USM), Malaysia, in 2018. He is currently an Assistant Professor with the Department of Computer Science, Amman Arab University, Jordan. He is also a Distinguished Researcher with the School of Computer Science, USM. According to the report published by Stanford University, in 2020. He is one of the 2% influential scholars, which depicts the 100,000 top scientists in the world. He has published more than 80 journal articles and books, which collectively have been cited more than 3100 times (H-index = 27). His main research interests include on arithmetic optimization algorithm (AOA), bio-inspired computing, nature-inspired computing, swarm intelligence, artificial intelligence, metaheuristic modeling, optimization algorithms, evolutionary computations, information retrieval, text clustering, feature selection, combinatorial problems, optimization, advanced machine learning, big data, and natural language processing. He currently serves as an Associate Editor for the journal of Cluster Computing (Springer) and the journal of Soft Computing (Springer). His research interests include the Internet of Things, cyber-physical systems, pervasive computing, cloud, fog, and edge computing, social mining, artificial intelligence, and cyber security. He serves as a PC member of several conferences and journal.