Application of Improved Moth-Flame Optimization Algorithm for Robot Path Planning

Path planning is the focus and difficulty of research in the field of mobile robots, and it is the basis for further research and applications of robots. In order to obtain the global optimal path of the mobile robot, an improved moth-flame optimization (IMFO) algorithm is proposed in this paper. The IMFO features the following two improvement. Firstly, referring to the spotted hyena optimization (SHO) algorithm, the concept of historical best flame average is introduced to improve the moth-flame optimization (MFO) algorithm update law to increase the ability of the algorithm to jump out of the local optimum; Secondly, the quasi-opposition-based learning (QOBL) is used to perturb the location, increase the population diversity and improve the convergence rate of the algorithm. In order to evaluate the performance of the proposed algorithm, the IMFO algorithm is compared with three existing algorithms on three groups of different types of benchmark functions. The comparative results show that the IMFO algorithm is effective and has good performance in terms of jumping out of local optimum, balancing exploitation ability and exploration ability. Finally, the IMFO algorithm is applied to the path planning of the mobile robot, and computer simulations confirmed the algorithm’s effectiveness.


I. INTRODUCTION
Path planning technology is one of the core elements of research in the field of mobile robotics, and its purpose is to generate an optimal or near-optimal collision-free path from the starting point to the end point. Commonly used criteria for generating paths include the shortest path length, the smallest turn angle, and the best safety. Depending on whether the mobile robot knows a priori information about its resident environment, the path planning problem can be divided into two categories: global path planning and local path planning. In global path planning, the robot has a priori information about the environment. However, in local path planning, the mobile robot has completely or partially unknown information about the environment, except for knowing the starting point and the target point. The traditional path planning algorithms are the artificial potential field method, The associate editor coordinating the review of this manuscript and approving it for publication was Seyedali Mirjalili . graphical searching approach, genetic algorithm, etc. Recently, an improved path initialization method is proposed to address the fact that the path initialization process can affect the performance of genetic algorithms, and simulation experiments show that the method can obtain high-quality paths in a shorter period of time [1]. The reference [2] addressed the shortcomings of the artificial potential field method in path planning, which is prone to fall into local optimum, and improved the repulsive function and combined it with a fuzzy inference strategy to better achieve multirobot path planning. The reference [3] combined neural networks with hierarchical reinforcement learning, first using neural networks to perceive the environment and perform feature extraction, making them environmentally adaptive to represent action functions, and then using hierarchical reinforcement learning to map the current state of these actions, satisfying the needs of mobile robots. In reference [4], an incremental training approach for path planning was proposed to overcome disadvantages of low efficiency, poor convergence, and weak generalization ability of reinforcement learning algorithms developed in a 3D path simulation environment, and better results were achieved. Although the traditional method has achieved better results, it is impractical to solve the mobile robot path planning problem by the traditional method alone. For example, the artificial potential field method has the problem of unreachable path, and the neural network algorithm and graphical searching approach have the problem of large amount of calculation. In recent years, swarm intelligence algorithm has been favored by scholars and achieved better results due to its better optimization ability. The reference [5] applied the quantum evolutionary algorithm (QEA) to robot path planning and compared it with the genetic algorithm, and the simulation experiments verified the effectiveness of QEA. The reference [6] studied the robot path planning problem under static obstacle space based on the dragonfly algorithm and achieved better results. Although the swarm intelligence algorithm has a good performance, it has problems such as slow convergence speed and easy to fall into local optimal solutions. The reference [23] applied the slime mould optimization algorithm (SMOA) to robot path planning and the simulation experiments verified the effectiveness of SMOA. For this reason, some scholars have improved the swarm intelligence algorithm from different perspectives. In reference [7], an adaptive firefly algorithm was proposed based on the adaptive strategy, which improves the problem that the traditional firefly algorithm tends to fall into local optimum. The reference [8] proposed the improved artificial fish swarm algorithm (IAFSA), which improved the update formula based on the traditional ASFA and introduced the backward learning strategy to improve the quality of the initial solution and increase the convergence speed of the algorithm. The reference [9] improved the pheromone update strategy of the ant colony algorithm and combined it with the geometric path method to obtain an improved algorithm with faster convergence and better performance. The reference [10] combined the particle swarm algorithm with the gravitational search algorithm to balance the global and local search capabilities of particles. The reference [10] combined chaos based particle swarm algorithm with the ant colony optimization algorithm Comparative simulation experiments show that the chaos-based particle swarm optimization-ant colony optimization has a rapid search speed and can obtain solutions with similar qualities. These improved algorithms have faster convergence, higher accuracy and better performance compared to the standard algorithm for path planning applications. However, the research space is limited, so it is necessary to explore new algorithms and apply them to robot path planning. For example MFO algorithm [11] firefly algorithm [12], cuckoo search algorithm [13], bacterial foraging algorithm [14], gray wolf optimization algorithm [15]. Among the above algorithms, the MFO algorithm has received special attention due to its advantages such as simple search mechanism and few adjustment parameters, which was proposed by S. Mirjalili in 2015. Unfortunately, there are disadvantages such as slow convergence speed in the late iteration of the algorithm and easy to fall into local optimum. The reference [16] combined an improved differential evolution strategy to increase the diversity of the population and improve the performance of the algorithm. The reference [17] hybrid simulated annealing (SA) algorithm, relying on the mechanism of SA algorithm to accept non-optimal solutions, helps MFO algorithm to escape from local optimum. The reference [18] hybrid teaching and learning optimization algorithm improved the algorithm exploration and development capabilities. The reference [21] used opposition-based learning strategy to improve the MFO algorithm and verifies the effectiveness of the improved algorithm in benchmark function performance tests. The reference [22] improved the MFO algorithm update formulation and then uses the mutation operator to escape the local minimum. The performance of the improved algorithm is verified in structural optimization experiments. In order to better apply the MFO algorithm to mobile robot path planning, this paper proposed an improved moth-flame optimization algorithm. Firstly, the updated formulation of the MFO algorithm is improved by referring to the hunting mechanism and attack mechanism of the SHO algorithm [19]. Then the QOBL is introduced to increase the diversity of the population and help the algorithm jump out of the local optimum in the late iteration.
The main contributions of this paper are summarized as follows: (1) The moth position update mechanism is improved by introducing the concept of historical optimal flame average by referring to the update formula of SHO algorithm. This mechanism improves the overall performance of the algorithm.
(2) The QOBL strategy is used to update the moth positions during the iterative process to further improve the overall performance of the algorithm.
(3) The IMFO algorithm is capable to achieve results (feasible paths) considering distance (minimum path length), safety (avoiding collisions) and smoothness. This paper is organized as follows. Section II presents the details of the moth-flame optimization algorithm, the spotted hyena optimization algorithm, the quasi-opposition-based learning. Section III of the IMFO algorithm is discussed in detail. Section IV selects three different types of benchmark functions to evaluate IMFO algorithm performance. Section V solves the mobile robot path planning problem with IMFO algorithm. Section VI presents the conclusions of this study.

II. RELATED WORKS A. MOTH-FLAME OPTIMIZATION ALGORITHM
The MFO algorithm is inspired by the special navigation mechanism of moths at night. In this mechanism, the moths fly at an angle to the moon. However, this flight mechanism is only effective when the moth is far away from the moon. When an artificial light source is present, the moth is attracted to the artificial light source and moves in a spiral around it. VOLUME 9, 2021 Inspired by this natural phenomenon, the process of a moth flying in a spiral around a flame is abstracted as a process of finding an optimal solution. The moth is assumed to be a candidate solution to the solution problem, and the variable to be solved is the position of the moth in space. Thus, by changing its position vector, the moth can fly in one, two, three, or even higher dimensions. The population position matrix and the fitness function matrix of the moths are denoted by M and OM, respectively.
The flame matrix in the MFO algorithm is similar in structure to the moth population matrix in that both represent solutions to the problem. The difference between the two is that the flame is used to store the elite solutions of the moth population. That is, the MFO algorithm sorts the moth populations according to the fitness value size after each iteration, and then stores the sorted moth populations into the flame matrix. The flame position matrix and the flame fitness value matrix are denoted by F and OF, respectively. For swarm intelligence algorithms, the position update mechanism is a central part of the algorithm. In the MFO algorithm individual moths keep perturbing in a spiral flight around the flame as the number of iterations of the algorithm increases until the optimal solution is found. The mathematical description is divided into two parts: flame-tending behavior and flameabandoning behavior.

1) FLAME-TENDING BEHAVIOR
In nature moths are flame-tending and moths move towards the closest flame according to a special navigation mechanism. This means that in the MFO algorithm, the moth updates its position only according to the flame that corresponds to it. The update mechanism is that the moth spirals around the corresponding flame and swims around the flame along the spiral flight path. The update formula is shown in (1): where D i,j indicates the distance of the ith moth for the jth flame, b is a constant for defining the shape of the logarithmic spiral, and t is a random number in [−1,1] which is used to determine the distance of the next position of the moth from the flame, when t=−1 the moth is closest to the flame, t=1 the moth is farthest from the flame. When there are enough moths and flames, the moths are able to search the vast majority of the solution space, thus ensuring the exploration capability of the algorithm.

2) FLAME-ABANDONING BEHAVIOR
The number of flames in the MFO algorithm decreases as the number of iterations increases in the process of finding the optimal solution, enabling the moth to fully search the neighborhood space of the more optimal solution and ensuring the exploitation capability of the algorithm. The update formula is shown in (2): where l is the current number of the iteration, K is the maximum number of flames, and L indicates the maximum number of iterations.

B. SPOTTED HYENA OPTIMIZATION ALGORITHM
The SHO algorithm is derived from the hunting and foraging mechanism of the African spotted hyena population, which mainly includes the process of searching, encircling, hunting, and attacking prey.

1) ENCIRCLING PREY
Spotted hyenas first rely on vision to determine the location of prey, then sort the population according to the proximity of prey, and set the spotted hyena closest to the prey as the current optimal position. The rest of the individuals in the population are updated with reference to the optimal position, thus surrounding the prey. The specific formulas refer to (1)-(5) in the literature [19].

2) HUNTING
Spotted hyenas rely on a trusted population network and the ability to locate prey for hunting, and the mechanism is specifically updated with (3)- (6).
C h = P k + P k + · · · + P k+N (5) where P h is the position of first best spotted hyena, D h the distance between the first best spotted hyena and other spotted hyena, P k the position of other spotted hyenas, C h a group of N number of optimal solutions and N the number of spotted hyenas which is calculated as follows: where M is a random variable in [0.5,1], nos the number of solutions.

3) ATTACKING PREY
After completing the encirclement of the prey, the spotted hyena will wait for an opportunity to attack the prey. When the convergence factor E is less than 1 and greater than -1, the spotted hyena will launch an attack on the prey. In the SHO algorithm, the mathematical formulation of the attack behavior is to find the average of the current optimal solution cluster as shown in (7).

4) SEARCH FOR PREY
Spotted hyenas search for prey based on the location of spotted hyenas in the best search group C h to search for prey, however, when the convergence factor E is greater than 1 and less than −1, the spotted hyenas move away from each other to search and attack prey again, thus performing a global search.

C. QUASI-OPPOSITION-BASED LEARNING
The main idea of Opposition-based learning is to improve the optimization performance of the swarm intelligence algorithm by simultaneously evaluating the current solution and its inverse solution and retaining them on a merit basis. The reverse point is defined as follows: Definition 1 (Opposite Point in the n-Dimensional Space) [20]: Let x = (x 1 , x 2 , . . . , x n ) be a point in n-dimensional space and x i ∈ (a i , b i ), then its opposite pointx can be calculated by using (8) Definition 2 (Quasi-Opposite Point in the n-Dimensional Space) [20]: The quasi-opposite number is the number generated between the center of the search space and the opposite point. Mathematically, it is expressed in (9).

III. PROPOSED IMFO ALGORITHM
In order to increase the global searching ability and enhance the robustness of the MFO algorithm, an improved moth flame algorithm called IMFO algorithm is proposed in this paper.

A. MOTIVATION OF IMPROMING MFO ALGORITHM
The MFO algorithm is a swarm intelligence optimization algorithm, which has a simple structure that is easy to learn and use, and has fewer parameters to adjust for a less demanding operating environment. However, the algorithm suffered from the following shortages.

1) SLOW CONVERGENCE SPEED
The spiral flight search of MFO algorithm has good local search ability and coverage search level for the target area, and it can quickly converge to the optimal solution in the early iterations, but after a certain number of iterations, the spiral flight search will only do some small updates to the current population, which makes the convergence of the algorithm slow in the late iterations.

2) LOCAL OPTIMAL
When the MFO algorithm finds the local optimal solution, other moth individuals will quickly gather near the local optimal solution, and because the MFO algorithm itself does not have a mechanism to jump out of the local optimal, it is difficult for the moths to escape after flying to the local optimal region, thus leading the algorithm to fall into the local optimal region and the phenomenon of premature convergence. In addition, the adaptive reduction mechanism of the number of flames in the MFO algorithm itself, although the algorithm is guaranteed to converge to a certain result in the iterative process, the mechanism also reduces the diversity

B. IMFO ALGORITHM
To address the above two shortcomings, this paper conducts improvements in two ways. The concept of historical best flame average is introduced to improve the update formula of MFO algorithm, and the improved update formula enhances the population diversity as well as the ability to escape from local optimum. Equation (9) is again applied to the updated moth population to execute the proposed backward learning strategy, which further enhances the population diversity and the ability to escape from the local optimum. The concept of historical best flame mean is inspired by the SHO algorithm, and the hunting mechanism in the SHO algorithm is similar to the construction of flame matrix in the MFO algorithm, both of which construct clusters of optimal solutions, although the MFO algorithm does not effectively use the information of the flame matrix. Therefore, the updated equations after the modification are (10) and (11).
S(M i , F j ) = Average y · e bt · cos(2πt) + F j (10) where F y denotes the flame that is better than the flame F j corresponding to moth i, f is the number of flames that VOLUME 9, 2021  are better than the flame corresponding to moth i, Average y denotes the average of the flames that are better than the flame corresponding to moth i, which is the historical best flame average.

IV. EXPERIMENTAL COMPARISONS ON CLASSICAL BENCHMARK SET
In the study of optimization, it is common to test the performance of an algorithm using benchmark functions. In order to verify the effectiveness of the IMFO algorithm, three sets of benchmark test functions with different characteristics were selected from the literature [18]    optimal solution and no local optimal solution, which is suitable for verifying the development capability and convergence speed of the algorithm. The multimodal benchmark function has multiple local optimal solutions, and the number of local optimal solutions grows exponentially with the number of dimensions, which is suitable for verifying the exploration ability of the algorithm. Heuristic algorithms are stochastic optimization techniques, and therefore they have to be run more than 10 times for generating meaningful statistical results. In order to better verify the performance of the IMFO algorithm, the number of populations is set to 40, the maximum number of iterations is set to 500, and each algorithm is run 30 times independently on each test function in this paper. The performance of the four algorithms is compared according to the mean and standard deviation of the optimal solution of the benchmark function, and the statistical results are shown in Table 4 - Table 5. A comparison of the convergence curves of the four functions is shown in Fig. 1.  In this paper, Mean and Std represent the mean fitness value and standard deviation, respectively. Experimental results are listed in Tables 4, 5, and 6. The best results are denoted in bold characters.
According to the results of Table 4, IMFO algorithm is able to provide very competitive results. From this table and the convergence curves of the unimodal benchmark function in Fig. 1, it can be seen that the IMFO algorithm outperforms other algorithms in terms of overall performance. Therefore, the proposed algorithm has high performance to find the global minimum of unimodal benchmark functions. From Table 5, Table 6 and the convergence curves of multimodal benchmark functions and fixed-dimension multimodal benchmark functions in Fig. 1, it can be seen that the performance of the algorithm proposed in this paper is better than other algorithms in most cases. These show that IMFO algorithm has a strong sense of stability and robust comparing from other algorithms. Overall, the results in Tables 4-6 and Fig. 1 show that the proposed method is effective in not only optimizing unimodal and multimodal functions but also optimizing fixed-dimension multimodal functions.

V. PATH PLANNING OF MOBILE ROBOT BASED ON IMFO ALGORITHM A. FITNESS VALUE FUNCTION
In the mobile robot path planning problem, a feasible path represents a solution of the problem. In order to ensure the validity of the path, the coordinate points in the path need to FIGURE 1. The convergence curves for part of the benchmark functions. VOLUME 9, 2021 satisfy the conditions of no collision with obstacles, the shortest distance between coordinate points, and smoothness of the path. Therefore, in this paper, the path length, smoothness, and safety are used as three components to measure the path quality.

1) SHORTEST PATH LENGTH
In the field of mobile robots, shortest means the shortest total path length of the robot from the start Position to the goal Position. Therefore, the shortest path from the starting point to the end point through n path points is found by (11).
where (x i , y i ) and (x i+1 , y i+1 ) denote the coordinates of the current point and the next adjacent point, respectively.

2) PATH SMOOTHNESS
As one of the important criteria to measure the quality of the path, adding path smoothing will increase the fitness value of the path as the number of turns increases, thus reducing the probability of a more tortuous path being selected, and the path smoothing formula is shown in (13).
where n 1 indicates the number of turning points in the path with 45 degrees; and n 2 indicates the number of turning points in the path with 90 degrees.

3) SAFETY
The effect is to improve the safety of the path. When a path point falls into an obstacle, it indicates that the path is unsafe, so the path is discarded and a new path is initialized. In summary, the formula of the fitness function used in this paper is shown in (14).

B. PATH PLANNING STEPS FOR MOBILE ROBOTS BASED ON IMFO ALGORITHM
Step1: Initialize the algorithm parameters, randomly generate N feasible paths X i = (x 1 , x 2 , · · · , x i ), i = (1, 2, · · · , N ), Maximum iteration number T; Step2: Calculate the fitness value F of individual i according to the fitness function equation (14) and rank them in ascending order; Step3: Update the number of flames, update the position of individual moths according to Equation (10), and calculate the quasi-opposite point OX of individual moths according to (9); Step4: Calculate the fitness value OF of the quasi-opposite path according to (14), if the fitness value OF < F of the quasi-opposite path, X = OX; Step5: Determine whether the termination condition is satisfied, if so, terminate the program and output the optimal solution, otherwise jump to Step2 to continue the execution.

C. EXPERIMENTAL RESULTS AND DISCUSSION
In order to verify the feasibility and effectiveness of the IMFO algorithm for mobile robot path planning, this paper compares it with the PSO algorithm, the MFO algorithm, the DA algorithm and the GWO algorithm in three environments. The maximum number of iterations for the four algorithms is set to 300 and the population size is set to 30. The starting point and end point are randomly generated in each environment, and each algorithm is run 100 times independently in each group of environments.
The simulations were conducted in Matlab 2019b environment on a personal computer which has an Intel Core i7 9750h 2.6GHz processor and 16 GB RAM.
The starting and ending points of the first environment are (43,49) and (35,2), the starting and ending points of the second are (45,12) and (24,47), and the starting and ending points of the third are (48,5) and (2,45). The results of the three simulation environments are shown in Table 7 to Table 9 and Fig.3 to Fig.7.
Figs. 3 to 7 and Tables 8 to 9 demonstrate that the IMFO algorithm in this paper can be applied to the mobile robot path planning problem as well as the PSO algorithm, DA algorithm and GWO algorithm. It can be seen from Table 7 to     Table 9 that in terms of the number of iterations of convergence, although in some cases the IMFO algorithm has more iterations of convergence than the MFO algorithm, the DA algorithm and the GWO algorithm. However, the convergence accuracy is higher than that of the MFO algorithm, the DA algorithm and the GWO algorithm. According to Figure 6 it can be seen that in scenario 1, the IMFO algorithm improves 1.42%, 2.08%, 3.94% and 1.61% over the MFO algorithm, the PSO algorithm, the DA algorithm and the GWO algorithm, respectively. In scenario 2, the IMFO algorithm improves 0.93%, 0.81%, 2.42% and 1.12% over the MFO algorithm, the PSO algorithm, the DA algorithm and the GWO algorithm, respectively. In scenario 3, the IMFO algorithm improves 0.56%, 0.68%, 1.83% and 1.02% over the MFO algorithm, PSO algorithm, the DA algorithm and the GWO algorithm, respectively. It can be seen from Fig. 7 that the average path length generated by IMFO algorithm is smaller than the MFO algorithm, the PSO algorithm, the DA   algorithm and the GWO algorithm in terms of path standard deviation, indicating that the IMFO algorithm is relatively stable in the path planning problem. To check whether the algorithms are statistically different or not, the t-test is performed. The level 0.05 of significance is considered. The results indicate that IMFO algorithm outperforms in all test environments because the p-values are far smaller than the level 0.05 of significance.

VI. CONCLUSION
In this paper, an IMFO algorithm has been proposed to address the shortcomings of MFO algorithm such as slow late convergence and easy to fall into local optimum. First, the update formula of MFO algorithm was improved by introducing the concept of historical optimal flame mean with reference to SHO algorithm, so that MFO algorithm can better utilize the information in the flame matrix. Then, the hybrid backward learning strategy proposed increases the population diversity. The performance test of 23 benchmark functions verifies that the IMFO algorithm has improved in stability and convergence accuracy compared with the MFO algorithm. Finally, the IMFO algorithm is applied to path planning, and the effectiveness of the IMFO algorithm is verified by simulation experiments. To the best of our knowledge, there are few papers that apply the MFO algorithm to mobile robot path planning. However, through the above digital experiments, it is obvious that there are still a few of imperfections in this paper. For example, how to apply the IMFO algorithm to dynamic path planning, and how to further improve the accuracy and stability of the IMFO algorithm. In the next step, we will further refine the proposed IMFO algorithm and compare the refined algorithm with more swarm intelligence algorithms.