Wild Horse Optimizer-Based Spiral Updating for Feature Selection

Feature selection (FS) is a vital and challenging process in several domains, including data mining, data clustering, text mining, education, biology, medicine, public health, machine learning, image processing, and so on. The greedy and comprehensive algorithm methods cannot identify the best subset amid the rising number of features. Thus, swarm-based algorithms are becoming more popular for identifying the best group of features. This study relies on the spiral-updated position of the Whale Optimization Algorithm (WOA) to propose an improved version of the Wild Horse Optimizer (WHO). This improvement enhances the WHO’s ability to update solutions and explore various possibilities in the search domain. The proposed method (WHOW) was assessed using two experiments to confirm the efficacy of the improved optimizer. The first experiment was global optimization using CEC 2019 benchmark functions, whereas the second was an FS by testing 20 benchmark datasets. The results obtained using the proposed WHOW method were compared to some popular algorithms over the benchmark datasets of global optimization and FS. The experimental results reflect the superiority of WHOW in the solutions to different optimization problems and its ability to select prominent features over most benchmark datasets. These results are due to the implementation of bubble nets in the WHO using spiral movement, which promotes flexibility and performance.


I. INTRODUCTION
Dimensionality reduction is the primary concern of many data mining algorithms, particularly classification tasks. The massive number of features in a dataset badly affects the performance of classifiers since many of them might be unimportant [1]. Therefore, feature selection (FS) is a mandatory preprocessing step [2]. Two methods are used for the FS task: filter methods [3], and wrapper methods [4]. This study focuses on wrapper methods, which depend on the selection and evaluation of a subset from the features. Although the wrapper methods are accurate, they are computationally The associate editor coordinating the review of this manuscript and approving it for publication was Mostafa M. Fouda . exhaustive. They rely on selecting a subset of features and calculating the accuracy of the applied classifier.
The goal is to select the minimum number of features that yields the highest classification accuracy. In this regard, Swarm-based algorithms are perfect candidates for selecting the best features to represent the dataset. These algorithms, which are divided into two phases, mimic the behavior of swarms in their search for a source of food. In the exploration phase of the swarm technique, the population searches the entire problem space. In the exploitation phase, the populations improve their positions iteratively to solve the problem. A balance between these two phases is a mandatory criterion to avoid sticking to local optima. However, as noted in the no free lunch (NFL) theory [5] not all metaheuristic algorithms can solve all problems. For this reason, researchers implemented many metaheuristic algorithms that can be classified into gradient-based methods (GB) and non-gradient-based methods. The Gradient-based (GB) methods include; a gradient-based optimizer [6]; Newton's method [7], and Levenberg Marquardt's algorithm [8] depending on the derivative of the objective function. Thus, they converge slowly but with no guarantee of a successful solution to various optimization problems [9], [10], [11], [12]. Non-GB (MH) methods, on the other hand, start at random positions, and the search direction is updated by exploration and exploitation. MH algorithms are robust at finding global optimum solutions but require huge processing power. Various MH algorithms have inherent binary variants to solve the FS problem, such as the particle swarm optimization (PSO) [13], [14], [15], artificial ant colony (ACO) [16], bee colony algorithm (BCA) [17], ant lion optimizer (ALO) [18], salp swarm algorithm (SSA) [19], [20], grasshopper optimization algorithm [21], and mothflame optimization (MFO) [22], [23].
As mentioned earlier, the origins of metaheuristic algorithms differ. While some are found in biology, others are found in physics. However, the authors in [24] introduced the league championship algorithm based on sports, and some researchers in [25] applied chaotic maps to improve its convergence rate. The law of light reflection has inspired the researchers in [26] to develop a new MH called opticsinspired optimization (OIO) that borrows the concept of convergence and divergence of mirrors. In this algorithm, the surface of the optimized function is a wavy mirror whose peaks and valleys are concave and convex lenses. The populations in OIO are artificial light points whose rays are reflected on the function surface to determine whether it is a minimum or maximum point. The classical OIO is improved in [27] through the addition of chaos theory, whereby chaotic maps improved the global convergence rate of the OIO and prevented ticking in local optima. However, the classical OIO achieved good results in solving the traveling tournament problem and minimized its parameters, the total movement of teams, and transportation [28]. Light-based optimization algorithms are assessed and proven as efficient relative to other competitive algorithms in solving specific problems [29] There is NFL theory [5] suggests that there is no optimization algorithm that can solve all problems. That explains the need to devise many algorithms to cope with all emerging problems. The study presented in [30] enumerated many advantages and disadvantages of metaheuristic algorithms. The shortfall ranges from slow convergence, sticking into local optima, to a shortage in search space coverage. Those drawbacks provoked the researchers to hybridize two metaheuristic algorithms or even devise new algorithms. This study addressed the low-search space coverage drawback of the WHO.
Thus, the study contributes to the literature by proposing a novel version of the WHO algorithm utilizing the spiral-updating position strategy. The proposed algorithm is evaluated using both global optimization and FS problems.
The rest of the paper is organized as follows: Section II presents some of the related works. Section III describes the methods and materials. Section IV introduces the proposed methods. Section V evaluates the proposed methods and discusses the results. The conclusions are presented in Section VI.

II. RELATED WORK
The Wild Horse Optimizer (WHO) algorithm is relatively new and has little application in the literature for FS purposes. However, many authors [31] interested in increasing the efficiency of photovoltaic applications have used WHO for parameter optimization in their problem domain. The parameters generated by the wild horse were more accurate than those of any other algorithms. WHO has been widely used to increase the reliability and stability of optimal radial distribution network systems [32]. Therefore, WHO is useful for optimizing the parameters in many problem domains. The study of [33] proposed an improved version of WHO by generating new candidate solutions using the cuckoo search algorithm. This improvement accelerated the WHO convergence speed. They evaluated their approach on benchmark functions, where the results showed the superiority of the improved WHO.
The classical WHO is used to optimize the parameters of an application in a parallel series system known as the reliability redundancy allocation problem [34]. An improved version of the WHO is introduced in [35] to solve global optimization problems. The authors used three operators to boost the exploitation behavior of WHO leading to a balance between the exploitation and exploration phases. The whale optimization algorithm (WOA) is useful in the FS research area. The authors in [36] introduced an approach based on WOA and differential evolution operators (DE) [37] to overcome sticking in local optima. Their approach relies on an elite opposition-based learning technique to enhance the initialization phase of WOA, and DE operators are used at the end of each iteration of WOA. The authors compared their approach with other optimization algorithms and two deep learning techniques in Arabic sentiment analysis FS. The results showed that the hybrid WOA achieved the best accuracy and fewer features. In [38], the quantum concept is unified with WOA FS. The quantum bit individual representation and the quantum gate variation operator enhanced the exploration and exploitation of the WOA. The traditional WOA was compared with quantum-based WOA, and the latter was applied for a wrapper FS. However, quantum-based WOA improved the average classification accuracy, average fitness, and average area under the curve. The continuous position of whales is converted into a binary correspondent using the eight different transfer functions in [39], [40], and [41]. In some applications, the V-shaped transfer function achieved the best performance over the S-shaped one, whereas it is different in other applications. Many researchers, such as [42], focus on enriching the application field of WOA and treating its shortcomings. The shortfall of WOA is summarized as slow convergence and unbalance between global and local search capabilities. The proposed approach in [42] improved the exploration capability of WOA through a hybrid mutation strategy based on the Gaussian and Cauchy mutation operators. The added operators addressed the random learning problem of the WOA exploration phase. However, the improved WOA achieved high accuracy and less number of features. The flower pollination algorithm (FPA) [43] is combined with WOA and opposition-based learning (OBL) to offer a new hybrid approach to wrapper FS methods [44]. FPA improves the WOA solutions with two global and local search processes in an opposite space to the solutions of the WOA, while OBL ensures the convergence and accuracy of the algorithm. Their proposed approach has been applied in spam e-mails detection and compared with other algorithms to prove its highest classification accuracy.
The Sine cosine algorithm (SCA) [45] is combined with WOA and a chaotic logistic map to produce a new enhanced version of WOA [46]. SCA improves the exploitation process of the traditional WOA, which prevents it from sinking in the optimal local area of the solution. Similarly, using chaotic maps improves the exploration phase and the diversity of populations. The new WOA is converted to a binary variant using S-shaped and V-shaped transfer functions to act as a wrapper FS algorithm. The new proposed WOA proved to be superior to other algorithms when applied to the problem of predicting students' performance. Although WHO is a new promising meta-heuristics, it suffers from an imbalance between exploitation and exploration. This shortage affects the algorithm convergence speed. However, many past and recent studies proved the efficiency of WOA in global optimization tasks.

III. METHODS
This section briefly explains the WHO and Whale Optimization algorithms.

A. WHO
The WHO [47] mimics the behavior of wild horses. Wild horses are called non-terrestrial horses. They live in two groups, one for female horses (mares), a family group, and the other is a single group for male horses (stallions). Mating happens between the family group and the single group. At the beginning of their life, foals (child horses) care about grazing. Female foals leave their family group and join others. Male foals are known as stallions when they reach puberty. However, stallions join the ''single group'' out of decency. Decency, in the sense that gathering the stallions in a single group prevents incest. Striving for water in dry seasons showed the act of dominant leaders who can access water holes while other low-dominant members wait for hours. Mares lead family groups; however, mares are subordinate and must follow a leader selected from the stallions. The main steps of the WHO are as follows.

1) POPULATION INITIALIZATION AND LEADERSHIP SELECTION
The Initial population ( x) with N members is randomized such that ( x) = { x 1 , x 2 , . . . , x n } and the objective function of each population is calculated to form the following vector The population is divided into groups G where G = NXPS, where PS denotes the percentage of stallions within the whole populations. Each group has a leader from stallions, randomly initialized, but in later stages of the algorithm, the highest fitness value controls the selection of leaders.

2) GRAZING BEHAVIOR
The grazing behavior is shown in Equation 2.
where X j i,G is the current position of a member's group, Stallion j indicates the position for the leader's group, the Z parameter is given as shown in Equation 3, R is a uniformly distributed random number in the range [−2, 2] that results in horses grazing at various angles (360 degrees) of the group leader, π is taken as 3.14, cosine function of R and π causes the movement in different radii, and the final position of a memberX j i,G is the updated position of a member.
where iter denotes the current iteration and maxiter denotes the maximum iterations' number.

3) HORSE MATING BEHAVIOR
Decency and mating behavior are presented in Equation 5.
where X p G,K denotes the position of horse p as it leaves group k and a horse whose parents leave groups i and j due to puberty takes it replaces it. They are not related to one other and have mated, and reproduced. X q G,i The position of the foal q belongs to i group and after reaching the age of puberty, it mated with the horse z with the position X z G,j which leaves j group.

B. GROUP LEADERSHIP
The leader's group must direct the group to an appropriate location of the water. Leaders struggle for this water for the domination group to use, and the others are not permitted to use it until the domination group leaves. Equation 6 shows this behavior, as in (6), shown at the bottom of the next page, where Stallion G i is the leader's next position of the i group, WH is the water position, Stallion G i is the current leader's position of the group i.

1) EXCHANGE AND LEADERSHIP SELECTION
The leaders are selected randomly at the beginning. However, the fittest population is selected as the leader at a later stage of the algorithm. The positions of the leader and the selected member are shown in (7), as shown at the bottom of the next page.

C. WHALE OPTIMIZATION ALGORITHM
The WOA was introduced by [48]. WOA mimics the foraging behavior of humpback whales, where they encircle their prey using a net of bubbles called bubble-net feeding behavior. The steps of WOA can be described as follows:

1) INITIALIZATION AND ENCIRCLING PREY BEHAVIOR
The initial population ( x) with N members is randomized such that ( x) = { x 1 , x 2 , . . . , x n } and the objective function of each search agent is calculated and the best one is represented as X * .
This behavior simulates the encircling of prey by a set of humpback whales. After defining the position of the best search agent, other agents update their positions accordingly, as shown in Equation 8.
where t denotes the current iteration, A and C are two vectors described in equations 9-10. X * denotes the best search agent obtained in iteration t and · (dot) is the element multiplication and || is the absolute value. In equations 9 and 10, a is a vector decreasing from 2 to 0 during iterations, and r is a random vector in the range of [0,1].

2) EXPLOITATION PHASE
Bubble-net-attacking behavior is divided into the following stages: • Shrinking encircling stage: As humpback whales encircle prey, a shrinking action is achieved by decreasing the value of a from 2 to 0 over all iterations in Equation 9.
• Spiral-updating position stage: The distance between the prey positioned in (X * , Y * ) and the whale positioned in (X , Y ) is calculated in this stage, as shown in for each stallion do 8: calculate Z as in Equation 3.

9:
for each foal inside the group do 10: if rand > PC then 11: update position by Equation 2 12: else 13: update position by Equation 5 14: end if 15: end for 16: if rand > 0.5 then arrange foals of groups according to fitness values 25: select the foal with minimum fitness 26: if fitness(foal) <fitness(Stallion) then 27: exchange foal and stallion position according to Equation 7 28: end if 29: end for 30: m = m + 1 31: end while 32: Return the solution with the best fitness Equation 11. The helix shape movement is presented in Equation 12.
where − → D = − → X * (t) − X (t) is the distance between ith whale and the best whale position, b is a constant that defines the shape of the spiral logarithm, r is a random number in the range [−1,1]. p is the probability that chooses the mechanism of update. VOLUME 10, 2022

3) EXPLORATION PHASE
Contrary to the exploitation phase, the whale position is updated based on the position of a random whale, as shown in Equation 13.
Algorithm 2 WOA Pseudo Code 1: Initialize whale populations. 2: Calculate the fitness values and select the population with the best fitness value as X * 3: while (t <= maxiter) do 4: for each search agent do 5: Update p,l,C,A,a. 6: if p < 0.5 then 7: if |A| < 1 then 8: Update position by the first part of Equation 12 9: else if |A| >= 1 then 10: Select a random population X rand and update position by Equation 13 11: end if 12: else 13: Update the position of population using the second part of Equation 12 14: end if 15: end for 16: Amend any population that goes outside the search space 17: Calculate the fitness of each population 18: Update X *

19:
t = t + 1 20: end while 21: Return the solution with the best fitness X *

IV. PROPOSED METHOD
This section presents the descriptions of the proposed method WHOW. The proposed WHOW improves the optimization behavior of the WHO using the spiral-updating position of the WOA. This improvement enhances the ability of the WHO to update the solutions and explore various possibilities in the search domain. The ''spiral-updating position'' stage increases the exploration behavior of the proposed algorithm, thereby increasing its ability to explore more regions in the search space. This step helps the standard WHO to effectively extend the search for global optima in the problem domain. Additionally, calculating the distance between the whale and the prey plays a good role in guiding the WHOW to find the best area in the search space mimicking the helix movement of the whale.
In this regard, the ''spiral-updating position'' is applied based on probability (p) which is calculated using Equation 14 to determine which operators could be used with the current solution. Therefore, if p < 0.5, the ''spiralupdating position'' is applied, or else the operators of WHO are used.
where, f i is the value of the fitness function. The first step in the application of WHOW is to determine the values of all parameters and generate the search domain for the initial population. Secondly, to select the best solution, the quality of each solution in the population is evaluated using the fitness function, as in Equation 15.
where γ E x is the error of the classification process, (K -NN classifier is used in this study). The second part defines the ratio of the selected feature. α ∈ [0, 1] balances the classification error and the number of selected features. Thirdly, the processing of the optimization steps starts by exploring the search domain to find the initial optimal solution by evaluating the current population using the fitness function. Finally, the values obtained through the fitness function are computed and the best one is determined and saved. The sequence of the WHOW is presented as follows: • Define the values of the global parameters for the experiment.
• Create the population X with D dimension and N size.
• Compute the fitness values for all solutions, and then the main iterations of the WHOW are started. This sequence is iterated till it reaches the stop condition. The full structure of the WHOW is illustrated in Figure 1. The complexity of the WHOW depends on the dimension of the problem (D), the size of the population size (N ), and the number of iterations (maxiter) of the WHO and the parameters of the WOA depend on a probability value and it does not add more complexity. Therefore, the complexity of the WHOW in the worst case is as follows: O(N × D × maxiter + N × D + N × maxiter).

V. EXPERIMENTS AND DISCUSSION
In this section, the proposed WHOW is evaluated using two experiments; the first one is global optimization based on CEC2019 benchmark functions, whereas the second one is the FS using 20 benchmark datasets. The proposed method is then compared with WHO [47], PSO [49], GA [50], SCA [45], MVO [51], SSA [19], SMA [52], MPA [53], and WOA [48]. In the experiments, 100 iterations were performed on a population size of 20, and 20000 function evaluations were applied as a stopping criterion. For statistical purposes, all methods are applied in 30 runs. The experiment used ''Matlab 2018b'' on ''Windows 10'' over a ''Core i5 CPU'' and ''8 GB RAM''. Table 1 shows the parameters of the compared algorithms.
To evaluate the performance of the proposed method, the following performance measures are used in the experiments: the maximum (Max) and minimum (Min) of the fitness function as in equations 17 and 16, respectively, and the standard VOLUME 10, 2022 for each stallion do 8: Calculate Z as in Equation 3. 9: for each foal inside the group do 10: if rand > PC then 11: if p < 0.5 then 12: Updating by Equation 11 13:  end for 20: if rand > 0.5 then 21: Updating Stallion G i by Equation 6 first part 22: else 23: Updating Stallion G i by Equation 6 second part part 24: end if 25: if fitness(Stallion G i ) >fitness(Stallion) then 26: Stallion = Stallion G i 27: end if 28: Arrange foals of groups according to fitness values 29: Select the foal with minimum fitness 30: if fitness(foal) <fitness(Stallion) then   • Maximum fitness value (Max): It is used for determining the worst fitness value (the greatest in terms of problem minimization) realized in a particular algorithm, this measure is used as follows: • Standard deviation (STD): It is used for measuring the extent of the dispersion of the fitness values from the average (µ) value. The lower the STD, the more robust the method. This measure is used as follows: where, fit and µ are the value and mean of the fitness function, respectively, N is the sample size. Additionally, there are some measures for the FS experiment, namely, the classification accuracy (Acc) as in Equation 19 and the number of selected features by each method. The Acc is applied to measure the ratio of correct  predictions in samples. Acc is computed as follows: where TN , TP, FN , as well as FP denote the true values, both negative and positive results of the algorithm, and FN and FP are the false values, both negative and positive results of the algorithm. Moreover, some statistical tests such as Wilcoxon ranksum and Friedman tests are performed to validate the proposed algorithm.

A. GLOBAL OPTIMIZATION USING CEC2019
In this section, the proposed WHOW is evaluated using the benchmark test functions CEC2019 [54]. These functions are presented in Table 2.
The results of this experiment are reported in Tables 3-7. The average fitness function values for 10 test functions from CEC2019 are shown in Table 3. The WHOW algorithm showed the best average in 8 of the 10 functions, followed by the WHO algorithm, which realized the optimal values in four functions. Furthermore, the MPA algorithm reached the optimum values in two functions, whereas the SSA and SMA got the best values in only one function. However, the rest of the competitor algorithms failed to reach the optimum values in all ten functions.
The standard deviation results over the 10 test functions are provided in Table 4. As seen in the table, the WHOW is the most stable metaheuristic among the algorithms, achieving the smallest values in 8 functions. The MPA and WHO algorithms rank second and third in terms of stability. In contrast, the PSO is the least stable algorithm, ranking last among all competitor algorithms.
In terms of convergence speed, as shown in Table 5, the SCA algorithm takes the shortest time to converge, followed by the PSO and SSA algorithms. The WHOW ranked fourth, achieving optimal results in three functions (F2, F3, and F10). The SMA, GA, and MVO algorithms are the slowest in terms of convergence. Moreover, Figure 2 illustrates the convergence curves for all methods, and it is clear from the Figure that the proposed WHOW showed the best convergence behavior in most functions.
The minimum values given by each metaheuristic algorithm are shown in Table 6. The Table showed that WHOW and WHO each showed the most optimal values in 3 functions (F2, F3, and F5 for WHOW) and (F1, F3, and F6 for WHO). The MVO and SSA algorithms ranked second since both showed the best values in 2 functions (F7 and F9 for MVO) and (F4 and F6 for SSA). The SMA and MPA algorithms showed optimal values in only one function (F1 and F2, respectively) and thus rank third among the competitor algorithms. The SCA algorithm obtained the lowest rank among all competitor algorithms since it failed to reach optimal values in 6 functions (F4-F6 and F8-F10).
The maximum values taken by each competitor algorithm are shown in Table 6. The WHOW ranked first, showing optimal values in 7 of 10 functions (F2-F5, F7, and F9-F10). The WHO algorithm came second, showing optimal values in 3 functions (F2-F3 and F10). The MPA algorithm ranked   third since it reached the best values in four functions (F2-F4 and F10). The SCA algorithm obtained the lowest rank as it failed to reach optimal values in 5 functions (F6-F10). Table 8 presents the results of the statistical test of Wilcoxon's rank-sum (i.e., P − value) to examine the significant differences among the WHOW and its competitors in the CEC2019 benchmark. Table 8 shows that there is a significant difference between WHOW and the other algorithms in most functions of the CEC2019 benchmark. The results of the Friedman test are shown in Table 9. Similarly, it is shown that the WHOW outperformed the other algorithms and ranked first in 7 of 10 functions, followed by MPA and WHO (see Table 9).

B. SOLVING FS PROBLEMS
This section evaluates the proposed WHOW for solving twenty benchmark datasets [55], the description is presented in Table 10.     Table 11, the WHOW showed the best values in 19 of 20 datasets, followed by the MPA algorithm, which provided the best results in 5 of 20 datasets (wineDS, tic-tac-toeDS, ZooDS, ExactlyDS, and M-of-nDS). The PSO and MVO ranked third as each provides optimal results over four datasets. The GA and SCA came in the fourth rank, performing well in three datasets. The SSA and WOA failed  to return the best values in most tested datasets. Moreover, Figure 3 shows the average results for all datasets based on the fitness function values.  Considering the stability of the tested algorithms in Table 12, the WHOW algorithm is the most stable metaheuristic because it has the lowest standard deviation in 9 (wineDS, breastcancerDS, LymphographyDS, tic-tac-toeDS, ZooDS, ExactlyDS, Exactly2DS, M-of-nDS, and krvskpDS) out of the 20 benchmark datasets. The MPA ranked second since it showed stability in seven datasets. In contrast, the PSO and WOA algorithms ranked third, showing the lowest standard deviations in 4 of the 20 benchmark datasets. The SSA, SMA, and WHO algorithms were less stable, achieving minimum standard deviations in only three datasets. Table 13 shows the minimum values of each algorithm in the datasets. From the table, WHO and PSO algorithms achieved optimal value in 14 of the 20 benchmark datasets. The MPA algorithm ranked second, achieving the best values in 13 datasets. The WHOW and GA algorithms came third, achieving the best values in 12 datasets. MVO and SSA ranked fourth and fifth, respectively. The SMA algorithm  ranked the least, showing the best value in just 2 of the benchmark datasets.

Considering the average fitness function values in
The maximum values of each competitor algorithm in each benchmark dataset are reported as in Table 14. From the Table, the WHOW algorithm achieves the best values in 15 datasets, whereas the MPA reached the second rank by achieving the optimal values in 12 datasets. The PSO showed better results in six datasets and ranked third among the competitors, while the SCA, MVO, and WOA algorithms ranked fourth, achieving the best values in 4 benchmarks. The SMA algorithm ranked the least as it did not show optimal results in all datasets. Table 15 shows the accuracy results for each competitor in the datasets. The Table shows that the WHOW algorithm achieved the highest accuracy in 17 benchmarks. The MPA algorithm ranked second and performed well in 7 datasets. The GA and SCA ranked third, with both PSO and MVO ranking fourth. The SMA algorithms ranked the least in accuracy in all benchmark datasets.  Given the computational time shown in Table 16, the WOA algorithm ranked first with the least computational time in ten benchmark datasets. The SMA ranked second and converged quickly in 6 datasets. Similarly, WHOW converges well in 3 benchmark datasets. The SCA algorithm converges quickly in only one dataset, whereas the rest of the competitor algorithms require a longer computational time in all tested datasets. Table 17 presents the number of features selected using each competitor's algorithm method. The Table showed that the WHOW selected the lowest features in 13 datasets, followed by the SCA with the lowest features in 6 benchmarks. The WHO, PSO, SSA, and MPA selected the lowest features in only one benchmark. The GA, MVO, SMA, and WOA failed to select the least number of features in all 20 benchmark datasets. Moreover, Figure 3 shows the average selected features for all datasets. Table 18 presents the results of the statistical test of Wilcoxon's rank-sum (i.e., P − value) used to examine the  significant difference between the WHOW and the other competitors in FS datasets. The Table shows there is a significant difference between the WHOW and the other algorithms in most datasets. In particular, it outperformed the GA in 35% of datasets and 40% of all datasets compared with MVO and WHO. Table 19 shows the results of the Friedman test. From the Table, WHOW outperformed the other algorithms and was ranked first in 18 of the 20 datasets. The MPA, PSO, GA, and MVO respectively follow the WHOW in order of the performance ranking.
Generally, the proposed WHOW method outperforms the other optimization algorithms in both experiments because the strategy of spiral-updating position helped the WHOW update its solutions within the optimization process. Therefore, it successfully escapes from local optima in most cases and effectively explores the search space. However, the proposed WHOW method has limitations, such as the computational time, where the WHOW could not outperform the other methods in more than 50% of all datasets. Therefore, this study recommends that the complexities of the proposed method be explored and improved in future studies.

VI. CONCLUSION AND FUTURE RESEARCH DIRECTIONS
In this study, we proposed a novel WHOW by integrating the strategy of the spiral movement of the WOA. Thus, it enhances the local exploitation ability of the WHOW, thereby ensuring intensification and diversification balancing. We evaluated the accuracy of the WHOW algorithm using CEC2019 test functions and compared the results with those of other popular swarm-based algorithms. The results showed that the WHOW algorithm provides a good ability in solving test function problems. Furthermore, we used the CEC2019 test functions to evaluate WHOW accuracy and compared the results with well-known and new algorithms. The results showed that the proposed WHOW algorithm also has an excellent ability to solve test function problems. Also, we used 20 more benchmark datasets from various fields to further examine the efficiency of the proposed WHOW algorithm for FS. The results showed that the proposed WHOW algorithm performed well in the selection of the most prominent subset of features. Thus, the simplicity of WHOW and the ability to deliver efficient and effective results showed that the proposed WHOW algorithm is an excellent improved version of the ordinary WHO with an enhanced exploration behavior. Future work should evaluate the proposed method in multi-objective optimization and parameter estimation. It could also be tested in image segmentation and solving real-application, such as engineering problems and predicting potential diseases.