EAOA: An Enhanced Archimedes Optimization Algorithm for Feature Selection in Classification

Feature selection plays a crucial role in order to mitigate the high dimensional feature space in different classification problems. The computational cost is reduced, and the accuracy of the classification is improved by reducing the dimension of feature space. Hence, in the classification task, finding the optimal subset of features is of utmost importance. Metaheuristic techniques have proved their efficacy in solving many real-world optimization issues. One of the recently introduced physics-inspired optimization methods is Archimedes Optimization Algorithm (AOA). This paper proposes an Enhanced Archimedes Optimization Algorithm (EAOA) by adding a new parameter that depends on the step length of each individual while revising the individual location. The EAOA algorithm is proposed to improve the AOA exploration and exploitation balance and enhance the classification performance for the feature selection issue in real-world data sets. Experiments were performed on twenty-three standard benchmark functions and sixteen real-world data sets to investigate the performance of the proposed EAOA algorithm. The experimental results based on the standard benchmark functions show that the EAOA algorithm provides very competitive results compared to the basic AOA algorithm and five well-known optimization algorithms in terms of improved exploitation, exploration, local optima avoidance, and convergence rate. In addition, the results based on sixteen real-world data sets ascertain that reduced feature subset yields higher classification performance when compared with the other feature selection methods.


I. INTRODUCTION
Machine classification is a prominent job in data mining and machine learning, which focuses on classifying that systematically, categorized each object in the data set based on their similar features [1]. It isn't easy to ascertain which features are beneficial without prior knowledge. As an outcome, numerous features are introduced in the data set comprising redundant, irrelevant, and relevant features. Irrelevant and redundant features mitigate the performance of the classifier because of the enormous search space termed as ''the curse of dimensionality''. Feature selection can solve this problem The associate editor coordinating the review of this manuscript and approving it for publication was Shadi Alawneh . by selecting only significant features for the task of classification. Thus, feature selection could improve the classifier's performance by eradicating the number of features, simplifying the learned classification method, curtail the training time, data visualization, and data understanding.
There are four crucial phases in feature selection: a generation process that produces a feature subset. A search technique is essential to find out the most favorable candidates of feature subset and determine the most suitable feature subset among the subsets; an evaluation process is employed [2]. A stopping criterion is adopted afterward, and to test the validity of the subset, a validation process is exploited. Choosing an optimal subset from a large number of features is a considerable hazard. Various search strategies exist to solve VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ this big menace, namely heuristic search strategy, complete search strategy, and random search strategy. Search strategies suffer from certain limitations. High computational cost, multifaceted modality of the search setting, growing problem dimensionality, and convergence towards local minima mention a few. The feature selection method suffers from overfitting and substantial computational time when the number of features is large. Classification errors may occur because of an overfitted model that may blunder slight fluctuations for vital variance in the data. Metaheuristic methods have produced outstanding outcomes in different real-life optimization issues [3]. These methods are easy to implement, simple, and can evade local minima. In the case of such methods, trade the balance between exploitation and exploration is the key. The successful techniques are those that uphold such stability on a huge range of optimization issues. New optimization methods derived from physics, mathematics, chemistry, and their powerful phenomena are introduced besides animals and insects. The essential techniques inspired from the discipline of Physics are Henry law based Henry gas solubility optimization [4], based on Coulomb and Newtonian's laws, an optimization method called as charged system search [5], ray optimization based on ray theory [6] and gravitational theory is observed by gravitational search algorithm [7]. The laws of physics have already yielded state-of-the-art performances when formulated into optimization methods. Hashim et al. [8] devised Archimedes Optimization Algorithm (AOA), whose basis is Archimedes' principle from the physics domain. AOA is a population-based metaheuristic method. Population-based methods perform a search with several preliminary points in an analogous style like swarmbased metaheuristics. Some of the metaheuristic algorithms have been effectively solved several optimization problems, while other methods have been utilized to enhance the classifier's performance.
The law of buoyancy is obeyed by Archimedes Principle [8]. It explains the association between the buoyant force and an item submerged in fluid. The upward force exerted is an item's buoyancy that is equivalent to the weight of the displaced fluid. If the displaced fluid is less than the item's weight, the item will sink. If the weight of the item and displaced fluid are equal, then the item will float. The population entities are the objects submerged in the fluid in the case of AOA. The items have acceleration, density, and volume, which are the vital players in the buoyance of the item. To touch a mark where items are impartially buoyant is the conception of AOA. Zero is the value of the net force of the fluid. The complex optimization problems are handled by AOA suitably with numerous local best solutions. The paramount global solution can be found in a broader area by investigating a population of solutions.
We had reviewed some of the original or modified optimization methods to solve the feature selection issue. Liu et al. [9] introduced an improved feature selection (IFS) technique by utilizing support vector machines with the F-score method and a modified Multi-Swarm Particle Swarm Optimization (MSPSO). Higher generalization capability was achieved by performing feature selection and kernel parameters simultaneously. To tackle the feature selection issue, Al-Tashi et al. [10] hybridized grey wolf optimization (GWO) and particle swarm optimization (PSO). The wrapper-based technique K-nearest neighbors classification method with Euclidean separation matric was applied to find the best feature subset. The method was assessed using the 18 standard benchmark data sets. Neggaz et al. [11] introduced a new technique based on Henry gas solubility optimization (HGSO) to choose key features and improve classification accuracy. They employed various data sets with different feature sizes, from small to massive, and evaluated against well-known recent metaheuristic algorithms. Wilcoxon's rank-sum non-parametric statistical test was performed to make a statistical comparison of the proposed method with other methods. Based on the graph clustering approach and ant colony optimization, a novel feature selection technique was devised by Moradi et al. [12]. Their method worked in three stages. In the first stage, the complete feature set was embodied as a graph. In the next stage, a community detection method was employed to discriminate the features into various clusters. In the last stage, the ACO technique was applied to select the final subset of features using a new search strategy. To tackle the same feature selection problem, a hybrid optimization method termed SSAPSO was proposed by Ibrahim et al. [13] by integrating PSO with the slap swarm algorithm (SSA). The efficacy of the exploitation and the exploration steps was enhanced. The empirical results provided evidence that the accuracy and performance were improved without affecting the computational effort. The same issue was addressed by Too et al. [14] by introducing a binary form of Harris hawk optimization. To convert the continuous variable into a binary one, the proposed method was equipped with a V-shaped or S-shaped transfer function. Quadratic binary Harris hawk optimization, another variant, was applied to enhance the performance of the earlier version.
In the literature, different metaheuristic algorithms were proposed to enhance the performance of feature selection issue. However, as far as the authors are aware, this is the first time AOA algorithm or a modified version of it is proposed to solve the feature selection issue. In this paper, EAOA will be utilized to improve the AOA exploration/exploitation balance and enhance the classification performance in real-world data sets by optimized feature selection.
The main contributions of this work are summarized as follows: 1. A new Enhanced Archimedes Optimization Algorithm (EAOA) is proposed to enhance the optimization efficiency and accuracy of AOA by adding a new parameter that depends on the step length of each individual while revising the individual location.
2. The proposed EAOA method is tested on twenty-three benchmark functions. 3. The significant Wilcoxon Rank-sum test is applied to present a fair comparison. The results provide evidence that the suggested EAOA escapes the local optima, provides a better convergence rate with an enhanced balance between exploration/exploitation processes and performs statistically significant as compared to the basic AOA and five wellknown optimization algorithms.
4. As a potential application, EAOA is used as a feature selection algorithm and sixteen real-world data sets are employed to test the performance of the proposed EAOA algorithm on selecting the optimal subset of features of the training data and the results of the same features from the test set are compared with some well-known optimization algorithms.
5. EAOA is used to mitigate the features, and learning is performed using the Support Vector Machines (SVM) classifier in all performed experiments. Wilcoxon signedrank test is also applied, and the results showcase that the proposed feature selection method statistically outperforms the AOA algorithm and other comparable optimization techniques based on the recorded results in terms of accuracy, G-Mean, F-Measure metrics. Also, EAOA yields superior performance than the basic AOA and the comparable feature selection methods in terms of Runtime, AUC, specificity, precision, and sensitivity for almost all used real-world data sets.
All experimental results prove that the proposed algorithm has a better improved ability than other state-of-theart algorithms to choose the optimal subset of features that reduce the computational complexity (Runtime and space) by reducing the dimension of feature space while improving the classification performance.

II. MATERIALS AND METHODS
A. ARCHIMEDES OPTIMIZATION ALGORITHM (AOA) [8] Archimedes Optimization Algorithm (AOA) is an algorithm that is based on a population where the population individuals are immersed objectives. Similar to algorithms that are based on population, AOA also instigates examination procedure with a primary population of individuals with accelerations, densities, and volumes of randomness. AOA first starts the evaluation of the fitness of the preliminary population then begins with the iteration until the conclusion condition happens. Every individual's density and volume were updated in each iteration. Considering the condition of its collision with the other neighboring individuals, the acceleration of individuals is revised. The reorganized volume, density, and acceleration fix the new location of an individual. The elaborated mathematical appearance of AOA steps.
Step 1-Initialization: Prepare the situations of all individuals utilizing (1): where in a population of N individuals, x i is the i th individual. ll i is the lower limit, ul i is the upper limit of the search space and rand is a vector with D dimension arbitrarily produces numeral within the range [0, 1]. Reset density (d) and volume (v) for each i th individual utilizing (2): And lastly, start acceleration (ac) of i th object utilizing (3): In this stage, assess starting population and choose the individual which has the finest fittest value. Allocate x best , d best , v best and ac best , where x best is the finest individual, d best , v best and ac best are the density, volume and acceleration related with the finest x best found.
Step 2-Renew Volumes and Densities: The volume and density of individual i for the repetition t + 1 is reorganized utilizing (4) where d best and v best are the density and volume related with the finest individual found to this point, and a uniformly distributed random number is rand.
Step 3-Density Factor and Transfer Operator: In the commencement, individual's collision occurs and, after a while, the individuals attempt to grasp at symmetry condition.
In AOA, this condition is applied with the assistance of transfer operator TO, which converts examination from exploration to exploitation, definite utilizing (5): where transfer operator TO upsurges progressively with time till attainment of one. Where t is the number of iteration and t max is the maximum iterations. Likewise, density declining factor (ddf) also supports AOA on a global search to local. Which declines as time passes utilizing (6): where ddf t+1 declines as time passes that provides the capability to converge in previously recognized hopeful region.
Worthwhile to indicate that appropriate handling of this variable will settle equilibrium amongst exploitation and exploration in the algorithm of AOA.
Step 4.1-Exploration Phase (Collision Between Individuals Happens): If TO ≤ 0.5, a collision between individuals happens, choose a material which is random and revise the object's acceleration for repetition t + 1 using (7): where d i is density, v i is volume, and ac i is the acceleration of i th individual. Likewise, v mr , ac mr , and d mr are the volume, acceleration, and density of the material which is random. It is significant to indicate that TO ≤ 0.5 guarantees exploration in the time of 1/3 of iterations. Utilizing a value that is not 0.5 leads to alteration exploration-exploitation performance.

Step 4.2-Exploitation Phase (no Collision Between Individuals):
In the condition where TO > 0.5, there exist no collision amongst individuals, revise individual's acceleration to iteration t + 1 utilizing (8): Step 4.3-Normalize Acceleration: Utilizing equation (9), acceleration need to normalize to compute the proportion of variation: where u = 0.9 and l = 0.1 are the range of normalization.
The ac t+1 i,norm illustrate the proportion of steps that every agent will vary. In the condition of individual i which is afield from global optimal, acceleration value will be high, meaning that the individual will remain in the exploitation stage or exploration phase which explains the way of exploration transforms from the phase of exploration to the stage of exploitation. In the usual condition, the acceleration element starts by a huge value that declines as time passes. This assists the search agents to the passage concerning the global finest results and consequently they back off from the solutions of local. Nevertheless, significant things indicate that there can persist a small number of search agents which require additional time to visit in the stage of exploration than usual circumstances. Therefore, AOA accomplishes stability concerning exploitation and exploration.
Step 5-Update Position: If TO ≤ 0.5 (exploration phase), the i th individual's location for following iteration t + 1 using (10): where Con 1 is constant which is 2. Otherwise, if TO > 0.5 (exploitation phase), the individuals revise their locations utilizing (11).
where Con 2 is considered as constant which is 6. T upsurges more and more which is proportionate to the transfer operator and it is definite utilizing T = Con 3 × TO.
where Con 3 is considered as constant which is 2. T upsurges gradually considering the time in limit [Con 3 × 0.3, 1] and takes a convinced proportion from the finest location, primarily. It begins with small proportions as this produces step-size of the random walk will be large and large difference concerning the finest position and current position.
While the search is occurring, this proportion upsurges progressively to decline the difference between the finest location and the current situation and consequently attain a suitable equilibrium between exploration and exploitation. K is the flag to variate the direction of motion utilizing (12).
Assess the primary population and choose one individual with the best value. Set For every individual i Revise volume and density of every individual utilizing equation (4).
Revise transfer and density declining factors TO and ddf utilizing equations (5) and (6) respectively.

End If End For
Evaluate every individual and select the one with the finest value Set t = t + 1 End While Return individual with finest fitness value. End Procedure where q = 2 × rand − Con 4 and Con 4 is a constant equals 0.5.
Step 6-Evaluation: Assess every object utilizing the objective function and remember the finest result originate till now. Allocate x best , d best , v best and ac best

B. ENHANCED ARCHIMEDES OPTIMIZATION ALGORITHM (EAOA) AND ITS ADAPTION FOR FEATURE SELECTION
In the AOA algorithm, the individuals revise their locations based on the revised densities, volumes, and accelerations. EAOA algorithm is proposed to enhance the basic AOA performance by adding a new parameter Mu depends on the step length of each individual. In the exploitation phase, the parameter Mu, upper and lower limits of search space are employed to enhance the balance between exploration and exploitation of the search space while revising the individual VOLUME 9, 2021 location over the course of iterations. Algorithm 1 shows the pseudocode of the EAOA optimization method.
The performance of EAOA is validated using 23 benchmark functions and utilized for solving feature selection problems on real-world data sets. The proposed EAOA has the ability to escape the local optima when tested on benchmark functions and presented an improved convergence rate with enhanced exploration/exploitation balance comparing to the basic AOA algorithm and Ant Lion Optimizer (ALO) [15], Grey Wolf Optimizer (GWO) [16], Whale Optimization Algorithm (WOA) [17], Particle Swarm Optimization (PSO) [18], and Giza Pyramids Construction (GPC) [19].
Moreover, the EAOA method is applied for feature selection on different real-world data sets. The application of EAOA for feature selection issues is performed by selecting the optimal subset of features from the training set using the enhanced optimization capability of the proposed EAOA, which starts its search with randomly generated search individuals. The proposed algorithm uses the binary encoding type as its representation scheme for selecting the optimal features. In this type of encoding, a binary array represents each search individually, and the features are treated as either existed (1) or removed (0). The 1s represent the stayed features, while 0s represent the deleted features from the training set. The G-mean of search individuals in EAOA is evaluated, which is defined in Equation (17), as fitness value. Figure 1 depicts the flowchart of EAOA method and its implication in the feature selection problem for the realworld data sets. As shown in the figure, the original data set is divided into training, validation, and testing subsets. The proposed EAOA focuses to find an optimal feature subset from the training set and is tested using the validation set. When the termination criteria are satisfied, the best search individual is assumed as converged and this search individual that has the highest fitness value is decoded for the solution that consists of a reduced feature set.
To assess the performance of the EAOA feature selection method, experiments are conducted, and the results are compared with state-of-the-art feature selection methods. Support Vector Machine (SVM) is used in our experiments to measure the classification performance of the reduced data resulting from the EAOA feature selection method using numerous evaluation metrics as defined in Eqs. (13)(14)(15)(16)(17)(18).
For evaluation metrics, let TN represent true negatives, and TP represents true positives, FN implies false negatives, and FP denotes false positives [20]- [22].
Area under the ROC (Receiver Operating Characteristics) curve (AUC) is a performance measurement for the classification problems at numerous threshold situations. ROC is a likelihood curve and AUC is used to measure or degree of separability.

III. RESULTS AND DISCUSSION
In this section, Experiments were performed on twentythree benchmark functions and sixteen real-world data sets VOLUME 9, 2021  to investigate the performance of the proposed EAOA algorithm.

A. RESULTS OF EAOA ON BENCHMARK FUNCTIONS
Benchmark functions can be divided into three groups: (1) unimodal functions (F 1 − F 7 ), which have one global optimum and can evaluate the exploitation ability and convergence of an algorithm; (2) multimodal functions (F 8 − F 13 ), which have many local optima and can test the exploration ability and local optima avoidance of an algorithm; (3) fixed-dimensional multimodal functions (F 14 − F 23 ), which can test the consistency of an algorithm in finding a global optimum solution.
The details of these functions are presented in Tables 1, 2, and 3. In these tables, V _no is the number of variables, Range represents the range of variation of variables, and F min is the optimum solution reported in the literature. Figure 2 shows a 2D view of selected benchmark functions considered in this paper.
To evaluate the performance of the proposed EAOA algorithm, it is compared to the basic AOA algorithm and five well-known and recent algorithms. These algorithms are Ant Lion Optimizer (ALO), Grey Wolf Optimizer (GWO), Whale Optimization Algorithm (WOA), Particle Swarm Optimization (PSO), and Giza Pyramids Construction (GPC). Each algorithm was run 30 independent times on each benchmark function, and the average and standard deviation of the best optimal solution were reported. For a fair comparison, all experiments were carried out on a 64-bit Windows 10 system with an Intel(R) Core(TM) i7, 16 GB memory, 2.40 GHz CPU, and Matlab R2018a. The population size (N) and maximum number of iterations (t max ) of each algorithm were chosen to be 20 and 1000, respectively. The other parameters are shown in Table 4.
Generally, All metaheuristic algorithms facing some challenges due to their stochastic nature and design. Failing to solve any of these challenges produces algorithms limitations. Some of these limitations are: getting trapped in confined areas of the search space, the exploration/exploitation imbalance, and the high computational cost. The main limitation that the AOA algorithm faced was the computational cost. Since the proposed EAOA is a modification of AOA then, it has the same limitation. Several techniques are applied for parameter sensitivity analysis in metaheuristic algorithms to solve the computational cost problem. AOA authors have applied a sensitivity analysis for general parameter configuration guidance for the AOA control variables Con 1 to Con 4 to solve this limitation. In our modification we have used the same values of AOA parameters to focus on the limitation relates to improving the exploration/exploitation balance that we try to solve in this work.  The statistical results (average (Ave) and standard deviation (Std) obtained by each algorithm are shown in Table 5. This table shows that the proposed EAOA algorithm is very competitive with the basic AOA algorithm, ALO, GWO, WOA, PSO, and GPC on most functions. For unimodal functions, the EAOA algorithm achieves exact optimum results for functions F 1 and F 3 and the best optimal solutions for functions F 2 and F 4 in terms of average and standard deviation. For function F 5 , PSO algorithm is able to obtain the best average optimal value, but the standard deviation of an algorithm is the best. In the case of function F 7 , EAOA algorithm obtains the second-best average optimal value. These results prove that the EAOA algorithm has better exploitation capability. For multimodal and fixed-dimensional multimodal functions, the EAOA algorithm obtains the exact optimum value for functions F 9 and F 11 . Moreover, the EAOA algorithm achieves the best optimal solutions for function sF 10 and F 20 in terms of average and standard deviation and provides the best average optimal value for functions F 8 , F 16 , F 18 , and F 20 . EAOA algorithm also achieves the second-best average optimal value for functions F 22 and F 23 . It is observed from the results that the EAOA algorithm has better exploration capability as compared to other algorithms. Figure 3 shows the convergence curves of the algorithms for six benchmark functions. As illustrated, the convergence speed of the proposed EAOA algorithm towards the optimum is faster than other algorithms only in final iterations for functions F 1 , F 3 , and F 10 . Figure 3 also shows that the proposed EAOA algorithm provides a better convergence rate from the initial steps of iterations for functions   In order to validate the results statistically, nonparametric Wilcoxon Rank-sum test [23] was used to create meaningful comparison between the proposed EAOA algorithm and the other used optimizers. The calculated p-values of the Wilcoxon Rank-sum test are shown in Table 6. The p-value was compared against a significance level 0.05.
The NaN in Table 6 means '' Not a Number'' returned by the significant Wilcoxon Rank-sum test. The symbols ∧, ∨ and ≈ respectively indicate that the proposed EAOA algorithm is significantly better, significantly inferior, and statistically similar to that of other compared state-of-the-art optimizers. According to Table 6, EAOA algorithm is significantly better than AOA and GPC for seventeen functions, significantly inferior for three and two functions, respectively, and AOA is statistically similar for three and four functions, respectively. EAOA algorithm is also significantly better than GWO and WOA for twelve functions, significantly inferior for nine and six functions, respectively, and statistically similar for two and five functions, respectively. Moreover, EAOA algorithm is significantly better than ALO for ten functions, significantly inferior for nine functions and statistically similar for four functions. While EAOA algorithm is significantly better than PSO for nine functions, significantly inferior for twelve functions and statistically similar for two functions. The preceding results prove that the EAOA algorithm performs statistically significant and comparatively better performance over the other state-of-theart optimizers.

B. RESULTS OF EAOA ON REAL-WORLD DATA SET
In this subsection, experiments were conducted on 16 realworld data sets with different numbers of instances and attributes. These data were chosen from the UCI machine learning repository [24]. Their classification results were obtained using some state-of-the-art methods compared with the proposed method results to assess its classification performance. Table 7 shows the characteristics of the realworld data sets.
The instances of each data set were divided randomly in all experiments into three sets: train, test, and validation sets. The proposed EAOA algorithm was applied to perform a feature selection of each set where the SVM classifier was used to evaluate the selected features. 30 runs of the algorithm were independently performed to measure the performance based on accuracy, G-Mean, F-Measure and Runtime metrics for each data set then, the average of each metric was recorded. Also, the number of selected features (# features) for each method was recorded. Figure 4 shows the number of features obtained by compared methods EAOA, AOA, ALO, GWO, and WOA for each data set. Table 8 presents the performances of the compared feature selection methods on real-world data sets. The original data set (Or) results with full features are provided in the first column for references. VOLUME 9, 2021   Table 8 show obviously that the EAOA outperforms the compared methods in terms of the classification accuracy on all data sets except ''Vowel(2-others)'' data, where the AOA method gained the best accuracy for this data only. On the G-Mean and F-Measure metrics, our proposed feature selection method also reaped the best values for fourteen out of the sixteen used data sets, proving the stability of our proposal against other feature selection methods. Also, results in Table 8 clearly indicate that all the feature selection methods have reduced the time consumed for classification  and it can be noticed that the runtime of the proposed EAOA is less than the AOA algorithm for almost all the data sets and has superiority for saving time over the other compared algorithms. It can also be noticed from Table 8 that although decreasing the number of features as in ''yeast-2_vs_4'' data, EAOA can maintain the same performance obtained by the full features while for most other data sets, EAOA gained better performance than the performance obtained by the full features. From the preceding results, we can conclude that the EAOA feature selection method can gain better performance regarding the computational complexity (Runtime and space) than other methods since it had the less Runtime and the fewer features that guarantee the best performance.

Results in
To perform the comparison, the Wilcoxon signed-rank test (a paired difference, two-sided signed-rank test), which is a non-parametric statistical hypothesis [25] was used to apply a statistical significance analysis and derive strong conclusions fairly. For each data set, EAOA was compared with all the other methods; the differences were calculated and ranked from smallest '1' to largest '16' for each two compared methods. ('+' or '−') signs were subsequently assigned to the corresponding differences of the ranks. R+ and R− were assigned to all the +ve and −ve ranks after summing up separately. The T value was compared against a significance level α = 0.05, with a critical value, equals 29 for 16 data sets where T = min {R+, R−}. The null hypothesis was that all performance differences between any two compared methods might occur by chance, and the null hypothesis was rejected only if the T value is ≤ the critical value 29.
Tables 9-14 present the significance test results and the addition '+' symbol in tables 12-14 indicates that our proposal EAOA outperforms the compared methods. The NaN values were represented in these tables by the value −1 to be at the bottom of the ranked list and thus are excluded.
The significance test results of average accuracy for EAOA VS. AOA, ALO, GWO, and WOA, using the SVM classifier, are presented in Table 9. In the case of EAOA vs. AOA, EAOA has a positive difference for 15 data sets; hence it is better than AOA, while AOA has a negative difference and is better than EAOA for only one data set. The sum of all the positive ranks 'R+' is 133, and the sum of negative ranks 'R−' is 3. Since the data used are 16 data sets from the critical table, to reject a null hypothesis at 0.05 (the significance level), the T value should be ≤ 29. It can be concluded that EAOA statistically outperforms AOA as T = min {R+, R−} = min {133, 3} = 3 < 29. As well, in the case of EAOA vs. ALO, GWO, and WOA, EAOA   statistically outperforms these algorithms where T values are 0, 2, and 4, respectively, and all values are less than 29.
Tables 10 and 11 represent the significance test results of average G-Mean and F-Measure, respectively. In the case of EAOA vs. AOA in both tables, EAOA has a positive difference for all the 16 data sets; hence it is better than AOA, while AOA has a zero negative difference and is not better than EAOA for any data set. The sum of all the positive ranks 'R+' is 136 in the case of G-Mean and is 107 in the case of F-Measure and the sum of negative ranks in both cases 'R−' is 0 which is less than 29. In the case of EAOA vs. ALO, GWO and WOA, EAOA can statistically outperform these algorithms in terms of G-Mean where T values are 13, 1, and 5 respectively. Also, in terms of F-Measure, EAOA can statistically outperform the compared algorithms ALO, GWO, and WOA, where T values are 11, 1, and 5, respectively, and all are less than 29.
All the results of the statistical Wilcoxon signed-rank test prove that the feature selection proposed method can statistically outperform the AOA algorithm and has strong competitiveness against other comparative methods in terms of all used performance measures.
Sensitivity, specificity, precision, and AUC measures were also recorded and presented in Table 15 to show the EAOA feature selection method's overall performance compared with the basic AOA and the three state-of-the-art feature selection techniques ALO GWO and WOA. It can be noticed from Table 15 that EAOA outperforms other feature selection methods for all the used data in terms of sensitivity, specificity, and AUC except for just four data which proves the superiority of our proposal in identifying both positive and negative examples correctly comparing to other used methods. Furthermore, it shows the best performance in terms of precision than basic AOA, ALO, GWO, and WOA for all data sets except only one data. The preced-   ing results can also be noticed from Figures 5-8 which present the box and whiskers plots of sensitivity, specificity, precision, and AUC for all compared methods. These figures give an evident and qualitative specification of the performance of the compared methods, and the box plots of all other feature selection methods are visibly lower than EAOA in all mentioned measures, which more directly demonstrate the superiority of the EAOA feature selection algorithm.

IV. CONCLUSION
This paper has enhanced the AOA algorithm and applied it in the feature selection problem. The EAOA algorithm is proposed to enhance the basic AOA performance by adding a new parameter that depends on the step length of each individual while revising the individual location over the course of iterations. The new parameter, upper and lower limits of search space, are employed to enhance the balance between the exploration and exploitation processes. The EAOA algorithm's ability to perform the optimization compared to stateof-the-art optimizers is tested using twenty-three benchmark functions. Experimental results show that the EAOA algorithm as compared to other algorithms has better exploitation capability for unimodal functions and better exploration capability for multimodal and fixed-dimensional multimodal functions. Also, the proposed EAOA algorithm could escape the local optima and provide a better convergence rate than the basic AOA method and other state-of-the-art optimizers. After applying the significant Wilcoxon Rank-sum test, the results proved that the EAOA algorithm performs statistically significant and comparatively better performance over the other state-of-the-art optimizers.
In addition, as a potential application, EAOA is used to find the optimal set of features, and SVM is applied for classification tasks. Sixteen real-world data sets are taken into account. Each data set is divided into three sets: train, test, and validation sets. To measure the performance on each data set the average accuracy, G-Mean, F-Measure, Runtime, and the number of selected features for each method were recorded. EAOA records comparable or superior results in the case of feature selection and classification performance when compared with the basic AOA, ALO, GWO, and WOA.
Results indicate that all the feature selection methods have reduced the time consumed for classification and the proposed EAOA is less than the AOA algorithm for almost all the data sets and has superiority for saving computational time and space while maintaining the best performance over the other compared algorithms. The statistical Wilcoxon signedrank test results also prove that the feature selection proposed method can statistically outperform the AOA algorithm and has strong competitiveness against other comparative methods in terms of accuracy, G-Mean, and F-Measure. Finally, the box and whiskers plots of sensitivity, specificity, precision, and AUC for all compared methods show the superiority of the EAOA feature selection algorithm.
SADIQ HUSSAIN received the Ph.D. degree from Dibrugarh University, India. He is currently a System Administrator with Dibrugarh University, Assam, India. He is associated with the computerization of the examination system of Dibrugarh University. He has published various research and conference papers of international repute. He had reviewed research papers for many IEEE, SCI, and Scopus-indexed journals. His research interests include data mining, medical analytics, and machine learning. He acted as a technical member of many reputed conferences.
SAMINA KAUSAR is currently working as a Researcher and also a Ph.D. Scholar with the School of Computer Engineering and Science, Shanghai University, China. She is also working as an Assistant Professor with the University of Kotli, Azad Kashmir, Pakistan. Her research interests include big data, bioinformatics, computer networks, cloud computing, data mining, E-learning, and machine learning algorithms.
MD. AKHTARUL ISLAM received the B.Sc. and M.S. degrees in statistics biostatistics and informatics from Dhaka University, Dhaka, Bangladesh, in 2012 and 2013, respectively. He is currently working as an Assistant Professor in the statistics discipline with Khulna University, Khulna, Bangladesh. He has authored or coauthored around 12 publications in different peerreviewed journals. His research interests include biostatistics, epidemiology, public health, infectious disease, and meta-analysis.
LAMIAA M. EL BAKRAWY received the M.S. and Ph.D. degrees in computer science from the Faculty of Science, Al-Azhar University, Egypt, in 2009 and 2018, respectively. Her master's topic was optimization using swarm intelligence. Her Ph.D. topic was machine learning in image authentication. She is currently an Associate Professor with the Mathematics and Computer Department, Faculty of Science, Al-Azhar University. She has published many articles in well-known international journals. Her research interests include image processing, social network analysis, computational intelligence, machine learning, metaheuristic optimization, and information security. She is also a program committee member in various international conferences.