Improved Salp Swarm Algorithm Based on Levy Flight and Sine Cosine Operator

The salp swarm algorithm (SSA) is a swarm intelligence optimization algorithm that simulates the chain movement behavior of salp populations in the sea. Aiming at the shortcomings of the SSA, such as low precision, low optimization dimension and slow convergence speed, an improved salp swarm algorithm based on Levy flight and sine cosine operator (LSC-SSA) was proposed. The Levy flight mechanism uses the route of short walks combined with long jumps to search the solution space, which can effectively improve the global exploration capability of the algorithm. Improved sine cosine operator use sine search for global exploration and cosine search for local exploitation. At the same time, an adaptively switching between the two function search methods can achieve a smooth transition between global exploration and local exploitation. In the simulation experiment, salp swarm algorithm (SSA), whale optimization algorithm (WOA), particle swarm algorithm (PSO), sine cosine algorithm (SCA), firefly algorithm (FA) and LSC-SSA were adopted for solving function optimization problems. Then, the feasibility of the improved algorithm for solving high-dimensional large-scale optimization problems and the effectiveness of the improvement strategy are evaluated. Finally, LSC-SSA was applied to train muti-layer perceptron neural network. Simulation results show that the introduction of Levy flight and improved sine cosine operator in LSC-SSA significantly improves optimization accuracy and convergence speed compared with other swarm optimization algorithms. In addition, the improved algorithm can effectively solve high-dimensional large-scale optimization problems. In the application of training muti-layer perceptron NN, the improved algorithm can avoid falling into the local optimal value and obtain the ideal classification accuracy.

gradient-based optimization method needs to calculate the gradient information of the searching space, but the metaheuristic algorithm generates the next solution from a random solution. The meta-heuristic algorithm treats the optimization problem as a black box, and only needs to consider the input and output to solve the optimization problem, without calculating the derivative of the searching space. Therefore, metaheuristic algorithms are more suitable for practical problems with complex information. Finally, the meta-heuristic algorithms have the characteristic of avoiding falling into local optimum. The searching space for practical problems has a large number of local optimal values, which makes the optimization process difficult. The traditional optimization method is easy to fall into the local optimum and ignore the global optimum. The random optimization of the meta-heuristic algorithms make the searching agents widely distributed in the searching space, which will reduce the probability of falling into the local optimum.
Generally, meta-heuristic algorithms can be divided into two categories: individual-based algorithms and populationbased algorithms. Individual-based algorithms initialize a candidate solution and improves the candidate solution during the optimization process. For example, Simulated Annealing (SA) [9]- [11], Tabu Search (TS) [12], Iterative Local Search (ILS) [13] are individual-based algorithms. However, the population-based algorithms obtain a set of candidate solutions by initializing the searching agent population, and optimizes this set of candidate solutions in subsequent iterations. Compared with individual-based algorithms, population-based algorithms have global explorability that can avoid falling into a local optimum. At the same time, the information exchange mechanism between populations is conducive to optimizing candidate solutions.
There are several main branches of population-based algorithms: evolution-based algorithms, physical phenomenon-based algorithms, and swarm intelligent (SI)based algorithms. The evolution-based algorithms use the idea of natural evolution. The initial population retains the best individuals and eliminates the poor ones through combination, crossover, and mutation so as to ensure that the newly generated population is always better than the previous generation. Evolution-based algorithms include Evolutionary Strategy (ES) [14], Differential Evolutionary Strategy (DE) [15], [16], Biography-based Optimization Algorithm (BBO) [17], Probability-Based Incremental Learning (PBIL) [18], etc. For example, Genetic Algorithm (GA) [19] takes Darwinian evolution as the inspiration and inherits better individuals to the next generation of individuals, so that the initial population is continuously optimized in iteration. Physical phenomenon-based algorithms use gravitational, inertial, gravity, electromagnetic, etc. in physical phenomena to perform information exchange and mobile search between populations in solution space. Physical phenomenon-based algorithms mainly include Water Cycle Algorithm (WCA) [20], Henry gas solubility optimization (HGSO) [21], Electrostatic Discharge Algorithm (ESDA) [22]. For example, the Black Hole Algorithm (BH) [23] takes black hole and universal gravity as inspiration. Through hundreds of years of evolution, biological populations can effectively organize hunting, sailing, defending, and foraging behaviors. Birds sailing in a V-shape can evenly distribute resistance among populations to save energy. Bees search for food and use pheromone to mark the path, guiding other individuals to find the shortest path from the hive to the food. Swarm Intelligent (SI)-based algorithms are inspired by the intelligent behavior of biological swarms, including Marine Predators Algorithm (MPA) [24], Seagull Optimization Algorithm (SOA) [25], Spotted Hyena Optimizer (SHO) [26], and Naked Mole-Rat algorithm (NMR) [27], Equilibrium Optimizer (EO) [28], Parasitism Predation Algorithm (PPA) [29], Manta ray foraging optimization (MRFO) [30], Social Ski-Driver optimization algorithm (SSD) [31], etc. For example, Ant colony algorithm (ACO) [32] takes ant colony as inspiration. Ant colonies can use information exchange mechanisms to find the shortest path to food sources in different environments. SI-based algorithms are usually equipped with fewer parameters and operators, which will reduce the complexity and computational scale of the algorithm. More importantly, SI-based algorithms usually retain information in the searching space for communication among individuals in the swarm. Therefore, SI-based algorithms are superior the evolution-based algorithms and physical phenomenon-based algorithms.
In general, SI-based algorithms are inspired by the intelligent social behavior of biological swarms. The Salp Swarm Algorithm (SSA) [33] was proposed by Seyedali Mirjalili. SSA solves optimization problem by establishing mathematical model that simulates the salp swarm. SSA is equipped with fewer parameters and operators, and the structure is simple and easy to implement. However, the shortcomings of SSA are also obvious. When the leader falls into the local optimum, it will mislead other search agents (followers) to stagnate in the local optimum. At the same time, the leader's location update model reduces the search efficiency of food. In addition, the mathematical model of SSA lacks the transition between exploitation and exploration. Finally, the improved methods of SSA are mostly for solving low-dimensional optimization problems. The performance of SSA in solving high-dimensional optimization is unknown. This paper uses two improvement strategies to make up for the shortcomings of SSA, and the improved SSA based on Levy flight and sine cosine operator (LSC-SSA) was proposed. First, Levy flight with step size control factor is used to increase the traversal and exploration capabilities of search agents. Second, improved sine cosine operator is used to improve the search efficiency of leader. At the same time, improved sine cosine operator is used to improve the balance between exploitation and exploration. Levy flight is a function that simulates animal foraging routes, which was proposed by French mathematician Paul Pierre Lévy. Researchers found that most animals' foraging routes followed Levy flight. The long-term short-step local search in the Levy flight mechanism can improve the diversity and traversal of the algorithm. The short-term long-step global jump can make the search agents jump out of the local optimum and improve the global exploration capability. As a global search operator, Levy flight is applied to many improved algorithms [34]- [36]. Xin-She Yang and Suash Deb proposed the Cuckoo Search Algorithm (CS) in 2009 [37]. Levy flight was introduced to update the bird's nest position, which effectively improved the global exploration capability of algorithm. In addition, as a novel branch of heuristic algorithm, there are little related literature on mathematical rules based algorithms. The Basic Optimization Algorithm (BOA) [38] uses basic mathematical operators and shrinking lengths to guide search agents closer to the optimal value. The Sine Cosine Algorithm (CSA) [39] builds mathematical model by adaptively and equally using sine and cosine search methods. As the name implies, mathematical rules based algorithms are inspired by mathematical rules. This type of algorithm can well balance exploitation and exploration capabilities.
The innovation of this paper is to introduce the Levy flight mechanism with step size control factor into the salp swarm algorithm, which improves the traversal and global exploration ability of the algorithm. In addition, the improved sine cosine operator and introduced into the salp swarm algorithm. The improved sine cosine operator use sine search for global exploration and cosine search for local exploitation. The improved sine cosine operator uses a more effective convergence factor. At the same time, the logarithmic spiral search route also introduced into the sine cosine operator. At the same time, according to the no free lunch theorem (NFL) [40], the effectiveness of the algorithm in a set of optimization problems may not be extended to other optimization problems. In other words, there is no one algorithm that can solve all optimization problems. Because the improved algorithm may be superior to other algorithms on some optimization problems, the innovation and motivation of this paper are strongly supported. The paper is organized as follows. Section III introduces the SSA. Section IV introduces the Levy flight mechanism, sine cosine algorithm and the proposed LSC-SSA in details. Section V selects the Salp Swarm Algorithm (SSA), Particle Swarm Optimization (PSO) [41], Whale Optimization Algorithm (WOA) [42], Sine Cosine Algorithm (CSA), Firefly Algorithm (FA) [43] and LSC-SSA to carry out the function optimization comparison experiments. The experimental results show that LSC-SSA has the advantages of high optimization accuracy and fast convergence speed. The feasibility of the improved algorithm for solving high-dimensional function optimization and the effectiveness of the improvement strategy are verified. Section VI applies LSC-SSA to train muti-layer perceptron neural network. Section VII is the conclusion of this paper.

II. RELATED WORK OF SSA
The SSA algorithm relies on the concept of swarm intelligence and has a simple structure, which has attracted the attention of many scholars. With the deepening of research, many improved methods and practical applications of SSA have been proposed [44]- [47]. Neggaz, N et al used the sine and cosine algorithm and the disrupt operator to improve SSA and proposed ISSAFD [48]. This mechanism can improve the exploration phase and avoid local stagnation. Experimental results show that ISSAFD has good performance in terms of accuracy and number of features. However, the performance of convergence speed and optimization accuracy is not shown. At the same time, the SCA algorithm has the drawback of slow convergence. SCA should be modified to introduce SSA. Tubishat, M et al proposed an improved Salp Swarm algorithm (ISSA) [49] to solve the feature selection problem and select the best subset of features in the packaging mode. Experiments show that ISSA is superior to other algorithms in accuracy, convergence and feature reduction. However, the improved algorithm needs to be further improved in terms of optimization accuracy. Panda, N et al used spatial transformation search (STS) to improve the performance of SSA and proposed STS-SSA [50]. Experimental results show that STS-SSA can effectively solve the optimization problem. Abd Elaziz, M et al proposed a multi-objective big data optimization method based on hybrid SSA algorithm and DE algorithm [51]. The experimental results of the test problems in the 2015 big data optimization competition show that the proposed method is superior to other methods on all test problems. However, the performance of the improved algorithm in function optimization has not been verified. Faris, H et al. Proposed two methods for feature selection using SSA as a search strategy [52]. The experimental results of 22 UCI data sets show that the proposed method is obviously superior to other methods. El-Fergany, AA used SSA to optimize the optimal values of unknown parameters of the polymer exchange membrane fuel cell model [53]. Simulation results show that the proposed SSO-based method can effectively solve the optimal solution of the model. Yang, B et al. Expanded the salp population into multiple independent salp chains and proposed the modular salp swarm algorithm (MSSA) [54]. Simulation results show that MSSA is superior to the other eight algorithms.

III. SALP SWARM ALGORITHM A. MATHEMATICAL MODEL OF SALP SWARM ALGORITHM
Establishing a mathematical model that mimics the intelligent behavior of swarm is the basic step for SI-based algorithms to solve an optimization problem. Mathematical models with fish swarm, bird swarm, and ant swarm have been widely used in optimization problems. In order to model the salps chain formed by end-to-end individuals, the individuals in the salps swarm are divided into two categories: leader and followers. The leader is the foremost individual of the salps chain to determine the movement direction and foraging route of the population, and guide the salps chain toward the food. The remaining individuals are followers. They follow the leader in turn to form a chain structure. However, 99742 VOLUME 8, 2020 the mathematical model only simulates the generation of the salps chain, and cannot directly solve the optimization problem. The mathematical model needs to be adjusted to adapt to the optimization problem. Determining the global optimal value is the goal of the optimization problem, so the global optimal value is used as the food that the salps chain needs to find. The position of the global optimal value in the optimization problem is unknown. Therefore, taking the optimal value in the current iteration as the global optimal value (food), the salps chain model can be moved closer to the target value. According to the position of the food update the leader, the entire salps chain can be brought closer to the food. This process is represented by the following equation: where, X 1 j indicates the position of the leader (the frontmost individual of the salps chain) in the j-th dimension; F j indicates the position of the food in the j-th dimension; ub j is the lower bound of the j-th dimension; lb j is the upper bound of the j-th dimension; Upper and lower bounds are used to limit leader from exceeding the searching space. Parameter c 2 is a random number between [0.1], which is used to control the leader's moving step. Parameter c 3 is a random number between [0.1], which is used to equally select whether the leader's moving direction is closer or farther from the food location. Parameter c 1 shown in Eq. (2) is an adjustment factor that is used to balance global exploration and local exploitation.
where, t is the current number of iterations, and T is the total number of iterations. It can be seen from Eq (2) that the adjustment factor c 1 will adaptively decrease with the iterative process. At the beginning of the iteration, the decreasing trend of the adjustment factor c 1 is slow, which drives the leader to conduct a largescale global exploration. In the later iterations, the decreasing trend of the adjustment factor c 1 is obvious, and the leader can carry out the detailed exploitation. In order to make followers follow the leader to form a chain structure, Newton's law of motion is used to update the position of followers, which is described as: where, X i j indicates the position of the i-th follower in the jth dimension when i ≥ 2; t represents time; v 0 represents the initial speed, and the acceleration of the follower's movement The time variable of the optimization problem is represented by the number of iterations, so the iteration interval represents the time interval, t = 1. The follower's initial speed v 0 = 0. Eq (3) can be updated as: In the process of following the leader to update the position, the followers may reach a position better than the current best solution (food). At this time, the food is replaced to the better position, and the updated leader guides the followers to move in the direction of food. The chain movement of salp swarm in searching space is shown in Fig. 1.
The advantages of SSA are as follows: 1) It can be seen from the mathematical model of SSA that it has a simple structure and is equipped with few parameters and operators, so it is easy to implement. 2) In the process of optimization, SSA only uses the optimal solution in the current iteration as food. Even the deterioration of the fitness of the entire population will not affect the quality of the food. 3) Leaders can explore and get closer based on the location of the food. The followers only need to move in a chain according to the position of the leader, which reflects the simplicity of the algorithm.
When considering the advantages, it is also necessary to discuss the shortcomings of SSA. The shortcomings of SSA are as follows: 1) Followers only need to follow the leader, which embodies simplicity. However, when the leader falls into the local optimum, it will mislead the entire population into the local optimum.. 2) The leader updates the location based on the food. However, updating the position of the leader requires calculating boundaries, which reduces the direct interaction between the leader and the food.
3) The mathematical model of SSA lacks the transition between exploitation and exploration, which leads to a low precision of the algorithm.

B. PROCEDURE OF SALP SWARM ALGORITHM
The procedure of standard SSA is described as follows.
Step 1: Initialize the algorithm parameters: Number of iterations T , number of ascidian populations N , test function dimension D.
Step 2: Initialize the salp population according to the upper and lower bounds, t = 1.
Step 3: Calculate the fitness value of each search individual, and treat the individual with the best fitness value in the current population as food F j .
Step 4: Update c 1 according to Eq. (2) and generate random numbers c 2 and c 3 .
Step 6: Determine whether the algorithm has reached the maximum number of iterations or found the optimal value. If the algorithm's end condition is met, the optimal value is returned and exited; otherwise, go to Step 3.
The flowchart of salp swarm algorithm is shown in Fig. 2.

IV. IMPROVED SALP SWARM ALGORITHM BASED ON LEVY FLIGHT AND SINE COSINE OPERATOR
Levy flight is a function that simulates animal foraging routes, which was proposed by French mathematician Paul Pierre Lévy. Researchers found that most animals' foraging routes followed Levy flight. The long-term short-step local search in the Levy flight mechanism can improve the diversity and traversal of the algorithm. The short-term long-step global jump can make the search agents jump out of the local optimum and improve the global exploration capability. In addition, as a novel branch of heuristic algorithm, there is very little related literature on mathematical rules based algorithms. The Sine Cosine Algorithm (CSA) builds mathematical model by adaptively and equally using sine and cosine search methods. SCA algorithm can well balance exploitation and exploration capabilities.

A. LEVY FLIGHT
Levy flight is a probability distribution proposed by the French mathematician Paul Pierre Lévy (1886Lévy ( -1971 in the 1930s [55], which is used to simulate bird foraging routes. So far, some scholars have shown that the foraging trajectories of many birds and insects in nature (such as albatross, bees and fruit flies) conform to the Levy distribution. Even more novel is that some marine animals (tuna, swordfish, some sharks, etc.) also follow the mathematical strategy of Levy distribution when foraging. These studies formed the Levy flight foraging hypothesis: Levy flight can improve the efficiency and accuracy of biological foraging, and it is more naturally adaptable. As a global searching operator, Levy flight mechanism searches for space using short-distance walking combined with long-distance jumping routes. Among them, longterm short-distance walking can enable the search agent to carefully search the area near it, which improves the diversity and local exploitation ability of the population. The directional variability of the occasional long-distance jump guarantees a large probability search of the entire region by the population, and the abrupt change has a great advantage for exploring problems in a large space. The combination of short-distance and long-distance methods achieves sufficient optimization of the solution domain, which greatly improves the global search ability of the algorithm. The 500-step motion trajectory of the Levy flight within the search range is shown in Fig. 3, which fully verified the characteristics of the short distance of Levy flight combined with the occasional long distance jump, and fully explored the solution domain.
The probability density function of Levy flight obeys the Levy distribution, which can be described as follows.
where, 0 < λ ≤ 2 to control the peak sharpness of the Levy distribution graph; β > 0 to control the span of the distribution graph. When λ = 2, the Levy distribution is transformed into a Gaussian distribution; when λ = 1, the Levy distribution is transformed into a Cauchy distribution. There is no clear analysis of the integral formula, and it is more difficult to generate a random number that obeys the distribution. However, when s s 0 > 0, that is to say s → ∞, Eq. (5) can be updated as: The approximate distribution exhibits power-law behavior, and the variance exhibits an exponential relationship with time, that is to say σ 2 (t) ∼ t 3−β . So Levy flight is better than Brown sport. Since then, many scholars have proposed many implementation methods for generating random numbers obeying the Levy distribution according to this approximate formula, which includes a method proposed by Mantegna in 1994 to solve random numbers using the normal distribution, sometimes called the Mantegna method [56]. The Mantegna method for generating random step sizes obeying the Levy distribution is described as follows: where, u and v obey the following Gaussian distribution.
where, β = 1.5; is a Gamma function, which is calculated by: When z = n, (n) = (n − 1)!. A large number of studies have shown that Levy flight mode can maximize the efficiency of search targets under uncertain conditions [44]. When solving the function optimization problems, the equation for updating the population position by the Levy flight mechanism can be described as: where, X i (t + 1) indicates the position of the population after the Levy flight operation; X i (t) is the position of the current population; s is a random step that obeys the Levy distribution shown in Eq. (7); ⊗ indicates the dot product between elements.

B. SINE COSINE ALGORITHM
Sine Cosine Algorithm (SCA) is a novel mathematical rules based algorithm. As the name implies, this algorithm uses the sine function combined with the cosine function in the mathematical rules to solve the optimization problem. SCA has fast convergence speed, simple structure, and can well balance the global exploration ability and local exploitation ability. In general, a population-based algorithm initializes a set of random solutions. After evaluation of the objective function of the optimization problem, this set of random solutions will be improved. If the distribution of the agents in the searching space are too concentrated, it will fall into the local optimal value and ignore the global optimal value, which will reduce the global explorability of the algorithm. On the contrary, if the distribution of the agents are too scattered, the local optimal value will be ignored, which will reduce the local exploitation of the algorithm. Therefore, balancing global exploration and local exploitation is an important part of optimization algorithms. In order to achieve this function, the sine cosine algorithm uses the sine search method for exploration and the cosine search method for exploitation. The equation of sine search and cosine search are described as follows: where, X t i indicates the position of the individual in the i-th dimension in the t-th iteration and p t i indicates the position of the current optimal individual in the i-th dimension.
It can be seen from Eq. (13) that there are four main parameters in the sine cosine algorithm: r 1 , r 2 , r 3 and r 4 . r 2 is a random number between [0,2π] to control the moving distance of the search agent. r 3 is a random number to provide weight to the search agent to enhance (r 3 > 1) or weaken (r 3 < 1) the effect of the individual's moving distance. r 4 is a random number between [0,1] to control the equal use of two search methods. r 1 can adaptively guide the moving direction of the search agent (the location to be searched next time), which is calculated by: where, t indicates the current number of iterations; T indicates the maximum number of iterations; a is a constant that limits the size of r 1 , generally a = 2.
As an adaptive guide factor, r 1 can guide the search agent's movement direction (next search position). When r 1 < 1, r 1 guides the search agent to the area near the optimal value. when r 1 ≥ 1, r 1 guides the search agent to spread beyond the optimal value. The effect of adaptive guidance factor r 1 on Eq. (13) is shown in Fig. 4. Fig. 4 illustrates that Eq. (13) defines the area of the search agent and optimal value in the searching space, and the movement direction of the search agent can be changed by the adaptive guidance factor r 1 .
Normally, the range of values for the sine and cosine functions is [−1,1]. By expanding the range of the sine and cosine functions to [−2, 2], Eq. (13) can be extended to higher dimensions to accommodate the complex searching space of the optimization problem. At this time, parameter r 2 (controlling the moving distance of the search agent) can ensure that the agent switches between the two search ranges ([−1,1] and [−2,2]). This mechanism effectively ensures the coordination of exploration and exploitation in the searching space.
The pseudo code of the Sine Cosine Algorithm (SCA) is described as follows.
Initialize the search agents population X t i (i = 1, 2, 3 . . . n) Calculate the fitness of each search agent, t = 1 p t i = the best search agent so far While(t < maxmum numer of iterations) Update r 1 , r 2 , r 3  Seen from the pseudo-code and mathematical model of sine cosine algorithm, it is known that the algorithm is optimized based on the search method generated by the sine function and cosine function, and the structure is simple. As a population-based algorithm, SCA continuously improves the initialized random solution to avoid falling into local optimal values. SCA adaptively adjusts the search area of the agents using the sine search method and cosine search method, and saves the current best solution as the target value (global optimal value). This mechanism can balance global exploration and local exploitation without losing the target value, and develop towards the best area of the searching space.

C. IMPROVED SALP SWARM ALGORITHM BASED ON LEVY FLIGHT AND SINE COSINE OPERATOR
In the process of solving optimization problems, how to make the algorithm avoid getting stuck in the local optimal value is a challenge. Avoiding local optimization requires the agent to search in the solution space as widely as possible. As a search operator with strong global performance, the Levy flight mechanism can improve the global exploration capability of the salp swarm algorithm. It uses short-distance walking combined with long-distance jumping routes to conduct a full search of the solution space. Among them, longterm short-distance walking can enable the population to carefully search the area nearest to it, which improves the diversity and local exploitation capacity of the population. The directional variability of the occasional long-distance jump ensures a large probability search of the entire area by 99746 VOLUME 8, 2020 the population and improves the global exploration capability of the algorithm. This paper uses improved Levy operator to update the position of salp swarm. The improved Levy operator adds a step size control factor to the Levy flight, which can control (weaken or enhance) the walking step size to suit the searching space of different optimization problems. When the step size control factor is small, the agent can be searched carefully in a small range so as to enhance the exploitation ability of the algorithm without affecting the exploration ability, which is suitable for optimization problems with small searching space. When the step size control factor is large, the agent can search extensively in a wide range so as to increase the probability of the algorithm jumping out of the local optimal value, which is suitable for high-dimensional large-scale optimization problems. The equation of improved Levy operator to update the position of salp swarm is described as follows.
where, X i j indicates the position of the i-th follower in the j-th dimension when i ≥ 2; s is the random step size following the Levy distribution generated by the Mantegna method shown in Eq. (7); ⊗ indicates the dot product between elements; a is the step size control factor. When the step size control factor is small, the search agent can carefully search in a small range. In this paper, a = 0.01.
At the same time, this paper also introduces an improved sine cosine operator to update the position of leader. In the salp swarm algorithm, the leader guides the followers so that the population can move according to the position of the food. In other words, just updating the leader's position can realize the chain movement of the entire salp swarm. The leader position update method shown in Eq. (1) is similar to the population update method shown in Eq (13) of the sine cosine algorithm. The same point is that both of them select the search method equally, and update the positions of the remaining individuals according to the position of the current optimal value. But the difference is that the selection of the former searching method is determined only by probability, while the latter can adaptively switch in the search method according to the information returned by the searching space. Compared with the salp swarm algorithm, the population update method of the sine cosine algorithm can better reflect the balance between exploration and exploitation. Therefore, this paper proposes an improved sine cosine operator to update the position of the leader. Firstly, the parameter r 1 of the sine cosine algorithm is replaced with the parameter c 1 of the salp swarm algorithm. Essentially, parameters r 1 and c 1 have the same effect. As global convergence factors, parameters r 1 and c 1 adaptively decrease with the iterative process, which makes the searching agent gradually converge from global to local. This mechanism guarantees the global convergence of the algorithm. However, the convergence effects of the two global convergence factors are different. The convergence effect of parameters r 1 and c 1 in 1000 iterations is shown in Fig. 5. It can be seen that the convergence effect of c 1 is significantly better than r 1 , so use c 1 instead of parameter r 1 . Second, the exponential function e x is introduced so that the leader can form a logarithmic spiral path close to the target value. Finally, the random parameter r 3 has limited usefulness to the algorithm. At the same time, too many parameters will increase the randomness of the algorithm, so the parameter r 3 is removed from the sine cosine operator.
The equations for updating the position of the leader by the improved sine cosine operator are described as follows.
The position updating method shown in Eq. (16) can use sine function for global exploration. Eq (17) can use cosine function for local exploitation. Parameter c 1 enables the search agent to adaptively switch between the two search modes for optimization, and to smoothly transition between exploration and exploitation. As a global convergence factor, parameter c 1 also makes the algorithm converge as the iteration increases. Parameter c 1 makes the sine and cosine search methods gradually converge in iteration, which is shown in Fig. 6.
After adding global convergence factor c 1 and parameter r 4 , the two search methods can be used in combination. The equation is described as follows: where, c 1 have the same effect as in Eq. (2), and r 2 and r 4 have the same effect as in Eq. (13). When the range of the sine and cosine functions are expanded to [−2, 2], the search space can be expanded. This mechanism guarantees that search agent can search VOLUME 8, 2020  inside or outside the target value to adapt to high-dimensional optimization problems. The two search methods can also search different regions based on the returned value, which will reduce the possibility of falling into the local optimal value. When the returned value is in the range of (1, 2] and [−2, −1), the search agent will search outside the target value, which reflects the global exploration. When the return value is in the range of [−1,1], the search agent will search inside the target value space, which reflects the local exploitation. The model of the two search methods after expanding the scope is shown in Fig. 7. It shows that after expanding the range of the sine and cosine functions, the search agent can perform different search methods based on the returned value.
In addition, the improved algorithm also introduces the idea of ''elite search''. After the search agent performs the sine cosine operator or the Levy flight operation, the position of the search agent will change, and the updated position may be worse than the position before the update. Therefore, the updated position of the agent is compared to the position of the last iteration. If the fitness value of the agent after executing the operator is better than the fitness value of the agent without executing the operator, the agent will remain in the current position. Otherwise, the search agent will return to the location where the operator was not executed. The idea based on elitist search ensures that the search agent will develop towards a promising area in each iteration.
In order to verify the mathematical model of the improved algorithm, 30 search agents of LSC-SSA were put into the search space of sphere function for simulation experiments. The search range of the optimization function is [−100, 100], the dimension is 30, and the optimal value is 0. The distribution of 50 and 100 iterations of the search agent in the solution 99748 VOLUME 8, 2020  space is shown in Fig. 8. The red dots indicate leader and the black dots indicate followers. It can be seen from Fig. 8 that as the iteration increases, the leader can guide the followers to move closer to the global optimal value in a chain motion. The experimental results show that the mathematical model of the improved algorithm is effective. In order to further verify the global convergence ability of the improved algorithm, SSA and LSC-SSA were selected to optimize the sphere function. The historical fitness of 40 search agents at 500 iterations is shown in Fig. 9.
The black dots indicate the fitness value of the search agent of the SSA, and the red dots indicate the fitness value of the search agent of the LSC-SSA. It can be seen from Fig. 9 that the search agent of SSA gradually converges around 200 iterations, while the LSC-SSA agent fitness value quickly completes global convergence within 50 iterations. Simulation experiments show that after the search agent exe-cutes the sine cosine operator and the Levy flight operation, the salps chain can effectively move in the searching space. At the same time, the search agent can explore and use the area near the target value, which makes the algorithm easy to jump out of the local optimal value and increase the global convergence.
The flow chart of LSC-SSA is shown in Fig. 10. The procedure of the LSC-SSA algorithm are described as follows.
Step 1: Initialize algorithm parameters: Number of iterations T , number of ascidian populations N , test function dimensions D.
Step 2: Initialize the salp population according to the upper and lower bounds, t = 1.
Step 3: Calculate the fitness value of each search individual, and treat the individual with the best fitness value in the current population as food F j . VOLUME 8, 2020 Step 4: Update c 1 according to Eq. (2) and generate random numbers r 2 and r 4 .
Step 6: Update the position of salp swarm according to Eq. (15). t = t + 1 Step 7: Determine whether the algorithm has reached the maximum number of iterations or found the optimal value. If the algorithm's end condition is met, the optimal value is returned and exited; otherwise, go to Step 3.
The pseudo code of LSC-SSA is described as follows. Initialize the search agents population X t i (i = 1, 2, 3 . . Time complexity is the calculation workload required to execute the algorithm, and it is an important indicator to evaluate the time consumption of the algorithm. The time complexity is usually expressed by the O symbol, excluding the low-order term and the first term coefficient of this function. In metaheuristic algorithms, time complexity is related to the number and structure of the operating units of the algorithm. For the basic salp swarm algorithm, the time complexity mainly depends on the number of initial populations, the number of iterations, and the location update mechanism. The time complexity of the improved algorithm LSC-SSA proposed in this paper mainly depends on the number of initial populations, the number of iterations, and the location update mechanism that introduces an improved strategy. In order to evaluate the impact of the improved strategy on the time cost of the algorithm, the time complexity of the salp swarm algorithm and LSC-SSA were analyzed. The time complexity of each operation unit in the salp swarm algorithm is described as follows. 1) Initialize N populations to be distributed in the Ddimensional search space, which needs to be run N · D times. 2) Calculate the fitness value of each search agent and select the best agent as food, which needs to be run [N · (N − 1)]/2 times. 3) Parameters c 1 , c 2 and c 3 are updated once and need to be run 3 times. 4) The leader performs the position update operation in the D-dimensional search space, which needs to be run 1 · D times. 5) The followers perform position update operations in the D-dimensional search space, which needs to be run (N − 1) · D times. 6) Select the optimal from the current population and output it, which needs to be run N · D times. The time complexity of each operating unit of the improved algorithm LSC-SSA is described as follows.

1) Initialize N populations to be distributed in the D-
dimensional search space, which needs to be run N · D times. 2) Calculate the fitness value of each search agent and select the best agent as food, which needs to be run [N · (N − 1)]/2 times. 3) Parameters c 1 , r 2 and r 4 are updated once and need to be run 3 times. 4) The Leader updates position in D-dimensional search space through sine cosine operator, which needs to be run 1 · D times. 5) The followers perform position update operations in the D-dimensional search space, which needs to be run (N − 1) · D times. 6) The Levy flight mechanism updates the population position, which needs to be run N · D times. 7) Use the idea of ''elite search'' to test the position quality of the population, which needs to be run N · D times. 8) Select the optimal from the current population and output it, which needs to be run N · D times.

Each of the above operation units goes through T iterations, so the total time complexity of LSC-SSA is
Compared with the salp swarm algorithm, LSC-SSA does not increase the time cost. The time complexity analysis shows that the introduction of the improved strategy does not destroy the simplicity of the algorithm structure, nor does it increase the computational cost of the algorithm.

V. SIMULATION EXPERIMENTS AND RESULT ANALYSIS A. FUNCTION OPTIMIZATION
Optimization is to find the optimal value among all possible values in a given searching range and output it in a minimized or maximized form. Without loss of generality, function optimization is considered as a constrained singleobjective optimization problem. Therefore, function opti-mization has only one target value that needs to be outputted in a minimized form. The equation for the function optimization problem can be defined as: where, d is the number of variables; p is the number of equality constraints; m is the number of inequality constraints; lb i indicates the lower bound of the i-th variable, and ub i indicates the upper bound of the i-th variable. In order to verify the optimization performance of the improved algorithm, this paper selected different algorithms for comparative experiments. The algorithm selected in the experiment and its parameter settings are shown in Table 1.

B. BENCHMARK FUNCTION
The simulation experiments adopted 34 test functions to evaluate the optimized performance of the improved algorithm LSC-SSA. These test functions can be divided into three categories: unimodal functions, multimodal functions, and combined functions. Among them, the function F1-F22 is the test function of CEC2005. As a classic test set, they can comprehensively evaluate the performance of the algorithm. In addition, functions F22-F34 are CEC2017 test functions. As the latest test functions, they can improve the quality of experiments. Functions F 1 − F 7 are unimodal functions. They only have a global optimal value. These functions are used to evaluate the local exploitation ability and convergence speed of the algorithm. Functions F 8 − F 13 are multimodal functions. Multimodal functions can produce multiple local optimal values in a continuous searching space. Therefore, the algorithm is prone to fall into a local optimum when solving multimodal functions, and the optimization process is challenging. At the same time, the number of local optimal values will increase as the problem size increases, which has important reference value for evaluating the global exploration capability of the algorithm. Functions F 14 − F 22 are VOLUME 8, 2020      optimization accuracy of the algorithm. The specific information of the test functions are listed in Table 2.

C. COMPARISON EXPERIMENTS AND ANALYSIS WITH OTHER ALGORITHMS
In order to verify the optimization effect of algorithm, the basic salp swarm algorithm (SSA), particle swarm optimization (PSO) algorithm, whale optimization algorithm (WOA), sine cosine algorithm (SCA), firefly algorithm (FA) and the improved algorithm proposed in this paper (LSC-SSA) were selected for carrying out the optimization comparison experiments. The algorithms set uniform parameters, and each test function runs independently 60 times. The test function convergence curves are shown in Fig. 11. The perfromance results are listed in Table 3.
The simulation results in Fig. 11 show that, except for a few functions, the optimization effect of LSC-SSA on most functions has obvious advantages. For functions F1-F5, F7-F11, F13-F15 and F23-F28, the optimization accuracy and convergence speed of LSC-SSA are significantly better than other algorithms. LSC-SSA has an advantage in 68% of functions, and has the best optimization performance in this experiment. For the unimodal functions F1-F7, LSC-SSA is only inferior to FA on function F6, and the optimal performance of other functions is the best. It shows that the improved algorithm has strong local exploitation ability and can quickly converge to the target value. For multi-modal functions F8-F13 with a large number of local optimal values, LSC-SSA has obvious advantages and can achieve higher optimization accuracy in a shorter number of iterations. LSC-SSA is inferior to PSO and FA only on function F12. This shows that LSC-SSA has a strong global exploration capability, the search agent can avoid falling into a local optimal value and develop to a promising area in the search space. For compound function F14-F22, LSC-SSA's optimization advantage on F13-F15 is the best. Among them, the problem dimensions of F16-F19 are small, so the performance difference between algorithms is not obvious. For functions F20-F22, the improved algorithm is only better than SSA and SCA, but the optimization accuracy is not inferior to other algorithms. For functions F23-28, the improved algorithm has obvious advantages. The simulation results of three types of test functions show the advantages of the proposed in global convergence and optimization accuracy.
Three criteria listed in Table 3 are the optimal value, average value, and variance, which are used to evaluate the optimization accuracy, average accuracy, and stability of the algorithm. As can be seen from the optimization accuracy in the table, the optimization accuracy of the proposed is significantly better than other algorithms. The improved algorithm found the theoretical optimal value on 17 functions (F1-F4, F8, F9, F11, F13, F17, F20, F22-F28), accounting for 61%, which is the algorithm with the highest precision in this experiment. In the remaining functions, the optimization accuracy of LSC-SSA is not much different from the theoretical optimal value. In average accuracy and stability, LSC-SSA can also maintain obvious advantages. It shows that the proposed LSC-SSA is not easy to be affected by randomness, has good robustness, and can stably maintain the optimization accuracy. It is worth mentioning that the FIGURE 11. Simulation experiment results. VOLUME 8, 2020 optimized performance of WOA is second only to LSC-SSA. Strong performance will bring more development to the algorithm. It can be expected that the whale optimization algorithm will be improved and applied to more optimization problems. It is worth mentioning that this paper uses the p value of wilcoxon rank sum test to test the performance difference between the two algorithms. When the p-value is less than 0.05, there is a significant performance difference between the two algorithms. When the p-value is greater than 0.05, there is no significant difference between the two algorithms. If the p-value result is NAN, there is no difference between the two algorithms. The test results in Table 4 show that the p value of the improved algorithm on most functions is less than 0.05, which shows that LSC-SSA has obvious advantages over other algorithms.
The optimization process of the meta-heuristic algorithm are global exploration and local exploitation. In the global exploration stage, search agents are distributed as widely as possible in the searching space and a set of random solutions are obtained. In the local exploitation phase, the algorithm will continuously improve this set of random solutions to make them develop towards global optimal values. Simulation experiments verify that LSC-SSA is more competitive than other algorithms in solving function optimization problems, indicating that the improved strategy proposed in this paper effectively improves the performance of the algorithm. First of all, the Levy flight mechanism enables the search agent to avoid falling into a local optimal value and improves the global exploration capability of the algorithm. Secondly, the introduction of improved sine cosine operator allows the algorithm to adaptively adjust the search area based on the return value of the solution space, effectively balancing exploration and exploitation. The improved strategy improves the optimization accuracy, global convergence and stability of the algorithm. The effectiveness analysis of the improvement strategy is explained in details in Section E.

D. EFFECTIVENESS ANALYSIS OF SOLVING HIGH DIMENSION FUNCTION OPTIMIZATION PROBLEM
High-dimensional, large-scale, and high-noise features are common in practical optimization problems, which make the searching space complex and difficult to optimize. In order to verify the possibility of the improved algorithm to solve practical problems and expand the theoretical research of the algorithm, this paper applies LSC-SSA to simulation experiments of high-dimensional function optimization problems. Functions F 14 −F 22 are fixed-dimensional functions, and they are not allowed to change the number of variables in the solution space. Therefore, for functions F 1 −F 13 , the experiments increase the test function's dimensions D = 30 to D = 100, D = 200, and D = 300 to evaluate the effectiveness of the improved algorithm for solving high-dimensional large-scale optimization problems. The experimental results are listed in Table 5. In addition, Table 6 shows the p-value results of wilcoxon rank sum test.
In general, the calculation size and complexity of the test functions will increase as the dimensions increase. The expansion of the searching space and the increase of the local optimal value will reduce the probability of the algorithm finding the global optimal value. Therefore, the high-dimensional optimization process is full of challenges. Especially for multimodal functions F 8 − F 13 , with multiple local optimal values increasing as the dimensions increase, search agent tend to fall into local optimal. Too many local optimal values hinder the algorithm's global explorability, which causes a dimensional disaster for largescale problems. The experimental results show that LSC-SSA can maintain its advantages in optimization accuracy, mean value, and variance, and is not much different from the performance of low-dimensional (30-dimensional) test functions. On D = 100, D = 200 and D = 300, LSC-SSA found the theoretical optimal value (0) of 8 test functions (F1-F4,  F8, F9, F11, F13), accounting for 61%. The optimal value of the function F8 will shift with the dimension. The improved algorithm can still find the theoretical optimal value of F8, which shows that the algorithm can effectively avoid the local optimal. For functions F7, F10, and F12, the improved algorithm does not fall into a dimensional disaster, and can maintain the advantages of low-dimensional functions. For functions F5 and F6, the optimization capability of LSC-99760 VOLUME 8, 2020 SSA will decrease as the dimension increases. The test results in Table 6 show that the performance of the algorithm in high-dimensional optimization is not significantly different from 30-dimensional, which shows that LSC-SSA will not fall into dimensional disaster. The experimental results verify that LSC-SSA can effectively avoid falling into dimensional disaster, and the optimization performance of solving highdimensional functions is strong. The improved algorithm can still maintain strong optimization accuracy and robustness when solving large-scale problems, which lays a theoretical foundation for the application of LSC-SSA to practical problems.

E. PROBLEM EFFECTIVENESS ANALYSIS OF IMPROVED STRATEGIES
This paper makes two improvements to the basic strategy of the salp swarm algorithm. First, the introduction of the Levy flight mechanism with a step size control factor increased the population ergodicity and global explorability. Second, the position of leader is updated by improved sine cosine operator, so that the algorithm can adaptively transition between global exploration and local exploitation. The function optimization experiments in Section C have verified that the improved strategy can improve the performance of the algorithm. In order to further evaluate the effectiveness of the improved strategy, the improved algorithm (LSC-SSA), the salp swarm algorithm that only introduces Levy flight (L-SSA), the salp swarm algorithm that only introduces sine cosine operator (SC-SSA) and salp swarm algorithm (SSA) are selected for comparison experiments. The parameter settings of the algorithm are the same as in Table 1. Select functions F1-F22 from Table 2. Each function runs independently 60 times. The experimental results are listed in Table 7, and the convergence curves of some functions are shown in Fig. 12. VOLUME 8, 2020 The convergence curves show that LSC-SSA is inferior to L-SSA only on F12. For the function F5, there is no difference in the optimization precision and convergence speed between LSC-SSA and L-SSA. For function F22, the convergence speed of LSC-SSA is inferior to SSA. On the other functions, LSC-SSA has the best optimization effect. The optimization accuracy in Table 7 shows that the proportion of LSC-SSA finding the theoretical optimal value is 50%, the proportion of L-SSA finding the theoretical optimal value is 14%, and the proportion of SSA finding the theoretical optimal value is only 9%. LSC-SSA has the best optimization accuracy. Combining the performance of average accuracy and stability, it can be seen that LSC-SSA has better global convergence and robustness than L-SSA and SSA.
In addition, Table 8 shows the p-value results of wilcoxon rank sum test. The statistical results show that the introduction of two improved strategies has more obvious advantages than the other cases, which shows that Levy flight and improved sine cosine operator have a synergistic effect on the improvement of algorithm performance. Experiments show that the improved sine cosine operator introduced on the basis of Levy flight can significantly improve the optimization performance of the salp swarm algorithm. The effectiveness of the diversified improvement strategies proposed in this paper is verified.

F. COMPARISON EXPERIMENTS AND ANALYSIS WITH OTHER IMPROVED ALGORITHMS
In order to verify the effectiveness of the improved algorithm, this paper selected other improved algorithms for comparative experiments, such as CMA-ES [57], PSOGSA [58], SADE and VPSO [59]. Among them, CMA-ES is the winner algorithm of CEC2013. SADE, PSOGSA and VPSO are improved algorithms that have been verified by various optimization problems (function optimization, engineering optimization, etc.). Comparison with the four improved algorithms can more fully demonstrate the optimization capabilities of LSC-SSA. The functions F1-F28 of Table 2 were selected for experiments. The convergence curves of some functions are shown in Fig. 13. At the same time, Table 9 shows the mean and variance generated by the algorithm running 60 times independently. The mean and variance can intuitively show the average accuracy and robustness of the algorithm. In addition, p-value results of wilcoxon rank sum test are shown in Table 10.
The convergence curve shows that the improved algorithm has the best convergence speed and optimization accuracy. It can be seen from Table 9 that the improved algorithm has the best performance in terms of average accuracy and robustness. In addition, the statistical results in Table 10 show that the improved algorithm LSC-SSA has significant advantages. In summary, the improved algorithm LSC-SSA is the best algorithm in this experiment. CMA-ES is the second best algorithm. Compared with other improved algorithms, it shows the competitiveness of LSC-SSA more comprehensively.

VI. TRAINING MUTI-LAYER PERCEPTRON BY LSC-SSA
Neural Networks (NN) is an application tool in the field of intelligent computing that solves classification problems by mimicking biological neurons in the brain. Many types of neural networks have been proposed, such as Kohonen self-organizing neural networks [60], Recurrent neural networks [61], and so on. Feed-forward neural networks [62] are also one of them. In a feed-forward neural network, neurons are arranged in different parallel layers with only one-way connections between them. The first layer is used as the input layer, the last layer is used as the output layer, and the level between the input layer and the output layer is hidden layer. The input information can share the information of two neurons along one direction in the neural network. The feed-forward neural network with only one hidden layer is called Muti-Layer Perceptron (MLP). The structure of the MLP is shown in Fig. 14. The neural network can use the trainer to learn from existing experience and obtain the best connection weight and error value to ensure that the deviation of the output layer is minimized. Muti-Layer Perceptron are no exception. Neural network learning trainers are divided into deterministic learning and random learning.
Back propagation (BP) algorithm and gradient descent algorithm belong to the trainer of deterministic learning. Deterministic learning has the advantages of fast convergence, simple and efficient. But the quality of the global optimal solution depends on the initial solution, and it is easy to fall into the local optimal solution. Random learning can continuously improve the initial random solution during the learning process, which can prevent the neural network from falling into a local optimum, but the conver- gence is worse than deterministic learning. As a random learning trainer, the meta-heuristic algorithm can effectively solve the problem of local optimal stagnation. The function optimization in Section IV is a continuous problem. There are limited variables and target values in a limited searching space. However, training muti-layer perceptron is a discrete problem, and the values in the searching space are not continuous. At the same time, there are a large number of local optimal values for discrete problems, and the trainer may mistake the local optimal as the global optimal and reduce the 99766 VOLUME 8, 2020 optimization accuracy. Therefore, meta-heuristic algorithms are challenging to train muti-layer perceptron. In order to further verify the optimization performance of the improved algorithm and the effectiveness of solving discrete problems, LSC-SSA was applied to train muti-layer perceptron neural network.

A. LSCSSA-MLP
The goal of the meta-heuristic algorithm for training mutilayer perceptron is to find a set of connection weights and error values to optimize the classification accuracy. Generally, the data set samples have the characteristics of high dimensions, muti-modality, noise pollution, and missing data. The expression of the meta-heuristic algorithm should be changed to apply to muti-layer perceptron, so it is necessary to choose a suitable encoding mechanism. According to [63], the coding mechanism is divided into matrix coding, vector coding and binary coding. Matrix coding can simplify the decoding process of the algorithm and reduce the operation cost. Therefore, matrix coding is selected as the encoding mechanism of LCSSSA-MLP. The target variable of MLP is the connection weights and error value, and the target variable of the LSC-SSA algorithm is the global optimal value. Matrix coding is used to represent the connection weights and error values as global optimal values. The equation is described as follows.
where, n is the number of input nodes; W i,j indicates the connection weight from the i-th node to the j-th node; θ indicates the error value. After defining the variables of LSCSA-MLP, the objective function needs to be defined so that LSCSA-MLP can achieve the best classification accuracy in the training and test samples. The mean square error (MSE) based on all training samples is a common indicator for verifying MLP. The equation is described as follows.
where, m is the number of input nodes; s is the number of training samples; d k i indicates the expected output value of the k-th input node when using the i-th training sample. Therefore, the objective function of the LSC-SSA-based muti-layer perceptron trainer can be defined as: Different data sets have different attribute ranges, so normalizing the data is an important step for LSSCA-MLP. This paper adopts the minimum-maximum normalization method to map the sample x to the intervals [a, b] and [c, d]. The equation is as follows: After encoding, the MSE and classification accuracy obtained by the training samples after MLP learning are passed to the trainer. The LSC-SSA based trainer optimizes the connection weights and error value transmission, which further improves the MSE and classification accuracy until it finds the best classification accuracy. The muti-layer perceptron trainer based on LSC-SSA is shown in Fig. 15.

B. ADOPTED DATA SETS
The three data sets used by the LSCSSA based mutilayer perceptron trainer are from the University of California Irvine Machine Learning Database. Three categorical datasets (XOR dataset, Balloon dataset, and Breast Cancer VOLUME 8, 2020  dataset) are selected. In order to effectively verify the performance of the algorithm, this paper sets different difficulties on the data set. The number of training / test samples is different, and the number of attributes is also different. Therefore, the muti-layer perceptron has different structure. The specific information of the data set and its MLP structure are listed in Table 6. It can be seen from Table 6 that the XOR dataset has 8 training / test samples, 3 attributes and 2 categories. In addition, the Balloon dataset and Breast Cancer dataset have more training / test samples than XOR dataset, so they are much more difficult than XOR dataset. The former has 16 training / test samples, 4 attributes and 2 categories, and the optimization dimension is 55. The latter has 599 training samples, 100 test samples, 9 attributes and 2 categories, and the search agent needs to optimize 209 variables. It can be seen that the classification difficulty of the three data sets is increasing.

C. EXPERIMENTAL RESULTS AND ANALYSIS
In order to ensure the objectivity of the experimental results, the mean square error (MSE) and classification accuracy of LCSSSA-MLP are compared with PSO-MLP, ACO-MLP, ES-MLP, and PBIL-MLP in the literature [64]. The experi- mental results are listed in Table 11. Among them, the algorithm uniformly sets the number of population to 200 and the maximum number of iterations to 250. The experimental results show that, on the XOR dataset, the mean square error and classification accuracy of LCSSSA-MLP are better than other algorithms. ACO-MLP, ES-MLP and PBIL-MLP have no difference in classification accuracy, and PSO-MLP has the lowest classification accuracy. This shows that PSO algorithm falls into a local optimum in XOR dataset, and LSC-SSA can avoid falling into a local optimum. On the Balloon dataset, the mean square error of LCSSSA-MLP is second only to PBIL-MLP, and the classification accuracy is no different from other algorithms. All algorithms have reached the theoretically best accuracy. The Breast Cancer dataset has the characteristics of large scale and high dimensions, and the classification difficulty is obviously higher than the first two datasets. It is worth mentioning that the mean square error and classification accuracy of LCSSSA-MLP are significantly better than other algorithms. ACO-MLP has the second best optimization effect. The classification accuracy of the other three algorithms failed to exceed 20%. This shows that LCSSSA will not fall into a dimensional disaster and can maintain a stable optimization capability. The experimental results show that the improved algorithm can be effectively used as a trainer for muti-layer perceptron, and can match the optimal connection weights and error value. It further illustrates that LSC-SSA has good development ability and optimization performance, and can be applied to different optimization problems (continuous / discrete problems).

D. DISCUSSION OF LSC-SSA ALGORITHM
In view of the shortcomings of SSA algorithm, this paper proposes two improvement strategies. First, Levy flight with a step control factor is used to increase the global exploration capability of search agents. This mechanism makes up for the defect that the leader may mislead the population into local optimum. Second, the improved sine cosine operator introduces the convergence factor and the logarithmic spiral search route, which is used to increase the search efficiency of the leader. The combination of two improved strategies improves the balance between exploitation and exploration. In the function optimization problem, LSC-SSA has excellent convergence speed and optimization accuracy, and it shows stronger competitiveness than other algorithms. As the neural network trainer, the improved algorithm can effectively avoid local optimization. The classification accuracy of LSC-SSA is higher than other algorithms. The experiment proved the effectiveness of the improved algorithm from different angles, and laid the foundation for the application of the improved algorithm.

VII. CONCLUSION
As a swarm intelligence optimization algorithm, the salp swram algorithm has a simple structure. The algorithm is easy to implement because it has fewer parameters and operators. However, the algorithm has the disadvantages of low optimization precision and slow convergence speed. The LSC-SSA proposed in this paper first introduced the Levy flight mechanism with a step size control factor. This mechanism uses short-distance walking and long-distance jumping routes to search the space, which effectively improves the traversal and global exploration capabilities of the algorithm. In addition, LSC-SSA uses an improved sine cosine operator to update the position of leader, and uses sine search for global exploration and cosine search for local exploitation. This mechanism ensures that the algorithm adaptively switches and optimizes between two search methods to achieve a smooth transition between exploration and exploitation. In the simulation experiments, firstly, 28 benchmark test functions were used for carry out comparison experiments. LSC-SSA showed more obvious advantages. The improved algorithm has higher global convergence and optimization accuracy than other algorithms. Secondly, the high-dimensional function optimization experiments verify that the proposed LSC-SSA will not be affected by the dimensional disaster, and can still maintain the optimization accuracy and stability. LSC-SSA can effectively solve high-dimensional and largescale optimization problems. At the same time, the effectiveness of the Levy flight mechanism and improved sine cosine operator have been verified. Finally, the muti-layer perceptron trainer based on LSC-SSA found an ideal classification accuracy rate, indicating that the improved algorithm can avoid falling into local optimal values. LSC-SSA can not only solve continuous problems (function optimization), but also effectively solve discrete problems (training muti-layer perceptron). The simulation results show that the improved algorithm has powerful optimization performance, which is of great significance for further theoretical exploration and practical application of the salp swram algorithm.
J. ZHANG is currently pursuing the master's degree with the School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, China. His main research interest includes intelligent optimization algorithms.
J. S. WANG (Member, IEEE) received the B.Sc. and M.Sc. degrees in control science from the University of Science and Technology Liaoning, China, in 1999 and 2002, respectively, and the Ph.D. degree in control science from the Dalian University of Technology, China, in 2006. He is currently a Professor and a Master's Supervisor with the School of Electronic and Information Engineering, University of Science and Technology Liaoning. His main research interests include modeling of complex industry process, intelligent control, and computer integrated manufacturing. VOLUME 8, 2020