An Improved Beetle Swarm Algorithm Based on Social Learning for a Game Model of Multiobjective Distribution Network Reconfiguration

With the increased distributed generation (DG) and electric vehicle (EV) load penetration in distribution networks, it is more difficult to ensure the safe and economic operation of the distribution networks because of the great volatility and randomness of DG and EV loads. In this article, the uncertainties of wind power, photovoltaics, conventional loads, and EV loads are considered. Photovoltaics and conventional loads are related to solar radiation, and they are subtracted to form the net load. Then, the Wasserstein distance is used to divide the scene, and K-means clustering is used to reduce the scene, so the reconstruction analysis is carried out in the limited typical scene. In addition, to minimize network loss, load balance and maximum voltage deviation, a multiobjective game reconstruction model of the distribution network is established under the condition of satisfying network constraints. Moreover, the social beetle swarm optimization algorithm considering two social behaviors is adopted to solve the complex problem. Finally, the simulation results are verified on the standard IEEE-33 system and IEEE-118 system. The results show that the proposed strategy and algorithm can effectively reduce the network loss, improve the node voltage level and ensure that the load is not overloaded.


I. INTRODUCTION
Distribution network reconfiguration (DNRC) is used to change the topological structure of the distribution network by changing the state of switches under the condition of satisfying network constraints, thereby improving the performance of the distribution network [1]. Currently, DNRC has more significance when many distributed generations (DGs) are expected and installed in a distribution network. With an increasing number of new energy and electric vehicles connected to the power grid, DNRCs have been considered an important solution to achieve power system economy and security [2]. Recently, it has become an inevitable trend The associate editor coordinating the review of this manuscript and approving it for publication was Ying Xu .
to fully use renewable DGs, such as wind turbines (WT) and photovoltaic (PV) units, because they have substantial advantages [3]. The global power structure is gradually changing, and the importance of renewable energy continues to increase. It is estimated that in 2040, renewable energy generation dominated by wind power and photovoltaics will account for approximately 30% of global power generation, and the proportion of global renewable energy, especially wind power generation and solar power generation, will continue to increase. However, extensive penetration of DGs with randomness and load with uncertainties greatly increases the risks of safe and economic operation of distribution networks. Therefore, it is necessary to perform research on this issue.
Wind power, photovoltaics (PVs), electric vehicles (EVs) and other uncertain factors are connected to the grid, which greatly increases the uncertainty of the grid [4], [5]. Wind power and photovoltaic power change with wind speed and solar radiation, respectively, and the EV load is closely related to human social behavior. In the process of distribution network reconfiguration, a model with DGs variation is established [6], [7], and the wind speed, solar radiation and EVs are simulated by the Weibull distribution [8], beta distribution [9] and normal distribution [10]. These three uncertainties are modeled separately. In fact, the normal load is related to the PV output, and solar radiation has an effect on the normal load. Therefore, in this article, PVs and conventional loads are combined to form the net load, which is simulated by the beta function. Moreover, there are two main methods for addressing these uncertainties, the scene number optimization method and the robust optimization method. The robust optimization method finds the best solution under the worst-case constraints without generating many scenarios. However, the solution to the worst-case scenario may be too conservative, and multilevel iteration is required to solve the min-max problem, which may not meet the constraints under unexpected conditions. To address this issue, [11] proposes the individual values of worst-case scenarios that can be flexibly determined, rather than assuming the worst-case values for all uncertain factors in the box uncertainty set. Furthermore, [12] proposes a novel risk-based uncertainty set optimization method for the energy management of typical hybrid AC/DC microgrids. This method has made a great contribution to improving the accuracy of uncertainty processing, but a high computational cost cannot be avoided. In addition, the scene number optimization method is stochastic optimization. Uncertain variables are assumed to follow a specific probability distribution, and discrete scenarios are generated accordingly [13]. As a result, the generated scenarios are only a rough approximation of the true distribution of the uncertainty. This method is approximately optimal and needs many scenes, and its accuracy contradict the amount of calculation. However, this method is simple and effective. Therefore, multiscenario technology has become the main method for addressing power system stochastic optimization. The Wasserstein probability distance index is widely used to divide wind power generation and photovoltaic power generation scenarios, and the classification probability synthesis multiscenario analysis method is proposed in reference [14], [15] to address the difference in the random distribution characteristics of wind power, PVs and EVs. In addition, the generation of many scenarios increases the amount of calculation. To reduce the computational burden, it is necessary to simplify the scenarios. Scene reduction methods are primarily heuristic algorithms, including the k-medoid clustering algorithm, K-means clustering algorithm, simultaneous backward reduction (SBR), and fast forward selection (FFS) [15]- [17].
Many scholars use of multiobjective optimization for distribution networks planning and reconfiguration considering the uncertainty of distributed generations. For example, Yang Li proposed a two-stage optimization method for optimal distributed generation planning considering the integration of energy storage in [18]. [19] introduces an optimal probabilistic study based on multi objective problem integrating the stochastic behavior of the renewable resources to improve system performance properties. Reference [20] presents a multiobjective approach to maximise the loadability of distribution networks by simultaneous reconfiguration and optimal allocation of distributed energy resources using a comprehensive teaching-learningbased optimisation algorithm. These papers consider the location and capacity optimization of DGs, which have made great contributions to improving the performance of distribution network and promoting the consumption of renewable energy. In the process of network reconfiguration, some scholars prioritize the economy and strive to minimize the network loss, so the single objective is the minimum active power loss of the distribution network [21], [22]. Some researchers not only pursue the economy but also attach importance to the security of the power grid, so they set the goals of reducing the network loss and voltage deviation as a double objective [23]. In addition, to ensure that the load in the power grid is not overloaded, Jakus takes the minimum network loss and the minimum load balance as the optimization goal [24]. When addressing multiobjective optimization, the multiobjective is usually weighted and converted into a single objective. This processing process is relatively rough and the three objectives cannot be optimized at the same time. Therefore, some scholars use the double-layer model to address multiobjectives [25]. Other scholars introduced game theory and constructed a multiobjective game model [26], [27] to solve the multiobjective problem without subjectivity. However, these methods do not consider the mutual restriction between the objectives, the network loss generally represents the economy in the distribution network reconfiguration, and the load balance and voltage deviation generally represent the security and stability of the distribution network operation. In this article, considering the cooperative relationship between the load balance degree and the voltage deviation degree, a cooperative game alliance model is formed.
On the other hand, the Pareto optimal solution is widely used in solving multiobjective optimization problems [28]. However, the Pareto optimal solution is only an acceptable solution set of the problem, and there are multiple Pareto optimal solutions in general. At this time, people need to make their own decisions, which is not objective. Although there are many improved methods for identifying a Pareto frontier as a set of candidate solutions [29], [30], it remains difficult to obtain an optimal solution objectively in the process of solving multiobjectives, and the improvement of the algorithm generally increases the complexity of the algorithm. At this time, the game model can transform the multiobjective into a single objective essentially, so the heuristic algorithm based on the cooperative game model is effective for solving the multiobjective in this article. Futhermore, swarm intelligence algorithms, such as GA [31], particle swarm optimization (PSO) [32], the moth swarm algorithm (MSA) [33], cuckoo search algorithm (CSA) [34], [35], and heuristic algorithm (HA) [36], are often used to solve multiobjective models. These algorithms easily fall into local optimal solutions when solving multiobjective models. In this regard, references [37] and [38] proposed a cat swarm optimization (CSO) algorithm and modified gray wolf optimization (MGWO) algorithm to improve the GA and PSO algorithms, respectively. In addition, J. Moshtagh proposed a nondominated sorting genetic algorithm II (NSGA-II) to solve the multiobjective solution in the reconfiguration of the distribution network [39], [40]. However, these algorithms are easily affected by the parameters, and the solution is unstable. In 2017, the beetle antennae search (BAS) was proposed as an efficient and intelligent search algorithm [41]. The BAS algorithm does not need to know the specific form of the search, which is simple and efficient. However, in each iteration, the convergence result of the BAS algorithm depends on the direction of the beetle, which has great randomness [42]. To solve these problems, PSO and BAS are combined to form the beetle swarm optimization (BSO) algorithm. However, the optimization speed of the algorithm remains insufficiently fast, and cannot jump out of a local optimal solution completely. In this article, to improve the efficiency of optimization, the search speed of the global optimal solution is accelerated by changing its operator, increasing social learning behavior, and sorting after each iteration. Moreover, chaos theory is added to solve the premature problem and avoid falling into the local optimal solution.
The main contributions of this article are as follows: 1 The uncertainty of DGs and EV load are considered in this article. The Wasserstein distance and K-means clustering methods are used to address the uncertainty of DGs and EV load. Additionally, PVs and conventional loads are combined to form the net load, which is more coincident with the actual situation.
2 Considering the mutual restriction and cooperation relationship between the objectives, Objectives 2 and 3 are considered the cooperative relationship. Together with Objective 1, the multiobjective game reconstruction model of the distribution network is constructed.
3 The social beetle swarm optimization (SBSO) algorithm considering two social behaviors, the cognitive factor and replacement method, is used to solve the model and improve the search efficiency, and chaos theory is added to solve the premature problem.
The overall framework of this article is shown in Fig.1.
Section II provides the random model and treatment method for renewable sources and load uncertainty. Section III introduces the objective functions and constraints. Section IV describes the simplification and encoding of topology based on a basic loop. Section V explains the SBSO algorithm and the test results of 3 benchmark functions from the SBSO algorithm. Section VI shows the results from the IEEE33-Bus and IEEE118-Bus Test Radial Distribution Network. Finally, Section VII presents the paper's conclusions.

II. MODEL OF UNCERTAINTY A. UNCERTAINTY MODELING OF WIND POWER OUTPUT
Wind power depends on wind energy to generate electricity, and the output power of the wind turbine is expressed as: where P is the actual shaft power obtained by the fan in W ; ρ is the air density in kg/m 3 ; v w is the actual wind speed in m/s; s is the swept area of the wind turbine in m 2 ; and C p is the wind energy utilization coefficient, with a maximum value of 0.593. From (1), the output of the fan is directly proportional to the wind speed. The wind speed distribution has strong uncertainty. The Weibull distribution has a simple form and good fitting. In this article, the Weibull distribution is used to simulate the wind speed distribution [8], as shown in the following equation.
where c is the Weibull scale parameter; and k c is the Weibull shape parameter. The value range of k c is [1.8, 2.3]. A Weibull simulation was performed with the actual wind speed sampled every 10 minutes at an altitude of 80 m in the Dabancheng wind field in Xinjiang, China, in June 2019. The actual wind speed distribution and Weibull distribution are shown in Fig. 2, which shows that the Weibull distribution can fit the wind speed distribution well.  The relationship between wind speed and wind turbine active power output P W is as follows: where P r is the rated capacity of the wind turbine; P W is the actual power of the wind turbine; v ci is the cut-in speed of wind power; v r is the rated speed of wind power; and v co is the cut-out speed of wind power. According to reference [8], when v ci < v < v r , the probability function of wind power output is as follows: where k 1 = P r /(v r -v ci ) and k 2 = v ci P r /(v ci -v r ). The output reactive power Q W is: where cosθ is the power factor.

B. UNCERTAINTY MODELING OF THE NET LOAD
In general, the load and photovoltaics are treated as two independent uncertain variables [43]. However, in practice, people's electricity consumption behavior has a strong relationship with solar radiation. The air-conditioning load and industrial load are heavy with a strong light, while the load inversely proportional to the light is small. The heavy load changes with the light condition. In this article, considering the interaction between load demand and photovoltaic intensity, a random variable is set as the net load P PL , which considers the uncertainty of the linkage between the load and the photovoltaic output.
P load is the load at each moment and P pv is the photovoltaic output. The solar radiation can be fitted by the beta function [10], and its probability density function is as follows: where is the gamma function, α c and β c are the shape parameters of the distribution functions, and E max is the maximum value of solar radiation.
The probability density function of the net load is: where P max is the maximum output power of P PL .

C. UNCERTAINTY MODELING OF THE EV LOAD
Electric vehicles have a strong spatiotemporal randomness, thus the EVs forecasting method based on the travel chain and charging frequency of electric vehicles [10] is statistically analyzed by NHTS2009, and the logarithmic normal distribution is used to approximately simulate the mileage D of each segment of EV [44], [5]: where µ D is the expected value of the travel distance, and σ D is the standard deviation. Take µ D = 3.2 and σ D = 0.88. According to the scale of the electric vehicle, power consumption per kilometer and other information, the charging load expectation is calculated using the Monte Carlo method [44], and the obtained EV load curve is shown in Fig.3.
From Fig.3, most users choose to charge at night. In addition, the more people there are, the greater the conventional load is, and the greater the probability of EV load charging here in reality. Therefore, large-scale disordered charging of EV will threaten the economic and safe operation of the distribution network. VOLUME 8, 2020

D. TREATMENT METHODS FOR THE UNCERTAINTY 1) SCENE DIVISION BASED ON THE WASSERSTEIN DISTANCE
This article assumes that the uncertainty probability density function of wind power generation, the net load and EV load is continuously distributed at a single moment, and the discrete distribution of Q quantiles is used to approximate the continuous probability density function [45]. The optimal quantile z q (q = 1, 2, . . . , Q) of the Wasserstein distance for any continuous probability density function can be obtained from (10).
where p c (x) is an arbitrary continuous probability density function, r is the order. The probability for each quantile is: The derivation of the quantile formula is shown in the literature [15], [46].

2) SCENARIO SET REDUCTION BASED ON THE K-MEANS CLUSTERING ALGORITHM
The scenarios of wind power generation, net load and EV demand are divided into scenes by the Wasserstein distance, and the corresponding scene numbers and corresponding probabilities of the N WK , N PLK , and N EK scenarios are obtained. Taking wind power as an example, there are T distribution network reconfiguration times, and Q t quantiles were used to describe the random output of wind power at moment t, Then, the total size of scene set WK at time t is Q t , and the total sizes of scene sets PLK and EK are identical. The k-means clustering algorithm is used to reduce scenarios and reduce the amount of calculation without losing important information [46].
The specific steps are as follows: (1) The Wasserstein distance is used to divide the N WK , N PLK and N EK scenarios of wind power, net load and EV load, and the corresponding probabilities P WK , P PLK , and P EK .
(2) Different scenes of wind power generation, net load and EV load are arranged and combined to obtain all the scenario sets N K .
The probability in scenario k is: (3) M k scenes were randomly selected as the cluster centers, and the cluster centers were set as C = {ηk C } (k = 1, 2, . . . , Mk). η k (k = 1, 2, . . . , N k ) represents N k different scenes before reduction, and M k is the target scene tree.
(4) The remaining scene sets are G = {η G k } (k = 1, 2., . . . , N k − M k ). Calculate the scene distances from the remaining scenes to the cluster center scene: According to D k,k , classify the remaining scenes to the nearest cluster center. The cluster set after this clustering = {C ks }, (ks = 1, . . . , M k ), where C ks represents a set of similar scenes.
(6) Assuming that there are L k scenes in one of the clusters C ks , calculate the sum of the distance between each scene and other scenes: When CT ks = min (CT k ), η ks is the new cluster center, and the cluster center set is determined according to the above method.
(7) Repeat step (4) -step (6) until the cluster center and clustering results finish changing; then, the scene reduction ends. The probability value of each scene is the sum of the probabilities of all scenes in the class, and the clustering centers and the probability ω k of each typical scene are obtained.

III. GAME MODEL OF DISTRIBUTION NETWORK RECONFIGURATION A. MULTIOBJECTIVE MODEL
In this article, the minimum network loss, minimum load balance and minimum maximum voltage deviation are taken as the first, second and third objectives, respectively. A multiobjective model of the distribution network is formed.
The minimum network loss of the first target is: where P j is the active power of the j-th branch, Q j is the reactive power of the j-th branch, R j is the resistance of the j-th branch, V j is the initial voltage of the j-th branch, and D j is the binary value. If D j is the opening switch status for the j-th branch, D j = 0; otherwise, D j = 1. J is the number of branches and T is the time length. The minimum load balance of the second target is: where S max j is the maximum complex power of branch 'j . The minimum voltage deviation of the third target is: where V n is the virtual voltage of the n-th node, V nr is the rated voltage of the nth node, and N is the total number of nodes.

B. MULTI-SCENE MODEL
The scenario probability of each typical scenario is taken as the weight coefficient of the network loss, load balance degree and maximum voltage deviation in the scenario, and linear weighting is performed to obtain the expected network loss, load balance degree and maximum value of the distribution network in K scenarios. The optimization objective of multiscene reconstruction is shown in Fig. 4. The objective function is: where f 1,k is the minimum network loss corresponding to the kth typical scenario; f 2,k is the minimum load balance corresponding to the k-th typical scenario; f 3,k is the smallest value of the maximum voltage corresponding to the k-th typical scenario; F 1 , F 2 , and F 3 are the expected network loss, load balance and maximum voltage deviation respectively; and ω k is the probability of scene k.

C. MULTIOBJECTIVE GAME MODEL
In the game model, the players are the network loss, load balance and maximum voltage deviation in the network reconstruction. The set of players is {Y u |u = 1, 2, 3}, and Y u is the u-th optimal objective function in distribution network reconfiguration. The strategy of player Y u is the switch combination of on-off when Y u is the optimal objective function in the network. The corresponding strategy set S u is: S su is the s-th network reconstruction strategy of Y u , as shown in (22).
is number of the disconnect switches under the s-th network reconfiguration strategy of player Y u , and M is the total number of disconnect switches. The Nash equilibrium strategy set of distribution network reconfiguration is as follows: S * su is the Nash equilibrium strategy of player Y u . The profit function of player Y 1 is as follows: where α, β and γ are the weighting factors of the profit function.
In this article, players Y 2 and Y 3 cooperate to form alliances Z 2,3 to form a cooperative game alliance model. According to the importance of decision makers to each optimization objective, this article proposes the revenue function F 4 (s) as shown in (27). The corresponding revenue function is expressed as follows: min FS 1,2 S sZ 2,3 = λF 1 S sZ 2,3 + γF 4 S sZ 2,3 (28) VOLUME 8, 2020 3 ) and F 4 (S sz2,3 ) are the network power loss and alliance revenue after being standardized, respectively; S sz2,3 are the alliance set strategies of players Y 2 and Y 3 ; and λ and γ are weight factors.
In summary, the players are Y 1 , {Y 2 , Y 3 }; the game strategy set is S 1 , S 2,3 = {S 2 , S 3 }; and the profit function is the profit function (25) and the profit function of the alliance income function (28).

D. CONSTRAINTS
Network reconfiguration also must meet the following constraints.
a. Power flow equation constraints.
P n + P w,n = P PL,n + P EV ,n U q G nq sin θ nq − B nq cos θ nq (31) where P n and Q n are the active and reactive power injected on node n respectively; P w,n , Q w,n are the active and reactive power output of the wind turbine on node n respectively; P PL,n and Q PL,n are the active and reactive power of the net load on node n, respectively; P EV ,n and Q EV ,n are the active and reactive power of the EV load on node n, respectively; G nq , B nq , θ nq are the conductance, susceptance, and voltage phase angle difference of the branch between nodes n and q, respectively; U n and U q are the voltages of node n and q, respectively; and N is the number of system nodes. b. Node voltage constraints.
U n,min and U n,max are the lower and upper voltage limits of node n, respectively. c. Branch power constraints.
S j is the power of branch j; S j,max is the maximum power allowed by branch j, and J is the total number of branches.
d. DG capacity constraints P DG,n,min ≤ P DG,n ≤ P DG,n,max (n = 1, . . . , N DG ) (34) P DG,n is the active output of distributed energy on node n; P DG,n,min , P DG,n,max are the minimum and maximum output  power of distributed energy on node n, respectively; and N DG is the total number of distributed generations.
e. Network constraints. The distribution network must have a radial topology. Loops and islands are infeasible solutions.
When the constraint condition exceeds the limit, it is included in the penalty function, the amount of the limit determines the size of the penalty, and finally the penalty function is included in the objective function [47].

IV. NETWORK SIMPLIFIED CODING
The standard IEEE 33-node system is selected. The system includes 33 nodes, 32 section switches and 5 tie switches. The reference voltage is 12.66 kV, and the reference power is 1000 kW. The specific parameters are found in the literature [48]. The charging node is shown in Fig. 5, and the EV load is allocated according to its basic load ratio of 1:2:4:1:2.
The parameters of EV are shown in Table. 1, and the DG type, capacity and access node are shown in Table. 2 [49].
Many switches are involved in the reconfiguration of the distribution network, the particle dimension is large, and there will be many infeasible solutions. Therefore, the coding of the above distribution network is shown in Table.3.
In this article, the distribution system is divided into several ring networks using the coding method in reference [46]. Only one switch is disconnected in each ring network, which ensures the radial topology constraints and greatly reduces the number of infeasible solutions.
The network loop switch matrix is defined as: where w is the number of interconnection switches; SL su is a 0/1 matrix; and if the switch number b corresponding switch is in loop a, a a,b is 1; otherwise, it is 0. For example, the numbered particle S su = [3 6 11 12 2] corresponds to switches (4,14,4,17,4). Switch 4 is in loop 1, so α 1,1 is 1; switch 14 is not in loop 1, α 1,2 is 0; switch 4 is in loop 1, so α 1,3 is 1; switch 17 is not in loop1, so α 1,4 is 0; switch 4 is in loop 1, so α 1,5  is 1; and so on, the corresponding SL su is: The SL su matrix corresponding to S su is a nondiagonal matrix, and the same row appears twice in the matrix, which indicates that the public switch in the loop has been turned on three times, and the loop network appears in the network, which is an infeasible solution. If the elements of a row of SL su are all 0, the ring network does not turn on any switch, an island will be generated, and the particle is also an infeasible solution.
In summary, R (SL su ) is the rank R of SL su , if R(SL su ) < a, the particle is an infeasible solution.

V. SBSO ALGORITHM WITH A SOCIAL LEARNING STRATEGY A. PARTICLE SWARM ALGORITHM
Dr. Eberhart and Dr. Kennedy proposed a PSO algorithm based on social behavior in 1995. The PSO algorithm depends on the speed of motion v and the direction of motion x for optimization. Each particle searches for the best solution Pbest in the search space, shares information with other particles in the particle swarm and determines the optimal individual extremum of these particles as the global optimal solution Gbest. All particles in the particle swarm update their speed and position based on Pbest and Gbest.

B. BAS ALGORITHM
The beetle antennae search algorithm is a new optimization algorithm proposed in 2017. The direction of the beetle's movement is judged by comparing the fitness functions corresponding to the two antenna positions of the beetle. If the left fitness value of the beetle is greater than that of the right, the beetle moves to the left, otherwise it will move to the right [41], [42]. In this article, multidimensional function optimization is taken as an example to solve the optimal value. The steps are as follows: (1) The position of the beetle in the o-dimensional solution space is (2) The spatial coordinates of the left and right sides of beetles are as follows: where x l is the left side of the search area, x r is the right side of the search area, which is a random unit vector, a L is the distance from two whiskers to the center of mass; and b is the random unit vector.
(3) Update the location of the beetle: where x t is the centroid coordinates of the t-th iteration of the beetle; x t l and x t r are the left and right antennas of the t-th iteration respectively; f (x) is the fitness value of x, δ t is the step size at the t-th iteration; and sign(x) is a sign function.

C. BEETLE SWARM OPTIMIZATION WITH A SOCIAL LEARNING STRATEGY
The BAS algorithm is only for the individual, and PSO focuses on groups. Therefore, this article integrates the BAS and PSO algorithms. VOLUME 8, 2020 To handle the problem of premature convergence, the chaos migration strategy is used in the proposed SBSO algorithm [50].
The similarity of beetles is described as follows: where d(q,p) is the Euclidean distance from individual q to p, q is the beetle with the best fitness, d min is the Euclidean distance of the nearest individual to q, d max is the Euclidean distance of the beetle farthest from q, and NP is the number of populations. The logistic map is described as follows: where X d is the d-dimensional variable of chaotic sequence X , X d ∈[0; 1], and η ∈(3.569, 4). The chaos initial population is scaled into [0,1], and the chaotic sequence G = {X 1 , . . . . . . , X S } is obtained iteratively. The iterative process from t to t+1 for individual p is where P L p is the learning probability of individual p, which is inversely proportional to its own fitness value; and p t p is the random number of individual p, with a value of [0, 1].
On one hand, the cognitive factors and social learning factors are taken into the SBSO algorithm. On the other hand, before the next iteration, the bad particles are replaced by the preference random method, and the particles with bad fitness values are eliminated. Thus an improved beetle search particle swarm optimization algorithm based on social behaviors was proposed.
In the social learning process, beetles not only learn from individuals who are better than themselves, but also learn from the average status of the entire social group; these two social behaviors are combined into social learning factors [51]. Beetles can learn from different individuals in each dimension, which can greatly reduce the number of iterations and solve high-dimensional and complex problems. Thus, the speed update rules of the beetle are as follows: where b (P b > P p ) is the learning object randomly selected by individual p among individuals with greater fitness than that of itself, and individual p can learn from different objects randomly in different dimensions. p can learn from many individuals who are better than it, so that it can obtain more information. x t pd is the position of individual p in the d dimension of generation t (1 ≤ d ≤ D); D is the dimension of the decision space; z t 0 is a cognitive factor; z 2 -z 3 are random numbers of individual p between [0, 1]; ε is the influence factor for controlling the social influence of x t m , which is taken as 0.01 in this article; z 1,max , z 1,min are the maximum and minimum inertia weights respectively; T max is the maximum number of iterations; and x t max,d , x t min,d are the upper and lower search bounds of the d-dimension variables respectively.
When the individual is lower than the average value, the individual performance is not good, so it is necessary to increase the cognitive factor to strengthen the global search ability, otherwise, it is necessary to reduce the cognitive factor to accelerate the convergence speed. (45) F, F min , F max , and F avg are the current individual fitness value, population minimum fitness value, population maximum fitness and average fitness, respectively; and z 0,max and z 0,min are the maximum and minimum cognitive factors respectively.
Sort the fitness value and replace the last particles of NP/6. The position updating of the final particle using the   preference random method can make full use of the effective information of the current solution set and help jump out of the local optimal solution.
The bad beetles are replaced by (46).
x t pr is the bad beetle; x t p,1 and x t p,2 are any two particles in the population; and z 4 is random numbers of individual p between [0,1].
The probability function is: where g is the rank of individual p in the population after fitness sorting. Because the SBSO algorithm is a heuristic search method, its optimization results have certain randomness. A large number of tests show that the algorithm runs independently for 30 times, the frequency of the minimum value is the largest. If the difference between the values is less than 10^-7, these values are regarded as the same number, and the open switches corresponding to the minimum value is the final solution. In this article, the algorithm runs independently for 50 times each time, and the occurrence times of the minimum value are counted. If the minimum value appears the most times, the minimum value is the optimal solution, otherwise, run independently for 50 times again.  The social learning strategy process is shown in Fig. 6. The algorithm flow chart of this article is shown in Fig. 7.

D. EXPERIMENTAL DESIGN AND RESULTS ANALYSIS
To preliminarily verify the effectiveness of the SBSO algorithm, it is used to solve the Arkley function. The Arkley function and the -Arkley function are shown in Fig. 8. The initial parameters are set as follows: the dimension is 2, the number of beetles is 10, the number of iterations is 100, L = 2, z 0,max = 2.5, z 0,min = 1.5, z 1,max = 0.9, z 1,min = 0.4, and the social impact factor is 0.01. The SBSO algorithm is used to solve this function, the solving process is shown in Fig. 9, and the number of iterations is shown in the Fig. 10.
The SBSO algorithm can effectively find the minimum value of the function, and the minimum value of Arkley can be obtained after approximately 30 iterations, with fewer iterations, a fast convergence speed and a high efficiency.
To further verify the superiority of the SBSO algorithm, three representative benchmark functions are simulated and compared with the PSO, BAS and BSO algorithms. F1 is a unimodal function, and F2 and F3 are multimodal functions, which are used to detect the ability of the algorithm to jump out of the local optimal solution. The initial parameters are set as follows: the dimension is 5, the number of beetles is 10, and the number of iterations is 300. Table. 4 shows the names of the three benchmark functions, the function expression F, the solution search space range and minimum value F min , and the dimension D of the variable.
To ensure the fairness and objectivity of the evaluation, in the test, the four algorithms run independently 50 times, and the variable dimension is 5. The average values of the experimental results are shown in Table. 5.
Table. 5 shows that the SBSO algorithm is superior to the PSO, BAS and BSO algorithms. The BAS algorithm obviously falls into the local optimal value when solving F2 and F3, and it can only jump out of the local optimal value with a small probability when solving other test functions. Overall, BAS has a limited global optimization ability.    The global optimization ability of PSO is better than that of BAS, but it still cannot always jump out of the local optimal solution. BSO has a high probability of jumping out of the local optimal value, but the accuracy of the solution is insufficient in the process of optimization. By comparison, SBSO has the strongest global optimization ability. In terms of the number of iterations, PSO and BAS need hundreds of iterations to reach the optimal value, while BSO and SBSO need fewer iterations, so the solving efficiency has been substantially improved. SBSO needs fewer iterations than BSO, showing a steady and rapid decline trend in the three test functions, so the solution of SBSO is effective and efficient. According to the different test functions, the scatter plot with the optimal value as the X -axis and the worst value as the Y -axis is obtained. Fig.11 (a) shows that two points of BAS algorithm are far from the origin and fall into the local optimal solution, while the other three algorithms can jump out of the local optimal solution to a certain extent for the three test functions, and the SBSO algorithm is closest to the origin with the highest accuracy. On the other hand, the SBSO algorithm is a slight distance from the y = x axis, which shows that the difference between the optimal value and the worst value is the smallest, and that the algorithm is stable and more robust. After sorting the average fitness and iterations of different algorithms for different test functions, the graph is drawn, as shown in Fig.11 (b), the SBSO algorithm clearly has the minimum average fitness for unimodal function F1 and multimodal functions F2 and F3. Fig. 12 shows that the SBSO requires the fewest iterations and the least running time to find the global optimal solution for all test functions, which means the computation cost of the proposed approach is the least.

VI. TEST AND ANALYSIS
The CPU model of the computer used in this study was an Intel (R) core (TM) i5-2400CPU@3. 10. The operating  system was Windows 7 ultimate 64 bit, and the running environment was MATLAB 2016a. Based on the abovementioned IEEE-33 node distribution network system, the cutoff wind speed v ci = 5 m/s, the cut-out wind speed v co = 24 m/s, and the rated wind speed v r = 11 m/s. The wind speed at the height of the fan impeller hub in this area obeys the Weibull distribution of k c and c and the solar radiation obeys the beta distribution of α c and β c . The parameters of k c , c, α c and β c are obtained from 8760 samples of actual wind speed and actual solar radiation data in a region [52], as shown in Table.6. The weight coefficients of the objective function were α = 2, β = 1, and γ = 1, λ = 2. The evaluation factors were a = 0.6, and b = 0.4. The total number of beetles was 50, the maximum number of iterations was 300, L = 2, z 0,max = 2.5, z 0,min = 1.5, z 1,max = 0.9, z 1,min = 0.4, and the social impact factor was 0.01. The scheme before reconfiguration was set as one that does not consider the uncertainty of wind power, the net load and EV load, and that was to disconnect five tie switches. The original network was to disconnect 33, 34, 35, 36 and 37 switches, without considering network reconfiguration. The total network loss was 3370.042 kWh, the voltage deviation was 0.0880 p.u, and the load balance was 5.1174 p.u.
The SBSO algorithm is used to reconstruct the IEEE-33 node distribution network in 24 hours. The results are shown in Table. 7 and Fig. 13. The change of each node in 24 h is shown in Fig. 14.
Table.7 shows that the total active power loss of the distribution network within 24 hours is 2227.739 kWh, the maximum voltage deviation is 0.0438 p.u, and the load balance degree is 3.7515 p.u. The changes of the network loss, maximum voltage deviation and load balance within 24 hours are clearly apparent in Fig. 13.
After reconfiguration per hour of the distribution network, Fig. 14 shows the network loss, maximum voltage deviation and load balance of 33 nodes within 24 hours. Fig. 14 (a) is primarily in blue, which shows that the network loss of the 33 nodes is maintained at a low level within 24 hours. The loss of each node is mostly distributed in 0-5 kW, and the active power loss of each node is less than 40 kW in 24 hours. Fig.14 (b) is dominated by a yellow tone, which indicates that the maximum voltage deviation of the 33 nodes in 24 hours is closer to the unit value of 1, and the minimum voltage at any time is not less than 0.955 p.u. This result means that the voltage level of the 33 nodes is maintained in a stable state. Fig. 14 (c) is primarily in blue tone, which shows that the load balance degree of the 33 nodes in 24 hours is small, and the voltage of each node is less than 0.1 p.u.

A. DIFFERENT CASES
To validate the presented method and compare it with other methods, the optimal network reconfiguration is solved for the following cases: Fig.15 (a) shows that the optimization effect of case 1 and case 2 are similar, those of case 3 and case 4 are close, and the optimization effect of case 4 is obviously better than those of case 1, case 2 and case 3, which means that DG access to the power grid and distribution network reconfiguration can reduce network loss. VOLUME 8, 2020    15 (b) shows that the optimization effects of case 3 and case 4 are similar, and are much better than those of case 1 and case 2, indicating that DG can also reduce voltage deviation. The voltage deviation of case 4 is also less than that of case 3, which indicates that the distribution network reconfiguration can reduce the voltage deviation and maintain the voltage level above 0.955 p.u.
The proper amount of DG connected to the distribution network can improve the power flow distribution of the system, balance the line load, reduce the current flowing through the line and reduce the network loss. The loss and voltage deviation of the network with DG reconfiguration are small, which also shows that proper DG access to the distribution network and distribution network reconfiguration not only absorbs renewable energy but also substantially improves the system indicators.

B. DIFFERENT SCENARIOS
Scenario partitioning based on the Wasserstein distance and K-means clustering are used to address the uncertainty of wind power, the net load and EV load, and then the time-sharing reconfiguration principle is used to reconstruct the distribution network in multiple scenarios. The results are shown in Table. 9.
Table. 9 shows that the network loss, voltage deviation and load balance degree have been improved in different scenarios after using the reconstruction algorithm in this article. According to the above data weighted calculation, the active network loss is 2227.7398 kWh, with a decrease of 33.89%, the voltage deviation is 0.0438 p.u, with a decrease of 50.22% and the load balance is 3.7515 p.u, with a decrease of 26.69%. From the above analysis, the multiscenario reconstruction strategy can effectively reduce the active power loss, voltage deviation and load balance.

C. DIFFERENT GOALS
The game model in this article and the single objective model are used to solve the reconfiguration of the distribution network. The results are shown in Table. 10.
From Table.10, it is determined that regardless of which target is used for distribution network reconfiguration, the performance index can be improved by taking distribution network reconfiguration measures, the optimization value of which is the optimization objective that can reach the optimal value, but the performance of the other two indicators will decline. Taken together, the use of the multiobjective game   model in this article to optimize the solution can make the three important indicators in the distribution network optimal at the same time.

D. DIFFERENT ALGORITHMS
The typical scene of each unit is shown in Fig.16.
For case 4 of the IEEE-33 system, the reconfiguration strategy of the algorithm in this article is used to reconstruct the distribution network, and the active power loss, voltage deviation and load balance are recorded. The total results are compared with those of the PSO, BAS, and BSO algorithms in Table 11 and Fig.17. Table. 11 and Fig. 17 show that SBSO and BSO have smaller values than BAS and PSO, and SBSO more easily find smaller global optimal solutions in a shorter time. The number of iterations is shown in Fig.18.
The number of iterations is substantially fewer for SBSO than for the other three algorithms, and the fitness value is smaller. The BSO algorithm combines the advantages of the PSO and BAS, improves the efficiency of the PSO algorithm, and helps the BAS jump out of the local optimal solution to a great extent. The above results also verify this conclusion. BSO has   a better global search ability and higher solution efficiency than those of PSO and BAS. Based on BSO, this algorithm improves its operator, adds social learning progress, and sorts and updates each optimization result to form an improved SBSO algorithm. From the above results, it is apparent that the SBSO algorithm has a stronger global search ability than the BSO algorithm, and it can find the global optimal value with a higher efficiency and the highest accuracy in fewer iterations.

E. SCALABILITY
To verify the scalability of the presented method in this article. The SBSO algorithm is used to reconstruct the IEEE-118 node distribution network in 24 hours. Diagram of the IEEE-118 node distribution network is shown in Fig. 19. The parameters of wind turbines, net load, and EVs are equal to those of IEEE-33 node. The parameters of the DG type, capacity and access node are shown in Table. 12 [47].
The results are shown in Table.13 and Fig.20. The change in each node in 24 h is shown in Fig. 21.   Table.13 shows that the total active power loss of the distribution network within 24 hours is 15173.86651 kWh, the maximum voltage deviation is 0.049378903 p.u, and the load balance degree is 4.79989888 p.u. The changes in network loss, maximum voltage deviation and load balance within 24 hours are clearly apparent in Fig.20.
After reconfiguration per hour of the distribution network, Fig. 20 shows the network loss, maximum voltage deviation and load balance of 118 nodes within 24 hours. Fig. 21 shows the network loss of 118 nodes, the maximum voltage deviation of the 118 nodes in 24 hours, and the load balance degree of the 118 nodes in 24 hours.
The total results are compared with those of the PSO, BAS, and BSO algorithms in Table 14. Table.14 and Fig. 22 show that SBSO and BSO apparent fewer values than BAS and PSO, and Fig. 23 shows that the SBSO more easily find fewer global optimal solutions in a shorter time. And the number of iterations of SBSO is substantially fewer than the other three algorithms, which means the computation cost of the proposed approach is the least. Moreover, These results indicate that the SBSO algorithm in this article is also useful for IEEE-118 node, which is a large scale power grid. Therefore, the presented method is scalable.

VII. CONCLUSION
Because many DGs and EVs are connected to the distribution network, the safe and economic operation of the network is affected. In this article, wind power, net load, and EVs were considered, and K-means clustering based on the Wasserstein distance was used to address this uncertainty. In addition, the game model of distribution network reconfiguration was established to minimize the sum of the active power loss, load balancing index, and maximum node voltage deviation. Moreover, aiming at the high latitude, nonlinear, nonconvex, and complex constraints involved in the model, SBSO was designed to solve the problem.
Here, proper DG access to the distribution network not only helps to absorb renewable energy but also improves the performance of distribution network indicators. At the same time, distribution network reconfiguration technology is also very important for reducing network loss, maintaining voltage stability, and balancing the load. Moreover, Scene partitioning and K-means clustering based on the Wasserstein distance can effectively address the uncertainty caused by distributed energy and EVs. Compared with the single objective model, the game reconstruction model of the distribution network can achieve the goal of simultaneously optimizing multiple indexes in the distribution network. Additionally, the SBSO algorithm in this article can effectively solve the complex multidimensional distribution network reconfiguration problem, and compared with the PSO, BAS, and BSO algorithms, it is more accurate and efficient.
In this article, it is assumed that the wind speed and light intensity change of the selected data sample is small and the method in this article is not considering DGs allocation and capacity optimization. Furthermore, these limits mentioned above will be studied in future work and the presented method will consider simultaneous reconfiguration and allocation of distributed energy resources.