Bilevel multi-objective gray wolf algorithm based on Packet transport network optimization

Packet transport network (PTN), as an efficient transmission network technology in mobile communications in the big data era, is used by more and more communication operators. The existing PTN resource utilization rate is low, the network security is poor, so the existing PTN needs to be optimized in all aspects. For the optimization of the PTN, it is necessary to consider the decision of both the operator user and the service product supplier. Therefore, this paper proposes a bilevel multi-objective gray wolf algorithm based on PTN optimization problem. The operator user is the upper-level decision maker, and the objective function is to pay the product supplier the lowest cost. The product supplier is the lower-level decision maker, it mainly includes two major objective functions. The first objective function is to maximize the Label switching path overlap rate(LSPOR) evaluation score to solve the abnormal Label Switching Path (LSP) problem in the network, and the second is to maximize the committed bandwidth with utilizing rate(CBWUR) evaluation score to solve the problem of excessive Committed Information Rate(CIR) bandwidth usage in the network. According to the three scale network situation in Hubei, China, the improved multi-objective gray wolf algorithm is used to solve the PTN bilevel programming problem. The experimental results show that the model increases the utilization rate of network resources, and reduces the cost to be paid by the upper-level decision makers.


I. INTRODUCTION
With the large-scale development of Internet technology, various types of high-bandwidth services are used more and more frequently. In the face of the rapidly increasing number of services, major operators need to find a suitable network to meet the current business needs [1]. As the basic network of operator communication [2], the transmission network needs to be continuously optimized and can maximize its service quality.
Packet Transport Network (PTN), as the mainstream network model in the transport network system, inherits technologies such as linear protection switching in the Multiprotocol Label Switching Transport Profile (MLSP-TP) environment [3], provides a perfect quality of service(QoS) system, and maintains the advantages of the traditional synchronous digital network (SDH) technology [4]. A properly configured PTN should meet the requirements of product suppliers and operators at the same time. How to optimize the PTN is one of the major issues that product suppliers need to solve.
After investigating the current literature with PTN as the research background, we found that such literature mainly outlines the QoS system in PTN and configures its key indicators, but does not give an optimization plan after PTN configuration. Li [5] analyzed the PTN structure of Hunan Changsha Mobile Metropolitan Area Network and established an indicator system based on the existing network structure. After analyzing it, he found the shortcomings of the network structure and further optimized the network. According to the service requirements carried by the PTN of Guiyang Mobile, Ding [6] gave an optimization plan corresponding to each level of the network. However, it is more complicated to optimize the bottom layer, and PTN related researchers hope to find a simple optimization solution. Zhang [7] first analyzed some hidden dangers in Chengdu Metropolitan Area Network, and then gave the corresponding optimization plan. It can be seen that the PTN optimization proposed in this kind of literature is optimized for the designated terrain network, which leads to the lack of portability of the optimization scheme.
Ridwan et al. [8] discussed the application of MPLS in various fields and reviewed its important technologies. Ra et al. [9] developed a packet transport layer protection switch integrated circuit (PPSI) to add multiple protection switches to protect network traffic on one or more working paths. Yun et al. [10] proposed an algorithm related to reliability, which solved the problems related to cost and reliability, and obtained the optimal set of primary path and backup path. This kind of literature introduces the related technology of MPLS-TP and gives some protection path schemes.
Yang et al. [11] gave the key indicators of QoS in PTN, including availability, throughput, and delay. In the throughput indicator, product suppliers can specify different committed information rates (CIR) according to business types. Bai et.al [12] gives a flow control method that reflects fairness and compensates CIR, which meets the requirements of Internet service providers, that is, can provide different service quality according to different business needs. Hou et al. [13] gave the relevant strategy of deploying QoS in PTN, including the configuration of CIR. This kind of literature mainly provides an overview of the QoS system in PTN and configures its key indicators, but it does not give out how to solve the problem when the configuration is unreasonable or the user demand is too high.
The PTN optimization schemes mentioned in the above documents are only to optimize a certain indicator separately, solve the problems in MPLS-TP separately and consider the configuration of QoS, but the existing network is usually more complicated, and there may be correlations between each indicator. Therefore, for the optimization of PTN, its various indicators can be optimized at the same time, that is, multi-objective optimization can be performed on it.
Since there are multiple conflicting objective functions in the multi-objective optimization problem (MOP), an increase in the performance of one objective may result in a decrease in the performance of another objective. This makes it very difficult to optimize all objective functions at the same time. With the increase of the optimization target dimension, some characteristics of various MOPs such as dynamics, nonlinearity and non-differentiation will cause the calculation of multi-objective optimization to become more complicated, and the search space of the solution will also increase sharply, which makes it difficult for researchers to find Appropriate solutions are applied to different MOPs [14]. These problems make the MOP solution one of the hottest problems in the field of evolutionary computing at home and abroad. At present, the research results of optimization algorithms have been widely used in resource scheduling, financial investment, automatic control, machine learning and other fields.
Initially, the multi-objective optimization problem usually uses linear weighting to convert it to a single objective and then optimizes it. However, the weight value of each objective in this method will have a direct impact on the optimization result. Later, people proposed the heuristic algorithm is combined with the multi-objective optimization problem [15]. Schaffer et al [16] proposed the concept of using vector evaluation genetic algorithm to solve multi-objective optimization in 1985. Fonseca et al. [17] proposed the Multi-objective Genetic Algorithm (MOGA) in 1993. Deb et al. [18] proposed the Nondominated Sorting Genetic Algorithm (NSGA) in 1994, and in 2002, Deb and Pratap proposed an improved NSGA algorithm (NSGA-II) [19]. On the basis of NSGA, NSGAII adds an elite strategy and a non-dominant ranking method for groups. Zitzler et al. [20] proposed Strength Pareto Evolutionary Algorithm (SPEA) in 1999 and proposed an improved SPEA algorithm (SPEA2) in 2001 [21]. At the same time, as the dynamic changes of the data in the actual project became larger and larger, in 2003, Yen GG and Lu H proposed the Dynamic Multi-objective Evolutionary Algorithm (DMOEA) [22]. In recent years, a new algorithm that combines evolutionary algorithms with biological information has become widely used due to its simple implementation and fast convergence speed [23]. For example, Cello et al. [24] gave the results of multi-objective particle swarm optimization algorithm(MOPSO) in 2002. Yong Zhang studied a dual-archive multi-objective artificial bee colony algorithm(MOABC) in 2019 to improve the search ability of different types of bees [25]. The multi-objective gray wolf algorithm(MOGWO) proposed by Seyedali Mirjalili in 2015 can achieve better results in convergence speed [26].
Through literature research, we found that there is currently no multi-objective optimization solution for PTN. At the same time, related literature on PTN are all researches on product suppliers, ignoring the relationship between PTN operator users and product suppliers. This type of optimization problem involving multiple decisionmaking users is a multi-layer multi-objective optimization problem with Nondeterministic polynomina(NP) hard. At present, the common bilevel multi-objective optimization problem has been applied in task allocation [27], transportation network planning [28] and other fields.. This paper regards the operator user as the upperlevel(UL) decision-maker, and takes the lowest cost delivered to the product supplier as the objective function; takes the product supplier as the lower-level(LL) decisionmaker, and uses the Label switching path overlap rate(LSPOR) and the committed bandwidth with utilizing rate(CBWUR) have the highest score in the PTN as the objective function. A bilevel model of PTN optimization is established, and the improved multi-objective gray wolf algorithm is used to solve it. The contributions of the research in this paper are: for the first time, the bilevel model is applied to PTN network optimization, which can meet the needs of product suppliers and operator users at the same time; for the first time, multi-objective optimization was applied to the indicator optimization of PTN, which improved the speed and performance of objective optimization; improve the multi-objective gray wolf algorithm to increase the search space and convergence speed of the solution.
The organization structure of this paper is as follows. In the second section, we give the relevant description of the PTN bilevel multi-objective optimization problem and the model establishment. In the third section, we give the solution ideas of the bilevel model and the improvement scheme of the multi-objective gray wolf algorithm. In the fourth section, we give the PTN optimization results in different regions, and compare them with multiple multiobjective optimization algorithms to verify the feasibility of the scheme. In the fifth section , we summarize the experimental results and research contributions of this scheme, and propose the limitations of this scheme, and finally look forward to the research road of PTN bilevel multi-objective optimization.

II. The Establishment of Bilevel Multi-objective Optimization Model of PTN
This section mainly introduces the related problems of PTN bilevel multi-objective optimization and the establishment of optimization model.

A. Problem Description
PTN is a connection-oriented packet transmission technology that can support multiple services. PTN divides the network into a channel layer, a path layer, a section layer, and a physical media layer [29], and each layer has a corresponding task assignment. The hierarchical structure diagram is shown in Fig 1. The optimization of the PTN network usually includes two types of decision-making users. One is the product supplier that provides PTN network services. The purpose is to optimize the performance of all aspects of the PTN network, increase the number of product users, and also need to consider factors such as cost. The other decision-making user is operator users who use the PTN network, and its purpose is to purchase service products on PTN at the lowest cost.In order to best meet the needs of these two types of users, we need to consider two aspects.
On the existing PTN, first of all, for operator users, each path in the network should be as short as possible, which can save service transmission time, enhance user experience, and save the cost of using optical fiber and other equipment. Its optimization belongs to channel layer optimization Secondly, the PTN optimization of product suppliers can usually be divided into the following categories: network security, network resources, network operation and maintenance, and performance special items. In the network security category, PTN is often optimized based on network structure, network topology, in-network business protection and equipment protection. In order to more accurately evaluate the performance of the PTN, each category is usually subdivided into specific indicator items, and each indicator item is jointly optimized to meet the various needs of PTN optimization and improve the PTN performance. Because there are too many indicator items in PTN, this article will select the two most representative indicators and optimize them with multiple objectives.
The first indicator is the LSPOR in the service protection category within the network. This indicator is defined as the ratio of Label switching path(LSP)1:1 [30] between the primary and backup paths to the route, which is used to evaluate the protective LSP primary and backup paths passing through the same node or same board or same logical link situation, where the same network element(NE) situation is shown in Fig 2. It can be seen that both the primary path and the backup path have passed through the NE C. When a failure occurs at C, even if the primary path is switched to the backup path, the problem cannot be solved. The situation of the same board is shown in Fig 3. It can be seen that both the primary path and the backup path have passed the same board in NE C. When the board fails, the LSP 1:1 protection will fail. The situation of the same link is shown in the fig 4. Both the primary path and the backup path pass through NE B and NE C. When B or C fails, or the BC link fails, LSP 1:1 protection will fail. LSP abnormalities also include the same board and the same NE and the situation of the same board and the same link. The optimization of this indicators belongs to the optimization of the channel layer.
Another indicator is the CBWUR in the network bandwidth resource category. This indicator is defined as the ratio of the total committed bandwidth of all services in the logical link to the link bandwidth, which is used to configure the Layer 2 Virtual Private Network (L2VPN) flow configuration CIR is mapped to all associated Tunnels, and the sub-topological link shows the CIR bandwidth occupancy rate. Among them, the mapping of CIR needs to be mapped differently according to different services [31]. The first type of service is E-Tree(Ethernet tree) service, which belongs to the Ethernet tree service. CIR needs to be mapped to all associated Tunnels, as shown in Fig 5. The second type is E-Line business, namely point-to-point business, only need to map CIR uniquely, as shown in Fig 6. Assuming that a certain PTN sets the threshold of CIR bandwidth occupancy rate to 80%, the rate of each fiber is 1 Gigabit Ethernet (GE), the maximum transmission bandwidth of one fiber is 1000Mbps(1000M), and each service occupies 300M bandwidth, the schematic diagram of CIR bandwidth occupancy is too high as shown in Fig 7. Three bussinesses have passed through the abnormal link, resulting in the optical fiber carrying 900M width, the occupancy rate reached 90%, exceeding the threshold of the CIR bandwidth occupancy rate, this situation will cause the user to use the network rate to reduce, and further reduce the sense of use. The optimization of this indicator belongs to the optimization of segment level.

B. Model building
Bilevel programming is a special type of mathematical programming problem, usually composed of an UL optimization problem and a LL sub-optimization problem, and each level of optimization problem has its own optimization goal and corresponding constraints. The objective function and constraint conditions of the UL problems are not only related to the UL decision variables, but also depend on the optimal solution of the LL problems; while the objective functions and constraints of the LL problems are also affected by the UL decision variables, and the LL problems are specific to a given UL decision variables, find their own optimal solutions and feed them back to their upper decision makers [32]. The structural relationship between the upper and lower layers in the bilevel programming model is shown in  According to the two decision makers of PTN optimization: product supplier and operator user, we regard the operator user as the UL decision-maker, and the product supplier as the LL decision-maker.
First, we find the best decision of the operator user. Second, the decision for operator is taken as a prerequisite to find the optimal decision of the product supplier. Then the operator finds a solution that conforms to the overall interests based on this decision. Among them, the objective function of operator is to pay the least, and the objective function of product supplier is to maximize the LSPOR and CBWUR evaluation scores.
The operator pays the least cost, which can be transformed into the shortest main path length of each tunnel in the PTN . In a certain PTN, suppose the total number of tunnels equipped with LSP 1:1 protection is Ntunnel. Select the i-th tunnel in Ntunne, the primary path of the Tunneli is denoted as Pi ,and the backup path of the Tunneli is denoted as Bi. The set of primary path NEs and board cards are respectively [NEPi 1 , NEPi 2 ， ... ， NEPi (Lpi) ], [BoardPi (1,1) , BoardPi (2,0) , BoardPi (2,1) ,...,BoardPi (Lpi,0) ], the backup path NE set contract is reasonable. The meaning of each symbol is explained in Table I.   TABLE I  SYMBOLS AND IMPLICATION CORRESPONDENCE TABLE   SYMBOLS  IMPLICATION   Tunneli  Article i Tunnel  NEPi k The k-th NE of Pi NEBi k The k-th NE of Bi BoardPi (k, 1) The exit board of NEPi k BoardPi (k,0) The incoming board of NEPi k BoardBi (k, 1) The exit board of NEBi k BoardBi (k,0) The incoming board of NEBi k LPi The length of Pi LBi The length of Bi NPi The number of Pi NBi The number of Bi Topo Fiber optic link LinkSpeed Fiber speed NTopo Total number of logical fiber links CBO The CIR bandwidth occupancy The following relationship exists.The length of the primary or backup path of the i-th tunnel is equal to the number of corresponding NE minus one.
The objective of the product supplier i s to maximize the LSPOR and CBWUR evaluation scores, that is, to require as many data as normal in each indicator, mark normal data as 1, and abnormal data as 0. The prerequisites for abnormal LSPOR indicator data are as follows.
The condition for the existence of the same NE in a tunnel is shown in (3).
The condition for the existence of the same board in a tunnel is shown in (4).
The condition for the existence of the same link in a tunnel is shown in (5  The preconditions for abnormal CBWUR indicators are as follows. Total number of logical fiber links is denoted as NTopo. The CIR bandwidth occupancy rate of an Topo is shown in (7).
For the CBWUR indicator, the scoring standards are: : (10) The upper model has no constraints, and the lower model has constraints:

III. Solution
In the bilevel programming model discussed in this article, the upper and lower models are related to each other. This section first gives the solution of the upper model. After obtaining the optimal solution, use it as a constraint condition for the solution of the LL model. Secondly, the solution of the lower model is given, including the improvement of the multi-objective gray wolf algorithm. Finally, the solution scheme of the entire bilevel multi-objective optimization model is given.

A. Upper level model solution
The upper decision maker of PTN optimization is the operator user, and the operator is required to deliver the lowest cost to the product supplier. We turn this into a problem of seeking the shortest path in the network. In a PTN, a tunnel contains a primary path, and a tunnel equipped with LSP 1:1 protection also contains a backup path. When the primary path fails, the backup path can be used to work. Therefore, we can see that for the shortest path problem in the PTN network, it is not only to find one path, but to find multiple paths included by the first and last nodes, so common shortest path algorithms, such as Dijkstra's algorithm [33], Freud's algorithm [34], etc. are not applicable to this research. We should find an algorithm that can get multiple paths and sort them by path length.

1) KSP ALGORITHM
YEN et al. [35] proposed a shortest path with the first K acyclics between two nodes in 1971. This algorithm can find K shortest paths according to the required calculation amount, which coincides with the idea of this article. The algorithm can be divided into two parts. Firstly, Dijkstra's algorithm is used to calculate the first shortest path, referred to as P(1), and then the other K-1 shortest paths are sequentially calculated on this basis. When calculating P(i+1), consider all nodes on P(i) except the termination node as deviating nodes, and calculate the shortest path from each deviating node to the termination node, and then compare it with the previous P(i) The path from the upper starting node to the deviating node is spliced to form a candidate path, and the shortest path is found in the candidate path set as P(i+1) [36].

2) IMPROVED KSP ALGORITHM
Since the main idea of the KSP algorithm is to find the first k paths that exist between two points, but for real networks, two adjacent NE nodes are usually connected by Topo, and the Topo must be carried on the NE node. On a certain port of a certain board, these information will be stored in the Topo data file of the PTN. If the NE node is used as the vertex in the KSP algorithm, the final path can only be determined to the NE node, and the boards and ports through which these paths pass cannot be located.
In order to solve the above problems, some improvements have been made to the KSP algorithm. The specific steps are as follows.
Step 1: According to the NE data in the PTN, generate the corresponding NE node number, generate the corresponding directed graph according to the number, and name it as topology map 1; Step 2: Traverse all the tunnels in NTunnel, enter the source and sink nodes of a tunnel, convert it to the corresponding node number in the topology map 1, and use the KSP algorithm to find the path it passes; Step 3: At this time, the path found by the KSP algorithm uses the number as the node identifier, and the node identifier in the path is converted to the corresponding NE node. The set of paths found by the KSP algorithm at this time is marked as KSP1; Step 4: Select the first path in KSP1 that is not equal to the primary and backup path, and record it as Path1; Step 5: For each link in Path1, find the corresponding board and port information in the Topo data file, and append it to each link in Path1. Record the path at this time as Path2; Step 6: Replace Path2 to the KSP alternate path. After the replacement is completed, go to step 2 until the end of the traversal.

3) UL MODEL SOLUTION FLOWCHART
For the solution of the UL model, first use the Topo information in the network data to generate a directed graph, and then traverse all the tunnel data equipped with the LSP 1:1 protection path. For each tunnel, the improved KSP algorithm is used to find Path2, and the primary path of the tunnel is replaced with Path2. If the KSP algorithm cannot find Path2, the next tunnel data is optimized. When all the tunnels are processed, the entire PTN reaches the optimum at this time. At this time, the transmission speed of each service is faster than before, and the length of the path is reduced, thus reducing the payment fees of the operator users and meeting the needs of UL decision-making. The KSP algorithm at this time is called ULKSP. The flow chart of the upper model solution is shown in Fig 10. VOLUME XX, 2017

B. Lower level model solution
The LL decision maker of PTN optimization is the product supplier, and the product supplier is required to provide the best PTN optimization service. To judge whether the PTN service meets the standard, the network is usually scored. The higher the score, the better the network. Therefore, we can convert the problem of solving the LL model into a problem of maximizing indicator scores. At the same time, multiple indicators are included in the PTN, and its optimization requires the use of multi-objective optimization technology. Nowadays, intelligent evolutionary algorithms are becoming more and more mature, and choosing a suitable multi-objective intelligent evolutionary algorithm is the key to the solution.

1) MULTI-OBJECTIVE GRAY WOLF ALGORITHM
The multi-objective gray wolf algorithm proposed by Seyedali Mirjalili in 2015 can achieve better results in convergence speed. The gray wolf optimization(GWO) algorithm is a new swarm intelligence optimization algorithm proposed by Mirjalili et al. [37] inspired by the cooperative predation process of wolves in nature. In 2015, based on this, a multi-objective gray wolf optimization algorithm (MOGWO) was proposed [26].
The gray wolf population can be divided into four levels, namely α, β, δ, and ω. GWO algorithm is established by referring to the predation process of the gray wolf. The position of the wolf in the algorithm represents a possible solution of the problem.
In the GWO algorithm, the three positions with the best objective function value in each iteration are assigned to α, β, and ω in turn, and the remaining individuals update their positions according to these three optimal individual positions. The next generation position of the gray wolf individual is shown in the formula.
Where t is the current iteration number, Xp(t) represents the position of the prey at the t-th iteration, Xi(t) represents the position of the gray wolf individual i at the t-th iteration, and A and C are the influence coefficients. The calculation formula is as follows.
Where tmaxiter is the maximum number of iterations. Compared with GWO, two new components have been introduced in MOGWO [38]. The first is the archive, which is responsible for accessing the non-dominated pareto optimal solution obtained up to the current iteration number; the other component is the leader selection mechanism, which is used to update the solutions in the archive. At the same time, a grid mechanism is proposed to improve the solution in the archive. And the selection strategy of the next generation of gray wolf individuals has been changed. Archive is used to store the outstanding individuals produced in each generation, that is, non-dominated solutions. And according to a certain strategy to update and delete. The MOGWO algorithm first selects three outstanding individuals as α , β , and ω from archive using roulette. After updating and deleting the population individuals, the individuals in the external population archive are a set of pareto optimal solutions for the optimization problem.
The research results [39] show that the optimization performance of gray wolf optimization algorithm is better than DE algorithm [40], PSO algorithm [41] and gravity search algorithm. The main advantages of the algorithm are simple structure, fewer parameters to be set, and easy implementation in experimental coding. Since its proposal, the GWO algorithm has been applied in terms of attribute reduction, feature selection, economic load distribution problems and surface wave analysis.

2) IMPROVED MULTI-OBJECTIVE GRAY WOLF ALGORITHM
Because the original multi-objective gray wolf algorithm still has some shortcomings, the improvement of the multiobjective gray wolf algorithm is also a research direction at present. However, gray wolf optimization algorithms have disadvantages such as low solution accuracy and slow convergence speed. In this regard, researchers have proposed many improvement methods. Common improvements to multi-objective gray wolf algorithms are mainly from the diversity of the initialization population and whether the convergence factor easy to fall into the local optimum, and how to improve the search speed of the global optimum solution.
Wang Zhao et al. [42] started from the aspect of time and used the parallelization method to intelligently optimize the individual drones, which improved the search speed of the optimal solution. However, the increase in the speed of this method sacrificed a certain amount of computer memory space. The CPU requirements of the device are relatively high. Qi Yan et al. [43] used the MOGWO algorithm to solve the optimization problem of the microgrid, and simplified the model. For the gray wolf individual initialization and location update, the calculation time was shortened by time periods. At the same time, the convergence factor is also set to non-linear convergence, but the exponential factor in the convergence mode is manually set, and it is impossible to judge whether the convergence effect is the best. Zhang Tao et al. [44] proposed a multi-objective differential gray wolf algorithm for coordinated reactive power optimization in the distribution network. The algorithm uses chaotic mapping in population initialization to increase the diversity of population initialization and introduce differential mutation and crossover in the algorithm solve the problem that the gray wolf algorithm is easy to fall into the local optimum. Meng Kai et al. [45] built a multi-objective optimization model for the current assembly line balance and equipment maintenance problems in assembly line management, and used an improved gray wolf algorithm for multi-objective optimization to improve individual encoding and decoding methods . This model is suitable for solving problems with discrete features. At the same time, they also introduced the Pareto hierarchy in the classification of gray wolf individuals to construct and calculate the crowding distance, and introduced the crossover operator in the gray wolf individual position update to expand the search range of the global optimal solution.
This article adopts to improve the convergence factor. Where maxiter is the maximum number of iterations, t is the current number of iterations, atraditional is the traditional linear convergence factor, aimproved is the improved nonlinear convergence factor, and rand() is a mathematical function that randomly generates a decimal number ranging from 0 to 1. The convergence factor is improved from the traditional linear descent to the combination of the trigonometric function according to the size of the random number, and finally presents a non-linear convex declining trend that first decreases slowly and then decreases quickly.
The population in this paper is initialized to use a tunnel with LSP 1:1 protection as the gray wolf individual, and the gray wolf individual is optimized without considering the data correlation between the gray wolf individual, that is, the optimization is towards the direction of the optimal solution.

3) ENCODING AND DECODING
The data used in the multi-objective optimization of the LL of the PTN comes from the data provided by the product supplier, which contains the entire configuration information in the network, such as NE information, board information, port usage information, and business information. The information corresponding to each node in the primary and backup paths can be converted into binary encoding format, and the similarities in the primary path and the backup path can be coded as 1, and code 1 for links that occupy too much bandwidth on all links in the primary and backup paths..
Considering the optimal situation, when the number of NE nodes in the primary and backup paths is the same, it is assumed that there are 8 NE nodes. At this time, the primary and backup paths are the same except for the source and sink nodes, other network elements are different, and the same board phenomenon does not occur on the source and sink NEs, so the binary code of the NE is 10000001; the binary code of the board does not consider the input board of the source NE and the output board of the sink NE, so only 2*8-2=14 bits ; There are topo connections between the ports, so (8-1)*2=14 bits represent topo binary code, and the first seven bits represent the link bandwidth occupation of the primary path, and the last seven bits represent whether the link bandwidth occupation of the backup path exceeds the standard . Table III shows the binary codes corresponding to different problems of 6 gray wolf individuals. Gray wolf 1 represents the optimal individual, gray wolf 2 represents the same NE, and the third NE is the same; gray wolf 3 represents the same board situation, and the outlet of the board on the third NE is the same; Gray Wolf 4 represents the same link problem, and the primary and backup paths are the same from the fourth NE to the sixth NE; Gray Wolf 5 represents the CBO rate of the link between the 6th and 7th NE of the primary path is too high; Gray Wolf 6 means that the CIR bandwidth of the same board and link is too high.
From the above analysis, it can be seen that according to the coding format of the gray wolf individual, according to the abnormal situation of different types of indicators, the corresponding decoding method is formulated, as shown in Table IV.
When there are problems with the same NE, the same board, and the same link similar to Gray Wolf 2, 3, 4 ,the decoding should be (1,0). When the CIR bandwidth of the link similar to Gray Wolf 5 is too high , Its decoding should be (0,1); when there is a situation similar to Gray Wolf 6, the decoding is (1,1); the rest of the normal situation is decoded as (0,0).

4) INDIVIDUAL FITNESS CALCULATION METHOD
Since each tunnel contains two paths, active and standby, each node of each path contains three types of information: network element node information, board information, and port usage information. According to these three types of information, two network elements can be located Optical fiber information between nodes. Regarding the LSPOR indicator, when a gray wolf individual occurs in any of the three situations: the same user, the same board, or the same link, the fitness = N1/NTunnel, where N1 represents the sum of the number of network elements, the number of boards, and the number of links in which LSP abnormalities occur in a gray wolf. If no LSP abnormality occurs, the objective function value is 1/ NTunnel, otherwise it is 0. For the CBWUR indicator, when the bandwidth of a certain segment of the fiber link in the gray wolf individual is too high, the fitness = N2/NTopo, where N2 represents the number of links in a gray wolf where the CIR bandwidth is too high. If no CIR abnormality occurs, the objective function value is 1/NTopo, otherwise it is 0. The direct conversion formula between fitness and objective function value is as follows.

5) POPULATION UPDATE
Because the gray wolf individuals in this article contain too much information, each individual has its own independent meaning, so the correlation between individuals is not considered for the time being, and only the individuals are updated, and there are different update strategies for different indicators.
For the population update operation of the LSPOR indicator, consider using the KSP algorithm to find the third path except the primary and backup paths. This path will not have the same route as the primary path, and the CIR bandwidth will be selected as much as possible. Therefore, it is necessary to improve the KSP algorithm. The KSP algorithm at this time is called the LSPKSP algorithm. The steps for updating gray wolf individuals using the LSPKSP algorithm are as follows.
Step 1: Enter the gray wolf individual arbitrarily and determine whether its fitness value is 0, if it is 0, update the next individual. If it is not 0, proceed to the next step; Step 2: Determine where code 1 appears in each part of the individual code. If only the source and sink NE appear on the same board, switch the board. If not, proceed to the next step; Step 3: Determine the individual source and sink NE nodes, convert them to the corresponding numbers in the topology map 1, use the KSP algorithm to find the path it passes, and convert the node identifiers in the path to the corresponding NE nodes. Mark the set of paths found by the KSP algorithm as KSP2; Step 4: Compare all the paths in KSP2 with the primary path of the tunnel, find the paths that do not have the same route and store them in the set KSP3.
Step 5: Sort the paths in KSP3 in ascending order of CIR value, and call the set of paths at this time KSP4; Step 6: Choose the first path in KSP4 as an optional path, and replace the alternate path; after the replacement is completed, perform the next individual updating; if the KSP algorithm cannot find a suitable path, the individual cannot be mutated and optimized . Perform the update operation of the next individual until all the individuals are updated.
Since the CBWUR indicator corresponds to the service bandwidth occupancy rate of the topo on each link, we consider finding all the tunnels that the service passes through, switching services to them, and switching the services with lower bandwidth occupancy to the tunnel to reduce the effect of bandwidth usage. Specific steps are as follows.
Step 1: Enter the gray wolf individual arbitrarily and judge whether its fitness value is 0. If it is 0, update the next individual. If it is not 0, proceed to the next step; Step 2: Determine the position where code 1 appears in the optical fiber part of the individual code. If 1 appears in the first 7 digits of the code, it means that the bandwidth of a certain section of the topo link of the primary path is too high. If 1 appears in the last 7 digits, it is a backup path; Step 3: According to the individual source and sink network element nodes, use the ULKSP algorithm to find K paths, and record the set of paths found by the KSP algorithm as KSP5; Step 5: Traverse the set KSP5, and judge the bandwidth occupation of each path; find a path whose bandwidth occupation of each link meets the standard, and the total bandwidth of all links is the smallest among the k paths. Individuals whose path bandwidth is too high need to find a second path that meets the conditions.
Step 6: Replace to the corresponding primary or backup path according to the found path. If the bandwidth usage of the primary and backup paths is too high, replace the optimal KSP path for the primary path and the suboptimal KSP path for the backup path. After the replacement is completed, update the next individual; if the KSP algorithm cannot find a suitable path, the topo will be used to expand the bandwidth capacity for processing. After the processing is completed, the next individual update operation will be performed.

6) LL MODEL SOLUTION PROCESS
The multi-objective optimization process of the lowerlevel model is that the network data provided by the upperlevel model is used as input, and all the tunnel data are used as the initial population individuals in the improved multiobjective gray wolf algorithm and initialized. Initialize the parameters a, A, C, among which the convergence factor a needs to be improved by using trigonometric functions between partitions. Calculate the fitness value of all individuals, and calculate the objective function value according to the fitness value. Refer to the fourth part of subsection B of chapter three for the specific fitness value calculation method. The individual whose fitness value of the two goals is added to 0 is regarded as the leader of the initial population. Determine whether the iteration is over at this time. If it is not over, update all individuals. That is, it traverses all the tunnels, and replaces the tunnels whose LSPOR fitness value is not 0 with the LSPKSP algorithm for alternate paths. It is judged that a tunnel with a CBWUR fitness value other than 0 has a path that occupies too much bandwidth, which is recorded as [flag primary, flag backup]. When only one item is 1, use the optimal path found by the ULKSP algorithm to replace the path with flag 1. If both items are 1, then the optimal and sub-optimal path replacements of the ULKSP algorithm are used for the primary and standby paths respectively. After the update is completed, calculate the current objective function values of all individuals, store the current non-inferior solutions in the external archives according to the dominance relationship, and update the parameters a, A, C. When the multi-objective iteration ends, the non-inferior solution in the current archive is the solution sought by the lower-level model. The specific flow chart is shown in Figure 11.

C.PTN Bilevel multi-objective optimization solution scheme
First, establish a bilevel multi-objective optimization mathematical model. Then use the improved KSP algorithm to solve the upper model; use the optimized network of the upper model as the input of the lower model, and use the improved multi-objective gray wolf algorithm to solve the lower model. The new network after the solution is continued as the input of the upper model is iterated in sequence to obtain the optimal solution of the bilevel multi-objective optimization model. The specific flow chart is shown in the

A.Data preprocessing
According to the network regulation situation, select three regions to conduct experiments to judge the applicability of the plan. Among them, the area 1 network is the most regular, the area 2 network is more regular, and the area 3 network is the most chaotic. In the original data, only part of the tunnel is equipped with LSP 1:1 protection mechanism, so we only use this type of tunnel as the initial population individual. And in order to facilitate the management and optimization of the network, we use Mongo Data Base to store the data. The experimental equipment is configured as: Intel(R) core(TM) i7-8700 CPU @3.20GHz, memory 16GB, Windows10 *64-bit operating system. The network configuration of the three regions is shown in Table V, where the number of primary and backup hops refers to the sum of all links of the active path and the standby path.

B.Experimental result
The results of the PTN networks in the three regions before optimization are shown in Table VI. It can be seen that the number of LSP abnormalities in area 1 is 329, and the number of CIR abnormalities is 9, and the condition of the entire network is relatively good. The number of abnormal LSPs in area 2 is 2556, and the number of abnormal CIRs is 118, which is slightly worse than that in area 1. The number of abnormal LSPs in area 3 is 2751, and the number of abnormal CIRs is 503. Compared with the other two areas, the abnormal data is the most and the network condition is the worst. After the optimization of the PTN two-layer multiobjective model, the abnormal data of the three regions are shown in Table VII. Compared with Table VI, the abnormal   data of the two major indicators in the three major regions  decreased by 293 and 9, 1437 and 101, 2060 and 339,  respectively. Table VIII shows the objective function values  of the upper model and the lower model before and after optimization. The total number of hops in each region is greatly reduced, and the indicators of the lower model have reached a relatively optimal situation.   Due to the large population of this data, this paper only selects a single individual gray wolf to analyze it. The gray wolf individual contains two paths, primary and backup, as shown in Fig 13(a), black is the primary path, and red is the backup path. Before optimization, the total number of hops is 8, and the primary and standby paths are the same at NE C; the Topo rate on the GC segment link of the backup path is GE, and the total BC is 1000. Before optimization, the BC was 973, and the CIR bandwidth occupancy rate was 97.3%, which is already in the alarm range and needs to be optimized in time. As shown in Fig 13(b), the alternate path is replaced with A->E->F->D, which optimizes the problem of the same NE, and switches the link with too high CIR bandwidth to reduce the link CIR bandwidth is occupied, and the total number of hops has also been changed from 8 hops to 6, which is smaller than the original network scale.

V.Conclusion
This paper proposes a PTN optimization method based on bilevel multi-objective optimization, in which the UL objective is that the operator pays the lowest cost for the service product, and the LL objective is the highest quality of service provided by the product supplier. In order to better solve the problem of PTN bilevel multi-objective optimization, we changed the UL objective to the smallest overall PTN scale, and the LL objective to the two major evaluation network performance indicators in the PTN. The higher the indicator score, the better the network performance and the higher the quality of service provided. In this way, the PTN bilevel multi-objective optimization model is established, and the improved multi-objective gray wolf algorithm is used to solve it. We found that for the most organized network, this solution can reduce the network size from 334368 to 233428 hops, increase the LSPOR problem from the original 94.93 points to 99.51 points, and the CBWURE problem from the original 99.78 points to 100 points. For the most irregular network, this solution can reduce the network size from 547050 to 414121 hops, increase the LSPOR problem from the original 75.57 points to 93.86 points, and the CBWURE problem from the original 88.02 points to 96.09 points. That is, the more regular the network, the better the optimization effect, which is also in line with the actual situation. When the network is more regular, the more solutions the model can find, the faster the solution speed. Using this model to solve the problem not only increases the utilization of network resources, but also reduces network security risks and meets the needs of decision makers at the upper and lower levels. Therefore, the model proposed in this paper is feasible for solving the PTN bilevel multi-objective optimization problem.
This article applies bilevel multi-objective optimization to the PTN for the first time, but when performing multiobjective optimization of the LL model, only two objectives are considered, and other indicators in the PTN are not involved. In future research directions, we can consider how to use evolutionary algorithms to solve high-dimensional multi-objective optimization problems in PTN. And this article does not consider the correlation between individuals. In the future, we can try to explore the correlation between individuals and further optimize PTN.