Achieving Multi-Time-Step Segment Routing via Traffic Prediction and Compressive Sensing Techniques

Traffic engineering (TE) is one of the most critical issues in networking, as it enables efficient and reliable network operations. With the advent of Machine Learning (ML) techniques, many ML-based TE methods have emerged in recent years, especially those employing Deep Neural Networks for future traffic prediction to enhance the performance of traditional approaches. However, current methods suffer from two major issues. Firstly, most prior works only solve the TE problem based on short-term traffic prediction, neglecting the network traffic dynamics over an extended time period. This oversight results in high network disturbance when numerous traffic flows need to be rerouted to adapt to traffic changes. Secondly, although traffic prediction models rely on historical traffic data to perform future prediction, ML-based TE studies often ignore the high overhead for network traffic monitoring. To address these issues, we propose a traffic prediction-based routing algorithm in which the routing rules can be applied to multiple time-steps without requiring changes, ultimately leading to reduced network disturbance. We employ the segment routing (SR) technique as the routing algorithm and formulate the multi-time-step segment routing method that incorporates future traffic prediction. To address the high monitoring overhead, we present an approach that combines partial traffic prediction and compressive sensing techniques to estimate unmeasured data. Through extensive experiments on real backbone network traffic datasets, we demonstrate that our proposal can achieve more than 80% of the optimal performance in reducing maximum link utilization while significantly reducing the number of routing changes and traffic monitoring cost.

for future traffic prediction to enhance the performance of traditional approaches.However, current methods suffer from two major issues.Firstly, most prior works only solve the TE problem based on short-term traffic prediction, neglecting the network traffic dynamics over an extended time period.This oversight results in high network disturbance when numerous traffic flows need to be rerouted to adapt to traffic changes.Secondly, although traffic prediction models rely on historical traffic data to perform future prediction, ML-based TE studies often ignore the high overhead for network traffic monitoring.To address these issues, we propose a traffic prediction-based routing algorithm in which the routing rules can be applied to multiple time-steps without requiring changes, ultimately leading to reduced network disturbance.We employ the segment routing (SR) technique as the routing algorithm and formulate the multi-time-step segment routing method that incorporates future traffic prediction.To address the high monitoring overhead, we present an approach that combines partial traffic prediction and compressive sensing techniques to estimate unmeasured data.Through extensive experiments on real backbone network traffic datasets, we demonstrate that our proposal can achieve more than 80% of the optimal performance in reducing maximum link utilization while significantly reducing the number of routing changes and traffic monitoring cost.Index Terms-Traffic engineering, segment routing, traffic prediction, graph neural network, network monitoring, compressive sensing.

I. INTRODUCTION
T HE CISCO Annual Internet Report [1] predicts that there will be 5.3 billion Internet users by 2023, an increase from 3.9 billion in 2018.Due to the high demand for Internet services such as video streaming, and VoIP, backbone network traffic has experienced exponential growth.As a result, traffic engineering (TE) tasks like optimizing traffic routing and network monitoring face significant challenges.
Recently, many studies have leveraged Machine Learning (ML) and Deep Neural Network (DNN) techniques, combined with traditional TE solutions, to address network problems.ML/DNN techniques can be used to predict future traffic demands or directly generate routing rules [2].However, these approaches often suffer from two main problems.Firstly, there is a problem of high network disturbance or a large number of rerouting flows, leading to a degradation in the overall network's Quality of Service (QoS) [3].Most of the proposed solutions only address the routing problem in a single snapshot (which is called a "time-step" in this paper) or use a shortterm traffic prediction to calculate the routing rules without considering the long time horizon [4], [5], [6], [7].Due to the dynamic behavior of the network traffic, the traffic matrix often varies over time, and the network controller may need to reroute many flows to balance the traffic loads, leading to significant network disturbance and service disruption.Depending on the traffic fluctuation, network optimization, and traffic rerouting can be performed with high frequency (e.g., at every minute).Secondly, the current ML-based TE solutions impose high network monitoring overhead.Many prior works only focus on solving routing problems and assuming the network statistics such as traffic matrix or link utilization are available.However, with the explosion of traffic and the expansion of the physical network, obtaining all the network statistics imposes a high monitoring overhead.In addition, applying ML/DNN into networking also needs a huge amount of monitored data for training/predicting processes.Although the quality of the measurements may have a huge impact on the performance of the TE solution [8], there are only a few studies that consider the joint problem of network monitoring and traffic engineering.Moreover, the scalability issue of applying ML/DNN techniques is often omitted in many ML-based TE studies.
To address these issues, this paper proposes a new approach that uses segment routing to optimize routing over a long c 2023 The Authors.This work is licensed under a Creative Commons Attribution 4.0 License.
For more information, see https://creativecommons.org/licenses/by/4.0/time horizon and mitigates the need for frequent routing changes.A graph-based deep neural network is used to accurately predict future traffic demand, and based on the predicted values, a long-term routing method is proposed to optimize the routing rules.In this regard, we employ the segment routing (SR) technique as the routing algorithm and formulate the multi-time-step segment routing method (called MTSR) that incorporates future traffic prediction.To reduce network monitoring overhead, a partial traffic prediction approach is combined with the compressive sensing technique.Specifically, a partial future traffic matrix is predicted using a small amount of observed traffic, and the compressive sensing technique is used to reconstruct the entire traffic matrix.This approach reduces monitoring overhead while still achieving high performance in network routing.This work is an extension of a previous study [9].The paper outlines several significant contributions related to traffic engineering in the face of high network demand and disturbance and high monitoring overhead.
• We address the issue of network disturbance by introducing the multi-time-step segment routing (MTSR) method, which utilizes an Integer Linear Programming formulation and advanced Deep Neural Network models for multi-step prediction to minimize the number of rerouting flows over a long time horizon.• To account for prediction accuracy, we present three versions of the MTSR and provide a theoretical analysis of their performance.• To enhance the practicality of MTSR by reducing network monitoring cost, we propose an extended approach called MTSR-CS that combines partial network traffic prediction with compressive sensing technique.• We conduct extensive experiments using different real backbone network datasets to evaluate the performance of our proposed methods and compare them to state-ofthe-art approaches.The remainder of the paper is organized as follows: Section II provides an overview of related work and problem discussion, Section III-C presents our proposed method MTSR to address the high network disturbance problem, Section IV presents an extended of MTSR (called MTSR-CS) which can reduce the network monitoring overhead, Section V provides extensive experimental results, and finally, we conclude the paper in Section VI.

II. BACKGROUND AND EXISTING WORKS
This section first provides a brief overview of traffic engineering (TE) and related works.Then, we discuss the two problems which are addressed in this paper.

A. Traditional Traffic Engineering (TE)
The minimization of congestion is widely considered one of the most significant objectives in traffic engineering.Typically, achieving this objective involves reducing the maximum link utilization (MLU) in a network.In theoretical works, the Multi-Commodity Flow (MCF) problem is often used to obtain fractional solutions in which traffic flows are split and Fig. 1.Illustration of 2-segment routing [12].directed through various paths.However, in practice, many ISP networks rely on shortest-path routing techniques, such as OSPF and IS-IS, due to their simplicity.By adjusting link weights in a distributed manner, shortest path routing can compute near-optimal forwarding paths.Nonetheless, this method has drawbacks, such as lengthy re-convergence time and poor performance when network topology or traffic demands change.Recent advancements in routing techniques, such as MPLS and RSVP, have offered increased flexibility and improved traffic engineering performance by enabling explicit routing paths.However, MPLS-TE solutions are known to have long convergence times and a high cost of maintaining the TE tunnels.Another routing approach, known as oblivious routing and described in [10] and [11], performs traffic routing based solely on network topology without any knowledge of current network traffic.This method is easier to implement and does not cause the rerouting problem associated with adaptive routing techniques.

B. Traffic Engineering With Segment Routing
Segment routing is a routing paradigm that operates on the basis of source routing.It allows the source node (or ingress node) to embed a list of Segment Identifiers (SIDs) in the packet header.This segment list serves as a set of instructions (SR policy) to direct the packet through the network devices.While SIDs can distinguish both nodes and links in the network, this paper focuses solely on the nodesegment for simplicity.Packets originating from the source node must traverse all nodes in the segment list before being forwarded to the destination.Shortest path routing techniques, such as OSPF, are used to route the packet within the segment.Figure 1 demonstrates an instance of 2-segment routing, wherein a traffic flow originating from node i and destined for node j is directed through two segments, specifically, the i − k and k − j segments.
Due to its flexible routing capabilities, SR has been extensively investigated both theoretically and practically.Bhatia et al. [12] formulated the TE problem with SR as a linear programming problem, where only two segments were considered.Subsequent studies have focused on utilizing more than two segments.For instance, 3-SR [13] proposed an optimization model that uses three segments and combines both node and edge segments.Jadin et al. [14] further improved the TE problem with SR by fully exploiting n-SR (n-segment routing with both node and edge segments) and proposed the first Column Generation-based approach.In practical approaches, the authors in [15] considered unexpected traffic fluctuation and link failure problems and proposed a local search-based algorithm to solve the targeted problems under a sub-second constraint.Recently, studies by SR Tunnel [16] and [17] extended the work in [12] by proposing an optimization model to minimize the number of deployed SR policies.

C. The High Network Disturbance Problem
The majority of proposed traffic engineering solutions tackle the routing optimization problem by considering a snapshot of the traffic demand at the current time-step.Although optimizing network routing at each time-step may effectively fulfill the traffic engineering objective, such as minimizing the peak link utilization, it gives rise to a notable predicament.This predicament manifests as a considerable volume of data flows being rerouted to accommodate fluctuations in traffic demands, subsequently resulting in pronounced disturbances across the network (e.g., service disruption).The impact of the network disturbance problem has been examined in a study conducted by [3].The findings from this study indicate that the substantial redirection of flow traffic can lead to a substantial decline of up to 50% of total network throughput.Furthermore, the rerouting can cause out-of-order problems in certain flows.The utilization of oblivious routing techniques [10], [11], [12] presents a viable method for mitigating the problem of network disturbance.By calculating the routing rules in advance only using the network topology (without prior knowledge of the traffic demand), these methods do not require changing the path of the network flows.While this approach can reduce the overhead caused by routing path changes, it may not perform well under dynamic network behaviors.Additionally, the oblivious routing method such as Traffic Matrix Oblivious Segment Routing in [12] is only practical for offline traffic engineering on relatively small networks as mentioned in [15].
Recently, several studies [18], [19], [20] utilized machine learning techniques to mitigate this problem.For example, within the work presented in [18], an approach founded upon Reinforcement Learning principles was introduced.This approach involves the identification of a subset of flows termed as the "critical flow set" which subsequently becomes the exclusive focus for rerouting.Consequently, the necessity to reroute flows is constrained, ensuring that the count of such flows remains within the confines of k% of the total number of flows within the network (e.g., k = 10%).Further exploration of this method can be found in studies such as [19] and [20].They introduced the integration of a graph neural network, thereby facilitating the generalization of the proposed approach to encompass previously untrained network scenarios.However, this method still results in a large number of rerouted flows over the time horizon when frequently executed.

D. The High Network Monitoring Cost of ML-Based Approach
Recently, machine learning (ML) techniques have been increasingly utilized to address various networking issues, including traffic routing, demand prediction, and anomaly detection.In traffic engineering, ML can be employed to forecast future traffic demands, enabling proactive calculation of routing rules to adapt to anticipated changes in traffic patterns.Previous studies [4], [6], [7], [21] have utilized diverse Deep Neural Network models, such as Convolutional Neural Network and Long Short-Term Memory, for traffic prediction and have obtained promising outcomes compared to traditional methods.In general, applying ML techniques issues a significant amount of data for the training and prediction processes.In the majority of ML-based TE solutions, there exists an implicit assumption that the necessary data, such as historical traffic demands, is readily accessible and the cost of network monitoring is frequently overlooked.To enhance the practicality of ML-based TE methods, there are some studies that focus on elevating network monitoring overhead.
For instance, the authors in [7] reduced the monitoring cost by only measuring a subset of network flows and proposed a method that exploits the forward and backward ConvLSTM layers to correct the input data.More recently, VAE (Variational AutoEncoders) models have been used by Kakkavas et al. in [22] to learn the distribution from traffic during the training phase.In the testing phase, the trained decoder of the VAE model is used to reconstruct the end-toend traffic matrix from the links' load.This approach can be applied to reduce monitoring overhead since the number of links is typically less than the number of traffic flows, and link-level measurements are easier to obtain than flow-level measurements.However, this approach has not been evaluated for addressing traffic engineering issues.

III. REDUCING NETWORK DISTURBANCE LEVERAGING TRAFFIC PREDICTION
In this section, we first provide a concise overview of the network model and the traffic engineering problem using segment routing.Then, we present a traffic predictionbased routing algorithm named Multi-Time-Step Segment Routing (MTSR) to reduce network disturbance.We formulate the MTSR problem using three different traffic prediction approaches, each of which is based on the complexities inherent in the traffic prediction methods.Finally, we conduct a theoretical analysis of the three approaches.It is essential to note that within this section, we assume that historical traffic data is readily available for utilization in the prediction tasks.In practical scenarios, this data is typically acquired through network monitoring modules, which can cause substantial monitoring overhead.Consequently, to enhance the practicality of MTSR, we introduce the MTSR-CS approach in Section IV.

A. Network Model
The traffic engineering-based segment routing was first introduced in [12] as a traffic matrix-aware segment routing method.Hence, several notations and figures from [12] have been incorporated in this paper.The network is represented as a directed graph G = (V, E), where V is the set of nodes (|V | = N ) and E is the set of links with each link e ∈ E having a capacity of c(e).The traffic matrix at time-step t is denoted as M t ∈ R N ×N , where m t ij ∈ M t signifies the traffic flow from node i to j at time-step t.The term "flow ij" represents the total traffic that enters the network at node i and exits at node j.We define the binary variable α k ij as the routing policy, where α k ij = 1 indicates that flow ij is routed through the intermediate node k, and α k ij = 0 otherwise.It is assumed that a central controller, such as an SDN controller [23], manages the network by collecting information about the network, predicting future traffic demands, and applying the routing policy to devices through PCEP [24].However, the implementation of the controller is beyond the scope of this paper.

B. Single Time-Step Segment Routing
The 2-SR technique requires selecting a single intermediate node k for each flow ij, as shown in Figure 1.The traffic between nodes i and k and between nodes k and j is directed through the shortest path connecting them.If k is equal to i or j, the flow ij follows the shortest path from i to j.The problem of segment routing for a single time-step (P 0 ) can be expressed as an integer linear program [12], where the variable θ ∈ R + denotes the maximum link utilization in the network.Although, in practice, θ cannot be larger than 1 as the total traffic on a link cannot exceed its capacity, the problem formulation allows θ to exceed 1, indicating an overcongested network.
P 0 : single time-step segment routing The equation g k ij (e) = f ik (e) + f kj (e) is used to calculate the total traffic load on a link e for flow ij with intermediate node k.Here, f ik (e) = 1 if link e is part of the shortest path from i to k of flow ij with intermediate node k, and f ik (e) = 0 otherwise.Eq. ( 2) and ( 4) ensure that all traffic from i to j is routed and it cannot be distributed across multiple paths.Eq. ( 3) ensures that the total traffic load on link e is less than or equal to its capacity.

C. Multi-Time-Step Segment Routing
The problem P 0 provides a routing policy for a single time-step t.Therefore, we need to resolve problem P 0 and update the routing policy every time-step.This approach may lead to a considerable number of rerouting flows and high network disturbance.To this end, we propose an extension of P 0 , which addresses the segment routing problem by applying multi-time-step traffic prediction.Our strategy for mitigating network disturbance revolves around reducing the frequency of path alterations for traffic flows.This approach involves solving the TE problem taking into consideration the anticipated values of forthcoming traffic demand.Figure 2 illustrates the process within a routing cycle including three tasks.At the beginning of the cycle (denoted by time-step t), we use the historical traffic data collected from the previous cycle to estimate future demands.Subsequently, these forecasted demand values are utilized for solving the TE problem.The routing rules derived from this approach can be applied for routing the traffic in the network over T time-steps of the cycle without necessitating frequent updates while retaining the adaptability required to accommodate dynamic demand fluctuations.Consequently, the traffic data is measured from time-step t through time-step t + T and will be used in the next routing cycle.
In this context, we compare three different prediction schemes aimed at estimating the future traffic demand for the subsequent T time-steps.First, we provide some notations regarding the traffic prediction used in this paper.The proposed approaches are subsequently utilized to formulate the multi-time-step segment routing problem in three variants, denoted as P 1 , P 2 , and P 3 , respectively.Each variant is based on a specific traffic prediction approach, and their corresponding theoretical analyses are presented.We use a prediction model ŷ = f (x , ω) to estimate the traffic demands (i.e., traffic matrices) of the next T time-steps using historical traffic data.Let t be the current time-step.The input of the prediction model is denoted as x = [M t−1 , .., M t−H ], which contains the previous H traffic matrices.The predicted values and the model parameters are represented by ŷ and ω, respectively.We describe the three traffic prediction approaches below: • Full prediction: This approach involves estimating the traffic matrices of all network flows at every time-step in the next routing cycle.Therefore, we have ŷ = [M t+1 , .., M t+T ]. • Max prediction: In this approach, we only predict the largest traffic demand for each traffic flow in the next T steps.The predicted traffic demand matrix is denoted by • Period-max prediction: This approach involves dividing the routing cycle into P sub-periods, each of which has T p time-steps.We then estimate the maximum values of flow ij in each sub-period.Accordingly, we have ŷ = [M * p ](p = 1, .., P ) An example of these three prediction approaches is shown in Figure 3.While the full prediction approach provides more information to the routing algorithm about future traffic fluctuation, it may suffer from low prediction accuracy.For example, the performance of some proposed prediction models [7], [25]   deteriorates as the number of predicted values increases.On the other hand, the max prediction approach reduces the prediction task's complexity by only estimating the worst-case scenario of network demands.The third approach can be seen as a trade-off between the first two methods.

1) P 1 (MTSR With Full Prediction):
We present the problem formulations of the multi-time-step segment routing, denoted by P 1 , P 2 , and P 3 , which are based on three traffic prediction approaches.First, we provide the problem formulation for P 1 using the output of the full prediction approach.Let M = [M 1 , M 2 , . . ., M T ] be the predicted traffic matrices for the next T time-steps, where the index t for the current time-step is omitted for simplicity.To formulate P 1 , we extend the problem P 0 for T future steps by introducing additional constraints that correspond to the traffic demands at each step.
It can be observed that P 1 differs from P 0 due to the link capacity constraints (7).More specifically, in P 1 , the routing policy α k ij must satisfy the link capacity constraints for each time-step in the future.The increased number of constraints makes P 1 more intricate than P 0 .Additionally, P 1 necessitates predicting all the traffic matrices for the next T time-steps, which remains a significant challenge even with the latest deep learning models.Notably, prediction accuracy plays a critical role in the performance of P 1 .However, this approach might exhibit low prediction accuracy due to the large number of values that need to be predicted.Therefore, to mitigate the burden on traffic prediction tasks and reduce the problem complexity, we formulate P 2 and P 3 using the max prediction and period-max prediction methods.P 2 and P 3 can be considered as the relaxed versions of P 1 .
2) P 2 (MTSR With Max Prediction): In problem P 2 , the formulation only takes into account the maximum values of each flow ij over the next T time-steps.This approach results in the same number of constraints as that of P 0 .Furthermore, it only requires the prediction of a single traffic matrix with elements representing the maximum value of each traffic flow.
3) P 3 (MTSR With Period-Max Traffic Prediction): Similar to P 2 , we formulate P 3 using the maximum values of flow ij in each sub-period T p .To solve P 3 , we only need to predict P traffic matrices, which represent the maximum traffic of every flow ij in each sub-period T p .The differences between the proposed problem formulations are summarized in Table I.In general, the number of required predicted traffic matrices depends on the method used to estimate future traffic.

D. Theoretical Analysis
In this part, we theoretically analyze the performance ratios of P 2 , and , and (θ * 3 , α * 3 ) as the optimal solutions obtained by solving P 1 , P 2 , and P 3 , respectively.We denote u(t, e, α * p ) the utilization of link e when we apply routing policy α * p in routing cycle t.Then, the maximum link utilization of the network when applying routing policy (α * p ), denoted as u(α * p ), is defined as follows: Theorem 1: Theorem 2: Let e * be the link with the largest capacity, and e * 2 be the link where the equal sign of constraint (11) in P 2 holds; Then, the performance ratio of P 2 to P 1 (i.e., ) is upper bounded by λ, which is calculated as follows.
The proofs are presented in Appendix A. According to Theorem 1, when applying the routing policies obtained from solving the MTSR problem, the actual maximum link utilization of the network is less than its theoretical θ * .In addition, since the performance of P 3 is bounded by P 1 and P 2 (i.e., θ * 1 ≤ θ * 3 ≤ θ * 2 ), we will derive the upper bound of which will also be the upper bound of . Assume that all the network links have the same capacity, the performance ratio λ largely depends on the current network situation in the routing cycle.If all flows in the network have similar demands and are stable within the routing cycle, the value of λ could be much greater than 1.However, the network traffic is usually unbalanced where a small number of elephant flows accounted for a large portion of total network traffic, especially in ISP and data center networks [26], [27], [28].Due to the unbalanced traffic flows, we may have ij max t m t ij ≈ max t max ij m t ij .Therefore, the more unbalanced and dynamic the traffic flows, the lower the performance ratio λ is.
According to Theorem 2, by solving the MTSR problem with formulation P 2 , we can significantly reduce the problem complexity while still achieving a good performance for the routing policy.In addition, P 2 only requires the predicted values of the maximum demand instead of the predicted demands of every time-steps in the next routing cycle.By doing so, P 2 alleviates the difficulty in the traffic prediction task.Therefore, we formulate the MTSR problem using P 2 formulation.In addition, by using the maximum traffic prediction, we can easily extend the method to n-segment routing.There are several n-segment routing algorithms (e.g., SRLS [15]) that take the traffic matrix as input to compute routing rules.Hence, we can use the maximum traffic matrix which is obtained from the prediction model as the input for these n-segment routing algorithms.In the remaining part of this paper, when mentioning the MTSR without any specification, we mean the MTSR with formulation P 2 .

E. Traffic Prediction-Based Graph Neural Network
As mentioned in the previous section, accurately predicted traffic matrices are required as input for all the problem formulations, except in the case we don't consider traffic prediction for P 0 , i.e., using current traffic for TE.Note that, our objective is not to develop a new prediction model but to leverage the existing models for addressing the traffic engineering problem.There is a large number of proposed models for traffic matrices prediction such as in [5], [7].Here we adopt the prediction model to meet the requirements of our problem formulations.For example, in problem P 1 , at the beginning of each routing cycle, the prediction model uses the historical data of the last H time-steps to predict the traffic of the next T time-steps.In P 2 , the prediction model only needs to infer the maximum demand for each traffic flow.
In this paper, we use Graph WaveNet (GWN) [25] as the prediction model.Motivated by [29], GWN adopts stacked dilated casual convolutions to extract temporal features in the historical traffic data.Its structure is depicted in Figure 4.The model comprises multiple layers, with each layer consisting of two principal modules: the temporal convolution module and the graph convolution module.The temporal convolution module utilizes dilated causal convolution to capture a node's temporal trends, while the graph convolution module extracts a node's features based on its structural information.The final output is obtained by combining the outputs of all layers through a fully connected layer.It should be noted that each layer may include multiple blocks, each comprising the two modules.GWN model was originally developed for predicting transportation traffic.In order to adapt GWN for network traffic prediction, we treat each traffic flow (i.e., source-destination flow ij) as a node in the graph, with the traffic volume of the flow serving as the attribute of the node.Since we do not have an explicit graph that represents the relationships among traffic flows, we employ the selfadjacency matrix module described in [25].This module includes two learnable vectors that can dynamically learn the relationships among flows based on the current input data.
By combining the temporal convolution and the graph convolution modules, GWN is able to handle spatial-temporal data (e.g., traffic flows) and achieve better prediction accuracy than other time-series models such as Long Short-Term Memory (LSTM) and Auto-Regressive Integrated Moving Average (ARIMA).The implementation of GWN for traffic prediction used in this work can be found at [30].

F. Obtaining Routing Rules
After getting the predicted traffic demand, we need to solve the optimization problem (e.g., P 2 ) to obtain the routing rules.To solve the MTSR problem, we can use optimization solvers or heuristic algorithms.However, MTSR is an integer programming problem with a vast search space, thus traditional optimization solvers are hardly scaled to large topologies with more than 20 nodes [15].Thus, using heuristic or metaheuristic algorithms would be a practical way to solve this problem rapidly.Gay, Hartert, and Vissicchio proposed a local search algorithm called SRLS [15] to solve the n-segment routing algorithm.We adapted the algorithm SRLS [15] and proposed an algorithm called Local Search 2 Segment Routing (LS2SR) that can effectively solve the MTSR problem (with P 2 formulation) by exploiting the intrinsic structure of 2segment routing.We also improved LS2SR by adding a mechanism to refine the routing policy from the previous cycle, thereby reducing the routing policy variation over different routing cycles.The details of this method have been presented in our prior study [9].We have evaluated the scalability of LS2SR with different network sizes (see Appendix B) to show that LS2SR can work on large-scale networks and achieve comparable performance with SRLS.

IV. PARTIAL TRAFFIC PREDICTION
AND COMPRESSIVE SENSING In the previous sections, we have presented the multitime-steps segment routing (MTSR) strategy to address the network disturbance by utilizing future traffic prediction.The performance of MTSR relies on the accuracy of the traffic prediction model (i.e., GWN) which requires a large amount of data for the training and predicting processes.Note that, in Section III-C, we assume the data for traffic prediction tasks are available.However, collecting the historical data of all flows causes the problem of high network monitoring cost.As mentioned in Section II-D, this problem is often omitted in the prior ML-based TE studies.To enhance the practicality of our proposed method, we introduce an extension of MTSR called MTSR-CS which combines the MTSR and the compressive sensing technique.In MTSR-CS, network measurements and traffic predictions are selectively conducted solely for a subset of flows.Subsequently, a comprehensive matrix is reconstructed from the partial traffic predictions using compressive sensing before being used to calculate the routing rules.

A. Compressive Sensing-Based Network Traffic Reconstruction
According to compressive sensing theory, the signal can be reconstructed or recovered from a few samples by exploiting the sparsity characteristic of the original signal.Therefore, compressive sensing can be used in reconstructing the network traffic from a few measurement data as follows.
where X ∈ R F ×1 is a vector that contains the traffic volume of all flows.F = N × N is the total number of sourcedestination flows in the network (N is the number of nodes).Note that, instead of using a N × N matrix to represent the network traffic, it is represented by a vector that has F elements).Z ∈ R L×1 represents the measured traffic volumes.
L is the number of flows that are measured (L < F).Φ ∈ Fig. 5. Partial traffic measurement.The traffic matrix X is represented as a vector.
Fig. 6.Illustration of partial traffic prediction and matrix reconstruction using compressive sensing.
{0, 1} L×F is a binary matrix to indicate the measured flows.Figure 5 shows an example of partial traffic measurement.However, since network traffic X is not sparse in practice, compressive sensing cannot be directly applied to reconstruct X from Z. To overcome this problem, the authors in [31] use a transformation matrix D ∈ R F ×F to sparsely project X in the transformation domain D as: where S ∈ R F ×1 is a sparse projection of X. From Eq. ( 18)-( 19), we have: Since S is a sparse vector, we can apply compressive sensing to reconstruct S from the measurement Z. Then the full network traffic X can be obtained by using Eq.(19).

B. Reconstruct the Traffic Matrix From the Partial Traffic Prediction
In contrast to the approach proposed in [31], our methodology does not utilize compressive sensing to reconstruct the network traffic itself.Rather, we leverage it to reconstruct the maximum traffic demand from a partial traffic prediction.The conceptual framework behind this is presented in Figure 6.The routing algorithm (MTSR) takes a traffic Fig. 7. Generating the training dataset from historical traffic matrix that contains the anticipated values of the maximum demand of all traffic flows in the next routing cycle.Our approach reduces the monitoring by reconstructing this matrix from a partial traffic prediction.To this end, we first estimate the maximum demands of L flows via a prediction model.Subsequently, compressive sensing is applied to obtain the maximum traffic demands of all flows.Finally, the routing rules are calculated via the LS2SR algorithm [9].Thus, instead of monitoring and predicting the entire traffic matrix, we monitor and predict the future demands of a subset of traffic flows, thereby decreasing the monitoring overhead.
In order to implement the methodology outlined in [31], it is necessary to define the following terms: Let Z (as defined in Eq. ( 18)) be a vector composed of elements representing the predicted values of the maximum demands of L monitored flows.Let X denote the vector comprising the predicted maximum demands of all F flows.The proposed technique is composed of two phases, namely the training phase and the testing phase.In the training phase, a training dataset is utilized to calculate the transformation matrix D, which is subsequently employed to train the prediction model.In the testing phase, partial traffic prediction of a subset of flows (Z) is carried out at the onset of each routing cycle.The maximum demand of other flows X is then reconstructed from Z using compressive sensing and the transformation matrix D. The routing rules are subsequently computed utilizing the LS2SR algorithm.The specifics of the process will be elucidated in the ensuing sections.

C. Training Phase
The training phase comprises two fundamental tasks, namely the acquisition of the transformation matrix D and the training of the prediction model.The training dataset encompasses the historical measured traffic volume of all flows.Nonetheless, as the focus of the max prediciton is centered on the maximum demand during a routing cycle, the original dataset is converted into the set of maximum traffic matrices, as depicted in Figure 7.The maximum matrix is constructed by computing the maximum values of each flow over T time-steps within each routing cycle.Subsequently, the resulting matrix is flattened into a vector that comprises F elements.Thereafter, the generated data is employed to train the prediction model and acquire the transformation matrix D.
The transformation matrix D can be acquired by solving the optimization problem: where D ∈ R F ×F and Ŝ = ( Ŝ (1), Ŝ (2), . . ., Ŝ (T )) ∈ R F ×T are two unknown variables.Furthermore, X = ( X (1), X (2), . . ., X (T )) ∈ R F ×T denotes T vectors within the training set, and each column Ŝ (i ) ⊂ Ŝ is a sparse representation of the traffic vector X (i ).The parameter K is an upper bound on the number of non-zero entries in the sparse representation.
As reported in [31], the Alternative Least Square (ALS) algorithm can be employed to obtain the solution to the aforementioned problem.Firstly, we randomly assign values to the transformation matrix D. Subsequently, given D, ALS is utilized to determine Ŝ .Based on the obtained Ŝ , we can then update the values of D. The iterative updates of Ŝ and D continue until certain termination criteria (e.g., maximum iteration) are met.The updating process of D can be described as follows.Initially, one column of the transformation matrix D is updated at a time.Let d k be the k th column of D, and ŝk be the k th row of Ŝ (where k = 1, 2, . . ., F ).The multiplication D Ŝ is decomposed into the sum of M rank-1 matrices: D Ŝ = M j =1 d j ŝj .We have: Then, to update the k th column of D, we solve the following optimization problem: We use Singular Value Decomposition (SVD) to update d k and ŝk .For the details of solving the above problem, please refer to [31].Then, we repeat the process above to update other columns of D.

D. Testing Phase 1) Traffic Reconstruction:
During the testing phase, there are three main tasks to be accomplished: predicting the future traffic (i.e., Z), reconstructing vector X, and calculating network routing rules.At the beginning of each routing cycle, we estimate the maximum future demands of L flows using the monitored data obtained from the last cycle.After that, we reconstruct the vector X from the partial prediction.Let Z ∈ R L×1 be the predicted traffic values.We solve an optimization problem to find the sparse representation S using the transformation matrix D, the measurement matrix Φ, and the predicted traffic vector Z.
Then, we get the reconstruction results of X using Eq. ( 19).
2) Selecting Monitored After applying the routing rules, we determine which flows (a set of L flows) will be monitored in the routing cycle.We demand of these flows within the cycle (from time-step t to timestep t + T. The monitored data is used as input to perform the max prediction in the next routing cycle.Intuitively, the large flows in traffic volume tend to have a significant impact on the routing performance, However, accurately monitoring the L largest flows is not feasible since all traffic flows need to be measured beforehand.Therefore, we propose a simple monitoring scheme based on the traffic volumes of L monitored flows from the previous cycle.The new set of selected flows is a combination of ϕL largest flows from the previous set and a randomly selected (1−ϕ)L flows from other traffic flows, where 0 < ϕ < 1.Note that when performing max prediction, we estimate the maximum traffic demand of L flows that have been measured in the preceding routing cycle.Therefore, the traffic data for executing max prediction is available.

V. EVALUATION
We design four different experiments to evaluate our approach in both hypothetical and practical scenarios with different backbone networks including real and synthetic datasets.The experiment's results can be reproduced at [30].In the initial two experiments, we evaluate the performance of the MTSR method under the assumption that data pertaining to all traffic flows can be accurately measured.Subsequently, we examine the performance of the MTSR-CS method with different numbers of monitored flow (e.g., L).In the last experiment, we study the impact of several factors including: (1) different prediction models, and (2) the routing cycle length T. In addition, other ablation studies such as the scalability assessment of the routing algorithm LS2SR and the time dedicated to traffic prediction/reconstruction are presented in Appendix B.

A. Datasets and Performance Metrics
We conducted experiments on four datasets: Abilene, Geant, Germany, and Gnnet-40, available at [32], [33].Abilene and Geant are well-known public datasets that are used for evaluating traffic routing algorithms in many studies such as [9], [18], [34].Among these datasets, Gnnet-40 is one of the synthetic network datasets used in the Graph Neural Networking challenge 2021 [33].Details of the datasets are presented in Table II.Figure 8 illustrates the differences in traffic among all the datasets.Figure 8(a) is a cumulative distribution function (CDF) of the traffic demands in the network, normalized to the range [0, 1].Among these datasets, Gnnet-40 exhibits a uniform distribution in traffic demands, while  The traffic data is divided into three sets, namely train, validation, and test, based on the time horizon, which corresponds to 70%, 10%, and 20% of the complete dataset, respectively.Although the monitoring granularity of the datasets is relatively large, our problem formulation and proposed approach can be generally applied to the network systems that have finer monitoring granularity.
We evaluate the performance of our proposed approach using two main metrics: the MLU ratio (r mlu ) and the rerouting disturbance (RD) adopted from [18].The MLU ratio is calculated using Eq. ( 35) where θ * is the maximum link utilization of the network obtained using P 0 .Since in P 0 , the routing rule is computed for every time-step using the current traffic demand, the MLU of P 0 can be considered as optimal results.Therefore, r mlu = 1 means that the routing algorithm achieves as good as the routing rule as the optimal routing.
The rerouting disturbance (RD) is defined in Eq. (36) as the ratio of the number of rerouting flows to total traffic flows in a time-step.In Eq. (36), f r is the number of rerouting flows per time-step, and F = N × N is the total number of traffic flows.
We employ the Mean Absolute Error (MAE) as a metric for assessing the accuracy of predictive and reconstructive values in the tasks of future traffic forecasting and traffic matrix reconstruction.The MAE is calculated using Eq. ( 37) Fig. 9.The empirical CDF MLU ratio of three MTSR methods.Fig. 10.The MLU ratio θ 1 /θ 2 over routing cycles.θ 1 and θ 2 are the maximum link utilization of the network obtained using methods P 1 and P 2 , respectively.
where Γ is the total number of predicted/reconstructed values, ŷi denotes the anticipated or reconstructed value, while y i signifies the ground-truth value.

B. Experiment 1: Performance Comparison of Three MTSR Approaches
In the first experiment, we assess the performance of the multi-time-step segment routing with different traffic prediction methodologies, denoted as P 1 , P 2 , and P 3 , respectively.These approaches are segment routing algorithms that employ full, max, and period-max traffic prediction techniques, as described in Section III-C.The objective of this experiment is to determine the best MTSR approach in the absence of prediction errors.For this purpose, we assume that future traffic matrices can be accurately predicted by utilizing the actual traffic matrices from the test set as input.We use the solver from the PuLP library [35] to solve all the problems and obtain the optimal solutions.In all experiments, the routing cycle length T for MTSR schemes is set to 12.
Figure 9 presents the empirical cumulative distribution function (CDF) of the MLU ratio (r mlu ) on the Abilene and Geant datasets.The results demonstrate that all MTSR approaches exhibit strong performance, achieving approximately 90% of the optimal results.Among the three approaches P 1 yields the highest performance, followed by P 3 and P 2 , respectively.However, the differences between their performances are marginal.Figure 10 illustrates the performance ratio θ 1 θ 2 across different routing cycles.As evidenced by the results, P 2 , which uses the max prediction approach, can attain over 90% the performance of P 1 with full prediction in both datasets.The observations reveal three key benefits of addressing the MTSR problem by adopting the max prediction approach (P 2 ): (a) mitigating the complexity of the problem, (b) alleviating the challenges associated with traffic prediction, and (c) attaining performance that is comparable to other approaches.Consequently, for the rest of this paper, we refer to MTSR as P 2 utilizing the max prediction approach.

C. Experiment 2: Performance Comparison of Different Routing Algorithms
We conduct an experimental evaluation to assess the performance of the proposed MTSR approach, which utilizes the max prediction approach (P 2 ), and compare its performance with other routing algorithms.To predict future traffic matrices, we train a Graph WaveNet (GWN) model [25].The prediction model leverages data from the last H time-steps to predict the maximum demand matrices of each flow in the next routing cycle.The implementation of the Graph WaveNet model utilized in this study is adopted from [36].Table III displays the parameters of the GWN model and the experiment settings employed in the training and testing phase.
At the beginning of each cycle, we utilize the predicted traffic matrices to solve the problem and obtain the routing policy.Subsequently, the actual traffic matrix is utilized to calculate the maximum link utilization for each time-step.We adopt the proposed LS2SR algorithm [7] to solve the MTSR problem, with the predicted traffic serving as the input.We set the routing cycle length to T = 6 and the number of historical steps utilized for the prediction model to H = 15.We set the values of β = 16 and γ = 1, which are adopted from [15].
We compare the performance of our proposed MTSR approach against several other routing algorithms, including P 0 (same as in Experiment 1), Traffic Matrix Oblivious Segment Routing (OR) [12], shortest path routing based on link's weight (SP), critical flow rerouting using Deep Reinforcement Learning (CFR-RL), TopK Critical (C-TopK), TopK, and our proposed MTSR algorithm.In order to have a fair comparison, we set the value of k% to 10% for CFR-RL, TopK, and TopK Critical, which were adopted from [18].Note that, in [18], new routing paths for the critical flows are obtained by solving the MCF problem, but we solve it using 2-segment routing.Since the network traffic is measured every five minutes (e.g., Abilene), in this paper, we set a limited time  for solving the routing rules as 60 seconds for all the routing algorithms.time for running the traffic prediction task is negligible as shown in Appendix B.
Figures 11 and 12 present the empirical CDF of the MLU ratio and the rerouting disturbance, respectively.The results demonstrate that MTSR can significantly reduce network disturbance while achieving over 80% of the optimal routing performance in both datasets.Additionally, MTSR outperforms other routing algorithms in terms of MLU ratio.Regarding rerouting disturbance, CFR-RL can keep the disturbance at a similar value to our method by setting a maximum number of flows that can be rerouted per step (10%).However, in larger networks such as the Geant network, the rerouting disturbance of CFR-RL is considerably higher than our method.Shortest path routing and Oblivious Segment Routing cause no rerouting disturbance but are not adaptive to dynamic network demands, often resulting in high MLU.

D. Experiment 3: Performance Evaluation of MTSR-CS
The objective of this experiment is to assess the effectiveness of the MTSR-CS technique, which has been developed with the purpose of minimizing the monitoring overhead, while simultaneously maintaining the routing performance.The degree of monitoring overhead is quantified by the number of monitored flows, denoted by L. It is intuitive that as the value of L increases, the monitoring overhead also increases.Thus, in this experiment, we vary the value of L from 10% to 100% of total number of flows and measure the MLU ratio (r mlu ).Regarding our proposed monitoring scheme (Section IV-D2), the new set of selected flows is a combination of ϕL largest flows from the set of previous routing cycle and a randomly selected (1 − ϕ)L flows from other traffic flows.In this experiment, we set ϕ = 0.5.It is important to note that in the case of L = 100%, we carry out the MTSR method in the same manner as in the second experiment, i.e., without the utilization of compressive sensing.
As depicted in Figure 13, our results indicate that the use of compressive sensing leads to a significant improvement in r mlu , particularly when L < 30% total flows.In comparison with the conventional approach of full traffic monitoring (L = 100% total flows), the MTSR-CS method successfully reduces the monitoring overhead while still achieving comparable routing performance.Notably, in the Geant network, MTSR-CS is able to obtain the same performance result by monitoring only 10% to 20% of the total number of flows.This observation can be attributed to the uneven distribution of traffic in the Geant network, where a mere 10% of the flows (i.e., 48/484 flows) correspond to more than 80% of the total network traffic.By monitoring only 10% of the flows, we can effectively capture the traffic data of all the large flows within the network.Consequently, MTSR-CS with L = 10% of total flows can achieve similar routing performance to the case of full network monitoring.Note that in this experiment, MTSR only performs traffic prediction for the selected flows and refrains from employing compressive sensing techniques to reconstruct the entire traffic matrix.As a result, its performance does not exceed that of MTSR-CS, which incorporates compressive sensing methodologies.
Subsequently, we conduct the experiment using all four datasets to evaluate the impact of different traffic distributions on the matrix reconstruction and routing tasks.Here, we compare the MTSR-CS approach using different monitoring schemes and visualize the reconstruction error (MAE) and the routing performance (MLU ratio) in Figure 14 and Figure 15 respectively.In total, we have three monitoring schemes: Random, TopK, and our proposed monitoring scheme (see Section IV-D2).In the TopK monitoring scheme, we assume that the data of the L largest flows can be measured during each routing cycle.On the other hand, in the Random scheme, L flows are randomly selected for monitoring in each routing cycle.
Generally, it is observed that a lower reconstruction error of the traffic matrix corresponds to an enhanced routing performance.Considering Abilene, Geant, and Germany datasets, our proposed scheme yields similar results to the TopK scheme in terms of reconstruction error, thereby achieving the same routing performance as the TopK approach, while significantly outperforming the Random monitoring scheme.The proposed monitoring scheme is considered a trade-off The routing performance (MLU) of MTSR-CS using different monitoring schemes.method between the Random and TopK monitoring schemes.Although the TopK monitoring scheme yields the best results, it is deemed impractical due to the strong assumption that the L largest flows can be identified prior to monitoring.
In the case of Gnnet-40, which has a uniform traffic distribution, distinct outcomes are observed.Notably, the TopK scheme exhibits the poorest performance among the monitoring schemes.This decline in the performance of TopK can be attributed to the rapid alteration of the monitored flow set throughout the routing cycle, as discussed in Section V-A.In the TopK scheme, measurements are conducted on the L largest flows, and this data is subsequently employed in the next routing cycle.Given that the majority of these flows will not constitute the largest flows in the following routing cycle, utilizing data from the prior cycle results in high reconstruction   errors and consequently lower MLU ratios.In conclusion, our proposed scheme not only demonstrates greater practicality compared to the TopK scheme but also exhibits the capacity to attain commendable performance across varying network sizes and traffic patterns.

E. Ablation Studies
In this section, we perform ablation studies to evaluate the performance of different prediction models and investigate the impact of the routing cycle length, and traffic distribution on the proposed routing algorithm.
1) The Performance of Different DNN-Based Models: We demonstrate the effectiveness of the Graph WaveNet (GWN) model for long-term traffic prediction by comparing its prediction error with that of other deep neural network (DNN) models, namely LSTM [37], GRU [38], STGCN [39], and MTGNN [40].We conduct the maximum traffic prediction with varying prediction steps T, which represent the length of the routing cycle, and investigate their impact on both the prediction error (measured as Mean Absolute Error) and the routing performance (measured as MLU ratio).We consider routing cycles of length 3 to 15 steps and perform each experiment five times to obtain the average results.Figure 16 illustrates the training processes of five different DNN models.Graph-based models including GWN, STGCN, and MTGNN can outperform RNN-based methods in the training process.Among them, GWN demonstrates superior performance in reducing the prediction error.
2) The Impact of the Routing Cycle Length: The relationship between the number of prediction steps and the error is illustrated in Figure 17.Overall, the GWN model outperforms the other prediction models in terms of MAE.In all cases, GWN achieves a reduction in prediction error ranging from 40% to 70% compared to the other models.Additionally, Figure 17 shows that increasing the length of the routing cycle leads to an increase in the prediction error.Figure 18 displays the MLU ratio of the network when using the predicted traffic demands from different prediction models.The results reveal that having the smallest error in traffic prediction, as achieved by the GWN model, helps to improve the MLU ratio.Furthermore, an increase in the length of the routing cycle leads to a degradation in network routing performance.

A. Conclusion
This paper presents a study on the use of multi-time-step segment routing (MTSR) with long-term traffic prediction to address the high network rerouting disturbance and high traffic monitoring overhead.To achieve this, we proposed a solution that utilizes traffic prediction to perform traffic engineering while reducing the number of flows that need to be rerouted.Given the complexity of the problem and the difficulty of multi-step traffic prediction, we formulated three versions of the MTSR problem and provided theoretical MTSR.We introduce an extension called MTSR-CS which combines the MTSR and the compressive sensing technique.In MTSR-CS, we only monitor and predict a subset of flows and use compressive sensing technique to reconstruct the full matrix from the partially predicted values, and hence, reduce the network monitoring overhead.
Our evaluation of different network datasets showed that our proposed approach can significantly reduce network disturbance and meet the requirements for minimizing the maximum link utilization.The experimental results on MTSR-CS demonstrated that the monitoring cost can be significantly reduced while maintaining routing performance comparable to full network monitoring.

B. Discussion and Future Directions
There are two main differences in our network model compared to other studies that use segment routing as the traffic routing scheme.First, we only use 2-segment routing (2-SR) [12] instead of utilizing n-segment routing (n-SR) [15].As pointed out in [12], [41], [42], [43], [44], using 2-SR can achieve a close performance to n-segment routing while significantly reducing the problem complexity.In addition, the set of possible paths between two nodes in n-SR is generally larger than that of 2-SR, which increases the probability of path changes and network disturbance when solving the routing problem.Therefore, using 2-SR can naturally reduce the network disturbance.However, even when using 2-SR, the network disturbance remains high, as demonstrated by the experimental results presented in Section V. Second, since our main targets are to reduce the number of flow rerouting and the network monitoring overhead, we will not consider a network in which a traffic flow can be arbitrarily split and routed into different paths or ECMP routing.However, our problem formulation can easily be applied to the network system with arbitrarily split flows or ECMP routing by considering the variable α k ij in our problem formulation as a real number (i.e., α k ij ∈ [0, 1] representing the split ratio of flow ij).It is important to emphasize that this adjustment exclusively pertains to the "problem solving" task (in Fig. 2) of our approach, and there exists no disparity in the training process of the deep neural network for traffic prediction.
Currently, our methods rely on a centralized controller for executing multiple tasks including collecting network traffic, performing traffic prediction, solving the routing problem, and distributing the routing rules.when applied to large-scale networks, this approach brings forth several challenges, notably the risk of single-point failure, heightened network monitoring overhead, and substantial delays in the distribution of routing rules.Therefore, we intend to confront the aforementioned issues specific to large-scale networks through the implementation of an ML-based distributed routing algorithm levering the Multi-agent Reinforcement Learning technique [45].
First, we will prove the following hypothesis: "If (α 3 , θ 3 ) is a feasible solution of P 3 , then it is also a feasible solution of P 1 ; if (α 2 , θ 2 ) is a feasible solution of P 2 , then it is also a feasible solution of P 3 ".According to this hypothesis, we derive that θ * 1 ≤ θ * 3 and θ * 3 ≤ θ * 2 .According to (15) It means that (α 3 , θ 3 ) satisfies constraint (7), thus it is a feasible solution of P 1 .
Similarly, according to (11), we have: Let T p is an arbitrary sub-period of T, then max t∈Tp m t ij ≤ max t∈T m t ij .Therefore, from (40), we have: It means that (α 2 , θ 2 ) satisfies constraint (15), thus it is a feasible solution of P 3 Theorem 4: Let e * be the link with the largest capacity, and e * 2 be the link where the equal sign of constraint (11) in P 2 holds; Then, the performance ratio of P 2 to P 1 is upper bounded by λ, which is calculated as follows.Eq. ( 47) holds for all time-steps t.Therefore, from (46) and ( 47

1) Traffic Prediction and Reconstruction Time:
In this part, we measure the running time of the traffic prediction and matrix reconstruction tasks in the proposed MTSR and MTSR-CS methods (Fig. 19).All of the experiments are conducted in a single machine with 40 cores of Intel Xeon Silver 4210R CPU @ 2.40GHz, and an NVIDIA GeForce RTX 3090 card (cuda version 12.1).According to Fig. 19, the prediction and reconstruction times of MTSR and MTSR-CS are less than 0.5 seconds even with the large network.Therefore, the time for estimating future traffic demand is negligible.In addition, the prediction time in MTSR-CS can be smaller since we only consider L traffic flows.
2) The Scalability of the Proposed Routing Algorithm: To evaluate the performance of our proposed methods in cases of large-scaled networks, we use REPETITA dataset [46].The dataset contains more than 200 topologies of the real backbone network, whose number of nodes varies from 4 to more than 100.For each topology, five traffic matrices were generated and adjusted so that the optimal values of the MLU (obtained from solving the MCF problem) are around 90%.This is the same dataset that was used to evaluate the scalability of the SRLS approach in [15].The dataset was divided into three groups based on the number of nodes in the network topology.Group 1 comprised networks with less than 20 nodes, Group 2 comprised networks with 20 to 40 nodes, and Group 3 contained large networks with more than 40 nodes.We compared the performance of our approach with SRLS [15] and the Shortest Path (SP) routing approach.It is worth noting that SRLS used n-segment routing for solving the TE problem while LS2SR only used 2-segment routing.
We conducted the experiment using five synthetic traffic matrices of each topology and calculated the MLU, the number of rerouting flows per time-step (RC), and the average delay as The maximum link utilization of different groups of network topology.performance metrics.The delay was computed by averaging the delay of all the traffic flows, which was calculated as the sum of link delays in its path.The link delays provided in the REPETITA dataset represented the geometrical distance between two nodes.Note that since SP has zero rerouting flow, its results were not shown in Fig. 21.The results of the experiment, shown in Fig. 20, Fig. 21, and Fig. 22, indicate that our proposed approach, LS2SR, achieved the same performance in terms of MLU as SRLS.However, LS2SR had the best results in reducing the number of rerouting flows and had similar results compared to SP, consistently outperforming SRLS regarding the average delay metric.
Achieving Multi-Time-Step Segment Routing via Traffic Prediction and Compressive Sensing Techniques Van An Le , Yusheng Ji , Fellow, IEEE, Huu Huy Tran, Phi Le Nguyen , Member, IEEE, and John C. S. Lui , Fellow, IEEE Abstract-Traffic engineering (TE) is one of the most critical issues in networking, as it enables efficient and reliable network operations.With the advent of Machine Learning (ML) techniques, many ML-based TE methods have emerged in recent years, especially those employing Deep Neural Networks

Fig. 3 .
Fig. 3.An illustration of three traffic prediction approaches on one traffic flow.The red crosses indicate the values that need to be predicted corresponding to each method.

Fig. 8 .
Fig. 8.The distribution and dynamicity of all traffic datasets.

Fig. 13 .
Fig. 13.The MLU of MTSR and MTSR-CS with different percentages of monitored flows.

Fig. 14 .
Fig. 14.The MAE of reconstructed matrices using different monitoring schemes.

Fig. 16 .
Fig. 16.The MAE of different prediction models in case H = 15 and T = 6.

Fig. 17
Fig. 17.The prediction errors of different models with T varying from 3 to 15.
Fig. 17.The prediction errors of different models with T varying from 3 to 15.

Fig. 18 .
Fig. 18.The routing performance of different models with T varying from 3 to 15.

Fig. 19 .
Fig. 19.The run time of traffic prediction and traffic reconstruction tasks.

Fig. 20 .
Fig.20.The maximum link utilization of different groups of network topology.

Fig. 21 .
Fig. 21.The number of rerouting flows per time-step of different groups of network topology.

Fig. 22 .
Fig. 22.The average delay per time-step of different groups of network topology.

TABLE I THE
DIFFERENCES OF THREE PROBLEM FORMULATIONS

TABLE II DETAILS
OF THE REAL TRAFFIC DATASETS

TABLE III THE
PARAMETERS OF THE GWN MODEL AND THE EXPERIMENT SETTING , we have: ij is the total amount of traffic routed through link e at time-step t, it should be greater than or equal to the traffic of any pair ij routed through e, it means that ij m t ij (47) ), we deduce that