Multi-Path Routing in Green Multi-Stage Upgrade for Bundled-Links SDN/OSPF-ECMP Networks

This paper considers the novel problem of upgrading a legacy network into a Software Defined Network (SDN) over multiple stages and saving energy in the upgraded network, or hybrid SDN. That is, in each stage, the problem at hand is to select and replace legacy switches with SDN switches and reroute traffic to power off as many unused cables as possible to save energy. Also, the operator must consider: (i) the available budget at each stage, (ii) maximum path delays, (iii) maximum link utilization, (iv) per-stage increase (decrease) in traffic size (upgrade cost), and (v) the Open Shortest Path First - Equal Cost Multi-Path protocol. This paper addresses two multi-path routing scenarios: 1) non-link-disjoint and 2) link-disjoint. It outlines a Mixed Integer Program and a heuristic algorithm for each scenario. The experimental results show that: (i) both solutions produce only up to 0.63% higher energy saving in scenario-1 than in scenario-2, (ii) the mixed integer program (heuristic algorithm) for both scenarios give an energy saving up to 71.93% (71.64%), (iii) using a larger budget and/or number of stages can increase the energy saving, and (iv) the saving achieved by the heuristic solution for each scenario is within 4% from the optimal saving.


I. INTRODUCTION
ASoftware Defined Network (SDN) offers operators a new network management paradigm [1]. It consists of a set of SDN-switches or s-switches and one or more controllers [1]. A controller provides a global view of a network. It helps an operator optimizes network performance such as the maximum link utilization (MLU) [2] and/or energy saving [3]. Consequently, network operators are keen to upgrade their legacy networks to SDNs. To do so, they must consider their available budget, advances in SDN equipment and cost reduction or depreciation of network equipment over time. Hence, legacy switches or l-switches are likely to be upgraded over multiple stages, creating so called hybrid-SDNs, which contain l-switches along with s-switches.
Another recent consideration is energy efficiency. It is well-known that the current networks are overprovisioned, e.g., link bandwidth, which satisfies traffic demands during The associate editor coordinating the review of this manuscript and approving it for publication was Peng-Yong Kong . peak hours but is underutilized during off-peak periods [4]. To this end, backbone networks now utilize IEEE 802.1AX [5], a bundled-link technology where logical links consist of multiple physical cables. IEEE 802.1AX enables network operators to scale the bandwidth or the number of cables in each link as per traffic demands [4]. More importantly, during off-peak hours, unused cables can be switched off to reduce their energy cost. For example, the work in [4] and [6] aimed to switch off as many cables as possible and reroute traffic flows to the cables from other paths. They considered multi-path routing using Multi-Protocol Label Switching (MPLS). On the other hand, the work in [7] considered the Open Shortest Path First -Equal Cost Multi-Path (OSPF-ECMP) to maximize energy saving. Further, multi-paths that do not share a common link, called link-disjoint paths, are used to provide path resiliency against link failures [8]. Reference [9] showed how to save energy in legacy networks while maintaining link-disjoint multipaths. Multi-path routing is ideal for use in SDNs, e.g., [2] and [3] because an SDN controller allows: (i) s-switches to VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ use non-shortest paths and (ii) each source s-switch to split unequal amount of traffic onto each path. Henceforth, this paper considers a novel problem in network upgrade. Specifically, it presents solutions for upgrading a subset of l-switches into s-switches over multiple stages. In addition, the resulting hybrid-SDN must support multi-path routing and allows each s-switch to turn off the maximum number of unused cables. The upgrade maintains the same routing service to users. More specifically, if a traffic demand in a legacy network is routed via link-disjoint paths, the demand must be routed via at least two link-disjoint paths after network upgrade. Otherwise, the demand can be routed via multi-paths that can share common link(s), called nonlink-disjoint paths, or even a single path. The upgrade is also subjected to the following constraints: (i) active cables must have sufficient capacity to carry traffic demands, (ii) each path has a delay no larger than a given delay constraint, (iii) there is a maximum budget to upgrade switches per stage, and (iv) each l-switch complies with OSPF-ECMP. In addition, the solution must consider increasing traffic volume and decreasing switch upgrade cost over multiple stages.
Given the above research aim, the main contributions of this paper are as follows: • It presents a novel problem to maximize energy saving in a hybrid-SDN. It consists of two sub-problems: i) multi-stage l-switch upgrade, and (ii) splitting traffic optimally via s-switches and setting link cost to ensure that each l-switch complies with OSPF-ECMP.
• It contains a novel Mixed Integer Program (MIP) that can be used to compute the optimal solution for small-size networks. The MIP considers two multi-path routing scenarios: 1) non-link-disjoint paths, and 2) link-disjoint paths. The paper also presents an analysis of the complexity of MIP and its NP-Hardness. Note that our solution for routing scenario-1 can be used to upgrade a legacy network where users do not require link-disjoint paths. Further, it produces an upper bound on energy saving for scenario-2.
• It proposes a heuristic algorithm that can be used in large-scale networks for each of the aforementioned routing scenarios. It also outlines the time complexity of the algorithm as well as a proof of correctness. Next, Section II discusses existing works on minimizing energy expenditure and those that carry out multi-stage upgrade of SDNs. Section III presents our network model, notations and MIP. Section IV describes our proposed heuristic solution. Section V outlines our results. Finally, Section VI concludes the paper and provides future research directions.

II. RELATED WORK
Works on green routing aim to reroute traffic for utilizing the minimal number of network components, e.g., line cards or links and switches. The unused components are then powered off [10]. For example, the efforts in [11] and [12] introduced energy-aware routing via single path routing with MLU constraint. References [9] and [7] maximized the energy saving of legacy networks with non-bundled links by respectively employing MPLS and OSPF-ECMP based multi-path routing that satisfies MLU. The authors of [13] designed an energy-efficient bundled-link with two types of cables. Namely, cables with different energy levels and cables with sleep mode. On the other hand, the work in [14] considered traffic load distribution among IEEE 802.3az cables in a bundled-link to minimize their usage. Other works, such as [4] and [6], aimed to maximize energy saving in backbone networks that support bundled links and MPLS. The work in [4] considered the power consumption of all links and l-switches is independent of traffic load. On the other hand, the authors of [6] assumed each link and l-switch have different power usage. Further, they considered routing over multipaths, delay tolerance, and MLU.
There are many works on improving the energy efficiency of SDNs. For example, the work in [15] and [16] powered down unused links in pure SDNs that only have s-switches. Both works considered a single communication path bounded by MLU and path delay for each pair of s-switches and from each s-switch to its associated controller. Many works have considered an incremental upgrade strategy; e.g., the authors of [17] considered hybrid SDNs with s-switches and l-switches. Moreover, these works consider co-existence between an SDN controller that programs s-switches and legacy routing protocols, such as OSPF and MPLS. To date, research into hybrid SDNs, e.g., [2], [3], [18]- [22] and [23], assumed the SDN controller has access to all required network information, including those from l-switches. Our work follows the same assumption, where the placement of multiple SDN controllers is deferred to future work.
A hybrid SDN can be formed by incrementally upgrading l-switches with s-switches [17]. The upgrades are performed over one stage [2], [3], [18]- [20] or multistages [21]- [23]. Reference [2] have used a greedy algorithm to upgrade a set of l-switches with the highest total traffic load on their outgoing links. The authors considered multi-path routing to minimize MLU. In [3], s-switches are randomly and uniformly distributed in a hybrid SDN. Each s-switch can split traffic to maximize energy saving; however, each l-switch uses OSPF to compute a single shortest path. The work in [18] used a given set of partially deployed s-switches to minimize the power usage of both s-switches and their adjacent links. Another work in [19] considered traffic routing via single path to minimize the power consumption of s-switches and links that are adjacent to the s-switches. The authors first select a set of l-switches based on different criteria, e.g., in decreasing order of their number of l-links, before performing traffic routing. On the other hand, the authors of [20] jointly addressed the problems of upgrading up to m l-switches and traffic routing to minimize the power usage. They assume OSPF routing for all l-switches and single path routing for all s-switches. Similar to [20], we jointly optimize the upgraded l-switches and traffic routing for maximizing the number of unused cables.
An operator incurs less risk in terms of performance and security degradation if a network is upgraded over multiple stages [22]. To this end, given a total budget (in $), the work in [21] and [22] aimed to upgrade l-switches in order to maximize the number of paths available to s-switches over T stages. Moreover, the authors of [21] considered a fixed upgrade cost (in $) for each l-switch. In contrast, Poularkis et al. [22] consider an upgrade cost that decreases over time and assume that traffic size (in bytes) increases over multiple stages. In addition, the authors of [22] aimed to maximize traffic controllability, i.e., traffic flows that passes through at least one s-switch.
Recently, the work in [23] addressed a multi-stage SDN deployment problem. Its goal is to maximize energy saving by shutting down as many unused cables in each link as possible. The authors of [23] considered: (i) decreasing switch upgrade cost and increasing traffic volume over time, (ii) using a maximum budget at each stage, (iii) satisfying MLU, and (iv) ensuring the upgraded network must be able to support existing flows. Their work ensured each flow is routed via a single path with longer delay but does not exceed the given delay constraint. They proposed an Integer Linear Program (ILP) formulation to solve the problem for small networks and a heuristic algorithm called GMSU that can be used for larger networks. In contrast, our recent work in [24] considered two types of multi-path routing for each demand: (i) those that traverse only l-switches and (ii) those that traverse at least one s-switch. For type (i), the traffic flow of each demand is routed using OSPF-ECMP, i.e., each l-switch splits a flow equally over multiple shortest paths. In contrast, for type (ii), the traffic flow of each demand can be split unequally over multi-paths that are not necessarily the shortest paths. Moreover, the work addressed two main challenges to maximally switch-off unused cables, i.e., for (i), link costs may need to be adjusted to ensure each l-switch complies with OSPF-ECMP and, for (ii), each s-switch needs to optimally split traffic among its selected multi-paths.
We summarize the differences between this paper and our previous work [24] as follows: • This paper considers two alternative routing scenarios: 1) multi-path routing as in [24], and 2) link-disjoint multi-path routing, where the selected paths for each demand have no common link. In the case where some demands have no link-disjoint paths, the demand is routed as per scenario-1.
• This paper proposes an alternative MIP as well as a heuristic algorithm to implement scenario-2 and their simulation results.
• This paper provides a qualitative analysis of the proposed MIP and heuristic solution.
• This paper discusses the effect of our solutions in terms of traffic controllability [22].

III. PRELIMINARIES
Section III-A first describes the network model. Table 1 summarizes our notations. Section III-B presents a mathematical VOLUME 9, 2021 model of the optimization problem. Finally, Section III-C analyzes the problem complexity.

A. NETWORK MODEL
Let G 0 (V , E) be a legacy network with |V | nodes or l-switches and |E| directed links. Each link (u, v) ∈ E has a bundle size with b uv cables and a propagation delay of π uv (in seconds). The capacity of each cable is γ (in bytes). Thus, the link capacity is c uv = b uv × γ (in bytes). Let T ≥ 1 be the given planning horizon. The duration of each stage t ≤ T is determined by the lifetime of network devices, e.g., three to five years [22]. Let G t (V , E) be the network after undergoing an upgrade at stage t. Let V t ⊂ V denote the l-switches that have been upgraded to s-switches. Each s-switch is a hybrid switch; an example is the OpenFlow-hybrid switch in [25], which supports both OpenFlow and normal Ethernet switching operation. Each link (u, v) ∈ E in G t (V , E) is a c-link if it is adjacent to at least one s-switch; otherwise it is a l-link. As per [18], [19], and [20], only cables in a c-link are powered off when they have no traffic. Also, every cable of each c-link runs IEEE 802.3az [26], meaning it can be placed in either active or sleep state. Without loss of generality, this paper assumes each l-switch does not turn off unused cables, i.e., the switch does not comply with the IEEE 802.3az [26] standard.
Let B be the total budget (in $) over time T and B t ≤ B denotes the maximum budget at stage t. The total cost to upgrade l-switches in V t cannot exceed the budget B t .
Any unused budget in stage t, denoted by B t , can be spent in subsequent stages. Thus, we set B t = B/T + B t−1 . Let p t v (in $) be the cost of upgrading switch v in stage t. The upgrade cost of a switch may vary over time depending on its model and type, e.g., edge or core switch [22]. We use ρ to denote the depreciation rate in switch upgrade cost, where 0 ≤ ρ < 1. Hence, we have } denote a set of traffic demands in G t (V , E). Node s d ∈ V and τ d ∈ V , respectively, represent the source and destination of each . Demand d has a traffic volume ω t d > 0 (in bytes). Let D 0 = D 1 denote the set of traffic demands in G 0 (V , E) and ω 0 d is the initial traffic volume of demand d ∈ [1, |D 0 |]. The traffic volume for each demand d increases with each successive stage with rate µ d ∈ [0, 1]. Thus, We assume network G 0 (V , E) has sufficient capacity to carry all demands at their maximum volume, i.e., ω T d for each demand d.
} be a set of paths from node x to node τ d . Let y t d,uv,i be a binary variable that is set to π uv . We assume that the propagation delay π uv of link (u, v) is proportional to the distance between node u and v [27]. Let Finally, unused or idle cables are switched off by powering off their line card to save energy. Specifically, a cable can be powered-off, called off-cable, if it is connected to at least one s-switch, i.e., a cable of a c-link. Note that line cards consume a significant fraction of a router's energy consumption [4]. Thus, without loss of generality, we assume a cable's energy consumption is equivalent to that of its line card. Also, similar to the energy saving model of [4], each cable with traffic consumes the same amount of energy. For example, an oncable with 1% load and another with 100% load consume the same amount of energy. Note that in practice, each port of energy-efficient switches continues to consume the maximum power even with 10% traffic load [30]. Let ε t be the energy saving in stage t. Formally, it is computed as In words, the energy saving ε t is a ratio between the total number of off-cables and the total number of cables in the network. For each l-link (u, v), we set n t uv = b uv because we assume an l-switch cannot turn off an unused cable. Finally, ε T denotes the average energy saving over T stages, i.e., ε T = 1 T T t=1 ε t . Table 1 summarizes the notations used in this paper.

B. MATHEMATICAL MODEL
We formulate our problem as a Mixed Integer Program (MIP). We consider two routing scenarios: 1) multi-path routing: the traffic of a demand d is split onto multi-paths and these paths can have common link(s), and 2) link-disjoint path routing: the selected paths of a demand d must not have any common link(s). First, we outline the MIP for scenario-1, see (2b), before outlining MIP (2b) for scenario-2. In constraint (2h), variable x t u is an indicator of whether l-switch u is upgraded at stage t. This constraint ensures each switch is upgraded only once. Constraint (2i) ensures the total upgrade cost at each stage is less than or equal to B t = B/T + B t−1 , while constraint (2j) enforces all cables of l-links are powered on. Note that only cables in c-links can be turned off.
Let z t a,uv indicate whether link (u, v) at stage t is on the shortest path from node u to a. Further, let h t u,a denote the path cost from u to a. Constraints (2k)-(2m) ensure that the traffic volume from l-switch u to destination a is split into equal sized segments; each of which has volume o t u,a and is routed via each shortest path from u to a. Thus, the cost h t u,a is minimum and ψ t uv is in the range [1, I]. Finally, constraint (2m) defines the domain of all decision variables.
, while (2e) considers all demands and |P s d d | paths for each demand. Constraint (2f), (2g) and (2j) exist for all links (u, v) ∈ E, while constraint (2h) applies to each u ∈ V . Finally, constraints (2k) -(2m) are evaluated for every destination a ∈ V and each link (u, v) ∈ E, with a starting node u ∈ V is a l-switch, i.e., x t u = 0.

2) SCENARIO-2: LINK-DISJOINT PATH ROUTING
We show how to revise MIP (2b) to support link-disjoint path routing; we call the revised MIP as DP-MIP. More specifically, if traffic demand d in legacy network G 0 (V , E) is routed via two or more link-disjoint shortest paths, i.e., |R s d ,0 d | > 1, DP-MIP must route the demand via at least two link-disjoint paths in P s d d . Otherwise, DP-MIP can route the demand via any one or more paths in P s d d , which is a set of paths from s d to τ d each of which has delay within δ max,d . Let

C. PROBLEM COMPLEXITY
Our problem is related to two NP-hard problems: (i) OSPF cost setting problem [29]: given a network G(V , E), maximum link utilization for each link (u, v) ∈ E, and a set of traffic demands, assign an integer cost for each link to optimize a given network performance metric, e.g., network delay; and (ii) 0-1 Multiple Knapsack Problem (MKP) [31]: given m items, each of which has a profit and weight, and T knapsacks, each of which has a maximum weight capacity, select T -disjoint subsets of items that maximize the total profit, subject to each subset having a total weight no more than its knapsack's capacity. With respect to problem (i), the network performance of interest is the minimum number of on-cables that have sufficient capacity to carry traffic demands. The cost assigned to each link is used to calculate the shortest path from each l-switch to any destination τ d in D t . These shortest paths define the total traffic volume on each link, which then determine the number of on-cables. Thus, our problem is at least as hard as the problem in (i).
Our problem can be reduced to MKP when (a) there is no depreciation in switch upgrade cost, and (b) the number of on-cables per link (u, v) ∈ E is known, i.e., the traffic splits and shortest paths used to carry traffic flows of demand d ∈ [1, |D t |] are fixed at each stage t. Note that the profit and weight of each item in MKP are respectively equivalent to the number of off-cables for each switch v ∈ V and the switch upgrade cost p t v . Further, the maximum budget at each stage B t is the same as a knapsack's capacity in MKP. Further, our problem aims to upgrade T disjoint subsets of l-switches that minimize the total number of on-cables over multiple stages T , i.e., maximize the total number of off-cables instead of the total profit in the MKP. Thus, our problem is also as hard as MKP. The following section describes our heuristic solution for the optimization problem.

IV. SOLUTION
This section outlines our greedy heuristic solution called Multi-Paths Green Multi-Stage Upgrade (M-GMSU). Section IV-A first describes M-GMSU, where it routes each traffic demand via multi-paths that may have common link(s). Then, Section IV-B presents our approach called DP-GMSU, which uses M-GMSU but adopts link-disjoint path routing. Section IV-C gives an example. Section IV-D analyzes the correctness of M-GMSU and DP-GMSU as well as their time complexity.

A. DETAILS OF M-GMSU
One can run M-GMSU offline in a centralized server that may also act as the SDN controller. As per Algorithm 1, it consists of three phases: (1) initialize traffic routing, (2) upgrade switches, and (3) reroute traffic and set link cost. Phase 1 is used only in stage t = 1, while Phase 2 is at the beginning of each stage (in years). On the other hand, rerouting in Phase 3, in addition to being computed at the beginning of each stage, can be used whenever a significant change occurs in network traffic within the stage, e.g., every week. At each upgrade stage t, M-GMSU produces: (i) a set of upgraded switches V t , (ii) a set of paths R s d ,t d to route each demand d, (iii) the number of on-cables n t uv on each link (u, v), and (iv) energy saving ε t .

1) PHASE 1: INITIAL ROUTING
Given a legacy network G 0 (V , E), Phase 1 initially routes each traffic demand according to OSPF-ECMP. For each Route flow of size ω T d /|R end for 9: end for 10: Phase 2: upgrade switches 14: 15: Compute ε t using (1) The term (b uv − n T uv ) in (3) denotes the number of off-cables in each link (u, v) at the last stage T . Equation (3) uses (b uv − n T uv ) to compute w u because we observe that the largest flow for each demand occurs at stage T . Recall that the size of each traffic demand d grows at a rate of µ d ≥ 0 per stage. Thus, if a link (u, v) that has n T uv number of on-cables can carry traffic demand at stage T , its on-cables can also carry traffic demands at any stage t < T . It implies that we have n t uv ≤ n t+1 uv and (b uv − n t uv ) ≥ (b uv − n t+1 uv ) for each link (u, v). In this case, the (b uv −n t uv ) number of unused cables at stage t include the (b uv − n T uv ) number of unused cables which can remain off at the next stage t + 1. Thus, upgrading a set of l-switches with the highest total number of unused cables at the earliest possible stage can maximize the overall energy saving. Line 12 concludes Phase 1 by initializing X with all l-switches in V . In summary, Phase 1 produces (i) the set of alternative paths P s d d and initial shortest paths R s d ,0 d for each demand d, (ii) total on-cables n T uv of each link (u, v) ∈ E at stage T , and (iii) weight w v for each node v ∈ V . This set of information will be used in Phase 2 and Phase 3.

2) PHASE 2: SWITCH UPGRADES
For each stage t, Phase 2 calls Selection(), shown as Algorithm 2, in Line 14. It generates a set V t that contains upgradable l-switches, which is defined as follows.
Definition 1: A set V t is upgradeable if (i) each switch v ∈ V t has a non-zero weight w v > 0, and (ii) the total cost to upgrade all switches in V t is at most B t . Phase 2 uses the ratio w v /p t v to upgrade a switch with the maximum off-cables per cost unit. It starts from the largest ratio w v /p t v in order to maximize the number of off-cables, and hence, energy saving, over T stages.
end if 11: end for 12: end for 13 The details of Selection() are as follows. Line 1 considers only each candidate switch v ∈ X that has (i) an upgrade cost p t v within budget B t , i.e., p t v ≤ B t , and (ii) weight w v > 0, i.e., switch v has cables to switch off. Among all nodes that satisfy the two criteria, Line 2 selects a node, say v, that has the largest ratio w v /p t v . Line 3 removes node v from X . Line 4 includes v into the set of upgradeable nodes V t and Line 5 computes the remaining budget B t . For each lswitch neighbor, denoted as u, of the upgraded l-switch v, Line 7 reduces its weight w u by the total cables to be switched off by node v. Lines 8-10 place each c-link (u, v) into the set L if some traffic demand passes the link, i.e., n T uv > 0. Lines 1-12 are repeated until the remaining budget B t is not sufficient to upgrade any remaining l-switch in X , or each switch v ∈ X has no unused cable to turn off, i.e., w v = 0. Finally, Line 13 records the remaining budget B t as B t . Line 15 of M-GMSU then adds the remaining budget B t to the budget for stage t + 1. In summary, function Selection() returns a set V t ⊂ V of upgraded l-switches, the remaining l-switches X , set L that stores each c-link (u, v) with non-zero traffic flow, and the remaining budget B t . The upgraded switches V t are used in Phase 3 to increase the number of off-cables on every c-link, when possible.

3) PHASE 3: TRAFFIC REROUTING AND LINK COST SETTING
Phase 3 uses function MGTE() or Algorithm 3 in Line 16. The function adapts the greedy approach proposed in [4] and [6]. Specifically, MGTE() switches off as many c-link's cables as possible and reroutes traffic flows over these cables to other paths. It starts from the cable that has the smallest VOLUME 9, 2021 used capacity. The rationale for this greedy approach is that such a cable has the smallest amount of traffic to be rerouted, and thus, more likely to be switched off. However, switching off a cable is feasible only if each traffic flow of demand d that passes through the cable can be rerouted via a set R s d ,t d of routable paths defined as follows.   end for 13: if (r uv > 0) then success = false 14: else 15: {ψ t , success}= LinkCost(R t , X ) 16: end if 17: if (success == false) then 18: Revert back each changed set R s d ,t d to its previous paths 19: n T uv = n T uv + 1 20:
To this end, LinkCost() is formally defined as: If the total excess cost, i.e., (4a), is not zero, LinkCost() sets success to false and returns ψ t without updating link costs. Further, Lines 18-20 revert the routable paths R t to their previous paths, set the cable(s) in link (u, v) back to on, and remove link (u, v) from the set L. If LinkCost() successfully updates set ψ t with new link costs, it sets success to true. If link (u, v) has no on-cable, i.e., n T uv = 0, Line 22 (Line 23) removes the link from sets L (L). This allows all cables in the c-link (u, v) to remain off in subsequent stages. Line 26 of MGTE() then updates the weight w x of each l-switch x ∈ X because the new routing produced by Lines 3 -25 is able to increase the number of offcables. Finally, Line 17 of M-GMSU computes ε t . Overall, Phase 3 produces a set R t that contains all routable paths for all demands in D t at each stage t ∈ [1, T ] and a set of link costs ψ t . VOLUME 9, 2021

B. DETAILS OF DP-GMSU
This section presents our approach to enable link-disjoint path routing in M-GMSU; we call this approach DP-GMSU. In DP-GMSU, we replace the function Reroute() in Line 8 of function MGTE() with the function RerouteDP(). As in Reroute(), the function RerouteDP() aims to reroute traffic carried by path R s d ,t d,i ∈ Q uv via m ≥ 1 alternative paths in set P s d d . For each demand d, the function considers two possible cases: 1) the demand is initially routed via non link-disjoint paths, or 2) the demand is initially routed via link-disjoint paths. For case 1), function RerouteDP() uses the function Reroute() to reroute demand d using not necessarily link-disjoint paths. For case 2), the function RerouteDP() aims to reroute demand d via at least two link-disjoint paths to route demand d. The function carries out the following three steps:

D. ALGORITHM ANALYSIS
The following two propositions analyze M-GMSU in terms of the algorithm's compliance to all constraints in MIP (2b) and time complexity, respectively. |D||E|+α)). Since in general we have |E| ≤ |V | 2 , |D| ≤ |V | 2 , and T = 5 and K ≤ 20 are constants, the time complexity of M-GMSU is O(|V | 2 |E| 2 + α|E|). The time complexity of DP-GMSU is the same as M-GMSU because their only difference is on the use of respectively Reroute() and RerouteDP(), which have the same time complexity.

V. EVALUATION
We have implemented M-GMSU in C++ and Gurobi [33] to solve our MIP. Our experiments are conducted on a 64-bit Linux machine with an Intel-core-i7 CPU @3.60 GHz and 16 GB of memory. We use five actual network topologies, which are also used in [23]; see Table 2. For Abilene and GÉANT, we use their actual traffic matrices. For DFN, Deltacom and TATA, we use the gravity model [34] to generate traffic flows as there are no public traffic matrices. We set γ = 2.5 Gbps, b uv = 4 cables, and U max is set to 80%. As per [22], we set ρ = 40% and µ = 22%. We assign an initial upgrade cost p 0 v of $50K, $100K or $150K by drawing a random number from N (2, 0.5) for each node v. We then VOLUME 9, 2021 round it to the nearest integer, where a value of one maps to 50K, two to $100K, and three to $150K. Each experiment uses M-GMSU and MIP with delay multiplier σ = 1.1.
This section is organized as follows. First, Section V-A evaluates the scalability of MIP, DP-MIP, M-GMSU and DP-GMSU in terms of their running time in CPU seconds. Then, Sections V-B and V-C analyze the effect of increasing budgets and stages on energy savings, respectively. Next, Section V-D and Section V-E study the effect of using single path routing and link-disjoint multi-path routing, respectively, on energy saving. Further, Section V-F reports the energy saving performance of MIP and M-GMSU against prior techniques in [22] and [19]. Finally, Section V-G provides additional findings.

A. RUNNING TIME
We set the budget to B = $1.2M and consider T = 3 stages to compare the run-time performance (in CPU seconds) of MIP, DP-MIP, M-GMSU and DP-GMSU. From Table 2, we see that the run time of all solutions increases with network size and traffic demands. The table shows that the run time of M-GMSU is far less than that of MIP, e.g., 1.57 versus 71942.81 seconds for GÉANT. Similarly, DP-GMSU runs significantly faster than DP-MIP, e.g., 1.24 versus 95599.2 seconds for the network, i.e., GÉANT. Further, MIP and DP-MIP failed to produce results for DFN, Deltacom and TATA because the optimizer ran out of memory. Thus, for the remaining simulations, we compare the performance of M-GMSU against MIP and DP-MIP versus DP-GMSU using only Abilene and GÉANT.  Deltacom and TATA have a larger number of l-switches to upgrade than the other three networks. It means an allocated budget can only upgrade a significantly smaller percentage of l-switches. As energy saving ε T is the result of turning off the unused cables in c-links, more s-switches can potentially lead to more switched off cables.
Note that in the last upgrade stage T , both MIP and M-GMSU route the majority of traffic demands via single paths. For example, when the budget B is $1.2M and T = 3 stages, MIP routes only 1.26% and 6.44% of traffic demands via multi-paths for Abilene and GÉANT, respectively. Similarly, M-GMSU routes 37.77%, 17.24% and 3.19% of traffic demands via multi-paths for GÉANT, DFN, and Deltacom, respectively. For Abilene, M-GMSU routes each of its traffic demands via a single path. Similarly, M-GMSU routes only two demands of TATA via multi-paths. Note that there are 18.18%, 76.61%, 61.83%, 67.35% and 73.9% of traffic demands with multi-paths within delay tolerance for Abilene, GÉANT, DFN, Deltacom and TATA, respectively.

C. EFFECT OF INCREASING STAGES
Next, we investigate how the number of stages, namely T = {1, 2, 3, 4, 5} impact energy saving ε T . The budget B is $1.2M. As shown in Figure 4, the energy saving ε T for Abilene, GÉANT, and DFN decreases as T increases. For example, the energy saving ε T for M-GMSU when it runs over Abilene (GÉANT) decreases from 74.56% to 66.67% (75% to 61.15%) when T increases from one to five. Notice that for Abilene and GÉANT, M-GMSU produces ε T value that is on average only 0.94% and 4%, respectively, off from the optimal energy saving, which is produced by MIP. In contrast, energy saving ε T for Deltacom (TATA) increases from 34.47% to 37.61% (25.4% to 29.52%) when T increases from one to five. For these two larger networks, there are more switches to upgrade in later stages which results in larger ε T values. In contrast, for smaller networks such as Abilene, a budget of B = $1.2M can be used to upgrade a larger percentage of switches in earlier stages. As a result, it reduces the number of switches to be upgraded in later stages, and thus fewer unused cables can be turned off. In addition, as the later stages have a higher traffic volume, it is unlikely that these remaining switches have idle or off cables. In other words, upgrading these switches does not significantly increase ε T .

D. MULTI-PATH VERSUS SINGLE PATH ROUTING
In this section, we aim to compare the energy saving ε T calculated by MIP and M-GMSU against that computed by ILP and GMSU [23], respectively. Briefly, ILP and GMSU use a single path that satisfies a given delay tolerance to route each traffic demand. ILP is the optimal approach that provides the optimal energy saving ε T , while GMSU is the heuristic approach that produces a sub-optimal ε T value. Further, similar to MIP and M-GMSU, ILP and GMSU perform rerouting at each upgrade stage. Here, we consider budget B = {$200K, $400K, $600K, $800K, $1M, $1.2M} and T = 3 upgrade stages.
As shown in Figure 5, the energy saving of MIP is very close to that of ILP for each budget. Similarly, Figure 6 shows that M-GMSU and GMSU result in similar ε T value. For Abilene, MIP and ILP produce the same saving. On average, for GÉANT, MIP produces 0.91% lower ε T value as compared to ILP. Similarly, M-GMSU produces 0.29% and 1.62% less energy saving than GMSU for Abilene and GÉANT, respectively. Further, the ε T value of M-GMSU is 1.77%, 0.72%, and 0.06% lower than that of GMSU for DFN, Deltacom and TATA, respectively. GMSU is more likely to have successful traffic rerouting because M-GMSU requires each l-switch x  to distribute traffic over k ≥ 1 shortest paths from x to the flow's destination. Note that traffic rerouting in GMSU is subjected only to path delay tolerance and MLU threshold. Note that ILP and GMSU are computationally faster than MIP and M-GMSU, respectively. For example, ILP respectively runs in 0.06 and 30.31 seconds for Abilene and GÉANT, while GMSU requires less than 2 seconds for each network. The reason is because both ILP and GMSU do not include link-cost setting.  Overall, as shown in Figure 3 and Figure 7, DP-MIP and DP-GMSU produce energy savings that are close to those of MIP and M-GMSU, respectively. For Abilene, all solutions, i.e., DP-MIP, DP-GMSU, MIP and M-GMSU, produce the same ε T . For GÉANT, the average saving ε T of DP-MIP is only 0.32% less than that of MIP. Similarly, the saving of DP-GMSU is only 0.63% less than that of M-GMSU. Similarly, for DFN and Deltacom, the energy saving obtained by DP-GMSU is only 0.36% and 0.06% off, respectively, from the saving of M-GMSU. Moreover, DP-GMSU and M-GMSU produce the same saving for TATA. The reason is because DP-MIP and DP-GMSU route the majority of traffic demands via single paths. More specifically, for Abilene with budget B = $1.2M, DP-MIP cannot route any traffic demand via link-disjoint paths. It routes only 2.27% of demands via non link-disjoint multi-paths. For GÉANT and budget B = $1.2M, DP-MIP routes 10.3% and 7.58% of traffic demands via link-disjoint and non-link-disjoint paths, respectively. Similarly, DP-GMSU routes all demands of Abilene via single path routing, while for GEANT, it uses link-disjoint and non-link-disjoint paths to route only 10.3% and 29.4% of traffic demands, respectively. Note that the percentage of traffic demands routed over link-disjoint paths that also satisfies a given delay tolerance for Abilene, GÉANT, DFN, Deltacom, and TATA is 6.06%, 55.15%, 9.83%, 2.96%, and 3.86%, respectively.

F. M-GMSU VERSUS TWO EXISTING SOLUTIONS
In this section, we compare the performance of M-GMSU against two existing solutions, i.e., Local Search (LS) [22], and Energy-Efficient Genetic Algorithm for hybrid SDNs (EEGAH-MNL) [19], in terms of traffic controllability and energy saving. For brevity, in this paper we call EEGAH-MNL as GA. Briefly, LS aims to upgrade l-switches over multi-stages subject to a given total budget B. However, the goal is to maximize the total traffic controllability over T ≥ 1 stages, denoted by TC. Moreover, LS is allowed to use its entire budget in one stage. On the other hand, GA aims to minimize the power consumption of links that are adjacent to an s-switch (c-links) and s-switches in a single upgrade stage, i.e., T = 1. Both LS and GA consider single path routing. Note that GA generates each shortest path using only the powered-on links, and thus, producing paths with long delays. Both LS and GA consider non-bundled links where they only have one cable.
We compare the performance of M-GMSU, LS, and GA using the following scenarios: (i) single path routing with 10% delay tolerance, (ii) initial 3) The fitness function of GA is changed to the sum of the total number of powered-on links, assuming that the power rate of all links is the same. Note that LS fails to produce results for DFN, Deltacom and TATA after running for three days. Thus, we use only Abilene and GÉANT to compare the TC and ε T performance of M-GMSU, LS and GA.

1) PERFORMANCE ON TC
The TC values for non-bundled and bundled link models are exactly the same. Thus, the TC results in Figure 8 apply to both link models. Figure 8 shows that LS consistently produces, on average, higher TC than M-GMSU and GA for Abilene and GÉANT. The results are expected as the goal of LS is to maximize TC. As an example, for budget B = $200K, LS produces 38.35% and 49.75% higher TC than M-GMSU, and 43.1% and 48% higher TC than GA for Abilene and GÉANT, respectively. However, as the budget increases to B = $1.2M, the difference between TC of LS and M-GMSU (LS and GA) reduces to only 5.95% (9.01%) and 4.69% (5.7%) for the two respective networks. The reason is because the budget at each stage becomes larger with increasing budget B. Thus, M-GMSU and GA upgrade most of the l-switches at earlier stages and hence, produces TC with  [22], and GA [19]. values closer to LS. As shown in Figure 8, M-GMSU and GA produce comparable TC values. At maximum, M-GMSU results in 3.17% and 9.49% lower TC than GA for Abilene and GÉANT respectively. The reason is because GA upgrades l-switches with the highest total number of l-links or node degree. On the other hand, M-GMSU selects l-switches which do not necessarily have the highest total number of node degrees. Note that switches with the highest node degree are likely to be traversed by more end-to-end paths [35]. To further analyze TC performance, we show the value of TC at each stage in Figure 9. Note that M-GMSU and GA produce a similar trend, and thus, the figure only compares the results of M-GMSU and LS. Figure 9 shows the TC produced by M-GMSU and LS at stage 1 to 3 using a budget of B = $200K. M-GMSU consistently produces higher TC at the last stage, whilst TC of LS remains the same over the three stages. For Abilene, the TC produced by M-GMSU increases drastically from 18.58% to 94.97%, while LS yields the same TC of 77.86% from stage t = 1 to t = 3. Similarly for GÉANT, the TC achieved by M-GMSU escalates from 11.07% to 74.17%, whilst LS produces the same TC of 69.04% for each stage t. The reason is because LS spends its entire budget upgrading l-switches in the first stage. In contrast, M-GMSU has a maximum budget to spend at each stage. Further, M-GMSU aims to maximize ε T , while LS aims to maximize TC. Thus, on average, LS results in a higher TC.

2) ENERGY SAVING PERFORMANCE
This section first evaluates the energy saving ε T produced by M-GMSU, LS and GA for non-bundled and bundled link models. Then, it analyzes the saving ε t of M-GMSU and LS at each stage t ∈ {1, 2, 3}. Lastly, it shows the advantage of saving more energy at later stages on energy cost. Figure 10 shows the energy saving ε T for the non-bundled link model. We see that LS uses all links to route a set of end-to-end traffic demands via shortest paths, and hence,   [22], and GA [19] for link model with single cable. no energy saving. In contrast, M-GMSU and GA can save energy because both solutions turn off as many links as possible and route traffic demands using the remaining active links. For Abilene, both solutions produce the same energy saving. As an example, for budget B = $200K, M-GMSU and GA produce the same ε T = 4.44% which increases to ε T = 6.67% for larger budget B = $1.2M. On the other hand, for GÉANT and budget B = $200K, GA results in 12.5% less ε T than M-GMSU. As the budget increases to B = $1.2M, M-GMSU significantly overcomes GA with 26.32% higher saving. Note that the energy savings of M-GMSU outperforms those of GA for the other networks, i.e., DFN, Deltacom and TATA.

b: BUNDLED LINK MODEL
We evaluate the energy saving performance of LS and GA for the bundled-links model. Figure 11 shows that LS can save energy. It produces less ε T value than M-GMSU and GA with budget up to B = $200K for Abilene and GÉANT. For budget B = $200K, LS gives ε T = 21.05% and ε T = 21.96% for Abilene and GÉANT, respectively. On the other VOLUME 9, 2021 hand, M-GMSU respectively produces higher ε T of 29.24% and 22.3% for Abilene and GÉANT. Similarly for Abilene, GA produces the same ε T = 29.24% which is higher than LS. However, GA produces ε T of 21.28% which is slightly less than LS for GÉANT. However, as the budget increases, LS produces higher energy saving than M-GMSU and GA. For Abilene and GÉANT with budget B = $800K, LS obtains respectively 16.67% (16.67%) and 14% (16%) higher ε T value than M-GMSU (GA). We use Figure 12 to explain the reasons for the higher ε T values that are produced by LS when the budget increases. We consider only M-GMSU in Figure 12 to analyze the energy saving performance against LS at each stage. The reason is because the energy savings produced by GA for all budgets, as shown in Figure 11, are the same for Abilene and only 0.05% off from the savings resulted by M-GMSU for GÉANT. FIGURE 11. Energy saving ε T of M-GMSU, LS [22], and GA [19] for bundled link model. Figure 12 shows a comparison of the ε T produced by M-GMSU and LS at each stage t ∈ {1, 2, 3} for Abilene and GÉANT using a budget of B = $200K and bundled link model. As shown in Figure 12, ε T increases at each stage for M-GMSU, while ε T of LS decreases slightly at later stages, especially for GÉANT. The reason is because LS uses its entire budget at stage t = 1 and thus, the number of s-switches upgraded by LS remains the same from stage 1 to 3. Recall that the traffic size increases at a rate of µ = 22% per stage, and thus some cables need to be switched on, which decrease the energy saving of LS over T = 3 stages. Moreover, budget B = $400K and B = $200K are not sufficiently large for LS to upgrade all l-switches of Abilene and GÉANT in only one stage, respectively. On the other hand, M-GMSU constrains the maximum budget that can be spent at each stage. Thus, M-GMSU is able to upgrade more switches at the later stages, which increases energy savings. It is important to note that, in general, M-GMSU would upgrade a larger number of switches than LS since the upgrade cost decreases over time/stages.

d: BENEFIT OF MORE ENERGY SAVING AT LATER STAGE
The following case study shows the benefit of saving more energy in later stages. Note that, in general, electricity cost increases in later years. For example, in the United States, reference [36] projects an annual increase in energy prices of 3.29% from 2020 to 2025. Assume Abilene and GÉANT carries out an upgrade every two-year using a total budget of B = $200K for T = 3 upgrade stages. For Abilene, M-GMSU and LS produce {15.79%, 26.32%, 64.91%} and {21.05%, 21.05%, 21.05%} of energy saving, respectively, at each stage. Assuming an initial energy cost of $1 per oncable, M-GMSU will be able to save $(0.1579 + 0.2632 × 1.0329 2 + 0.6491 × 1.0329 4 ) = $1.775, while LS saves only $0.6747. For GÉANT, M-GMSU saves $0.8538 which is slightly higher than LS, which only saves $0.7033.

G. ADDITIONAL RESULTS
This section reports additional findings in terms of increased path delays and link utilization when using our approach. Further, it analyzes the benefit of using l-switches that can turn off unused cables, e.g., those that support the IEEE 802.3az standard. Let us call this green l-switch as gl-switch. In addition, it discusses the energy saving performance of our approach against existing techniques in non-SDNs and pure SDNs. Section V-G1, V-G2, and V-G3 use the total budget of B = $1.2M and the total number of upgrade stages is T = 3. On the other hand, Section V-G4 and V-G5 use a different total budget B over the same total number of upgrade stages, i.e., T = 3. Figure 13 shows a small increase in path delays for all networks when B = $1.2M is used. More specifically, path delays produced by MIP (M-GMSU) for Abilene and GÉANT, on average, are increased by 0.43% (1%) and 2.84% (0.01%), respectively. Further, only 4.3% (12.12%) and 28.39% (0.25%) of the paths in the respective networks have 10% longer delays. Note that all simulations allow 10%  delay tolerance and each path originally uses the shortest path. Thus, each path cannot have a lower delay or more than 10% increase in delay. For DFN, Deltacom and TATA, there is no increase in path delay for a budget of B = $1.2M. This is because the said budget can only upgrade 37.93%, 25.66%, and 19.31% of switches in the respective networks. However, when we increase the budget such that M-GMSU can upgrade more switches and all links are c-links, M-GMSU is able to route some demands via longer paths to maximize energy saving; see the results in Figure 13 for B $1.2M. For example, there are respectively 22.39% and 16.51% of traffic demands that use longer paths for Deltacom and TATA. In this case, the average path delay of these networks is increased by 2.24% and 1.65%, respectively.

2) LINK UTILIZATION
To see the effect of our MIP and M-GMSU on link utilization, we first measure the initial utilization of all links of each network, i.e., before upgrading the network. Recall that the initial routing of each demand follows the OSPF-ECMP protocol. As shown in Figure 14, we find that the maximum link utilization in the five networks ranges between 18% and 36% when all cables are turned on. More specifically, for Abilene and GÉANT, the maximum link utilization is 18.16% and 35.05%, respectively. Then, we measure link utilization of each network after upgrading the network using MIP or M-GMSU. Using MIP, the maximum link utilization in Abilene and GÉANT decreases to 17.81% and 23.18%, respectively. On the other hand, M-GMSU does not change the maximum link utilization for all networks. The reason is because M-GMSU limits the number of on-cables on each link at each stage according to the number of on-cables used at the last upgrade stage T when performing traffic rerouting. Moreover, M-GMSU reroutes any traffic demand at each stage by using its largest volume at stage T . Thus, the maximum link utilization is less likely to increase significantly.

3) ENERGY SAVINGS IN NETWORKS WITH GREEN LEGACY-SWITCHES
This section examines the effect of using gl-switches, i.e., l-switches that support energy efficient technology, e.g., IEEE 802.3az, to turn-off unused cables in each l-link. Recall that the reported energy savings in all previous sections consider non gl-switches, and thus unused cables in each l-link are still on. For this examination, we modify Equation 1 to include unused cables in both c-links and l-links. Figure 15 shows that MIP increases the energy saving of Abilene and GÉANT from 71.93% to 75.44% and 66.1% to 77.7%, respectively. Similarly, M-GMSU improves the energy saving of Abilene and GÉANT from 71.64% to 75.15% and 63.4% to 74.78%, respectively. Further, For DFN, Deltacom and TATA, M-GMSU increases their saving from 52.68% to 74.81%, 32.22% to 77.43%, and 23.97% to 74.55%, respectively. The additional saving accumulates because the unused cables in each l-link can now be powered off by gl-switches, and hence saving more energy.

4) ENERGY SAVINGS IN NON-SDNs
This section evaluates the energy savings in legacy networks or non-SDNs. More specifically, we use the greedy-based heuristic solution, called MSPF-LS, and its simulation results in [6] to represent the energy saving in non-SDNs. Similar to M-GMSU, the MSPF-LS approach in [6] considered multi-path routing, delay constraint, maximum link utilization threshold, and bundled links. Further, the results reported in [6] use the same network topologies as ours, i.e., Abilene and GÉANT. In addition, MSPF-LS used gl-switches that can also perform traffic rerouting. Thus, for M-GMSU, we use a total budget B that is sufficiently large to upgrade l-switches that can control all traffic flows and turn off all unused cables at the beginning of each upgrade stage. As reported in [6], MSPF-LF produced 73% and 74% energy saving for Abilene and GÉANT respectively; see Figure 16. On the other hand, the energy saving produced by M-GMSU is 75.14% and 76.46% for the respective networks. Thus, our results are better than those reported for MSPF-LF.

5) ENERGY SAVINGS IN PURE SDNs
This section presents the energy saving in pure SDNs. To represent the energy savings, we use GA [19] and LS [22]. In this case, except for the total budget B, we use the same scenarios and settings as in Section V-F for M-GMSU, LS and GA. We use a sufficiently large budget to upgrade all l-switches at the first stage and calculate energy saving ε T over T = 3 stages. Figure 16 shows that MIP obtains the optimal energy saving of 75.44% and 77.7 for Abilene and GÉANT, respectively. M-GMSU and GA produce the same energy saving of 75.44% for Abilene and 77.03% and 75.34%, respectively, for GÉANT. For LS, the energy saving for both networks is 73.68% and 74.44%, respectively. The results show that our solutions, i.e., MIP and M-GMSU, outperform both GA and LS for pure SDNs. Further, we observe that the energy saving achieved in pure SDNs is higher as compared to those in non-SDNs and hybrid SDNs; viz. Section V-F.

VI. CONCLUSION
This paper considers the problem of upgrading a legacy network that supports OSPF-ECMP into an SDN over multiple stages. A key aim is that an upgraded network must maximize energy saving. To do so, we consider the maximum available budget at each stage, MLU, maximum path delay, and each l-switch must comply with OSPF-ECMP. This paper considers two routing scenarios: 1) multi-path and 2) linkdisjoint. We have formulated an MIP for scenario-1 and its extension, called DP-MIP, for scenario-2. In addition, we have proposed two heuristic solutions: M-GMSU for scenario-1 and DP-GMSU for scenario-2. Our simulations have shown that M-GMSU and DP-GMSU require significantly less CPU time than MIP and DP-MIP, respectively. Further, M-GMSU and DP-GMSU obtain energy saving that is only up to 4% off from the optimal saving obtained by MIP and DP-MIP, respectively. The energy saving of DP-MIP and DP-GMSU when considering link-disjoint paths is only 0.63% off from the saving attained by MIP and M-GMSU. Moreover, M-GMSU produces up to 1.77% less energy saving than GMSU, which uses single path routing. We find that increasing budget and number of stages result in larger energy savings. Further, M-GMSU produces higher energy saving at later stages than an existing technique, called LS, that tends to spend its entire budget at the first stage. As the energy price (in $) is expected to increase every year, M-GMSU is expected to perform better than LS in terms of reducing the OPEX of networks. As a future work, we plan to consider multi-controllers and their placement in an upgraded hybrid SDN.