Minimum Cost Survivable Routing Algorithms for Generalized Diversity Coding

Generalized diversity coding is a promising proactive recovery scheme against single edge failures for unicast connections in transport networks. At the source node, the user data is split into two parts, and their bitwise XOR is computed as a third redundancy sub-flow. In order to guarantee instantaneous failure recovery without costly node upgrades, the network must ensure that any two of the three sub-flows reach the destination node in case of a single edge failure only by allowing flow duplication or merging identical flows, and avoiding any coding operation in the core network. In this paper, we investigate the corresponding routing problem to calculate capacity-efficient routes for these sub-flows. We propose a polynomial-time algorithm for topologies without capacity constraints on the links and without capability limitations of the nodes. We show that with node limitations the presented algorithm (as well as a minimum cost disjoint path-pair) provides a 4/3-approximation for the routing problem. Furthermore, we formulate an integer linear program to provide a minimum cost solution with arbitrary constraints in general graphs and we propose a polynomial-time algorithm in directed acyclic graphs. Our simulation results suggest that with upgrading only a small set of core network nodes with flow duplication and merging capabilities most of the benefits of generalized diversity coding can be achieved.

DC requires the existence of three edge-disjoint paths between the communication endpoints, which is rarely present in transport networks.
In order to tackle the connectivity issue recent works [17], [18] generalized diversity coding and provided polynomial-time network coding algorithms to route the three sub-flows on minimum cost survivable subgraphs instead of disjoint simple paths. While [17] focused on algebraic properties such as the necessary field size for coding, in [18] we revisited the problem with a pure graph theoretical mindset, and demonstrated that no in-network coding is necessary at all. Although these works assumed that a minimum cost subgraph is given for coding, finding such subgraphs (survivable routing) was first discussed in [19]. 1 In this paper we extend [19] in order to make the generalized diversity coding concept a viable alternative for 1 + 1 in transport networks. To be more specific, from a practical perspective we introduce an approximation algorithm for networks with limited node capabilities, and discuss incremental network node upgrade strategies to deploy our method into real networks. Furthermore, from a theoretical perspective, we propose a polynomial-time survivable routing algorithm in directed acyclic graphs.
The rest of the paper is organized as follows. In Section II we formulate our problem, and reveal important structural properties of the minimum cost survivable routings. In Section III a polynomial-time algorithm is presented in fully upgraded networks without capacity constraints. In Section IV we prove that 1 + 1 approximates our routing problem in partially upgraded networks, and provide a 4/3-approximation algorithm for this scenario. As the routing problem is NP-complete with scarce bandwidth resources in partially upgraded networks [19], in Section V we present an integer linear program for general topologies and a polynomial-time algorithm in directed acyclic graphs. In Section VI we show our simulation results, which reveals the network scenarios where the generalized diversity coding approach can be a real alternative of 1 + 1 with a minimal (or even without) network upgrade. Finally, Section VII concludes the paper.

A. Problem Formulation
A transport network is a collection of routers, switches (referred to as nodes) and high bandwidth communication channels (referred to as edges) between them. It may be represented by a directed graph G = (V, E, k, c) with node set V and edge set E. Each e ∈ E edge has two attributes, namely its capacity k(e) ∈ N, i.e., number of bandwidth units available for data transmission, and its cost c(e) ∈ R + , which is defined as the cost of using one unit of bandwidth along edge e. Given a connection request D = (s, t, d), with information source s ∈ V , with information sink t ∈ V , and the number of data units d requested for transmission. 2 Our 1 The survivable routing problem was later extended to include different delay requirements of the applications [20]. However, in the current paper we deal with the original problem [19] without any delay constraint. 2 The notation is summarized in Table I. goal is to allocate non-negative bandwidth f (e) for each edge e which is resilient against single edge failures. This goal can be achieved either with applying three link-disjoint paths (Fig. 1a), or using three directed acyclic graphs which might share common edges (Fig. 1b), but even upon the failure of these edges all data units are received at the sink without any network reconfiguration, formally: Definition 1: The allocated bandwidth f (e) for each edge e implements a survivable routing of connection request D = (s, t, d) in G, if ∀e ∈ E : f (e) ≤ k(e), and there is an s − t flow of value at least d in G with edge capacities f , even if we delete any single edge of G. Our goal is to find a survivable routing f for connection D with minimum bandwidth cost, formally: We say that routing is vulnerable if it is not survivable. Furthermore, a survivable routing is critical, if we cannot further decrease the bandwidth value f (e) along any edge in e ∈ E without making the routing vulnerable. Intuitively speaking, critical means the routing is a local minimum. The rest of the paper is devoted to finding the global minimum.
This optimization problem has been investigated for decades in the literature, and it was shown that finding the optimal survivable routing for a connection with d > 2 data parts, or finding the optimal survivable routing for multiple edge failures are NP-complete problems [21]- [23]. However, in current transport networks, single edge failures are the most relevant failure scenarios [2], while dividing user data into more than two parts is impractical from an operational point of view. Furthermore, the minimum cost routing solution in most real-world networks can be reached by dividing the input flow into 2 sub-flows [9]. The first results on the complexity of this practically relevant special case of single edge failure minimum cost survivable routing when d = 2 was presented in [19], and it was shown that the problem is NP-complete with topological constraints, while polynomial-time solvable in the unconstrained case.
In this paper we will focus on the algorithmic techniques solving the survivable routing problem for this practical scenario, i.e., the connection can be routed as two parts of equal (unit) size, 3 denoted by A and B; considering multiple constrained scenarios. We are searching a survivable routing for a single demand D = (s, t, 2) at a time. Our algorithms exploit the special structural property of critical survivable routing solutions, which is detailed in Section II-B.

B. Structure of Critical Survivable Routing Solutions
First we define a couple of auxiliary graphs for simple In [17], [18] R was called "coding graph", and several properties have been proved which we will overview in this subsection. For the sake of easier presentation of our results, we introduce auxiliary graph G * = (V, E * , c). The node set of G * is the same as the node set of G, and each e ∈ E is replaced by k(e) parallel edges (i.e., edges which have the same tail and head node as e), each with cost c(e). Note that k(e) is a non-negative integer, and a single edge failure e in G corresponds to the failure of all k(e) edges in G * . A critical survivable routing 4 R * = (V R * , E R * ) forms in G * a Directed Acyclic Graph (DAG) according to Lemma 4 of [18]. It represents the routing of the connection, where V R * ⊆ V , E R * ⊆ E * , while the objective function in Eq. (1) can be rewritten as: Definition 2: A routing DAG H ⊂ G * is a subgraph of G * , which is a DAG connecting s to t in such a way that there exist a positive integer l and different nodes s = v 0 , v 1 , . . . , v l = t of H, such that in H for every i with 0 ≤ i < l node v i−1 is connected to v i by a directed path or two fully-edgedisjoint directed paths, 5 and H is the edge-disjoint union of these segments. If the segment from v i−1 to v i consits of two 2) with the corresponding routing DAGs E A , E B and E A⊕B denoted with dashed, dotted and solid edges, respectively. directed paths, then v i−1 is called a splitter node and v i a merger node (for obvious reasons). The edge set between a splitter node and the corresponding merger node is called an island.
A critical survivable routing R * of G * is the edge-disjoint union of three routing DAGs H 1 , H 2 , H 3 . Moreover, for any edge e ∈ E at most two corresponding parallel edges are in R * , and if two such edges appear, then one of them is part of an island (e.g., in Fig. 1b the routing DAG corresponding to sub-flow A has an island between splitter node p and merger node t, and this island contains parallel edges with the other two routing DAGs). Therefore, if we delete from all the H i DAGs the edges corresponding to an edge e ∈ E, then at least two of the resulting DAGs H i \ {e} 6 will still include directed paths from s to t and implement a survivable routing. Please note that, in the routing problem under investigation (i.e., d = 2) the routing DAGs of the sum are denoted as E A , E B , E A⊕B , indicating that on the first DAG we send data part A, on the second one data part B, and on the third one A ⊕ B. We have the following facts about diversity coding: Theorem 1: If G contains a survivable routing then it contains a critical routing R as well.
If R is critical, then it is a DAG. Also, then R * can be obtained as the union of edge-disjoint routing DAGs Any node of a critical R can be splitter (or merger) in at most one of the three routing DAGs.
The proof of the claims of Theorem 1 are included in [17], [18]. To be more specific, in [17] the authors proved that a critical survivable routing is a DAG, while in [18] it was shown that its corresponding R * can be decomposed into three edge-disjoint routing DAGs with disjoint set of splitter and merger nodes. As a corollary, R * can be obtained as the union of three appropriately selected routing DAGs, which gives the basic concept of our routing algorithms proposed in this paper.
We will refer to the routings satisfying Theorem 1 as Survivable Routing with Diversity Coding (SRDC). Note that, in an arbitrary SRDC solution (one is shown in Figure 2) the three routing DAGs carry, the same data part respectively (either A, B or A ⊕ B), regardless of the failure (i.e., no data retransmission or flow rerouting is necessary). Hence, if two routing DAGs remain s − t connected, the source data parts A and B can be reconstructed at the destination node with an XOR operation (if necessary at all). In diversity coding all of E A , E B , E A⊕B are s → t paths. However, the deployment of an SRDC solution might require splitting and merging of the routing DAGs at the core nodes (e.g., nodes p and m in Figure 2). In Figure 2 E A consists of an s → v path and a v → t island, E B is an s → t path, while E A⊕B consists of a path s → p, an island p → m, and a path m → t.

C. Incremental Upgrade of Node Capabilities
In [24]- [26] the possible extension of node capabilities in Software Defined Networking (SDN) is discussed. Several implementations of network coding are presented, where besides merging and splitting also the much more complex NC capability is implemented. In [24], [25] Multiprotocol Label Switching (MPLS) labels are utilized to distribute sequence numbers [27]. With the sequence numbers, we are able to identify, duplicate (split) or merge given flows. Therefore, a splitter can be deployed by applying regular flow rules, while a merger functionality can be implemented as a network function [22], [28]. Thus, we believe that implementing splitting and merging operations are reasonably simple in SDN; however, a software update is still necessary, which might be performed incrementally in the network.
Hence, in our model the set of the currently available splitter and merger nodes are given as the input of the problem and are denoted as P ⊆ V and M ⊆ V , respectively. If all nodes are capable of performing the splitting and merging operation, i.e., P = V and M = V , then we say that the network is fully upgraded. If only a given set of nodes is capable to perform the actions, then we deal with the partially upgraded network scenario. Note that, for a given connection request D = (s, t, 2) we always assume that s ∈ P and t ∈ M, as these operations can be done by the application instead of network node upgrades.

III. POLYNOMIAL-TIME SURVIVABLE ROUTING ALGORITHM IN FULLY UPGRADED NETWORKS
In this section we show that the minimum cost survivable routing problem for d = 2 with diversity coding is solvable in polynomial time if P = V, M = V and there are no capacity constraints on the edges, meaning that f (e) can be an arbitrary large positive integer. We shall see later, that large capacities are not really necessary in this setting, and in fact, ∀e ∈ E : k(e) = 2 is equivalent to the no capacity constraint scenario.
Suppose that we have a critical survivable routing R such that R * is the sum of three routing DAGs E A , E B , and E A⊕B . We show here an important property of the islands of these DAGs: Lemma 1: Let R * be a critical survivable routing, which is a subgraph of G * corresponding to network G that has no capacity constraints. Let R * be the union of 3 routing DAGs E A , E B , and E A⊕B . Assume E R * p,m is an island for a given splitter (p) and merger (m) node in E A . Let E G p,m denote an arbitrary edge-disjoint dipath-pair 7 connecting p to m in G, with the corresponding fully- Then the routing R = (R * \E R * p,m )∪E G * p,m is also survivable. 7 For brevity, we use "dipath" instead of directed path in the proofs.
Proof: Since we have no capacity constraints, we can select the edges for the new island in G * to be different from the edges used in E B and E A⊕B . The survival property of routing R * implies that no edge e of G appears in two routing DAGs, unless e appears in an island of one of the DAGs. This holds also in R as the non-island edges of R * and R are the same, hence the deletion of all edges corresponding to e can disconnect at most one of the 3 routing graphs. 8 As a consequence, after the deletion of e we still have two s − t dipaths in R \ {e}.
Corollary 1: Let R * be a minimum cost survivable routing and E R * p,m an island for a given splitter (p) and merger (m) node. If the network has no capacity constraints, then E R * p,m is a minimum cost fully-edge-disjoint dipath-pair from node p to node m in G * .
Proof: R * is a minimum cost survivable routing, hence it is also critical. This implies that it is the union of three routing DAGs, and these may have islands. Now if E R * p,m is not a minimum cost dipath-pair for a splitter-merger pair p, m, then with an optimal dipath-pair the construction of Lemma 1 would give a survivable routing R with cost lower than R * , which is a contradiction.
An optimal dipath-pair for p, m can be calculated with Suurballe's algorithm in O(|E| + |V | log 2 |V |) steps [3]. Note that E G * p,m survives a single edge failure, as it corresponds to a disjoint path-pair in G. Thus, we can substitute it with a fail-safe edge between p and m in E A . This gives the basic idea for the algorithm, searching for a survivable routing in a tractable form.
Claim 1: Let R * be a critical survivable routing, decomposed into 3 routing DAGs E A , E B , and E A⊕B . Replacing every island E G * p,m with an edge (p, m) results in three edge-disjoint s → t paths.
Now we are ready to present our constructive proof, which gives a polynomial-time algorithm to find an optimal survivable routing. Let T denote the set of node-pairs that have an edge-disjoint dipath-pair between them in G. For each node-pair (u, v) ∈ T we compute the minimum cost disjoint dipath-pair and save the total cost as cost(u, v). We construct the following auxiliary (multi-)graph G = (V, E, c). The node set of G is the same as the node set of G, and we will have |E| + |T | edges. The edges of G are the edges of E with cost c(e) = c(e) for every e ∈ E, and we add an edge e n = (u, v) for every (u, v) ∈ T with cost c(e n ) = cost(u, v). We refer to the newly added edges as virtual edges.

Proof:
Equality of costs is straightforward. Since π A , π B , π A⊕B are edge-disjoint in G, every edge e in E is contained in at most one path as a non-virtual edge, and may be contained in other island(s) used for substituting virtual edges. In case of a failure of an e ∈ E, the latter remain connected, hence at most one of the edge-sets corresponding to π A , π B , π A⊕B can be disconnected, which proves the claim.
The Lemma above implies that any three edge-disjoint s → t dipaths in G can be transformed into a feasible survivable routing in G with the same total cost as the three dipaths. To complete the proof of correctness, we need to show that a minimum cost survivable routing R is mapped to a union of three edge-disjoint s → t dipaths in G with minimal cost. Theorem 1 implies that R must be critical. Now according to Claim 1, R corresponds to three edge-disjoint s → t dipaths in G. Moreover, the cost of the three edge-disjoint s → t dipaths equals to the bandwidth cost of the survivable routing. This cost must be minimal for the three s−t dipaths in G according to Lemma 2. Finally, finding the minimum cost of three edge-disjoint paths could be done in O(|E| log 1+|E|/|V | |V |) time [3]. In the construction of G, finding the pair of shortest edge-disjoint path from a single source to every destination is [29], which should be launched for every source node, resulting O(|V ||E| log 1+|E|/|V | |V |) steps, which proves the theorem.
It was shown in [18], as a consequence of Theorem 1, that in a critical survivable routing for a connection with d = 2 the bandwidth values are f (e) ≤ 2 for every e ∈ E. Thus, without loss of generality, we may build the auxiliary graph G * with k(e) = 2 (i.e., at most 2|E| edges) when searching for a solution in the no capacity constraint scenario.

IV. APPROXIMATION SURVIVABLE ROUTING ALGORITHM IN PARTIALLY UPGRADED NETWORKS
In this section we present an approximation algorithm to solve the SRDC problem in partially upgraded networks with no capacity constraints on the edges. First, we show that the algorithm provided by Theorem 2 cannot solve the survivable routing problem when not all nodes are capable to perform the splitting and merging action. In Figure 3 only node m is upgraded, i.e., only node m can be a splitter or merger in addition to the source (s) and the destination (t) node. If diversity coding is used, the total cost of the solution is 22, since the user data is sent along three edge-disjoint paths (i.e., 12)). If 1 + 1 is used the cost of the solution is 20 (twice the cost of the π 1 and π 2 paths). The optimal survivable routing is 19 (given by the dotted, dashed and densely dotted edges in Figure 3). Note that, between nodes v 4 and m two copies of the same data is transferred in order to get to merger node m in the network. However, using the polynomial time algorithm provided by Theorem 2 to find the three routing DAGs between s and t would use v 4 as a merger node to remove the duplicate copies In order to solve this issue, Algorithm 1 is based on finding 3-edge-disjoint paths in an auxiliary graph G, which is constructed in the same way as G in Section III, with the exception that virtual edges are added only between upgraded nodes where a disjoint path-pair exist (∀u ∈ P, v ∈ M : u = v) instead of every pair of distinct node-pairs where a disjoint path-pair exist. Obviously, if P = V, M = V we get back G and the constructive algorithm of Theorem 2. The computational complexity of Algorithm 1 is dominated by the creation of the auxiliary graph resulting O(|V ||E| log 1+|E|/|V | |V |) steps.
A. Algorithm 1 Approximates SRDC 1 + 1 was proved to be a 2-approximation [23] for the general survivable routing problem. However, our evaluations and simulations on hundreds of graphs showed that the ratio between the cost of the optimal SRDC solution and 1 + 1 is below 4/3 in all investigated topologies. Thus, it led us to the conjecture that 1 + 1 is a 4/3-approximation for the special case of d = 2 data units.
Proof: Let the two edge-disjoint paths of the 1+1 solution be denoted by π 1 , π 2 and denote their cost 9 as |π 1 | and |π 2 |, respectively. Furthermore, the paths of the SRDC solution are denoted with π Ea , π E b , π Ec in G. We know that the cost of the paths for the 1 + 1 solution i.e., |π 1 | + |π 2 | is lower than the cost of each path-pair form the SRDC solution, since we would utilize the lower cost paths for the 1+1. Hence we know that: We have to show that the following inequality always holds: We emphasize that the 1 + 1 sends both data parts (A and B) on both paths resulting in cost of 2(|π 1 | + |π 2 |), while SRDC transfers A, B, and A ⊕ B on three disjoint paths resulting in the overall cost of |π Ea | + |π E b | + |π Ec |.
If we add the three inequalities and multiply by 2 we get that: From here it follows trivially that the inequality in Eq. (3) is always satisfied.
Built on this fact, the following theorem can be stated.

Theorem 3: Algorithm 1 is a 4/3-approximation algorithm for SRDC when ∀e ∈ E : k(e) = 2.
Proof: Since the source s and target node t are always allowed to be splitter and merger, Algorithm 1 can return 1+1 as a worst case solution, 10 for every possible input. As 1+1 is a 4/3-approximation algorithm for SRDC according to Claim 2, Algorithm 1 provides a 4/3-approximation as well.

V. SURVIVABLE ROUTING WITH LIMITED FREE CAPACITIES
In practice some edges might have limited capacities (i.e., k(e) = 1, referred to as "bottleneck edges" in the rest of 10 Note that the 1 + 1 can be considered as sending A ⊕ B along two edge-disjoint paths i.e. on the island between the source s and target node t. the paper), depending on the previously allocated demands. It was previously shown that with capacity constraints in partially upgraded networks the SRDC problem becomes NPcomplete [19]. Hence, in Section V-A we present an Integer Linear Program (ILP) in general network topologies. On the other hand, in Section V-B we give a polynomial-time algorithm in directed acyclic graphs.
First, we show that the algorithm presented in Theorem 2 cannot cope with networks with some edge capacities k(e) = 1. The problem is that in such a capacity constrained case E G p,m depends on the route of the other two routing DAGs, i.e., another routing DAG may use the single available capacity unit along an edge e ∈ E G p,m of the minimum cost disjoint path-pair. For example, Figure 4(a) shows a network with an optimal survivable routing of cost 20. Note that, the virtual edge e n = (v 1 , t) has cost c(e n ) = 5 because cost(v 1 , t) is the cost of the shortest path-pair v 1 → v 2 → v 3 → t and v 1 → v 5 → t is 3 + 2 = 5. The minimum cost 3 edge-disjoint paths in G are shown in Figure 4(b). Clearly, this is not a valid solution in the capacity constrained case, as edge e = (v 2 , v 3 ) has only k(e) = 1 available capacity in G, while two routing DAGs should use it in the optimal solution.
A next attempt for solution would be to modify the algorithm to find the minimum cost 3 edge-disjoint paths with Suurballe's algorithm using the augmenting path technique. Applying this technique to SRDC, the virtual edges are only traversed by the 3 rd augmenting path, only after 2 edgedisjoint paths were already found. A natural extension of the polynomial time algorithm provided in Section III may be to run the disjoint path search for each virtual edge (e.g., to (v 1 , t)) as a disjoint path-pair between nodes v 1 and t. During this search the reverse edges of the already found 2 edge-disjoint paths can be used (shown in Figure 4(c)) similarly as in Suurballe's algorithm, and additionally it can use the reverse edges of the third edge-disjoint path's segment between s and v 1 (which is . This could result in an augmenting path between splitter v 1 and merger t of v 1 → v 5 → v 6 → v 3 → t. In this case the second augmenting path between splitter v 1 and merger t would be v 1 → v 4 → v 5 → t. This in fact results in a vulnerable routing shown in Figure 4(d) with cost 16.

A. Optimal Solution in General Graphs
In this section we present an ILP to obtain an optimal survivable routing R in terms of bandwidth cost. The ILP formulation provides the three routing DAGs for SRDC even with capacity constraints and node limitations. To do so, we need to introduce the so called reduced capacity function [17]   The following constraints are required: ∀w ∈ W, ∀i ∈ P \ M: ∀w ∈ W, ∀i ∈ M \ P: ∀w ∈ W, ∀i ∈ V \ {P ∪ M}: ∀e ∈ E: ∀e ∈ E: ∀e The constraint in Eq. (5) formulates the flow conservation for each routing DAG w. Additionally, Eq. (6)-(8) formulate the constraints needed for representing the different node capabilities. Namely, Eq. (6) represents the set of nodes that can only perform the splitting operation (P \ M). Eq. (7) is needed for the nodes that are only capable of merging the data stream (M \ P) and Eq. (8) is for non-upgraded nodes. Note that we do not need extra constraints for the nodes (P ∪ M) that can both split and merge the data. Eq. (9) sets the maximal flow value based on the reduced capacity function, while Constraints (10) Note that the f w (e) variables correspond to the edge set w ∈ W in the solution, i.e., provide the three DAGs. Since Constraint (5) ensures that x A + x B + x A⊕B gives an s − t flow of value 3 in G, from Theorem 4 we get that f (e) is indeed survivable.
In order to analyze the complexity of the ILP we have to assess the number of constraints and variables necessary to formulate the problem. For the formulation of the flow and node capability constraints, i.e., for Eq.

B. Polynomial-Time Algorithm in Directed Acyclic Graphs
Although finding the optimal SRDC solution is NP-complete in general graphs, here we give a polynomial-time algorithm for the special case when the input topology is a DAG. Given a DAG G, let v 1 , v 2 , . . . , v n be a fixed topological order of the nodes in G, that is, for every edge e = (v i , v j ), i < j holds. For capacities k(e) and cost c(e) we are going to give an algorithm to find a minimum cost survivable routing solution for demand D = (s, t, 2). We can assume that s = v 1 and t = v n .
Definition 3: For any 1 ≤ i < n, let S i := {v 1 , . . . , v i } and T i := {v i+1 , . . . , v n }, finally let C i denote the set of edges in G in the S i − T i -cut, that is those with tail in S i and head in T i . We call these cuts topological cuts.
Let C i be a topological cut of G and L i , P i , Y i three, not necessarily disjoint 1 or 2-element subsets of C i . We call such an ordered triplet τ a coloring of C i , where L i ∪P i ∪Y i are the colored edges, and edges in L i , P i , Y i are called lime, purple, yellow, respectively. We say that this coloring is survivable, if after the removal of any edge e in C i , at least two of the sets L i , P i and Y i remain non-empty. A coloring of cut C i and a coloring of cut C i+1 are compatible, if they are the same on C i ∩ C i+1 and for every colored edge in C i+1 with tail v i+1 there is an edge entering v i+1 with the same color (see Figure 5). A coloring τ i of C i is feasible for a capacity function k, if for every edge e in C i , the number of colors containing e is at most k(e). For a subset of edges F ⊆ E let Intuitively, a coloring of C i intends to capture the parts of the survivable routing DAGs E A , E B , E A⊕B which are subsets of C i . As we seek a minimum cost solution, these parts cannot have more than two edges (according to Theorem 1). Proof: In a survivable routing every edge intersects at most two of the three routing DAGs, hence the removal of any edge from a cut C i leaves at least one of the corresponding color classes untouched, which proves the survivability of the cuts. Since an edge of capacity 2 appears in at most two out of the three routing DAGs, the corresponding edge sets in a cut are also feasible. Finally compatibility of the cuts follows from the fact that the routing DAGs are s-reachable.
Lemma 4: If for three s-reachable subsets of edges L, P, Y for every topological cut C i coloring τ i = (L i , P i , Y i ) is survivable, then L, P and Y form survivable routing DAGs of G.
Proof: Assume indirectly that there is an edge e = (v i , v j ) the removal of which disconnects at least two DAGs. Then it is easy to check that cut C i is not survivable. Now we are ready to describe our algorithm, based on dynamic programming. For every 1 ≤ i < n, let G i denote the graph obtained from G by the contraction of nodes in T i . We are going to calculate the minimum cost of three survivable routing DAGs in G i with a fixed survivable, feasible coloring τ i on C i . This value will be denoted by opt(τ i ).
For i = 1, the cost of a survivable, feasible coloring of C 1 is just the sum of the costs of the colored edges with multiplicity (an edge may have multiple colors). For 1 < i < n, let a survivable coloring τ i = (L i , P i , Y i ) be given. Then From Lemma 3 and Lemma 4 the cost of a minimum cost survivable routing is min{opt(τ n−1 )|τ n−1 survivable, feasible coloring of C n−1 }. Since edge capacities in a minimum cost survivable routing can be assumed to be 1 or 2, for every edge there are at most 6 possible colorings. Hence the number of survivable, feasible colorings of a topological cut C i is O(|C i | 6 ), and the above recursion yields a polynomial-time algorithm. Note that the case of splitter and merger node sets (when P and M are given) can be easily integrated in the algorithm by the modification of compatibility, e.g., only a merger node v i+1 can have two entering and one outgoing edges of the same color (see Figure 5).

VI. EXPERIMENTAL RESULTS
In our simulations we assume that a set of connection requests D is given between all possible source-target pairs and plot the average capacity reserved per connection for every survivable routing approach. We compare our methods to the theoretical lower bound [22] (data can be divided into an arbitrary number of parts) and to 1 + 1 protection, which is a 2-approximation of the survivable routing problem against single edge failures in general [17], [23] and a 4/3approximation of the SRDC problem with d = 2 data units. As a baseline, we also plot the optimal solution of the ILP(100) presented in Section V-A and the 4/3-approximation line of the optimal solution. The number in the parenthesis beside the algorithms refers to the percentage of upgraded nodes, e.g., (10) means 10% of the nodes are upgraded with splitter/merger functionality. We investigate random generated real-like planar G = (V, E, k, c) topologies with different sizes and densities, and some real-world transport network topologies, too. By the real-like topologies, the simulation results are obtained by averaging several instances from the topologies with the same properties (95% confidence interval is plotted).
Note that we do not compare our method to the DC since the blocking probability of the DC is extremely high, due to the fact that it requires the existence of three edge-disjoint paths between the communication endpoints.

A. Fully Upgraded Networks Without Capacity Constraints
Here, we present the simulation results without capacity constraints in Figure 6. The x-axis represents the node numbers of the random networks, while the y-axis shows the average capacity reserved per connection. Our results in Figure 6a show why 1 + 1 is still the most often deployed protection scheme, as the gap between the bandwidth cost of 1 + 1 and the theoretical lower bound for survivable routing is small. However, our SRDC algorithm given by Theorem 2 outperforms 1 + 1, and reaches the theoretical lower bound. This also demonstrates that the lower bound can be achieved by dividing the data into two parts in these topologies. On the other hand, in maximal planar graphs in Figure 6b the theoretical lower bound requires that connection data is divided into more than two data units. Although our algorithm still approaches the lower bound, 1+1 reserves one more edge (bandwidth unit) per connection to provide the same simplicity as our SRDC method.

B. Partially Upgraded Networks Without Capacity Constraints
In Figure 7 we show a scenario where not all nodes are upgraded with the splitter/merger functionality. In particular, in Figure 7 we show that just by upgrading 10% of the nodes (which we consider as a typical scenario of incremental network upgrade) we can achieve significant improvement compared to the 1 + 1, both in sparse (Figure 7a) and dense networks (Figure 7b). We can observe that Algorithm 1 provides results near to the optimal solution of the ILP(100), and demonstrates that even with 10% of upgraded nodes, our SRDC approach can bring real benefits. As the source and destination nodes are always considered to be splitter/merger for a given connection demand, in the denser networks (Figure 7b) almost always exist 3-disjoint paths, and no in-network splitting and merging is required. Hence, network node upgrades cannot bring huge capacity savings in this setting.

C. Experimental Results With Capacity Constraints
In this subsection, we investigate the capacity constraint case through the performance of our methods in real network topologies (SNDLib [30] and Rocketfuel ASs [31]). In this scenario the network is heavily loaded, i.e., due to the heavy traffic load some edges lack free capacity (i.e., are considered as bottleneck edges). To achieve this, we continuously increase the traffic load and analyze the given state of the network. For a fair comparison, we only take into account the non-blocking scenarios, i.e., where there is still a disjoint path-pair between all source and destination pairs even for 1 + 1. Since no traffic matrix is given beforehand, we identified a certain number of edges which are most prone to congestion based on their betweenness centrality value. We considered these edges as bottlenecks in the simulations (i.e., only a single capacity unit k(e) = 1 is available on them). Three traffic scenarios are distinguished: • Light traffic load: no bottleneck edges in the network, • Medium traffic load: maximum 10 bottleneck edges, • Heavy traffic load: maximum 20 bottleneck edges. Note that in each scenario the maximum number of bottlenecks is chosen only if it does not violate the non-blocking condition.
In Figure 8 we show the results when both the capacity and the node capabilities are constrained which is the most challenging SRDC subproblem. One can observe that as the traffic load increases, the average bandwidth cost of 1 + 1 increases dramatically (as the 1 + 1 cannot use bottleneck edges), while the average bandwidth cost of the ILP(100), i.e., optimal solution in fully upgraded networks remains low and scales well with the traffic load. Furthermore, with the increase of the percentage of the splitter/merger nodes the average capacity reserved per connection decreases, demonstrating the benefits of incremental deployment.

D. Incremental Deployment
In this subsection, we intend to give an insight for network operators how incremental deployment of SRDC improves the overall performance, and on the way the upgradeable nodes should be selected according to the budget. For selecting the upgradeable nodes, we compare two approaches: • Random: nodes are selected uniformly random, • Smart (S in the figures): in a pre-process phase we run the algorithm given by Theorem 2 for each source-target pair assuming there are no capacity constraints on the edges. We count how many times a given node was utilized as a splitter/merger in these solutions, and greedily upgrade the nodes with the highest values until the budget is reached. In Figure 9 we show the effects of the traffic load increase. We see that the gap between the ILP(100), ILP(10) and ILP(S10) increases gradually as the traffic load increases. Furthermore, it also demonstrates that even with randomly upgrading 10% of the nodes, SRDC performs close to the optimal solution even in a heavy traffic scenario.
In Table II we demonstrate the benefits of a more fine-grained incremental deployment strategy on real network topologies in the heavily loaded network scenario. In particular Table II compares the results of 1 + 1 and the presented ILP solution where the number in the table refers to the number of upgraded core nodes (besides the source and destination nodes of each connection request, which are always considered as merger/splitter), e.g., S4 means that 4 of the core nodes are upgraded with splitter/merger functionality with the help of "smart" selection. Note that 0 refers to the case where only the source and destination nodes are capable of performing the splitting and merging operation, i.e., the survivable routing is either 1 + 1 or traditional diversity coding, whichever is better. Even in this case, we can achieve a significant gain, i.e., the average capacity consumption can drop down to half from 1 + 1 compared to the ILP. Furthermore, even with upgrading a small number of (random/cheap) core nodes we can further approach the optimal solution.

E. Run-Time Analysis
The simulations were performed on a computer running Debian Stretch Linux Version 9, with four 2.67 GHz Intel Core2 Quad Processors with 12 GB RAM. To solve the ILPs, we used the Gurobi Optimizer version 6.0.4.
The running time of Theorem 2 and Algorithm 1 is dominated by the creation of the auxiliary graphs G and G, respectively. Hence, when just a few nodes are upgraded (the number of virtual edges between splitters and mergers is small) the computation time is around 80 ms in the 40 node random networks. However, if all the nodes are upgraded the time is about ten times higher, but always less than 1.5 s per demand. 11 In other words, the running time of Theorem 2 and Algorithm 1 is strongly influenced by the size of the network and the extent of the upgraded network nodes.
In the capacity constrained cases, the computation time of the ILPs is around 320 ms in the 40 node networks, and depends on the size of the network (increased number of variables and constraints), but it is independent of the number of bottleneck links. By incremental deployment, as we upgrade more nodes in the network the running time decreases slightly since fewer constraints (Eq. (6)- (7)) are needed.

VII. CONCLUSIONS
Generalized diversity coding is a novel, easily deployable routing scheme in transport networks which keeps the ultra-fast recovery and simplicity (both in computation and operation) of 1+1. As a missing link of its practical implementation, we investigated the minimum cost survivable routing problem (SRDC), showed that a minimum cost subgraph can be computed in polynomial-time without capacity constraints on the edges (and in directed acyclic graphs), provided an approximation algorithm for partially upgraded networks, and proposed an integer linear program for the other scenarios. Our simulation results suggest that even with upgrading only a small set of network nodes, we can reduce the bandwidth cost of 1 + 1 in most network scenarios and utilize up to three-four edges less per connection, which could lead to a significant capacity saving with an excessive number of connections. We argue that the novel method can provide a viable alternative for 1 + 1 in transport network protection for the price of a minimal network upgrade.