Enhanced Logical Representations of a Real Network Based on an Algebraic Model

The algebraic model and logical representations of a real network are important but very challenging issues in automatic network verification. In this paper, we abstract the concrete network to its corresponding abstract graph with enhanced vertices and edges by splitting the vertices according to the protocols that they run. Based on classical routing algebra, we consider combining the interactions of different protocols and routing records and then give a newly modified algebraic structure. To apply the abstract modified algebra routing into the concrete network, we make use of the SMT solver to encode the components of algebra into logical constraints motivated by the work of Ryan Beckett et al. By encoding the network behaviors and properties into logical formulas, we can compute whether the experimental network that we configured satisfies any property, including reachability, routing loop, and so on. In the previous work, the routing algebra or the representation of the network is only applied to several network properties. However, in this paper, we extract all kinds of properties into exact logical formulas.


I. INTRODUCTION
A network is composed of a control plane configuring the behavior of the data plane, which in turn is in charge of forwarding the actual packet. Therefore, the control plane can take configuration files and the current network environment and then generate the data plane. The data plane can be treated as a director that tells a packet its next station. The data plane takes a packet and its location as input and outputs a packet at a different location.
To ensure that a large scale of the network runs correctly, engineers have to spend considerable time writing the configuration files and maintaining the network. They have to ensure that each routing protocol locally computes a set of correct routes for the routers and collaborates well with each other protocol. However, most of the time, the oversight of the configuration files may lead to a terrible disaster. These issues can be addressed by adding a level of abstraction separating the high-level decisions of what a routing protocol should compute and how it should be configured from the lowlevel implementation files of how it performs the necessary computation [1]. The behaviors of each device in a real network are complicated due to the different mechanisms The associate editor coordinating the review of this manuscript and approving it for publication was Eyuphan Bulut . of each routing protocol and the interactions among them. Obtaining a unified and abstract algebraic representation of the network at a high level is worth researching. The purpose of routing algebra is to model the complex routing protocols into a high-level abstraction structure to obtain the stable state of the network. As is known, network devices use several protocols to exchange messages and forward packets according to their destinations. The configuration files of each device decide how these routers or switches process packets and select paths for them. These files usually have thousands of command lines and are vendor specific, which makes it hard to determine the behavior of the network. Based on this approach, the low-level configurations of devices are modeled into concise specifications. The abstract models of kinds of tools enable engineers to handle these tasks and networks with high efficiency and reliability.
The earliest network diagnostic tools, such as traceroute and ping, can analyze whether and how a given packet reaches its destination to find the configuration errors. The issue is that they have poor data plane coverage. In other words, they just analyze the forwarding behavior for only a single packet for the data plane but ignore the corresponding packets.
HSA [2] and Veriflow [3] cover the data plane better to analyze the data plane behaviors. The problem is that it is too VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ late to detect errors from the data plane because errors have occurred. An effective way is to translate network configuration files into logical formulas capturing the stable states to which the network forwards. This method will converge as a result of interactions between routing protocols, such as OSPF [4], BGP [5], [6], and static routes. Minesweeper [7] tries to achieve both high network design coverage and high data plane coverage while remaining scalable enough to enable verification of many real networks. This approach models the whole network instead of a single path by using a graph-based model. Encoding all interactions of the route messages, there are many logical constraints on these nodes and edges of the network. Moreover, these network properties, such as reachability, isolation, and routing loop, can also be translated to logical formulas according to the constraints of the network behaviors. By using an SMT solver [8], which combines the constraints of both network behaviors and the network property that we check, we can obtain the computational result.
Covering the control plane is hard work because devices may run different kinds of routing protocols. From this point of view, our goal is to modify the basic routing algebra to model the dynamic network and explore more abstract and concise logical representations of network behaviors and properties based on our approach of the algebraic model.
Much research focuses on raising the level of abstraction of network configuration files and the mechanism of routing protocols, which is an effective and unified way to synchronize the structure of different networks [9], [10]. However, it is difficult to contact algebra with real network behaviors. In this paper, we advocate an approach to the algebraic structure of routing based on the aforementioned algebraic models. There are mainly two significant improvements to our contributions. The first is the definition of the network graph. We split the nodes in the directed graph according to the multiple protocols that they run. In this way, we can clearly analyze and represent the interactions of protocols and forwarding behaviors. Moreover, we add the routing records part into the algebraic model to catch the interactions between different routes rather than following a single route. The second point is that we try to concretize the abstract algebraic model by logical representations. Based on the modified routing algebra, we complete the encoding of the network behaviors and properties into logical formulas and obtain all the concise logical representations of the main ten network properties.
The paper is organized as follows. Section II illustrates the procedures of modeling a real network using our modified routing algebra. First, we model the real network into a directed graph G as basic routing algebra does. Then, we illustrate how we split the nodes for the different protocols that they run. Thus, we give a new model of the network as G and show the structure of our approach to algebra routing by adding the routing records and filtering function. Thus, a high-level abstract structure to model the dynamic route changes in the real network is obtained. In section III, we apply the modified routing algebra to encode the whole network. At the highest level, the operating progress of the network can be understood as the mechanism and policies of routing protocols and the way that the devices process them. Then, we can divide the encoding model into three parts: data plane, control plane, and routing policy. We show the encoding details of network properties in section IV. It is important to point out that we encode all ten kinds of network properties in abstract logical formulas, which are not done completely in the previous work. Section V concludes with our contributions, a discussion of directions for future research, and practical use case scenarios.

II. ROUTING ALGEBRA
Recalling the basic routing algebra, Griffin et al. showed that network control planes solve the stable path problem [11], and these paths can be described by constraints on edges. In their work, each node represents an autonomous system, and these nodes process the BGP. BGP is referred to as a pathvector protocol that implements its routing policies by route announcement passed between routers or switches [12]. BGP announcements are represented as records that have several attributes. The basic routing algebra models the network into a directed graph and comprises a set of labels, a set of signatures, and a set of weights. Each network link has a label, and the path in the network has a signature. There is an operation to obtain the signature of a path from the labels of its constituent links and a function mapping signatures to weights [13]. Some related work is enriched by properties that combine its elements, allowing stronger statements about protocol convergence to emerge and permitting a computationally easy characterization of free cycles. Motivated by previous work [1], [10], [14], [15], we modify the routing algebra to combine routing records with basic algebraic structure, which is more helpful to verify the network properties. For ease of encoding and computing, we assume that the network is already convergent. We prefer the relation of routing algebra and logical formulas to the convergence of the network.

A. MODEL A NETWORK TO A GRAPH
Existing algebraic models mostly reason a single path from a certain source node to its destination. In data plane analysis, this method can effectively obtain results, such as HSA. Because of ignoring the interference between different routes, this approach can cause problems in real network analysis, especially in control plane verification. For example, several routes may use a common edge to forward traffic. If the edge is down, these routes all fail to reach destinations. Encoding a network into a graph can combine associated routing messages to analyze the current route. To represent the network in a graph, we first abstract the network to graph G, G = (V , E, W ) V : Nodes in the network. E : Links between each pair of nodes. W : Metrics of each link.
(1) Then, we need many logical constraints to model the nodes and links. The more complicated the network behavior is, the more logical formulas are needed, but we only compute those relevant constraints to obtain the result. Figure 1 presents a sample network to illustrate the dynamic state of the network. There are two autonomous systems, AS1 and AS2. Each AS has three internal routers, A to C and E, F, and H , which runs OSPF protocol separately. For AS1, A connects to an external neighbor N 1 via an external BGP. C connects AS2 through E running the external BGP as well. A, B, and C share the external BGP route messages from N 1 via the internal BGP. For AS2, H receives routes from external neighbor N 2 . E, F and H process the OSPF protocol to deliver route messages.
Suppose N 1 tries to establish a communication with N 2 . What will the routing path then be? First, A receives external BGP routes from N 1 and then announces to the routers in AS1. Next, A chooses a proper path to send the route out of the current AS. According to the short path first algorithm of OSPF, A selects a lower-cost path to reach C, which is the border router of AS1. The route will then be passed to E via the external BGP and passed through AS2, finally reaching N 2 .
Let us assume that each link in AS1 has the same cost. Then, the shorted path from A to C is the edge between them. Thus, the path from However, if the link between A and C breaks down, A will choose to reach C by way of B. Then, the path will be N 1 −A−B−C −E −F −H −N 2 . We represent all the possible outcomes to logical formulas and to compute whether N 1 can reach N 2 . By using modern SMT solvers, we can obtain the result to check the reachability.
To model the routing messages in more detail, the form and definition of graph G are modified. First, we extend the topology of the graph by splitting the nodes [7], [16]. The splitting number of a node depends on the number of protocols associated with the current route running on this node. For example, in figure 1, router C runs OSPF with its companies in AS1 and external BGP exchanging messages with router E in AS2. Router C splits to C OSPF and C eBGP two vertices. The number of edges E will increase with the splitting of nodes. The edge between C OSPF and C eBGP represents the protocol interaction of OSPF and the external BGP process on C. Moreover, the metric on this edge is also meaningful. Then, we can modify the graph as G, ( Hence, we have |V| ≥ |V |, |E| ≥ |E|. In general case, different paths can interfere with one another. To make a proper and correct analysis, all possible interactions should be taken into consideration. In our modified graph model, all possible paths can be constructed into a synthesized topology model, G. Then, the complexity of this structure should be O(V + E). However, in many existing network models, the complexity is up to O(V E V ).

B. ABSTRACT THE GRAPH TO ALGEBRA STRUCTURE
The control plane defines the behaviors of each router and the rules regarding how they treat different routing messages. The enhanced graph of a network is a set of nodes and edges. Based on the algebraic models of routing [10], [13], [17], [18], we try to combine the routing records and algebraic models to advocate a new destination-oriented algebraic structure. The destination-oriented refers to the fact VOLUME 8, 2020 that though the nodes of the graph model all routers in the network, we only consider those associated with the destination that we need to check. To model different routing protocols, we use a set of attributes to distinguish the different attributes of each protocol, such as the local preference and as-path of external BGP, the hop counts of RIP, and the cost OSPF. Then, we use R to model the routing records. When a router receives several routing messages to the same destination, it may select the best one and process it according to some filtering rules as configured. Thus, we also need a notation to model the routing attributes and a function F to model filters and routing policies. Then, we can define the algebraic structure as : routing attributes.
: comparison order over R. R : routing records.
F : filtering function. ( Here, r is a routing record associated with a certain destination d, and r ∈ R. To model the records in more detail, each r is a tuple of attributes. For example, for an external BGP routing record, r can be written as a vector of attributes, such as r(ad, lp, as _ path, med, . . .). We use φ to represent a missing attribute. Suppose r 0 is a BGP routing record with med value missed; then, r 0 can be denoted as r(ad, lp, as _ path, φ, . . .).
The operates on the routes in lexicographic order. In other words, can be treated as the total order that ranks and models the routing decision procedure using some combination of record fields [19], [20]. For example, if a router receives two records of different protocols, we just compare the ad values of the two records to obtain a better one. For the records of a same protocol, suppose we have two BGP routing records r 1 (lp 1 , as _ path 1 , med 1 , · · · ) and r 2 (lp 2 , as _ path 2 , med 2 , · · · ). To compare r 1 and r 2 , we first compare the values of lp 1 and lp 2 . If they are the same, we then compare the values of as _ path 1 and as _ path 2 and so on. Thus, we compare the two records in lexicographic order, and only the corresponding attributes can be compared, such as lp 1 and lp 2 . Then, lp and as _ path cannot be compared. Although a record r contains different attributes, any two records can be compared to obtain a better one.
The function F maps pairs with a node and a routing record to a record, which is where x ∈ V is a node and r ∈ R. The filtering function describes how routing records are modified, especially dropped by the current router.
A well-defined algebra structure of routing needs to satisfy two properties [20], loop free and faithful. The loop free is different from the network property routing loop.
Property 1 (Loop Free): A graph must not contain selfloops.
The routing records are strictly mapped from the routing messages from the network.
If router x blocks the current record, we have Here, r φ represents that the record is dropped. Then, router x will not advertise a record to its neighbor.
Since we model the network to a graph with a well-defined algebraic structure, we need an approach to concrete the algebraic structure to calculate the real network behaviors and verify some network properties. Motivated by the previous work [7], we try to encode the algebraic structure to logical constraints. Then, it is possible to transfer the symbols of the algebra to the corresponding logical variables.

III. THE BASIC NETWORK MODEL
In this section, we model the algebraic network into a set of SMT constraints. We first traverse the signatures in to their corresponding variables. The variables are set proper ranges to encode certain attributes of protocols or packets. For example, local preference is an important attribute of BGP; then, we can encode it with lp, and lp ∈ . According to different routing records, these variables of attributes form constraints, which include the network behaviors for a certain packet we might concern and the network environment that might influence the behaviors. To model the network state in detail, we also set constraints to the nodes in V and edges in E, such as variables encoding the active or down state of a node or a link. For ease of calculation, we denote these constraints to the notation C and denote the network property that we want to check as P. By using a modern SMT solver, such as z3, we combine C and ¬P to obtain the verification result. Why do we solve ¬P instead of P? The principle of z3 is interesting. If the network behaviors of a packet C are consistent with the network property, we do not need a specific outcome. In contrast, if C violates the property P, we do need to locate the problem; thus, the computation result shows the counterexample.
Based on the process of passing a packet from the source to its destination, we model the network into three parts.

A. DATA PLANE
To model the data plane into symbolic representation, we encode the header space of the packet. There are five important elements: source IP address, destination IP address, source port, destination port, and protocol. When a packet arrives at the router, the router needs to know where the packet is from and where to go. The symbolic notation we turn to the work of Beckett et al. [7]. The first is the source IP address of the current packet, which is modeled by an integer variable sIp. The destination IP address is modeled as dIp. Both of them range from 0 to 2 32 −1. We also need the source port and destination port of the packet. Then, they are denoted as sPort and dPort, which both range from 0 to 2 16 − 1. Additionally, the protocol in which the routers process is encoded as prot. Apart from the integer variables, the forwarding decision in the data plane should also be encoded. Thus, a Boolean viable choice x,y encodes whether router x forwards traffic to router y in the data plane. If router x finally decides to forward a packet to router y in the data plane, then choice x,y equals 1.
If not, it equals 0. Because routers will be equipped with filtering policies, some routes to certain destinations may be blocked, and then, we also need a variable to encode the effectiveness of the routes. We list the main variables in table 1.

B. CONTROL PLANE
The control plane is difficult to model due to the complexity of each routing protocol and the interactions among them. A network equipped with relatively complete functions usually processes several different protocols. In an autonomous system, OSPF, IS-IS, or RIP may be used, and the external BGP runs across multiple autonomous systems. Additionally, static routes and connected routes can be configured in the same network. Routers in the network usually process multiple routing protocols. For example, in figure 1, router C receives a route from A via the internal BGP and then announces the route to B via the OSPF and to E via the external BGP. To model the routing messages clearly, we try to split each node in the network into multiple nodes according to the protocols that it runs.
In figure 2, we illustrate the splitting of C. Routing messages are exchanged and processed among C iBGP , C eBGP , and C OSPF . We turn to the work of Ryan et al. to use records of symbolic values corresponding to protocol messages. It is difficult to model the message exchanges among different protocols because each protocol has specific values to filter and process the routes. For instance, BGP has various values to evaluate routing messages, such as as as-path, local  preference, metric, and multi-exist discriminators. OSPF runs the short path first algorithm; then, it needs a cost value to record the metric of a path. RIP needs a hop value to count the metric, and in SMT constraints, the hop is less than 15.
We can roughly take three steps to follow the process of the router treating a message. First, it needs to filter the message. Then, the current protocol processes the message. Third, the message traverses an export filter and is then exported to the next router. To model the procedure, we turn to the arcoriented symbol, and we encode the routing record r into r o and r i . For example, in figure 3, considering the edge between C OSPF and C eBGP , we give the edge a tuple of symbols, r o represents the record r exported by the OSPF process of C. r i encodes the message after traversing the import filter of C's external BGP process. The r i and r o symbols are the records of the routing message. In other words, each of them has the values associated with each protocol. Specifically, r i .ad encodes the administrative distance value of the imported record. For BGP, r i .lp encodes the local preference value of the imported record. For OSPF, r o .cost encodes the cost of the current path, and so on. The type of these variables is integer; thus, we can map them to specific values in SMT constraints. VOLUME 8, 2020 We list the main variables in the control plane in table 2. For convenience, we encode the routing messages as records, and each record includes the variables according to different protocols. When a router receives a routing record, it first compares the prefix or the record with its prefix of the destination IP address. The longest IP prefix matched will be chosen. The variable r prefix ranges from 0 to 2 32 − 1. Administrative distance (ad) is the feature that routers use to select the best path when there are two or more different routes to the same destination from different routing protocols. Each routing protocol is prioritized in order of most to least reliable (believable) with the help of an administrative distance value. The variable ad ranges from 0 to 2 8 − 1. For BGP, the records include some basic attributes to integer variables. For example, local preference (lp) is the BGP attribute ranking after administrative distance. If there are several paths to the same destination, the local preference attribute with the highest value indicates the preferred outbound path from the local autonomous system. The Boolean variable ibgp is used to mark the route via the internal BGP. The metric is also contained in these records, which is measured distinctly in different protocols. For BGP, the metric is the length of AS path, and for OSPF is the cost of a path. When a route is redistributed, the metric usually has an initial value, and the value depends on the configurations. After all the steps of the route selection, if several routes are still equally good, we need the router ID(rid) to break ties. Moreover, we need a variable canUse to encode the forwarding behavior in the control plane.
To make a forwarding decision in the control plane, the router needs a valid routing record, which is r valid = 1, and the next hop is the best choice among all possible choices. Then, we can encode the forwarding decision in the control plane as where best all denotes the highest-ranking symbolic record when comparing multiple possible choices, which is explained in detail in section route selection. The forwarding decision in the control plane and the rules of routing policies, such as ACLs or IP prefix lists, contribute to the final forwarding decision in the data plane. Then, the relationship between canUse and choice can be If the filtering function drops the routing record for some specific ACL rules, then F(r) = r φ . Thus, we can obtain choice x,y = 0 in the data plane.

1) ROUTE SELECTION
When a router has multiple choices to the same destination, it selects the best paths. In the current protocol process, the available paths are ordered in a uniform standard. For example, OSPF uses Dijkstra's algorithm to compute the minimum cost paths from a source to destinations, and the router chooses the min-cost path as its local best route. For the BGP, available routes are ordered according to the rank of those attributes, and the router that processes the BGP selects the high-rank path as its local best path. When a router can reach its destination via multiple different protocols, it selects a prior rank protocol to process according to the value of these protocols' administrative distance. As mentioned above, we turned to the symbolic variable [7] r to denote the record of a routing message. Then, we modify the form of r. A record r contains a series of attributes. Taking the BGP as an example, a record r can be treated as r(ad, lp, metric, as _ path, origin, . . .). Suppose a router receives two BGP routing records r 1 and r 2 , which announce that they all can reach the same destination. The router needs to select a better record. Then, the records r 1 and r 2 are compared lexicographically. r 1 .lp and r 2 .lp are first compared, and BGP prefers the higher value. If those are equal, BGP prefers the higher value of metric and then shorter as _ path, and so on. OSPF prefers a lower cost path. The RIP selects a lower hop count path. Encode this mechanism in SMT constraints in the following form: where means that record r 1 is superior to r 2 or at least as preferred as r 2 in lexicographic order. If a router is configured with k routing protocols, then each protocol maintains a routing information base in the control plane. Then, we can encode a symbolic variable prior i for each protocol instance i. Here, i denotes the ith protocol and 1 ≤ i ≤ k. The variable prior i is equated with the highest available routing record in the order for each protocol instance i. After comparing all the possible best paths prior i of each protocol, according to the administrative value of each protocol, the router finally chooses the best one to use, which we denote as best all . More importantly, the best all record is superior to any of the prior i and at least equals one of them.

2) ROUTE REDISTRIBUTION
Route redistribution allows a network that uses one routing protocol to route traffic dynamically based on information learned from another routing protocol. Each router usually runs more than one routing protocol instance. These routing protocol instances operate independently and announce or receive the best route from others. Suppose a router R runs two protocols, p 1 and p 2 . There are k 1 routers running the p 1 protocol and k 2 routers running the p 2 protocol, and the two routing protocol instances communicate through R. As mentioned above, p 1 and p 2 operated independently; then, we only consider the symbolic variables on the link between R p 1 and R p 2 . Due to the different metric standards of each protocol, such as the cost of OSPF, the length of BGP and the hop count of RIP, we modify the metric attribute of variable r only when r is to be redistributed. The metric is only a value according to the configuration files.
In figure 3, when calculating the formulas of route redistribution, we only consider the variables on the link between C OSPF and C eBGP . Now we consider the variable metric. When in an external BGP instance, metric represents the number of AS of the external BGP path. When the routing record is redistributed from the external BGP to OSPF, the metric is defined to the default value that is understandable to the receiving protocol (OSPF) to unify the measure.

C. ROUTING POLICY
All routing protocols place their routes into the routing information base or routing table. When advertising routes, the routing protocols by default advertise only a limited set of routes from the routing table. Moreover, each routing protocol exports specific routes according to the configuration files.
Routing policies enable operators to filter which routes a routing protocol imports into the routing table and which routes a routing protocol exports from the routing table. We can conclude this function as the import filter and export filter. A routing policy also enables operators to modify the information associated with a route when it is being imported into or exported from the routing table. Filtering imported routes enables the current router to control the routes used to determine active routes. Filtering routes being exported from the routing table enables the router to control the routes that a protocol advertises to its neighbors. Operators usually use ACLs or IP prefix lists to achieve filtering routes. The mask is used with IP addresses in ACLs to specify what should be permitted. This ACL rule will permit any packet whose destination IP prefix matches the first 24 bits of 192.168.0.1. Thus, we can convert this kind of rule into computable function h(p, n) technically, which takes the first n bits of p. For the above ACL fragment, we have the following logical formula. , 24) = h(192.168.1.0, 24) The IP prefix list is more flexible for controlling routes. The operators can configure prefix lists to match an exact prefix length or a prefix range. For example, considering the following IP prefix list, Additionally, when a routing table imports routing information from a routing protocol, a routing policy can modify the route's preference. When a routing table exports routes into a routing protocol, a policy might assign metric values, modify the BGP community information, tag the route with additional information, or prevent the route from being exported altogether. Then, in our logical formulas, we can modify the corresponding attributes of the record r. For instance, if router R is configured to change the local preference value of an external BGP route to 100, then we set r lp = 100.

IV. NETWORK PROPERTIES
The operators usually test some common properties to verify the network runs as expected. In our representations of the network, these network properties are also encoded into logical formulas. Combining the constraints of network behaviors and properties, we can solve whether a network holds such a property that we want to check. Here, we denote the network behaviors as C, a certain property as P. If C ∧ P = 1, then the network holds the property. However, to capture more details of the network, we use the SMT solver to compute C ∧ ¬P. If the result is satisfiable, the network does not hold the property that we check, and a counterexample is obtained. On the other hand, if the result is unsatisfiable, the property of interest is held. We illustrate the encoding of several common network properties.

A. REACHABILITY
To verify whether a source node can reach the destination node, we set an additional integer variable hopCount x,S 1 to each node and initiate the variable to −1 at the beginning. If a node does forward to its neighbor in the data plane, then we update the hop count of the neighbor by increasing hopCount. Specifically, the hopCount S,S of the source node equals 0. For a certain router, if its hop count is greater or equal to 0, 1 The two variables at subscript represent the hop counts from the source node S to the current node R. VOLUME 8, 2020 the router can reach its destination. Thus, we can claim that the source node can reach the destination node. Then, we have Here, y is an out neighbor of x. Then, we can check the reachability of source node S to destination node D by solving the hop count of D.
More generally, if we want to check whether a set of routers in δ can reach the destination, we can compute x∈δ hopCount D,x .

B. ISOLATION
Isolation is the negative property of reachability. If a node is isolated from the other node, we can test the reachability between the pair of nodes. In other words, if node A cannot reach node B, node A is isolated from B. Then, the isolation property can be encoded as In another situation, nodes A and B cannot communicate with each other in direction, and we can check Similarly, if the configurations need to ensure that two regions are isolated from each other, we can check Here, δ 1 and δ 2 are two sets of routers.

C. ROUTING LOOP
A routing loop is a serious network problem that happens when a data packet is continually routed through the same routers over and over. The data packets continue to be routed within the network in an endless circle. Figure 4 shows a routing loop occurring in a simple network. A routing loop can cause a catastrophic impact on a network and, in some cases, completely disable the network. Due to the loop preventing mechanism of BGP and OSPF, a normal routing loop is a problem associated with distance-vector protocols.
To encode this property, we add one Boolean variable to those specific nodes that are most likely to create a routing loop. The variable goesThrough denotes whether a data packet has gone through this node. It is initiated to be false at first, goesThrough = 0. When a router forwards a packet to its neighbor, the goesThrough variable is updated to be true, goesThrough = 1. If a router forwards to any neighbor with the goesThrough = 1, then a routing loop occurs. We can verify whether a routing loop occurs between x and y via the following formula: To check the routing loop between the source node and destination node, we can also compute the iteration of the above formula.

D. BLACK HOLES
In networking, black holes refer to some routers in the network that can receive packets but silently drop them, without informing the source that the data did not reach its intended destination. We can verify if a node is a black hole by checking its income forwarding and outcome forwarding. Denote the Boolean variable blackHole x as whether node x is a black hole in the network. Then, we have where I is the set of in-neighbors of node x, and O is the set of out-neighbors.

E. WAYPOINTING
Under certain circumstances, operators want to ensure traffic from source node S to destination node D will traverse a certain router of a chain of specific routers. Suppose the current traffic is configured to traverse k routers. As figure 5 shows, routers from R 1 to R k are the waypointing routers in green, and the general routers are blue nodes. First, we set each router a variable M with k + 1 dimensions. For a router x, M x = (1, m 1 x , m 2 x , · · · , m k x ). The first bit 1 denotes that the origin of the traffic is S, and m i x denotes whether the ith special router has been traversed for x. We also initiate these m i     (16) where I is the in-neighbor set of node x. Another way to verify waypointing is to check the reachability between S and D by shutting down these waypoints. To model the network more exhaustively, we also need to encode the state of the physical links and node in the scope of network topology. Therefore, we use state x to represent whether node x is active and state x,y to encode the state of the link between x and y. If state x = 0, then node x is shut down. Thus, we can set state w = 0 and ∃ w ∈ {waypoints} and then check reachability S,D .

F. DISJOINT PATH
The problem of disjoint paths in the network has been researched for years, and there are many clever algorithms to solve the problem. However, encoding such a property in a well-defined network model is still challenging. There are two subproblems of this property, the node-disjoint path and the link-disjoint path. We just take the link-disjoint path to illustrate the way we encode. Take figure 6 as an example. We want to verify that sources S 1 and S 2 traverse the same link (x, D) to destination D. We add a two-dimensional variable p to each link, p = (p 1 , p 2 ), and each p i is a Boolean variable, i = 1, 2. p 1 represents whether S 1 uses this link to forward traffic, and p 2 denotes whether S 2 forwards traffic via this link. A link with a variable p = (1, 1) is used by two routes. We can compute the variable p via the forwarding behavior,

G. PATH LENGTH
In some configurations, the operators may need to ensure that the packets traverse bounded-length paths. For instance, source S is configured to reach destination D, and the path length is not more than k hops. Otherwise, the traffic will occupy too much bandwidth of the network. Then, we can make use of the variable hopCount aforementioned to encode this property. If a source node S hops n counts to reach the destination node x, then the length of the path is n. We can check hopCount x,S ≤ k to ensure that the path is bounded. Some routers are configured to have an equal length to reach the destination. Violations can be found if these routers have different hop counts to reach the destinations.

H. MULTIPATH EQUIVALENCE
The routers configured ECMP (equal cost multipath) may have several paths to the destination. Multipath consistency VOLUME 8, 2020 ensures that these paths have equal statuses. For example, in figure 7, source S has three choices to reach destination D.
If D is reachable for S, then all three paths are valid. If one port of B fails, the path S − B − F − D is invalid. Multipath consistency ensures that the other two paths are also invalid. Then, we can make use of the reachability in an iterative way for each node.
Here, the variable reachability is defined as equation (10) shows. Through the above formulas, B can reach D via C but not F, which violates the multipath equivalence property.

I. FAULT INVARIANCE
It is inevitable that links or ports in the network will fail for some reason. Fault invariance maintains that the network still runs as expected when one or more nodes or links fail. Then, we can add a new variable fail to each node and link. If state x of router x is false, then fail x = 1. fail (x,y) = 1 if status (x,y) of link (x, y) is false. Thus, we can encode the property to and (x,y)∈E The equation above means that the network still runs as expected when there are k nodes or links that fail.

J. LOAD BALANCE
Some routers are usually configured ECMP to forward traffic through multiple paths to properly make use of the bandwidth of the link. Consider the example in figure 8; routers are all configured ECMP. In figure 8, router A receives traffic from N and forwards to B and C separately. Then, B forwards to D and the subnet S1. C forwards to D and D forwards to the subnet S2. We add a variable to each link. First, there is a total traffic variable t on the link from N to A. We can initiate t = t 0 as a whole input according to the configuration. Then, A receives t traffic and forwards w 1 A to B and w 2 A to C. Thus, we have w 1 Similarly, we can encode other links, where t 0 is a constance. These variables depend on the forwarding behavior, similar to the work [7] described. Take router A as an example, where a 1 is the traffic from A to C.
We can verify whether the links (A, C) and (A, B) average the traffic from N by checking |w 1 A − w 2 A | ≤ h, and h is the constant threshold that we set.

V. CONCLUSION AND DISCUSSION
In conclusion, our main idea is to abstract a network more practically, and combining the routing records into the algebraic structure is a way of integrating multiple routes into the model. We first redefine the definition of the graph model of the network. A node in the graph splits into k nodes, where k is the number of protocols that it runs. Thus, the information of the routes advertised in the network is clear to catch. Then, we advocate a modified routing algebra (G, , , R, F) by abstracting the routing records and filtering mechanism to R and F. In this way, we can concisely and accurately analyze and model the interactions of protocols and forwarding behaviors. Based on the modified routing algebra, we instantiate each element of the algebraic model. For instance, we use different types of variables to encode the nodes and edges in the network graph and the attributes of routing protocols. These variables compose logical formulas to encode the network behaviors, such as forwarding. Then, the filtering function F implements the functions of routing policies to block certain routes or modify the attributes of routing records. The other important improvement of our work is to make use of the constraints and formulas to abstract all the common kinds of network properties into logical representations and obtain a complete abstract model of both network behaviors and the properties in a mathematical way. Then, it is possible to program the whole network model and verify different properties of interest for computer engineers. The main contribution of our work is to model the real network into an algebraic structure and encode the network behaviors and properties in theoretically mathematical logical formulas. Coding the whole network model into a program and using real data for implementation are complicated and challenging. As further research, we that believe our contributions can be programmed and encoded to an available verification tool.
As is known, the mechanism of routing protocols is quite complicated. There are also some configuration techniques we have not considered, such as route maps and route reflectors in internal BGPs. Thus, determining how to model these features into the routing algebra and encode properly is still worth researching. On the other hand, applying the routing algebra to check the correctness of the configuration files is also important but a hard problem. Encoding the algebraic structure into logical formulas is an effective method to compute and code. As mentioned above, the logical formulas that we encode for nodes and edges are all SMT constraints. However, as the scale of the network increases, the number of SMT constraints will increase exponentially. Then, the time of verification programming may be out of control. Though some tools aim to improve the efficiency of verifying the reachability of the network by compressing the scale of the large network [19] or abstracting the control plane in a systematic form [15], they are obviously at the cost of losing the possibility of verifying QoS network properties. Therefore, it is still a significant and challenging issue to model and encode the network concisely and concretely. The algebraic model is applied not only in IP network verification but also in other related research areas. In the work of [21], the abstraction network representations are used to solve the space network, such as verification of the periodically changed topology of satellites in space. We believe that it is valuable to explore more applications of network verification in other research areas.