Towards Efficient and Delay-Aware NFV-Enabled Unicasting With Adjustable Service Function Chains

Network Function Virtualization (NFV) has becoming an emerging technology for ensuring the reliability, security and scalability of data flows. The Virtual Network Function (VNF) embedding problem, which tries to minimize the embedding cost and link connection cost toward customers or maximize network throughput for a given set of NFV-enabled requests, has attracted extensive interests recently. However, the existing works always assume the fixed execution order of VNFs, which limits their application. Thus, we investigate the VNF embedding problem without such limitations in this paper. Firstly, we propose a general transformation framework for the NFV-enabled unicast routing problem with arbitrary order of service function chains, and an optimal algorithm is proposed for the unicast VNF embedding problem without delay constraint. Secondly, an efficient algorithm with theoretical guarantee is also proposed for such a problem with delay constraint. Thirdly, the throughput maximization problem where there exists a set of unicast requests with delay constraints is also investigated, and an efficient algorithm is also proposed to maximize the number of admitted requests while the total traffic delivery cost is minimized. Finally, we evaluate the proposed algorithms via extensive simulations, which demonstrates the high efficiency of the proposed algorithms.

location of VNFs; the link connection cost, which is related to the routing path. Note that, when deploying VNFs, if one selects the existing VNFs for unicasting, it may increase the link connection cost, and conversely, if one chooses to instantiate a new VNF on other server nodes to achieve traffic load balancing which may increase the deployment cost. These bring much challenges for orchestrating an NFV-enabled unicast service. Additionally, some NFV-enabled unicast requests often contain predetermined end-to-end delay constraint, and there may exist multiple unicast services share the underlying network resources.
To address the above challenges, several works have investigated NFV-enabled unicast orchestration problem [4], [5], [6], [7], [8], [9], [10], [11], and many effective approaches have been proposed. Some try to balance the deployment cost of VNFs and link connection cost [4], and some try to meet the delay requirements [12], and some others try to maximize the throughput [13]. In their proposed methods, they usually rely on a given SFC to deploy the VNFs and compute the routing path. However, all these existing works assume the SFCs follow a fixed order, which limits their application, especially when some server nodes can only instantiate specific VNFs. For example, in many unicast requests, it may require an SFC as (VPN, firewall, monitoring, load balancing), which means the data flow should go through these four VNFs one by one. Actually, in practical, the firewall and monitoring function can be executed in any order. This means, the SFC can be embedded with the order (VPN, firewall, monitoring, load balancing) or the order (VPN, monitoring, firewall, load balancing). If one just relies on a fixed order, it may greatly increase the total traffic delivery cost. Fig. 1 depicts an example for different SFCs embedding with the same VNFs requirement in a SDN network. As the network shown in Fig. 1(a), which consists of 11 nodes and 19 links. Assume the source node is s k , the destination node is t k . The nodes attached with servers are {A, B, C, D, E , F, G, H, I}, and the VNF can be deployed on the server and its correspondent cost is shown around the node. The capacity of each node is set to be 1, which means it can be instantiated only one function. Note that, although the four server nodes {E , F, G, I} can all instantiate function f 3 and f 4 , but they may have different deployment costs. The weight attached to the link is the connection cost. The unicast task is to flow the data from the source node s k to the destination node t k through the required the SFC with where the order of f 3 and f 4 can be changed. In the traditional SFC embedding with a fixed order ( f 1 → f 2 → f 3 → f 4 → f 5 ), the total cost reaches 62 in Fig. 1.(b). If we compute the deployment and routing plan by considering the adjustable order, the total traffic delivery costs can be reduced to 46 (as shown by the red path in Fig. 1.(c)).
In this paper, we investigate the first work for the SFC embedding problem with adjustable order for unicast in SDN. To minimize the total traffic delivery cost, a general transformation framework and several efficient algorithms are proposed. Our contributions are summarized as follows: r We investigate the first work for the SFC embedding problem with adjustable order, and it is proved to be NP-hard under the delay constraint. A transformation framework is proposed for SFCs embedding with adjustable order.
r To minimize the total traffic delivery cost, an optimal algorithm is proposed for unicast routing without delay constraint. When given a delay constraint, an efficient algorithm with theoretical guarantee is also proposed.
r When there exist multiple unicast requests, the throughput efficient NFV-enabled unicasting algorithm is also proposed, which tries to maximize the number of the admitted unicast requests.
r We evaluate the proposed algorithms through extensive simulations in both synthetic network and real network. Experimental results show the high efficiency of the proposed algorithms. The rest of the paper is organized as follows. Section II reviews the related work. Section III formally defines the proposed problem. Section IV introduces the detailed design for the NFV-enabled unicasting algorithms to minimize the total traffic delivery cost. Section V shows proposed throughput efficient NFV-enabled unicasting algorithm. Section VI presents the performance evaluation of the proposed algorithms. Finally, we draw the conclusion in Section VII.

II. RELATED WORK
In recent years, the NFV-enabled routing problem has attracted increasing interests from researchers. According to their specific objectives, the existing works can be grouped into the following three categories: minimize the total traffic delivery cost [4], meet the end-to-end delay requirements [12], maximum the network throughput [13].
Some works focus on minimizing the total traffic delivery cost [4], [5], [7], [8], [14]. Chen and Wu [5] propose an VNFs placement algorithm to minimize the total traffic delivery cost for the unicast request by balancing the costs of VNF placement and link bandwidth consumption. Mirjalily et al. [7] study the problem of optimal VNF placement and routing for multicast services, and propose a two-stage heuristic method to reduce the total delivery traffic cost in wireless mesh networks. Ren et al. [9], [10] try to embed VNFs by transforming SFC into service function tree (SFT) to minimize the total delivery traffic cost for NFV-enabled multicasting. Alhussein et al. [11] believe that multicast replication points can appear before and after network function (NF) instance deployment, incorporate multipath routing between embedded NFs, and develop a flexible multicast routing and NF placement framework to improve resource utilization during traffic routing. Kuo et al. [15] design a dynamic programming algorithm for NFV-enabled request routing by considering bandwidth requirements. Lin et al. [16] develop a new abstraction model to minimize the embedding cost of SFC with parallel VNFs by combining Directed Acyclic Graph (DAG) and hybrid SFC (SFC with parallel VNFs).
Some works try to meet the end-to-end delay requirement while reducing the total traffic cost. Yu et al. [6] try to maximize the total profit when placing VNFs into a set of locations by taking the delay constraint of each unicast request into account. Jin et al. [17] design a novel two-stage delay-aware VNF deployment scheme and propose three heuristic algorithms for deploying VNF chains with latency guarantees and improving resource utilization at the network edge. Li et al. [18] study the NFV-enabled routing problem when the requests come from a single tenant with end-to-end delay constraint, and propose a dynamic deploying strategy. Asgarian et al. [19] propose a source-destination-vnf graph-based twostage approach to address the problem of embedding multicast service chains onto NFV-enabled underlying networks.
Some other works try to maximize the throughput of NFVenabled unicast in SDN networks. Mike et al. [8] propose a general framework for the delay-bounded unicast to maximize the number of admitted unicast requests. In mobile edge networks, Ren et al. [20] propose an approximation algorithm with a provable approximation ratio for single request admission, and then design an efficient heuristic algorithm when given a set of latency-aware NFV-enabled requests. Xu et al. [21] consider the delay-aware task offloading designs with network functional requirements in MEC networks, in which they proposed an efficient online algorithms with provable competition ratios guarantee. Yue et al. [22] aim to optimize the throughput of SFC requests and then propose a adjustment algorithm in the MEC-NFV-enabled networks.
With the advent of the 5G era, Gharbaoui et al. [23] propose a service chain orchestration system in a 5G network with virtualization functions in geographically distributed edge clouds. Kumazaki et al. [24] propose a dynamic service chain construction based on Model Predictive Control (MPC) to utilize network resources. Pei et al. [25] study the differential routing problem considering SFC in SDN and NFV-enabled networks, and propose a resource-aware routing algorithm.
However, all the above existing works assume the fixed execution order of VNFs. In practical, the orders of the VNFs in SFC can be adjusted. The existing works, which do not consider the order adjustability of VNFs, may lead to the embedding of SFC in a non-optimal order. In this paper, we  consider the order of execution can be adjusted between VNFs in each particular service chain, thereby providing flexibility for topology customization which is particularly important for geographically distributed SFC.

A. NETWORK MODEL
In this paper, the SDN is modeled as a graph G = (V, E ), where V and E are the sets of nodes and links, respectively. The nodes are composed of switch nodes (represented by V S ) and server nodes (represented by set V M ), i.e., V = V S ∪ V M . Switch nodes are capable of forwarding and replicating traffic, and server nodes not only can play the role of switch nodes, but also are capable of hosting and operating VNFs to process traffic. Each node v ∈ V M is treated as a switch node without an attached server if its server is not used for implementing VNFs deployed on VMs. Otherwise, the VNFs implementation cost of node v must be taken into account. Let cap(v) denote the computing resources of the server node v ∈ V M to hold VNFs, B e represent the bandwidth resources of each link e ∈ E , and c e denote the usage costs of one unit of bandwidth at each edge e ∈ E . When data is transmitted on a link, the traffic delivery cost and transmission delay will be incurred on the link e ∈ E . In G, there is an SDN controller in charge of managing the allocation of resources for each admitted unicast request, and the above properties of each switch can be obtained by the controller by the methods as in [26], [27]. Fig. 2 gives an example of a SDN network with VNFs, Consider the service providers supplying m ≥ 1 categories of VNFs in the network G. We define the VNF set F = { f 1 , f 2 , f 3 , . . ., f m }. Furthermore, we let f 0 denote the dummy VNF instantiated on the source node s k . Any VNF will incur a deployment cost when deployed on a server node. For simplicity, we assume that each VNF instance can serve traffic flows of any size. Let c f j ,v represent the deployment cost of a new VNF instance f j on node v, v ∈ V M and let a binary variable β f j ,v indicate whether or not a new VNF instance f j is deployed on node v. Let π f j ,v represent the number of resources required to deploy f j on node v.
In a given SDN network G, the NFV-enabled unicast request r k is represented by a five-element tuple r k = (s k , t k , b k , SC k , S aop ), which implies transferring the data traffic from a source node s k to a destination node t k through the SFC with adjustable order In addition, b k represents the given bandwidth resource required, which can be derived from historical information, and S aop indicates the set of each adjustable order part in the SC k .

B. NFV-ENABLED UNICASTING PROBLEM
As for the NFV-enabled unicasting problem, when given a request r k , we first try to minimize the total traffic delivery cost for implementing the request, which consists of computing and bandwidth resource consumption costs. After then, we will investigate the NFV-enabled unicasting problem to meet the end-to-end delay requirements and to maximize network throughput. Thus, we define the following three optimization problems.

Definition 1 (The Resource-constraint NFV-enabled Unicasting
Problem): For a unicast task r k = (s k , t k , b k , SC k , S aop ), under the condition that the computing resources of each sever node v ∈ V M and the bandwidth resources of each link e ∈ E in G are constrained, the resource-constraint NFV-enabled unicasting problem is to find the routing path such that i) the unicast request is satisfied and ii) the total traffic delivery cost is minimized.
Definition 2 (The Delay-aware NFV-enabled Unicasting Problem): Based on Definition 1, the delay-aware NFVenabled unicasting problem in an SDN G = (V, E ) for an NFV-enabled request r k = (s k , t k , b k , SC k , S aop ) is to find a routing path for r k , such that its total traffic delivery cost is minimized while the end-to-end delay of the path is no greater than a given end-to-end delay constraint D k .
Definition 3 (The Throughput-aware Delay-aware NFVenabled Unicasting Problem): This problem in an SDN G = (V, E ) is to admit as many NFV-enabled unicast requests as possible for a given set of NFV-enabled unicast requests, while the accumulated implementation cost of all admitted requests that meet the end-to-end delay requirements is minimized, subject to the computing resources at each server node v ∈ V M and the bandwidth resources constraint at each link e ∈ E .
As the example shown in Fig. 1(a), which depicts a SDN network. For simplicity, the switch nodes between server nodes are removed. In Fig. 1, the source node is s k , the destination node is t k , and the SFC required by the unicast task is (  total traffic delivery cost is 67. Another solution for the SFC deployment order of ( Fig. 3(b) with the cost 62. In the same way, there are also two routing schemes for the sequence SFC such as the total traffic delivery cost of the request can be greatly reduced by exchanging the execution order of the VNFs with adjustable order. For example, the best scheme with execution order ( can reduce the cost of admitting r k by 25.8% compared to the one of (

C. PROBLEM FORMULATION
To formulate the total traffic delivery cost and delay model, we define the following variables. r θ f j u,v : a binary variable, indicating whether the edge e u,v is located between VNF f j and f j , where f j is the next right join VNF of f j . r ϕ f j ,t k ,u : a binary variable, denoting whether the flow is processed by the f j function at node u before reaching the destination node t k . r σ u,v : an integer variable, representing the number of times the edge e u,v is traversed repeatedly. r P f j , f j : the path collection between VNF f i and VNF f j for routing traffic.
r p τ f j , f j : a specific routing path between the two VNFs f j and f j , where τ is a flag used to distinguish other paths in the P f j , f j routing set. r x j τ, j : a binary variable, indicating whether there is a realpath p τ f j , f j chosen to implement the path between the node in the AOP and the node in the FOP. r y j τ, j : a binary variable, indicating whether real-path p τ f j , f j is selected to implement the path between the two nodes that are either in AOP or both in FOP.
Then, the Resource-constraint NFV-enabled Unicasting problem can be formulated as: Note that constraint (2) ensures that the node is not overloaded. Constraint (3) ensures that all links in the network will not exceed the bandwidth limit. For each NFV-enabled unicast request r k , its data traffic needs to flow through a series of network functions before reaching the destination node, and each flow is processed only once by the same VNF. Constraint (4) ensures that it will not be repeatedly processed by the same VNF. Constraint (5) ensures that the destination node can obtain the complete VNF services in the required SFC.
For a delay-aware NFV-enabled unicast request, we consider both the processing delay in the selected server nodes and the data transmission delay on the links in network G, which are defined in the following.
The processing delay of unicast request r k is related to the data traffic that needs to be processed and the computing resources allocated to process the traffic. We consider that the processing delay d j p,k of VNF f j in SC k with 1 ≤ j ≤ l for each request r k is proportional to the amount of traffic, i.e., where β j denotes a given proportional factor of VNF f j . Therefore, the accumulative processing delay in SC k is: Let P k be a candidate of routing path from source s k to destination t k , and d t,k (e ) be the transmission delay on link e ∈ P k . The transmission delay d t,k on path P k is: As a result, the total delay experienced by r k is: For a service satisfying the delay demand, the total delay cannot be greater than the specified delay demand D k , i.e., The delay-aware NFV-enabled unicasting problem is proved to be NP-hard, which is shown as follows.
Proof: We prove it to be NP-hard by considering a special case of the delay-aware NFV-enabled unicasting problem is NP-hard. Assume the SFC SC k is null, then such a problem is equivalent to the traditional delay-constrained shortest path problem, which is proved to be NP-hard even under the directed acyclic graph [28]. Thus, the delay-aware NFV-enabled unicasting problem is NP-hard.

IV. NFV-ENABLED UNICASTING ALGORITHM FOR A SINGLE REQUEST
In this section, we will introduce the efficient algorithms for the NFV-enabled unicasting problem with and without endto-end delay constraint.

A. CONSTRUCTION OF PMCD
Before introducing the algorithms, we first construct a Parallel Multilayer Compression auxiliary Diagram (PMCD).
Let V f j be the set of server nodes that can instantiate VNF f j . The node set V k in PMCD can be composed of source node s k , destination node t k , and server node sets To guarantee the network functions of SC k are traversed in the specified order, the nodes in V k are connected according to all possible SFC chains. The specific steps to construct PMCD as shown in Fig. 4 are as follows:  and do this samely for the 2-th columns. And then, create the column V 3 th with a node V f 3 , and the column V 4 th with V f 4 . Finally, the result is shown in Fig. 5. Note that, V f 1 is the set of nodes that can instantiate f 1 , which has two nodes in Fig. 2, i.e., {v 4 , v 5 } (the number in the nodes in Fig. 5, i.e., 1 and 3, are the instantiating cost, which will be introduced later).
r Step 2: All the nodes in the left column are connected to all nodes in the adjacent column except the nodes instantiate the same function and the weight of the edge is assigned as the routing cost of the corresponding shortest paths in the original network, which can be obtained by the shortest path algorithm [29]. If different adjacent VNFs are implemented on the same server node, then the weight of this edge is 0.
r Step 3: Connect the source node s k to the node v ∈ V 1 th in the first column with a directed edge, and set the edge weight to be the total cost of the shortest path between the source nodes s k and v ∈ V 1 th in graph G. Similarly, connect the nodes v ∈ V l th in the last column with destination t k , and assign the edge weight as the total cost of the shortest path between the node v ∈ V l th and t k in graph G. Therefore, the edge set for PMCD can be computed as The constructed PMCD auxiliary graph ensures that the optimal deploying and routing strategy for the NFV-enabled unicasting is a subgraph of the PMCD.
In the following, we will introduce how to set the weight of nodes in the PMCD.
Considering a unicast request r k in the SDN network shown in Fig. 2, which consists of two server nodes v 4 and v 5 . As aforementioned, the first two functions of the SFC ( f 1 , f 2 , f 3 , f 4 ) are adjustable. Meanwhile, the deployment cost of these four functions on these two server nodes is represented by a matrix M, which is shown below.
Note that, since V f i denotes the set of server nodes that can instantiate function f i , then for each function f i and each node in V f i in PMCD, its weight is set as the cost of instantiating the corresponding VNF at that server node. For example, as shown in Fig. 5, V f 1 = {v 4 , v 5 }, and the cost of instantiating f 1 on v 4 and v 5 are 1 and 3, respectively, as in the matrix M. Then, for the nodes in V f 1 in the first column, their weights are just set 1 and 3, respectively. One can find that this auxiliary graph in Fig. 5 contains all possible SFCs of r k , which is a routing path from s k to t k .

B. THE OPTIMAL ALGORITHM FOR HANDLING UNICAST REQUEST WITHOUT DELAY CONSTRAINT
With the constructed PMCD, then we will introduce the optimal algorithm for the NFV-enabled unicasting problem without delay constraint. For description, we first introduce two symbols used in the proposed algorithm. Let c(p s k ,v i ) denote the shortest path distance from the source node s k to the node v ∈ V i th in the i-th column. Let c(e) denote the shortest path distance from the source node v in i-th column to the node u in (i Note that, one cannot just employ the traditional shortest path algorithm here. This is because it may obtain the result with the SFC ( f 1 , f 2 , . . ., f i , . . ., f j , f j+1 , f j ), as shown in the red route in Fig. 4, where some functions are duplicated or missing. To address this issue, we propose an efficient minpath column-by-column iterative search algorithm. It mainly works as follows.
r Step 1: For each node v 1 in the first column V 1 th , calculate the shortest path distance from s k to v 1 , denoted by c(p s k ,v 1 ). And then store the path from s k to v 1 , and the VNFs which have been already deployed in this path.
) to calculate the shortest distance from the source node s k to the node u i (∀u i ∈ V i th ) on each column V i th (2 ≤ i ≤ l ). And then store the path from s k to u i ∈ V i th (2 ≤ i ≤ l ), and the VNFs already deployed in this path and the shortest path distance c(p s k ,u i ). Note that, if there exist redundant VNFs along this shortest path, it will just be ignored. As shown in Fig. 5, when calculating the shortest path from s k to u 2 ∈ V 2 th , one can select a node v 1 ∈ V 1 th in the column V 1 th which has the smallest sum of the c(p s k ,v 1 ) and c(e v 1 ,u 2 ). And then, store the path from s k to u 2 ∈ V 2 th , the VNFs deployed in this path, i.e., ( f 1 , f 2 ). This process repeats until reaches the last column V l th . r Step 3: Calculate the shortest path distance from the nodes in the last column to the destination node t k . Finally, the one with the smallest distance will be the one for implementing the unicast request. Note that, when processing the function deployment of each column, it is necessary to determine whether the function has been deployed. Also note that the total amount of computing resources of each node and bandwidth resources of each link is limited. Hence, the routing path P k obtained by Step 3 may not be suitable under such a scenario. In this case, we need to check the obtained routing path so as to satisfy the network resource constraints.
Based on this idea, we further check whether there exist any overload server nodes and links. Specifically, let R e denote the remaining bandwidth capacity of link e ∈ E k . If R e < 2 · b k , we remove the corresponding link e in PMCD. Furthermore, we check all VNFs in the SFC. If the computing resources required to deploy f j ∈ SC k on node v ∈ V M are greater than the remaining resources R v on the node v ∈ V M , we disconnect the link between the overloaded node and its connected nodes.
Theorem 2: For the resource-constrained NFV-enabled unicasting problem, the optimal solution with the optimal VNFs execution order can be obtained by Algorithm 1, which takes O(|V | 3 ) time.
Proof: We first prove that Algorithm 1 can return an optimal result for the resource-constrained NFV-enabled unicasting problem. For the nodes on the 1-th column, the least delivery cost from the source s k to any node in the first column V 1 th is obtained by the shortest path algorithm. Then, for the nodes on the (l − 1)-th column, the shortest path from the source s k to any node in the (l − 1)-th column V (l−1) th can also be obtained by c(p s k ,v l−1 ). For the nodes on the l-th column, the shortest path from s k to any node in l-th column V l th can be obtained as c(p s k ,v l−1 ) + min(c(e)|e ∈ v l−1 , u l , ∀v l−1 ∈ V (l−1) th , ∀u l ∈ V l th ). Since the constructed PMCD considers all the possible adjustable order of VNFs, then the obtained result by Algorithm 1 is also an optimal result with the optimal VNF ordering.
We then analyze the time complexity of Algorithm 1. Since in the constructed PMCD graph, |V k | = l i=1 |V i th | + 2 ≤ |V | 2 and |E k | ≤ l i=1 (|V i th | × |V i th |) ≤ |V | 3 , then the construction of PMCD takes O(|V | 3 ) time. The weight of the edges in the PMCD graph can be obtained by calculating the shortest path between the corresponding two nodes in graph G, which takes O(|V | 3 ) time. Since the PMCD is a directed acyclic graph, it takes a linear time to find the shortest path [26]. Therefore, finding the shortest path in PMCD from s k to t k takes O(|V k | + |E k |) = O(|V | 3 ) time. After finding the shortest path, one need to convert it to the corresponding path in the original network graph, which takes O(|V | + |SC k |) time. Thus, Algorithm 1 takes O(|V | 3 ) time.

C. HANDLE UNICAST REQUEST WITH DELAY CONSTRAINT
In practical, the routing requests may include a predetermined end-to-end delay constraint [30], [31]. In this section, we propose an efficient method for the delay-aware NFV-enabled unicast problem.
The main idea is to find the shortest path P k with delay limitation D k based on the constructed PMCD G . To satisfy the end-to-end delay requirement, we try to adjust the weight of the edge in the PMCD, with (c k (v), c k (e), d p,k (v), d t,k (e))

Algorithm 1: Handle Unicast Request Without Delay Constraint.
Input: A network G = (V, E ), request r k = (s k , t k , b k , SC k , S aop ), server nodes V M , the resources of server nodes cap(v), link resources B e Output: A solution shortest path P k , and its cost C k . 1: Construct an PMCD G for r k , under the node resource constraints. 2: Delete edges in G that do not have sufficient resources. 3: if the link or node resources are insufficient then 4: return 0. 5: end if 6: for each node v 1 in the first column V 1 th in G do 7: Calculate the shortest path distance from s k to v 1 , denoted by c(p s k ,v 1 ). And then store the path from s k to v 1 , and the VNFs which has been already deployed in this path. 8: end for 9: for each column V i th (2 ≤ i ≤ l ), since i = 2 in G do 10: for ∀u j ∈ V i th do 11: Calculate the shortest distance from the s k to the node u j by the formula: min(p s k , Connect node v to destination t k in minimum total cost C v , meanwhile, store the C v in the cost set and the shortest path P v in path set, respectively. 24: end for 25: Compare each C v in cost set to get the smallest cost C k and its corresponding path P k . 26: return C k and P k . for each node v and each link e in the original graph G, where c k (v) and d p,k (v) represent the processing cost and delay in the node v, respectively. In addition, c k (e) is the link transmission cost, and d t,k (e) is the transmission delay.
For the delay-aware NFV-enabled unicast problem, we try to find an end-to-end delay-constrained shortest path P k in terms of its implementation cost C p defined in equation (1), under the constraint v∈V (P k ) d p,k (v) + e∈P k d t,k (e) ≤ D k from s k to t k in the PMCD graph, where V (P k ) denotes all used server nodes on path P k .
Before introducing the proposed algorithm, we first try to transform the delay-aware NFV-enabled unicast problem to the delay constrained least cost (DCLC) path scheduling problem [32]. It works as follows. Since each node in the PMCD graph has been attached server to implement the VNFs, which incurs node consumption in the total traffic delivery cost and processing delay d p,k (v) at the server node v ∈ V M , therefore, the formula (1) can be formulated in terms of the DCLC problem, i.e., min{C P |P ∈ P s k ,t k &d k,P ≤ D k }, where P s k ,t k represents the path set from the s k to the t k for the request r k , V (P) denotes all used server nodes on path P, C P denotes the total traffic delivery cost defined in (1), and d k,P = v∈V (P) d p,k (v) + e∈P d t,k (e) denotes the total delay of P. We next propose a modified edge cost function, i.e., With the adjusted edge weights of PMCD, the traffic delivery cost of the path P in PMCD can be expressed as C P,λ . When d k,P > D k , we try to find the suitable solution for r k to meet the delay constraint by changing the value of λ.
In general, we adopt the ideas of Juttner et al. [28], which uses the concept of aggregated costs and exploits an efficient lagrange relaxation based algorithm to get the delay constrained routing path. Based on the PMCD graph, the detailed operations to find a feasible solution are stated as follows.
r Step 1: Set λ = 0, find the shortest path P c from s k to t k with C P c by Algorithm 1. If the path P c found meets the delay constraint D k , stop the algorithm and return the optimal path P c as P k . r Step 2: Otherwise, store the path P c and d k,P c . Then, using the link delay d t,k (e) and node delay d p,k (v) as the cost of the PMCD graph, calculate whether there is a path P d that meets the delay constraint from s k to t k . If the obtained path P d satisfies the delay constraint, the path P d with total delay d k,P d is stored as the best path found so far, and go to Step 3. Otherwise, there is no suitable path in the network that can meet the delay constraint for this request r k , and then the algorithm is stopped.
. Adjust the edge weights in PMCD by λ. Then use the Algorithm 1 to find another path P r . If d k,P r ≤ D k , P d = P r , otherwise, P c = P r . r Step 4: Repeat Step 3 until C P r ,λ = C P c ,λ . Return the path P d as P k . r Step 5: Finally, we replace the all edges in P k with its corresponding shortest path in network G to derive the delay-constrained shortest path. Theorem 3: Let LB(λ) = min{C P,λ |P ∈ P s k ,t k } − λ · D k , for any λ ≥ 0. Then LB(λ) is the lower bound of the proposed algorithm for the delay-aware NFV-enabled unicasting problem.

Algorithm 2: Handle Unicast Request With Delay Constraint.
Input: A network G = (V, E ), request r k = (s k , t k , b k , SC k , S aop , D k ), server nodes V M , the resources of server node cap(v), link resources B e . Output: A solution shortest path P k with the cost C k . 1: Find the shortest path P c from s k to t k with the cost C P c by Algorithm 1.
Get P c and C P c ; 4: Derive P k by replacing each edges in P c with its corresponding shortest path in PMCD; 5: return P k and C P k . 6: else 7: Find another shortest path P d from s k to t k with the cost C P d , using the edge delay d t,k (e) and the node delay d p,k (v) in PMCD. 8: Reject r k , return 0. 10: else 11: while true do , where C P and d k,P are the sums of costs and delays on the edges and nodes in P, respectively; 13: Find another shortest path P r from s k to t k by Algorithm 1, using the modified edges 14: if C P r ,λ = C P c ,λ then 15: break. 16: else 17: if d k,P r ≤ D k , P d ← P r ; otherwise P c ← P r . 18: end if 19: end while 20: Derive P k by replacing each edges in P d with its corresponding shortest path in network G; 21: return P k and C P k .

22: end if 23: end if
Proof: Let P opt denote the optimal solution to Definition 2, and let the minimum cost of implementing an NFV-enabled unicast request with delay constraint be C P opt . Then, we have (12) Theorem 4: For the delayed-aware NFV-enabled unicasting problem, Algorithm 2 can find a feasible solution with O(|V | 3 ) time.
Proof: First of all, Algorithm 2 is also based on PMCD to search for the shortest path that satisfies the end-to-end delay constraint. From Theorem 2, it follows that the obtained solution will not violate the bandwidth resources and node computing resources constraint. In addition, through the algorithm proposed by Juttner et al. [28], the path found in PMCD is to satisfy the end-to-end delay demands of request r k . Therefore, this solution is feasible.
In Theorem 2, the time to construct a PMCD is O(|V | 3 ). Based on the time complexity analysis of Juttner et al. [28], Algorithm 2 takes O(|V k | 2 log 4 |V k |) = O(|V | 2 log 4 |V |) time to get the path from source s k to destination t k in PMCD. After finding the path, convert it to the corresponding path in the original network graph, which takes O(|V | + |SC k |). Thus the overall running time of Algorithm 2 is O(|V | 3 ) time.

V. ALGORITHM FOR THROUGHPUT-AWARE NFV-ENABLED UNICASTING
In this section, we will introduce the method for the throughput-aware NFV-enabled unicasting problem where there exists a set of unicast requests with delay requirements. We try to admit the requests in the set as much as possible, while the sum of implementation cost of all admitted requests is minimized, subject to the limited network resources in G.
To determine the order of processing each unicast request, we define a weight wt k , which is used to compare the size of resources consumed by these requests in terms of the computing resourcec k v of r k and the bandwidth resource b k , i.e., wt k = 1 2 (b k +c k v ). For example, as shown in Fig. 2, the SFC of r k is composed of = 2.625. Then, the proposed throughput-aware NFV-enabled unicasting algorithm works as follows, of which the pseudo code is shown in Algorithm 3.
r Step 1: According to the SFC, the requests are grouped into different categories. The requests in the same category will share the constructed auxiliary graph, as shown in Fig.6. r Step 2: Calculate the weight wt k of each request r k . Then, the weight wt k is added to the fivetuple of r k to construct a new seven-tuple r k = (s k , t k , b k , SC k , S aop , D k , wt k ). Based on wt k , all the requests r k are sorted in an non-decreasing order.
r Step 3: Employ Algorithm 2 to handle the first request (i.e., the request with the smallest weight) in each category. If Algorithm 2 returns 0, the request will be rejected. Otherwise, it will be admitted. After then, remove that request from the category and handle the first request in the next category. If the remaining resources in the network are insufficient, the algorithm will terminate. There are two points to note, one of which is that the unicast requests of the same category may have different source nodes and different targets. Therefore, we need to adjust the
Input: A network G = (V, E ), a given set of requests r k = (s k , t k , b k , SC k , S aop , D k ), server nodes V M , the computing resources of server nodes cap(v), link resources B e . Output:The admitted request set and the results . 1: for each request r k do 2: According to the SFC required by r k , the r k is divided into its corresponding subsets Set cate,i . 3: Calculate the weight wt k of request r k , let r k = (s k , t k , b k , SC k , S aop , D k , wt k ) 4: end for 5: Rank each subsets Set cate,i of the classified requests, by weight wt k . 6: while the remaining resources in G are sufficient do 7: for each category Set cate,i do 8: if Set cate,i is processed for the first time then 9: Construct the auxiliary graph PMCDi for this Set cate,i . 10: end if 11: Employ Algorithm 2 to process the first request r k in Set cate,i . 12: if Algorithm 2 return 0 then 13: reject r k . 14:  auxiliary graph after each unicast request is accepted, i.e., remove the source and target nodes of the previous request and add the source and target nodes of the current request. This means that instead of constructing a new auxiliary graph, we adjust the constructed auxiliary graph G before accepting the next unicast request r k+1 . This update operation continues in each iteration until there are no more requests can be accepted in this category. Another one is that the auxiliary graph PMCD needs to be constructed only once when processing the same type of request, but needs to be reconstructed in case of insufficient network resources and delay conflicts.
Theorem 5: Given a network G = (V, E ) with a set of NFV-enabled unicast requests with each unicast request r k = (s k , t k , b k , SC k , S aop , D k ) that requires to transfer an amount of b k data from its source to its destination with an endto-end delay requirement D k through the VNFs in SC k . The proposed Algorithm 3 can obtain a feasible solution in time O((N cate ·|V | + N req ·log 4 |V |)·|V | 2 ), where N cate indicates the number of categories in this group of requests and N req is the total number of requests.
Proof: To prove that the result generated by Algorithm 3 is feasible, we need to prove that the classification of the unicast requests does not affect the feasibility of Algorithm 2. Suppose the unicast request that Algorithm 3 is currently processing is r k+1 . If its previous request r k is admitted, the corresponding resources on the server node of each VNF used to process the traffic of r k are updated. Otherwise, the corresponding server nodes and links on the auxiliary graph do not change. In addition, Algorithm 3 ensures that the corresponding PMCD will be updated every time a request is admitted, and the corresponding PMCD will be reconstructed when the resources are insufficient. Recall that the Theorem 4 has shown that Algorithm 2 can generate a feasible solution for a single request. Therefore, the set of admitted unicast requests by Algorithm 3 is feasible.
We then analyze the time complexity of the proposed Algorithm 3. First, we classify a given set of NFV-enabled unicast requests. It is assumed that in the implementation of Algorithm 3, the corresponding auxiliary graphs PMCDs for the requests with the same SFC requirements are the same. Suppose N req requests are considered, and these N req requests can be divided into N cate categories. Therefore

VI. EXPERIMENTAL EVALUATION
In this section, we introduce the performance evaluation of the proposed algorithms through extensive simulations in both synthetic networks and real networks.

A. EXPERIMENT SETTING
To evaluate the proposed algorithms, we conduct experiments using the networks generated by the ER random graph construction algorithm and the real network topology as in GÉANT [33], respectively. The settings of the network parameters are given as follows.
r Network size: Following the previous work [9], [33], we set the network size in the range of [50,250], covering both small and large networks. r Server node ratio: the number of server nodes is set as 10% of the total network nodes.
r Computing resources: the computing resources of the server nodes are randomly generated at [4000, 12000].
r Unicast request task: For each unicast request, the source node and the destination node are randomly selected from the network. The consumed bandwidth b k by each request is between [1,5].
r VNF deployment cost: the cost of deploying function f i on the server node is in the range of [1,25]. r SFC size: According to [34], the range of the SFC length in the unicast request is set as [5,20]. For comparison, the following algorithms are compared as the baseline in the evaluation.
Firstly, we compare Algorithm 1 (Min-path column-bycolumn iterative search algorithm, MCIS) with two other benchmark algorithms: i) The existing ASR algorithm in [35], which can handle the NFV-enabled unicast request by mapping different resource usages in the SDN network to the edge weights in an auxiliary graph G ; ii) the existing UNICAST algorithm in [8], which proposes a general framework for solving the NFV-enabled unicast request with fixed order.
Secondly, the two baseline algorithms for Algorithm 2 (Heu_Lag for short) are as follows: i) ASR_Delay, which is based on the ASR and the request that cannot meet the delay constraint is directly discarded; ii) The existing UNI-CAST_Delay algorithm in [8], which employs the Lagrange relaxation algorithm to balance the delay and resource constraints to meet the delay constraint.
Thirdly, we compare Algorithm 3 (referred to as Heu_GW) with the following two baseline algorithms: i) the existing algorithm GAP_Alg in [35], which employ the idea of the GAP problem to deal with the throughput optimization problem. ii) We try to process unicast requests in the front-to-back order using a direct entry-by-entry method until the resources can no longer admit more requests, which is denoted as Order_Alg.
All the algorithms are evaluated on a computer configured with an Intel CPU 11700 with a maximum frequency of 4.8 GHz and 16 GB of memory (RAM). Additionally, since the baseline algorithms cannot handle the case when there is adjustable order, we randomly choose a function deployment order for these algorithms at each computation time. We conduct experiments using the generated random graphs of different sizes, taking the final delivery cost and running time of the algorithm as performance metrics. Furthermore, we perform experiments on the real network topology GÉANT.

B. EXPERIMENT ON SYNTHETIC NETWORK
In this group of experiments, we apply the synthetic network to evaluate the proposed methods by changing the network size from 50 to 300.

1) PERFORMANCE WITHOUT DELAY CONSTRAINT
Firstly, we compare the total traffic delivery cost and running time of each method without delay constraint. Fig. 7(a) shows the total traffic delivery cost under different network sizes. One can observe that the total traffic delivery cost of the proposed MCIS algorithm is much lower than the one of ASR and UNICAST, which is 143.06% lower than the one of ASR, and 22.21% lower than the one of UNICAST. This is because ASR only considers deploying all VNFs on one server node, which results in a much bad performance. Compared to UNI-CAST, MCIS can further reduce the cost by considering the adjustable order of SFCs. Fig. 7(b) shows the average running time of each method. One can find that the running time of MCIS is slightly lower than the one of UNICAST, while the average running time of ASR is much lower than the other two algorithms. This is because ASR simply considers the deployment of all functions on one server node, which means ASR does not need to construct a complex auxiliary graph to calculate the deployment of VNFs.

2) PERFORMANCE WITH DELAY CONSTRAINT
Secondly, we evaluate the performance of each method with delay constraint. Fig. 8(a) shows the total traffic delivery cost under different network sizes. The total traffic delivery cost of the proposed Heu_Lag is about 117.06% lower than that of ASR_Delay, and 18.27% lower than that of UNICAST_Delay in this case, which demonstrates the efficiency of the proposed methods. As shown in Fig. 8(b), the average running time of Heu_Lag is about 15.04% less than that of UNICAST_Delay, while the average running time of ASR_Delay is the shortest.  This is because ASR_Delay only deploys all VNFs on a single server and rejects the request that do not meet the delay constraint directly, which results in a much bad performance. Fig. 9 compares the numbers of admit requests of Heu_GW, GAP_Alg and Order_Alg. In the experiments, we restrict the number of requests in a group to be 150. As the size of the network increases, the total provided network resources is increased, and then the number of the admitted requests of each algorithm is also increased. The average number of requests accepted by the proposed Heu_GW algorithm is about 11.9% higher than that of GAP_Alg, and about 105.7% higher than Order_Alg. This demonstrates the efficiency of the proposed algorithms.

C. EXPERIMENTS ON GÉANT
In this group of experiments, we employ the real network topology of GÉANT to evaluate the performance of the proposed method by changing the length of SFC. Fig. 10(a) shows the total traffic delivery cost of each method without delay constraint. In this group of experiments, the length of SFC is varies from 5 to 20, while the number of functions with adjustable order is fixed at 3 and the number of server nodes is 5. One can observe that, when the length of SFC increases, the total delivery cost of ASR increases greatly, while the one of MCIS and UNICAST are almost unchanged. And the total delivery cost of MCIS is slightly lower than that of UNICAST about 2.31%. For the running  time, which is shown in Fig. 10(b), the average running time of MCIS is about 36.26% lower than that of UNICAST, while ASR is the lowest. This is because ASR simply deploys all functions on one server node.

2) PERFORMANCE WITH DELAY CONSTRAINT
Considering delay constraint, we evaluate the impact of the length of SFC on each method, while the number of functions with adjustable order is fixed at 3 and the number of server nodes is 5. Fig. 11(a) displays the total traffic delivery cost of each method with delay constraint. The total traffic delivery cost of Heu_Lag is 38.73% lower than that of ASR_Delay. In Fig. 11(b), one can observe that the average running time of Heu_Lag is about 50.89% less than that of UNICAST_Delay, while the average running time of ASR_Delay is lower than that of the other two algorithms.

3) PERFORMANCE OF THROUGHPUT
Last, we compare the numbers of the admitted requests of Heu_GW, GAP_Alg and Order_Alg in the GÉANT topology. The number of requests in a group is restricted to 150. The number of admitted requests of each method are shown in Fig. 12. As the length of the SFC increases, the required network resources by each request are also increasing. This is the reason why the number of requests admitted by each algorithm is decreased. The average number of requests accepted by the proposed Heu_GW algorithm is about 9.63% higher than that of GAP_Alg, and about 41.72% higher than that of Order_Alg, which demonstrates the efficiency of the proposed algorithms.

VII. CONCLUSION
In this paper, we investigate the first work for NFV-enabled unicasting with adjustable SFCs. First, we propose a general PMCD transformation framework for the NFV-enabled unicast problems when the SFCs have adjustable order. To minimize the total traffic delivery cost, a min-cost columnby-column iterative search algorithm is proposed when there is no end-to-end delay constraint. An efficient algorithm with theoretical guarantee is also proposed for the NFV-enabled unicasting with delay constraint. When there exist multiple unicast requests, in order to maximize network throughput, we design an efficient algorithm to accommodate as many requests as possible. Finally, we evaluate the high performance of the proposed algorithm through extensive simulations. YONGCHAO TAO received the B.S. and master's degrees from the School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China. He is currently a Senior Engineer with the Shenzhen Academy of Aerospace Technology, Shenzhen, China. His research interests include wearable and ubiquitous computing, and IoT networks. JIZHOU SUN received the bachelor's degree from the Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2009, the master's degree from China Aerospace Science and Industry Corporation, Beijing, China, in 2012, and the Doctoral of Philosophy degree in computer software and theory from the Harbin Institute of Technology, Harbin, China, in 2020. He is currently a Lecture of the Huaiyin Institute of Technology, Huaian, China. His research interests mainly include massive data computing, data cleaning, data management, and time series data.