Delay-Aware SFC Provisioning in Hybrid Fog-Cloud Computing Architectures

Network function virtualization (NFV) technology enables service providers to implement software-based network processing functionalities on standard computing servers (nodes). As such, this approach mandates the need for mapping virtual network functions (VNFs) in service function chains (SFCs) on these nodes for incoming service requests. Now traditional VNF mapping schemes use cloud nodes for its abundant available resources, at the detriment of prolonged network delays. Alternative schemes use fog nodes that return reduced delays, at the detriment of limited resources. Hence in this work, a novel SFC provisioning scheme is proposed for a hybrid fog-cloud architecture of various resources. The architecture is composed of a single fog and single cloud layer, in order to accommodate both delay-sensitive and delay-tolerant requirements for large number of incoming requests, respectively. This scheme yields a tradeoff between standalone cloud and fog solutions when implemented on the proposed hybrid architecture, in terms of the number of satisfied requests, network delay, resources consumption, energy consumption, and realization cost at large traffic volumes. The proposed scheme achieves 15–40% higher traffic capacity than fog solutions, 21–43% reduced delay, 45–52% less energy consumption levels and 28–30% less cost as compared to cloud solutions.


I. INTRODUCTION
Cloud computing technology has emerged as a distributed paradigm that provides large-scale and dynamic pool of host infrastructure [1]. This technology brings multiple advantages, such as flexibility, scalability, and efficient resources utilization. Here the computing resources are shared among multiple networks of different applications and traffic volumes. This allows service providers to provision applications at reduced cost by eliminating on-site hardware and maintenance. However, cloud computing suffers from increased network delays for time-sensitive applications, e.g., real-time online gaming. This in turn results in services outages, network congestion at high traffic scenarios, limited bandwidth, as well as security and privacy challenges. For instance, cloud computing has recently been considered as a potential solution for underlying internet of things (IoT) infrastructure to benefit from the abundant storage capabilities [2].
The associate editor coordinating the review of this manuscript and approving it for publication was Aniello Castiglione .
However, it can be inefficient for time-sensitive applications due to the introduced delays, which can degrade quality-of-service (QoS). Thus yielding reduced network performance. Therefore, alternative computing solutions are necessary for time-sensitive applications.
Fog computing architectures [3] have been proposed to overcome the aforementioned cloud limitations. This is possible by utilizing edge devices to locally perform substantial amount of the network computing and storage (memory) requirements. Hence, this approach acts as a service-oriented intermediate interface between terminals (end-users) and cloud nodes. A major advantage here is reducing propagation delays associated with links connecting distant-based cloud nodes. It also alleviates extended bandwidth usage in these links. In spite of that, fog nodes feature low computing and storage capabilities at high traffic volume, i.e., limited capacity. This limits the support of high bandwidth applications.
Hence this paper proposes a hybrid fog-cloud (HFC) architecture that merges fog and cloud technologies, as depicted VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ in Figure 1. Hence it combines the advantages of both technologies in one architecture at reduced limitations. This architecture can then support different types of applications and services. Namely, it aims to process specific application components at the fog nodes at reduced network delays and low bandwidth usage. Meanwhile, it provides high bandwidth capabilities at the cloud nodes at higher delays. Specifically, if the request demands a delay-sensitive service, then it will be mapped on the fog nodes, thereby achieving reduced delays. This reduces mapping on the cloud nodes, which minimizes traffic bottlenecks in the fronthaul of the network. Moreover, if the request demands a delay-tolerant service, then it is mapped on the fog cluster heads (FCHs) or the cloud nodes, since it has no stringent requirements on the delay.
Here mapping delay-tolerant requests on the FCHs or cloud nodes keeps the fog nodes resources better utilized for future incoming delay-sensitive requests. Furthermore, NFV is another key technology that decouples network functions (NFs) from proprietary hardware (nodes) [4]. Namely, it enables VNFs to run as software instances over dispersed cloud or fog nodes at various locations, in order to execute specific NFs such as firewall, DNS, caching, and evolved packets. Now a major design task here is VNF mapping onto the underlying cloud or fog nodes, in order to realize the network virtualization paradigm. Moreover, the VNFs here are often chained together in a certain sequence specified by terminal requests, thereby forming a SFC. Now performing SFC provisioning on NFV-based fog or cloud nodes at efficient resources utilization is a challenging task. This has been investigated by various efforts as presented in Section II. This paper is organized as follows. Section II reviews the recent efforts on SFC provisioning and VNF mapping, along with existing fog/cloud architectures, and proposed contributions. Section III presents the novel multi-layer HFC architecture. This is followed by the system model in Section IV, composed of a proposed terminal request model, design metrics and assumptions. The SFC provisioning scheme is then proposed in Section V. Then network setup, performance evaluation, and simulation results are presented in Section VI. Finally, Section VII presents the conclusion remarks and future directions.

A. SURVEY OF SFC PROVISIONING SCHEMES IN CLOUD ARCHITECTURES
Now a wide range of studies have been proposed to achieve SFC provisioning objectives in cloud architectures, see [5], [6] for a detailed review for some recent survey articles. First, global optimization techniques have been widely proposed to solve the problem of SFC provisioning in cloud computing. Foremost, the authors in [7] propose a mixed-integer linear programming (MILP) and a heuristic-based algorithm to jointly optimize NFV resource allocation in the SFC composition, VNFs forwarding graph embedding, and VNFs scheduling. Furthermore, the work in [8] formulates the provisioning problem as an optimization model and proposes a mixed integer quadratic constrained program (MIQCP) solution. Also, multiple dependent directional acyclic graph schemes are presented in [9]- [11]. These schemes consider the priority dependence between nodes at reduced usage of links bandwidth. An integer linear program (ILP) scheme is proposed in [9] and [10] that maps requests of multiple instances onto a single node to minimize resources consumption. However, the aforementioned schemes in general require traffic to traverse at longer paths to reach nodes hosting the VNFs. Thus yielding increased network delays and bandwidth consumption. Moreover, the work in [11] proposes a graph-based heuristic that combines graph centrality and multi-stage graphs for the placement of service functions chains. Authors in [12] implement a scalable SFC placement by proposing an eigendecomposition-based approach for VNFs mapping on the cloud nodes. The proposed method here reduces the complexity and convergence times that essentially depend only on the physical graph sizes. Finally, the work in [13] formulates the physical network and SFC request as two weighted graphs and formulates the SFC placement problem in the NFV environment consisting of graph matching and VNF mapping. Then authors propose an LP-based approach and a Hungarian based algorithm to solve the graph matching and SFC mapping problem.

B. SURVEY OF SFC PROVISIONING IN HYBRID FOG-CLOUD ARCHITECTURES
The work in [14] integrates NFV with fog computing to leverage the advantages of both technologies to support a handover scheme for 5G systems. This integration achieves reduced overhead and network flexibility for fog nodes of fixed catches. Also, a VNF mapping approach is proposed in [15] that accounts for stringent delay constraints. The mapping problem is formulated as a graph-clustering optimization model and a genetic algorithm is applied to reduce cost. However, the aforementioned schemes lack SFC provisioning.
Few studies have looked into HCF architectures, along with VNF mapping. Foremost, authors in [16] propose an architecture for platform as-a-service (PaaS) that splits VNF mapping onto cloud and fog nodes to measure network delay. However this study is limited to a single VNF and it lacks SFC requirements. The authors in [17] also propose to span VNF components between cloud and fog infrastructure for IoT healthcare applications. This work focuses only on mechanisms for providing control, signaling, and data interfaces between the cloud and fog nodes. Also work in [18] proposes to migrate and monitor applications components between cloud and fog nodes, while considering the tradeoff between power consumption and delay. Here the mapping problem is again formulated as an optimization model. Then an approximate solution is proposed to decompose the primary problem into three subproblems (solved independently). This work achieves reduced response times and enhanced throughputs at the expense of high computation resources. However, it lacks VNF mapping onto any of the nodes.
Moreover, metaheuristic schemes have been leveraged in fog architectures to solve the SFC provisioning problem. For instance, the work in [19] proposes Tabu search VNF mapping scheme, where the feasible region consisting of all nodes is divided into equally-adjacent smaller sub-region (of fewer nodes). Then the scheme conducts an inner pattern search to select a host node based upon the shortest and the least-load path to the source.
Additionally, an application component placement in NFV-based HFC architecture is presented in [20] and [21]. Placement decisions are determined from an ILP solver to achieve cost minimization. This work is limited to small-scale scenarios and considers a single VNF type in a single fog layer. Authors in [22] present a multi-layer fog and cloud architecture for video-streaming applications, as opposed to single layer fog solutions. Three layers are classified here in terms of their coverage, computing and storage capacities. Nonetheless, this work only demonstrates suitable services for multi-layer architectures. It lacks SFC provisioning and it is limited to video-streaming applications. Also, the three layers model here can increase realization costs.
Furthermore, some studies leverage software defined networks (SDN) technology in fog and cloud architectures. Foremost, authors in [23] integrate cloud and fog computing in conjunction with SDN and NFV for 5G systems based on a SFC model. The work here only considers the type of hypervisors, virtualization, and security issues. It lacks consideration of the constrained-resources in the single fog layer. Moreover, it does not account for delay-sensitive applications. Additionally, work in [24] studies the benefits associated with service migration from cloud to multi-layer fog nodes based on SDN for video distribution applications. Namely, it measures the required time for migration between the different layers in efforts to reduce traffic at the core network. However, this study does not consider resources constraints in highly congested multi-layer fog nodes. It also lacks consideration of delay-tolerant applications of high capacity demands, as well as absence of VNF mapping. Moreover, authors in [25] use a heuristic scheme based on ILP to provide SFC in SDN-based networks with a failure recovery scheme. This work considers computational complexity, average failure probability in the selected paths, in addition to link and server utilization. Finally, a combined architecture based on SDN and fog computing is presented in [26] for vehicular ad-hoc networks (VANETs) of delay-sensitive and location-awareness services. Also, the work in [27] proposes an edge architecture for vehicular service composition that aims to replace conventional vehicular cloud solutions in next generation networks. The architecture is based on vehicular service clouds (VSC) that are available on-the-fly, which are accessed as per the needs of vehicular users. This provides an intra-vehicle resource sharing model that delivers a wide range of cloud services, such as on-demand entertainment and speech recognition for driver assistance at reduced latencies. The overall objective here is to achieve low latency based on reduced end-to-end task completion times.
Moreover, authors in [28] proposes a cooperative communication scheme for fog-to-fog and fog-to-cloud nodes. The scheme provides a range of terminal-defined services that involve a one-time requester/provider interaction. Hence short-term service level agreements (SLA) are developed based upon Tabu search heuristic method that uses previous solutions when selecting new optimal choices. However, this work focuses on establishing workflows for new service requests. It lacks the definition of the VNFs for the terminal-defined multimedia services and it does not present VOLUME 8, 2020 SFC model on the fog or cloud nodes. Finally, the work in [29] proposes an SLA-aware fine-grained QoS provisioning (SFQP) scheme for multi-tenant software-defined networks (MTSDN), i.e., a service platform that manages and shares VNFs between tenants and terminals. The proposed scheme automatically extracts the eigen characteristics of each packet by an application-aware methodology. It also leverages K-nearest neighbor (KNN) algorithm to predict the SLA popularity.

C. MOTIVATIONS AND CONTRIBUTIONS
The aforementioned SFC provisioning schemes on cloud nodes map the VNFs without accounting for delay requirements of incoming requests. Note that applying these schemes to fog or hybrid fog-cloud (HFC) architectures can yield in various limitations, due to the case of heterogeneous nodes of different available resources. These schemes also lack network delay and cost models for SFC deployment. Moreover, the deployment and communication costs here are highly dependent on the node location, at which the VNF is hosted. Thus making the provisioning task more challenging. Now the lack of SFC provisioning in HFC architectures presents a major motivation for this work. Specifically, existing efforts often focus on dependency-unaware VNF mapping schemes with application components splitting between fog and cloud nodes. Hence, this paper proposes a novel SFC scheme for application components in synergetic HFC paradigm, termed as delay-aware fogcloud (DAFC) provisioning. Namely, the proposed scheme accounts for the ultra-low latency requirements for real-time applications, where requests are categorized as either delay-sensitive and delay-tolerant based upon the delay requirements.
The goal here is to achieve SFC provisioning for HFC architectures that yields in practical advantages to network operators and regulators, as specified in the the fog computing standard OpenFog Consortium in [30]. This includes reduced network delays, shorter latencies, minimized energy consumption and realization costs. This work also enables efficient and dynamic resources utilization, in terms of processing usage (e.g. CPU usage), memory, and link bandwidth. Hence achieving high number of satisfied requests (reducing access attempts). Thus increasing network capacity and scalability.

III. PROPOSED HYBRID FOG-CLOUD ARCHITECTURE
The proposed hierarchical HFC architecture is composed of two layers. Terminals in Layer I, fog nodes and fog cluster heads in Layer II, and cloud nodes in Layer III, as detailed next, see Figure 1 [31].

A. LAYER I (TERMINALS)
This layer is composed of a set of terminals that connect to the substrate network via radio links. Such as mobile devices and vehicles, as well as stationary devices. Terminals here initiate SFC requests.

B. LAYER II (FOG NODES)
This layer consists of resources-constrained fog nodes that cooperatively process, compute, and temporarily store received requests from nearby terminals within the fog node footprint (coverage area). The adjacent fog nodes in a specific geographical area form fog clusters. The nodes in each cluster communicate with each other for traffic and available resources information exchange. The fog nodes in each cluster are interconnected with fog cluster heads (FCH) of higher computing and storage resources. The FCH allows traffic offloading to adjacent nodes, that possess abundant resources within the same cluster or with the neighbor cluster members. The FCH can also be used to host medium-delay or delay-tolerant services for requests relayed from the fog nodes (which only host delay-sensitive requests). Consequently, this saves resources in the fog nodes to accommodate additional delay-sensitive requests. This layer also includes multiple devices, such as edge routers, gateways, switches, access points, and set-top boxes.

C. LAYER III (CLOUD NODES)
This is the highest layer composed of high-end cloud nodes connected to the fog nodes in the lower layers via a high bandwidth network backbone. They feature very high computing resources to process and store enormous amounts of data, at the detriment of high network delays. Hence this layer is utilized for delay-tolerant requests demanding high resources consumption.

IV. SYSTEM MODEL
Consider the following assumptions for the proposed DAFC provisioning scheme, implemented on the aforementioned HFC architecture. I) The incoming service requests from terminals in Layer 1 are first processed at the fog nodes in Layer II through one-hop or multi-hop connection. II) Requests are classified into two different types based upon their delay requirements. Namely, delay-sensitive and delay-tolerant requests. III) Different VNFs in the requests demand various amount of resources, such as processing and storage resources and delay. IV) The total capacity at each fog or cloud node is represented by the total amount of available resources at the node. The various design models are presented next.

A. MULTI-LAYER SUBSTRATE NETWORK TOPOLOGY
The multi-layer substrate network for the proposed HFC architecture ( Figure 1 Here an edge is denoted by e, and it can be either a link between a terminal and a fog node, fog cluster, cloud node, denoted by e t,f , e t,h and e t,c , respectively. Also, it can be a link between a fog node and FCH or cloud node, denoted by e f ,h and e t,c , respectively. Moreover, it can be a link between cluster head and cloud node, e h,c . Finally, it can be a link between fog nodes e f ,f , cluster heads e h,h , and cloud nodes e c,c . Furthermore, each node can host one or more VNFs, v u ∈ V r , i.e., u = 1, 2, . . . , U , where U denotes the total number of VNF types. Also, V r is the set of all required VNFs in the request r. Moreover, the available substrate resources are bounded by a finite set of constraints. This is necessary for practical and realistic operational settings. Namely, the overall resources capacities at ascending layer numbers possess higher available resources. For example, cloud nodes have much higher resources than fog clusters heads. Now each node has a specific computing capability, which constitutes to the amount of available resources in its layer. These resources are presented as a set of three main attributes, i.e., processing (CPU), memory and bandwidth. Here, the total memory at any node is bounded by Q me (n), and the total processing capacity is bounded by Q proc (n). Also, the available link bandwidth on a substrate link between two nodes is bounded by B(e). Note that a key saliency for the proposed scheme is scalability for larger number of node or link resources at high traffic volumes.

B. TERMINAL REQUEST MODEL
A novel model is developed here for request r ∈ R of specific resources and delay requirements, where R is the set of total requests from terminals. Each request r is expressed by 6-tuple r =<src, dst, V r , b r , δ r , L r >, where src ∈ N , denotes the source node (terminals in Layer 1), dst ∈ N , denotes the destination node hosting last VNF in the SFC. Also, V r denotes the set of desired VNF types, v u , ordered in the SFC in request r, i.e., U is the set of defined VNF types in the network. See Figure 2 for requests of various delay and resources requirements, mapped on nodes in Layers II & III based on the destination, e.g., r 1 , r 2 and r 3 . Furthermore, each hosted VNF in the network nodes requires specific processing Q pr (v u ) and memory Q me (v u ) resources, i.e., Q pr (v u ) ∈ Z + and Q me (v u ) ∈ Z + . The variable b r is the required link bandwidth to interconnect the VNFs in SFC, δ r denotes the network delay requirement for the request (either delay-sensitive or latency-tolerant), and L r represents the service lifetime. Note here that different number of instances of the same VNF can be generated from requests R, hence the same VNF can be shared on the same node by more than one request. Figure 2 shows various SFC provisioning examples on the proposed HFC architecture for requests demanding various delay and resources levels, mapped on nodes in Layers II & III. First, a delay-sensitive request r 1 is received at a nearby fog node from the terminal layer with a dependency require- Hence it is required here to map v 1 first, then v 3 , followed by v 2 . These VNFs are all mapped on the fog layer in order to achieve the delay requirements here.
The dotted arrows in red shows the SFC initiated from the terminal, passing through the nodes hosting the VNFs to the destination. The red dotted lines in Figure 2 show the SFC route for r 1 from terminal through the hosting fog nodes to the final destination. Similarly, the delay-sensitive request r 2 has a dependency requirement of v 3 → v 4 , i.e., mapping v 3 followed by v 4 . Lastly, the received delay-tolerant request r 3 features a dependency requirement of v 4 → v 5 . Hence VNF v 4 is mapped first on a fog node in Layer II, followed by the next VNF in the SFC, v 5 , which is also mapped on another fog node in the same layer. This VNF is mapped on a nearby node that satisfies least delay and cost, while achieving shortest path. Here the green dotted lines depict the SFC route from terminal through the hosting nodes (FCH and cloud) to the destination.
Prior to presenting the SFC provisioning scheme, key design measures are taken into account that considers the resources requirements and availability. In particular, node and edge capacity requirements, as detailed next.

C. NODES CAPACITY REQUIREMENTS
Each VNF, v u , in the request list is mapped to either a fog node n i f , fog cluster head n k h , or cloud node n l c , on substrate network n ∈ N − {n i t }, that has enough available computing Q pr (n) and memory resources Q me (n). A key condition here is that the sum of processing and memory capacities required by VNF instances mapped to the nodes cannot exceed the amount of available physical resources. This in turn avoids node overloading and requests drops. In notations, where λ r v u denotes the number of VNFs of a specific type in request r mapped to a node, and S v u is the set of instances for the VNF of type v u .

D. EDGE CAPACITY REQUIREMENTS
Here each link between two consecutive VNF nodes in the SFC request must be mapped to a substrate link, e, e ∈ E of enough available bandwidth, B(e), i.e., larger than the request bandwidth, b r . This is necessary in order to transverse sufficient amount of flow (required by SFC requests) at reduced overloading. This is modeled as, where r e denotes the number of virtual links in request r mapped to the network.

E. DELAY MODELS
For each request, as data packets travel from the source node to the subsequent nodes along the path, unit reaching the destination node, these packets encounter multiple types of network delay along their paths. In this work, the network delay is composed from the processing, transmission and propagation delays as presented next (note that the queuing is not considered here). Given a delay bound, δ r , the network delay along the designated path, D(p), for request, r, must satisfy the delay constraint, where p, p ∈ P, is the valid path for P total number of paths. This path delay is modeled as, where D pr (n), D qe (n), D tran (e) and D prop (e) represent the node processing, node queuing, link transmission and propagation delays, respectively.

1) PROCESSING DELAY
This value measures the total time required by the node to process a mapped VNF, v u , in the SFC of the request, r. The total processing time for R requests is formulated as, where A r is the traffic load per request, δ v u (n) denotes the processing rate of VNF type u in the request on a cloud or fog node, which is the request traffic unit per time (in ms).

2) QUEUING DELAY
The average queuing delay for each request, D qe (n), depends on the instantaneous requests arrival rate to the queue at the node, ϒ(n) (requests/sec), the processing capacity requirements of the request, Q pr (r), service processing rate at node n, δ pr (n), as well as transmission rate at the link, δ(e). Moreover, the queuing delay depends on the nature of the traffic, e.g., exponential during peak-hours or Poisson arrival rate. The above parameters form the traffic intensity, , the key factor in determining the queuing delay. It is modeled as, where the request processing capacity requirements is the total VNF requirements, Q pr (v u ), ∀v u ∈ V r , expressed as, The traffic intensity starts at very low rate when the number of requests is small (low requests arrival rate). Hence when a request arrives, it is unlikely that it will find another request in the queue. As a result, the average queuing delay will be very small (in order of microseconds). Meanwhile, as the number of requests increase, i.e., higher arrival rate. Then the traffic intensity increases until a point where the arrival rate exceeds the transmission rate and processing capacity of the link. Therefore, the queuing delay is increased. Note that if > 1, then the average arrival rate at the queue at node n exceeds the transmission rate at link e. However, the work here assumes bounded traffic intensity, in order to avoid an infinite increase in the queuing delay. This in turn avoids network failure for all incoming requests, when exceeding the requests delay bounds. Hence this is achieved by having the arrival rate to be less than or equal to transmission capacity of the link. This makes the service time the major factor that determines the queuing delay. Along these lines, the requests arrival rate at the node is modeled as Poisson distribution, and the queuing delay follows exponential increase. This model is characterised by the M /M /n queuing model and Erlang C formula, thereby achieving a bounded traffic intensity.
Overall, the queuing delay at node n for each request r is formulated as, The parameter E(M (n), ζ (n)) is the Erlang C formula, i.e., E(C) expressed as [32], where ζ (n) is the utilization rate of the node, expressed as, The variableδ pr (n) is the instantaneous processing rate provided by node n, defined by, Here the parameter ψ r (n) represents the percentage of requests that the node n can process utmost, ϒ max (n), from the total arriving requests at the same node, ϒ tot (n). This is formulated as, The maximum workload at each node is represented by the maximum request arrival rate that a node can receive, ϒ max (n), as a fraction of the total arrival rate, ϒ tot (n). Here the parameter ϒ max (n) represents the maximum workload at the node, i.e., the maximum request arrival rate that a node can receive. It avoids excessive queuing delay when the fog node is heavily loaded.

3) TRANSMISSION DELAY
It relates to the transmission rate of the link, i.e., amount of traffic units that are forwarded/transmitted from one node to another. Note that work here adopts first-come-first-serve transmission scheme. Therefore, the transmission delay for all requests, D tran (R), is gauged as, where δ(e) is the link transmission rate, which defines the elapsed forwarding time per traffic unit for the request.

4) PROPAGATION DELAY
It represents the time encountered for data to propagate on link e between any two nodes. It is gauged by the separation distance between the nodes divided by the propagation speed of the medium (e.g., wireless, fiber). The overall propagation delay for request r accounts for all interconnecting links between the source node, nodes hosting the VNFs to the destination node (propagation delay over all links joining the nodes at which VNFs are mapped). This delay is modeled as,

F. COST MODELS
The cost model includes deployment, processing, and communication costs, where their summation formulates the overall realization cost of SFC provisioning in different network architectures, fog, cloud or HFC architectures.

1) DEPLOYMENT COST
The total license cost of deploying VNF software instances, modeled as, where ρ(v u ) is the license cost of VNF type u.

2) PROCESSING COST
The cost of resources assigned and reserved for the overall number of mapped VNFs in the SFC requests, expressed as, (16) ∀n ∈ N − {n i t }, where ρ pr (n) and ρ me (n) are the node processing and memory costs per resource unit, respectively.

3) COMMUNICATION COST
The total cost of edges assigned and used for all the mapped VNF edges in the requests. This includes communication cost between terminals and their affiliated VNFs on fog and/or cloud nodes. This is modeled as, where ρ(e) accounts for the transmission cost per traffic unit between links in different layers. See Table 1 for different edge costs [24].

V. DELAY-AWARE FOG-CLOUD MAPPING (DAFC)
A novel scheme is now presented for delay-aware SFC provisioning implemented on the the proposed multi-layer HFC architecture, termed as delay-aware fog-cloud mapping (DAFC). In particular, the scheme uses a heuristic search method in efforts to achieve a tradeoff between delay and cost. The detailed pseudocode for this scheme is also presented in Algorithm 1, as detailed next. Consider a group of terminals in Layer I within a specific a footprint that generate service requests R transversed via radio interfaces to fog nodes n j f in Layer II. The neighboring fog nodes here form a cluster to process delay-sensitive requests. Moreover, the fog nodes in each cluster are connected to cluster heads in Layer III, Prev ← src /* Assign source to previous node */ 9: Subs /* Node with subsequent mapped VNF */ /* Set route vector */ 10: if (delay-sensitive), δ r < δ th then 11: /* Map on fog nodes in Layer II*/ 12: /* Process all functions in function list */ 13: for (each v u ∈ V r ) do 14: /* Prune fog layer to build feasible graph, G = (N , E )*/ 15: if end if 43: end for n k h , over which delay-tolerant requests are processed. Overall, the communication and management protocols among nodes in different layers follow a hierarchical clustering approach, as depicted in Section III. Thus nodes can communicate horizontally within nodes in the same layer and vertically to nodes in higher layers for relaying purposes.
Consider the set of incoming online requests, R, generated from the terminals, i.e., received at the closest fog node within proximity (over a wireless link). Now the closest fog node examines the delay δ r and lifetime L r requirements when mapping VNFs, v u , [v u ∈ V r ], in the SFC of each request r. The scheme initiates variables D(r) and C(r), to account for the network delays and costs, respectively. Also, a route vector is required that establishes routes between previous hosting node with mapped VNF, Prev, to the candidate node that maps the subsequent VNF, Sub. Thus forming route vectors.
The VNFs are mapped at different layers based on delay requirements, where the node selection mapping policy over which fog nodes and cluster heads is to achieve lowest latency and maximum available capacity. For instance, a close neighboring fog node that provides minimum total delay and maximum remaining resources at the current state is selected for an incoming request. Namely, if request is delay sensitive, then it is mapped on the fog nodes in the Layer II. For all VNFs in the SFC, the network is then pruned to build a feasible graph G composed of candidate nodes n Then the scheme computes delay-bound candidate routes set, P, from Prev to all n j f ∈ N . Moreover, each route p here satisfies minimum network delay for incoming request, δ r , i.e., min{D(r) + D(e)} < δ r . From these routes, a candidate node is selected based on minimum delay requirements, hence achieving minimum delay with previous node that host last mapped VNF in the same layer.
Overall, nodes are selected based upon delay requirements. If the request is delay-sensitive, specified by δ r less than a threshold value, δ th , then VNFs are mapped to Layer II. Once a VNF is mapped, then resources are reserved in G = (N , E ) by updating node resources and links bandwidth. This is achieved by subtracting the resources and bandwidth capacity requirements of all mapped VNFs on the node, i.e., Q me (n Note that mapping here is contingent upon available resources in nodes at this layer. Namely, if a fog node in the cluster does not possess sufficient resources, then it handovers the VNF to its adjacent neighbors within the same layer. Moreover, if no available resources at all nodes in the cluster n j f , then the VNF is relayed to the cluster heads in Layer II n k h . Similarly, if Layer II does not possess enough resources, then the VNF mapping is performed at Layer III, i.e., cloud nodes n l c , [n l c ∈ N ]. Finally, when all the VNFs in r are mapped (see pseudocode in Algorithm 2), then r is flagged as successful. The VNFs are now mapped using a hypersivor or container on the node [33]. Then data transmission phase is initiated (service starts) for terminals, and resources are reserved for the entire lifetime period, L r , of the request. Namely, a counter is initiated here to record the elapsed time over which a request is in transmission phase. Once this time has reached the request lifetime, Counter = L r , then data transmission phase is terminated and resources are released by updating node and link capacity in As a result, this setting relieves the limited resources in the nodes for use by other requests. In turn, network capacity and traffic volume is increased, i.e., accommodating more requests.
Meanwhile if the request is delay-tolerant, δ r > δ th , then the VNF is directly transversed to the fog cluster heads (FCHs) for mapping. This saves resources in the fog nodes for other delay-sensitive requests. Similarly, the network is pruned to select the hosting nodes for all the VNFs in the SFC for the delay-tolerant request. Here a feasible graph G is built that is composed of candidate nodes n k h ∈ N with Q me (n k h ) > Q me (v u ) or Q pr (n k h ) > Q pr (v u ), and links e h,h ∈ E with B(e h,h ) > b r . Then the scheme also computes P routes from Prev to all n k h ∈ N , where each route p here satisfies the minimum network delay for request r, δ r , i.e., min{D(r)+D(e)}. Following the route and node selection process, service is provided for the request lifetime L r , during which resources are reserved. Note that if no resources are available at the FCH in Layer II, then the VNFs are mapped to cloud nodes in Layer III. Furthermore, if no resources are available at the cloud nodes, then the request is dropped. Consequently increasing the dropping rates.

VI. PERFORMANCE EVALUATION
The proposed DAFC provisioning scheme is now evaluated for the multi-layer HFC architecture versus traditional cloud and fog architectures using key performance metrics. This includes the number of successful (satisfied) requests, network delay, energy consumption, and realization cost. See Table 2 for the network parameters and their assigned values [23], [24], [37]- [39]. Now consider the followings in the simulation settings. The generated requests from the terminals are sent to a request pool. Then they are managed by the provisioning scheme based upon their arrival time, where the scheme aims to find the best combination among fog and cloud nodes to serve incoming requests. Thereafter, some statistical data are recorded when all requests are processed, such as the average number of satisfied requests, and network delay and cost, as presented next.

A. NUMBER OF SUCCESSFUL REQUESTS
High traffic volume from terminals in Layer I results in aggregated number of incoming requests of various requirements to the higher layers. These requests impose high demand on the available resources in the substrate network and impose challenges on the SFC provisioning process. Consequently, some requests can be dropped when resources and links become more congested. Hence SFC provisioning scheme is performed here on the three network architectures to conclude an optimum placement with the highest number of satisfied requests. This value is gauged in Figure 3 and it shows that cloud architectures return the highest capacity in terms of the number of satisfied incoming requests. Cloud solutions approximately accommodate 90-95% for the first 1000 incoming requests and 65-75% for 1600-2000 requests. This is attributed to the abundant available resources at the cloud nodes, at the detriment of increased propagation delays. Meanwhile, fog architectures suffer from significant requests drop rate, due to the limited resources on the fog nodes, i.e., satisfying 70%, 37%, and 25% for 400, 800 and 1600-2000 incoming requests. Note that fog nodes become highly saturated after the first 800 incoming requests, which results in dropping many incoming traffic from Layer 1 (60-70% dropping rates). Meanwhile, the proposed HFC architecture yields a tradeoff between cloud and fog solutions. Despite the close success rates in the hybrid and fog architectures for the first 400 requests, however the proposed HFC architecture is capable of satisfying more requests when traffic volume increases further. For instance, the HFC architecture yields in 75%, 62% and 50% success rates for 400, 800 and 1600-2000 incoming requests, respectively. Hence it achieves 15-40% higher capacity as compared to fog solutions for the same traffic volume. Moreover, the architecture here utilizes the fog cluster head (FCH) to accommodate the increased demand on resources at reduced processing times and latencies.
Note that the number of satisfied requests is an important key metric for delay-sensitive requests. When mapping the incoming delay-sensitive requests on the fog nodes, then these requests are deemed successful if D(r) < δ(r).
Here its important to avoid mapping delay-sensitive on the cloud nodes, as this can yield in prolonged delay on the request delay, D(r), until it exceeds δ(r). Consequently, the request is dropped from the network and denied access. Therefore, the terminal repeatedly attempts to access the network again. These attempts result in additional propagation, queuing and processing times in the network, hence yielding increased latencies in the control-plane for delay-sensitive applications.

B. OVERALL NETWORK DELAY
The overall network delay is composed of the processing, queuing, transmission and propagation delays. It is defined as the time required by the VNFs to process incoming packets of various applications, e.g., firewall, load balancer, and VPN function. This parameter is gauged by the processing time of the overall number of software-implemented VNFs at each fog or cloud node, as per Table 3 [34]. Figure 4 shows the network delay for the proposed DAFC provisioning scheme implemented on the cloud, fog, and HFC architectures. It is noticed here that cloud architectures yield excessive network delays, e.g., 6.4 and 9 seconds for 800 and 1600 incoming requests, respectively. This is attributed to the long propagation delays for packets traversed over nodes separated by large geographical areas. Finally, fog architectures yield in very short delays due to the low processing times at fog nodes and proximity to the terminals (short propagation times), e.g., 4 and 6 seconds for 800 and 1600 requests, respectively (at the expense of low capacity).
Meanwhile, the proposed HFC architecture achieves reduced delays at various number of requests. Namely, it returns significant reduction as compared to the cloud architecture, and slight increment over fog solutions at larger number of requests, e.g., 9.2 and 13.6 seconds for 800 and 1600 requests, respectively. Note that the small privilege for fog architectures here is contingent upon resources availability at the fog nodes, in order to support such a high number of satisfied requests. Also, standalone fog architectures can suffer from increased delays at high traffic volumes. This is in contrast to the proposed scheme that can accommodate high traffic volumes. A key saliency for the propsoed HFC architecture is the relatively low processing times at increased number of requests. For instance, fog architectures can suffer from aggregated processing times when resources are limited. Consequently, high number of incoming requests can be dropped. This is in contrast to the abundant resources at the HFC solution. Moreover, this multi-layer HFC solution achieves significant reduction in processing times versus cloud architectures, i.e., approximately 25% and 48% less times at 1000 and 2000 requests, respectively. Note Table 2 for various delay ranges between different layers.

C. ENERGY CONSUMPTION
The energy consumption for the SFC provisioning scheme is plotted in Figure 5 for the various architectures. Specifically, this value is gauged by measuring the overall power consumption, W , during the entire network delay D(r). It accounts for the power consumption levels w(n) at the fog and cloud nodes in Layers II & III, as well as w x power consumption in X number of switches between these nodes. See Table 2 for parameters settings [35], [36]. In notations, where the power consumption at each node, w(n), i.e., where β(n) denotes the power consumption rate in idle mode, w(n) max is the maximum power consumption for nodes in different layers, and ζ (n) represents the utilization (saturation) factor at any node used for SFC provisioning. Figure 5 shows that the proposed HFC architecture consumes reduced energy levels at high number of incoming requests. For example, the HFC architecture requires in order of 11 and 16 KJoules for 1000 and 2000 requests, versus 21-27 KJoules for cloud, and 4-8 KJoules for fog architectures for the same number of requests. Note here that fog nodes in Layer II approaches saturation earlier (100% utilization) after 300 requests, ζ (n j f ) = 1. However, this architecture still accommodates requests, as previously occupied resources are released when earlier hosted requests reach their lifetime counter. As a result, resources become available to host new requests (e.g., at 1600 incoming requests). This is compared to 900 requests for nodes in Layer III, i.e., ζ (n k h ) = 1. This demonstrates the benefit of the fog cluster heads (FCHs) in the proposed architecture. Finally, cloud architectures require excessive amounts of energy to accommodate the large number of requests, e.g., 22 KJoules for mapping 1000 requests, at abundant amount of resources, ζ (n l c ) = 0.7 at 1000 requests. Overall, the proposed HFC architecture again yields a tradeoff between cloud and fog counterpart.

D. REALIZATION COST
This cost is calculated for the different architectures at various numbers of incoming requests, as per Figure 6. Here the proposed HFC architecture yields an average cost tradeoff between cloud and fog architectures. Namely, it leverages the reduced processing times in fog architectures, and the abundant resources available in cloud architectures. Also, the proposed DAFC scheme yields efficient fog and cloud nodes utilization to satisfy each type of request separately according to the service requirements, while keeping costs and delay at minimum.
Overall, the proposed provisioning scheme shows that the performance for the HFC architecture yields a tradeoff between cloud and fog solutions. It approaches the performance of cloud architectures in terms available resources and successful requests. Meanwhile, it approaches the performance of fog architectures in terms of reduced network delays, realization costs, and energy consumption. Thus making the proposed architecture a suitable solution for delay-sensitive and delay-tolerant requests of various requirements. Such as, real-time video applications and data storage.

E. COMPUTATIONAL COMPLEXITY
The computational complexity of the different SFC provisioning schemes in cloud and fog computing architectures are classified into the following. First, graph-based heuristics such as the work in [11]- [13] and [18] implement graph centrality, graph-clustering and multi-stage graphs to find the best path for request r in the SFC placement. Generally, the proposed methods here reduce the complexity and convergence times, since the computational complexity here essentially depends only on the physical graph sizes. These schemes feature polynomial time computational complexity, modeled as O(N k ). See Table 4 for the run-time computational complexity of the different provisioning schemes.
Meanwhile, global optimization techniques such as the MILP and ILP in [7] and [21] suffer from high computational complexity, i.e., exponential run-time, modeled as, O(2 N ). This high complexity is attributed to the evaluation of the objective function on the entire search space (nodes) in the network, in order to find the best solution to host the VNFs. Furthermore, metaheuristic schemes, such as Tabu search [19] performs greedy process to select the host node from each sub-region, where it maintains locally optimal selection for each VNF. The computational complexity here is characterised by an exponential process O(2 n ), n N . Meanwhile, the computational complexity of the proposed SFC provisioning scheme implemented on the HFC architecture along with the fog, cloud solutions is now analyzed. The proposed provisioning scheme determines E number of routes from the source to the first candidate node in the network. This route has multiple nodes and links. Therefore, the algorithm iterates over E routes between the previous node (e.g., source or prev) to the candidate node, in order to select a single node from the route that yields the least delay and load. Thereafter, the scheme examines the route possesses a node with sufficient resource with the least delay, i.e., first-fit (first suitable) node in the shortest route. Hence a single route is selected for each request from the network graph G (N , E). Assuming the worst-case scenario (utilizing the whole network nodes and routes), then the run-time computational complexity is bounded by O (|N | log |E|).
In light of the above, the proposed provisioning scheme on the HFC architecture features in reduced computational complexity as compared to the cloud and fog counterpart for the same number of successful requests. This is because it consumes reduced network resources, i.e., reduced nodes (links) usage compared to standalone fog (cloud) solutions. This is in contrast to the proposed scheme the computational run-time complexity for the proposed hybrid model is based on a first-fit approach, where it selects the first best node in the selected shortest path p that yield the least delay or load.
Overall, the aforementioned performance results of the proposed DAFC provisioning scheme benefit network operators in achieving higher user capacities, accommodating larger traffic volume, at improved quality-of-service (QoS) in terms of latency and bandwidth. It also yields in reduced operating and capital expenses (OPEX and CAPEX), as less number of nodes and links are used to accommodate incoming requests, at reduced power and energy consumption levels at nodes and switches.
It is also interesting to consider the proposed SFC provisioning scheme on multi-tier fog architectures, composed of heterogeneous nodes. Namely, implementing fog nodes of different available resources, at the edge of the network, over which the network capacity, delay, cost and energy efficiency can be investigated. Moreover, this work can be extended to include failure probabilities associated with nodes and links in the network. Hence it is also important to consider SFC provisioning schemes that account for network failures. Here developing various restoration methods becomes important for network recovery, while taking into account single-and multi-failure, and recovery times.

VII. CONCLUSION
There is a growing need to implement service function chaining support in emerging fog-cloud infrastructures. Hence this paper presents a novel delay-aware provisioning scheme for that maps the virtual network functions in fog-cloud architecture, composed of fog nodes, fog cluster heads and cloud nodes. Moreover, the scheme accounts for several key attributes for incoming requests. Foremost, delay, resources and bandwidth requirements, and lifetime. Overall findings confirm that the proposed hybrid architecture outperforms fog and cloud counterparts at high number of incoming requests. Future efforts will investigate failure awareness service provisioning on the proposed hybrid architecture. Moreover, these efforts will study the impact of the cluster heads as protection nodes in network survivability.