Incremental Multilayer Resource Partitioning for Application Placement in Dynamic Fog

Fog computing platforms became essential for deploying low-latency applications at the network's edge. However, placing and managing time-critical applications over a Fog infrastructure with many heterogeneous and resource-constrained devices over a dynamic network is challenging. This paper proposes an incremental multilayer resource-aware partitioning (M-RAP) method that minimizes resource wastage and maximizes service placement and deadline satisfaction in a dynamic Fog with many application requests. M-RAP represents the heterogeneous Fog resources as a multilayer graph, partitions it based on the network structure and resource types, and constantly updates it upon dynamic changes in the underlying Fog infrastructure. Finally, it identifies the device partitions for placing the application services according to their resource requirements, which must overlap in the same low-latency network partition. We evaluated M-RAP through extensive simulation and two applications executed on a real testbed. The results show that M-RAP can place 1.6 times as many services, satisfy deadlines for 43% more applications, lower their response time by up to 58%, and reduce resource wastage by up to 54% compared to three state-of-the-art methods.


I. INTRODUCTION
H IGHLY distributed applications, such as video processing [1], e-commerce [2], virtual reality [3], object recognition, face detection [4], or natural language processing [5], raise significant time-critical and low-latency challenges impossible to fulfill using centralized Cloud data center technologies [6]. To mitigate this problem, Fog computing [7] has emerged as a distributed paradigm encompassing highly heterogeneous and dynamic devices, hierarchically split between the high-end Cloud and the low-end user devices. The Fog supports applications with time-critical low-latency constraints through service placement [6], [8] across devices closer to the users' needs (e.g., computation, communication, storage). Existing works use resource partitioning [9], machine learning [10], or distributed deadline-aware [11] methods to place time-critical applications in the Fog. Despite employing different technologies, they all use greedy heuristics that overload the fastest devices to lower the response time. However, Fog infrastructures have limited capacity-constrained resources with sporadic availability, requiring a trade-off between utilization and response time to satisfy various time-critical application requests. The following example illustrates the disadvantage of greedy heuristics mitigated by a similarity placement method. Example 1. Let us assume a Fog network of four devices with different memory sizes (see Table I). The devices d 1 and d 3 are always available. We assume two concurrent applications A 1 = {s 1 , s 2 } and A 2 = {s 3 , s 4 } of two sequential services each, with specific memory and execution deadline requirements (see Table I b). The services exchange a message of fixed size, transmitted between the hosting devices with the duration shown in Table I a. After 9ms from the application requests, device d 2 joins and d 4 leaves the network. To mitigate the failures caused by the leaving device d 4 , we consider a monitoring interval of 10ms, equal to the minimum service deadline. 1) Greedy placement unnecessarily overloads the fastest device d 1 , leading to resource wastage (representing the percentage of unused memory) and deadline violations. For example, placing s 1 onto d 3 satisfies its required memory and provides a response time of 9ms ahead of its deadline. However, the method places s 1 with the lowest deadline onto d 1 and occupies 40% of its memory with a low response time of 5ms. Similarly, it places s 2 with a lower deadline on d 1 , eliminating the transmission time and providing the lowest response time of 5 + 0 + 7 = 12ms. The utilization of d 1 is 60%, and the remaining devices have insufficient memory to host s 3 despite being idle. Moreover, they cannot host s 4 due to the dependency on s 3 . The method places half of the services and, thus, has a low placement rate of 0.5 and a deadline satisfaction rate of 0.5. The average device memory wastage is 1 − 2+1 15 ≈ 0.8 for the first monitoring interval and 1 − 1 11 ≈ 0.9 for the second. 2) Similarity placement maps each service to a device with a capacity similar to the required resources and a response time within the deadline. Unlike the greedy method, it places s 1 to d 3 with an equal memory size and a response time of 9ms, eliminating the wastage on d 3 , satisfies the deadline, and spares d 1 for other services. Likewise, placing s 2 to d 3 eliminates the transmission time and generates a response time of 9 + 0 + 12 = 21ms, which satisfies the deadline. Afterward, it places s 3 onto d 1 with a response time of 7ms, which satisfies its deadline and eliminates memory wastage. Finally, it places s 4 on all 4GB of memory of d 2 with a lower transmission to d 1 . As d 2 leaves the network before completing s 4 , the placement relocates s 4 onto d 4 with a failure overhead of 3ms (i.e., 2ms execution on the failed device plus 1ms to detect the failure), a transmission time of 2ms (from d 1 , hosting s 3 ) and a response time of 7 + 3 + 2 + 13 = 25ms. Despite the lost computation of 3ms, s 4 can still reach its deadline. The average resource wastage is 1 − 5+4+2 15 ≈ 0.27 for the first interval, and 1 − 0+1+4 11 ≈ 0.55 for the second. Notably, although a lower resource wastage leads to a higher placement rate, one can still reach a high placement rate and waste resources. Research Proposal. To address this challenge, we propose an incremental multilayer resource-aware partitioning (M-RAP) method for adaptive placement of distributed time-critical applications, represented as directed acyclic graphs (DAG) of services with soft absolute response deadlines [12], in a dynamic Fog. M-RAP approaches this problem as multi-objective optimization of three goals: Fog placement rate, resource wastage, and service deadline satisfaction rate. M-RAP models Fog as a dynamic multilayer graph comprising the changing network structure [13] and devices with different resources and availability [14]. M-RAP splits the Fog devices into overlapping partitions considering different resources and network structures to accelerate application placement. Afterward, it incrementally updates the partitions by tracking infrastructure changes to detect clusters of dynamic networks with minimum time and cost that avoid re-partitioning. Each resource partition has an associated feature, defined as a quadruplet of the average number of cores, memory and storage sizes, and processing speed of its devices. Afterward, the dynamic application placement needs two steps.
1) Similarity placement step maps all application services to feature partitions based on their resource requirements, which must share the same network partition underneath to satisfy their deadlines. 2) Monitoring and replacement step mitigates critical situations in case of device failures and invalid placements and maps the affected services onto new feature partitions. We performed extensive simulations with an intensive workload of application requests in a Fog environment with devices that dynamically join or leave the network. Compared to three related methods [9], [15], [16], M-RAP improves the Fog placement rate by range 1.4-1.6 times, satisfies deadlines for range 5-43% more placement requests, decreases the application response time by range 16-58%, and reduces the resource wastage by range 1-54%. We validate the simulation using two e-commerce and video streaming applications in a real computing continuum testbed [17]. Extensions. This paper extends our early work [18] on application placement in a static multilayer Fog in three areas. 1) We model the evolving Fog as a dynamic multilayer graph based on the incremental network changes and device availability; 2) We design a new dynamic placement algorithm based on an incremental multilayer resource partitioning method considering infrastructure changes; 3) We compare our results against three state-of-the-art methods using two applications running on a real testbed. Outline. The paper has ten sections. Section II summarizes the related work. Section III presents the model underneath M-RAP representing the Fog as a dynamic multilayer graph. Section IV proposes a two-phase architecture for placing an application in a dynamic Fog: multilayer partitioning described in Section V and service placement described in Section VI. Section VII provides the experimental setup. Section VIII evaluates M-RAP compared to related work using simulation experiments, confirmed on a real testbed in Section IX. Section X concludes the paper.

II. RELATED WORK
This section revisits existing static and dynamic Fog placement methods, summarized in Table II.

A. Static Placement
Static placement methods model the Fog without inspecting the unstable network and the dynamic service workloads.
1) Resource wastage: Taneja et al. [15] proposed a resourceaware application mapping to maximize utilization in Fog. Similarly, Shooshtarian et al. [19] proposed a two-phase allocation method on hierarchical Fog resources using local clustering in each layer to optimize resource utilization. Jie et al. [20] modeled resource allocation in Fog and Cloud as a two-stage noncooperative game to minimize the service cost and maximize Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  II  RELATED WORK COMPARISON ("D": DEADLINE SATISFACTION, "T": MESSAGE  TRANSMISSION, "W": RESOURCE WASTAGE) resource utilization. Unfortunately, these methods ignore the network structure of Fog devices and introduce high latencies.
2) Transmission time: Sajjad et al. [21] proposed a decentralized resource partitioning method based on multiple biased random walks to reduce the transmission time. Similarly, Filiposka et al. [22] designed a community-based resource management method in Fog that exploits distributed hierarchical clustering to minimize latency and service migration. Sun et al. [23] partitioned the Fog devices based on the network connection and modeled the application placement in each cluster as a multiobjective optimization of execution and transmission time. Lera et al. [9] proposed a greedy application placement that optimizes availability and latency by partitioning the Fog resources based on their connectivity. Ma et al. [24] introduced a collaborative method for service caching and placement in edge computing formulated as a mixed integer nonlinear programming problem to minimize the transmission and execution time. These methods clustered the Fog devices based on network connectivity without considering the other resources such as processing speed, memory, or storage.

B. Dynamic Placement
Related work studied two dynamic placement aspects. 1) Dynamic applications: Yousefpour et al. [25] investigated changes in the application structure by formulating the placement as an integer non-linear programming problem using reactive service provisioning for optimizing cost and service delay. Similarly, Sami et al. [10] inspected the changes in the application structure using a proactive demand-driven deep reinforcement method to optimize the deadline. Meng et al. [11] proposed a deadline-aware task dispatching and scheduling method for Edge computing as an unrelated parallel machine problem that minimizes resource wastage and transmission  [27] proposed a collaborative caching and processing based on an integer linear programming Edge scheduling method to minimize the transmission time considering dynamic application requests.
2) Dynamic infrastructure: Mseddi et al. [28] modeled the availability of Fog devices based on discrete Markov processes to maximize the application deadline satisfaction. They extended their work with an integer linear programming method [16] to optimize the monolithic application placement rate and deadline satisfaction, considering the mobility of Fog devices rather than workflows.
3) Contribution: Most related works [10], [11], [24], [25], [26] focused on dynamic workflows and neglected network instability that affects their deadlines. A few efforts [16], [28] partially considered the deadlines or resource wastage in a dynamic Fog for simpler monolithic applications. M-RAP augments the related works by modeling the network instability and device availability as a dynamic multilayer graph to decrease resource wastage and maximize deadline satisfaction of workflow applications in Fog.

III. MODEL
This section presents the abstract model underneath M-RAP, using a formal notation summarized in Table III.

A. Infrastructure Model
We model the distributed infrastructure in three layers. 1) Cloud layer represents a high-performance data center suitable for placing resource-intensive tasks. 2) Fog network layer F = (D, N ) lies between the Cloud and the end-users and provides proximity computational and storage services on top of two resource sets [7].
where R j1 represents the number of cores, R j2 the memory size in GB, and R j3 the storage size in GB, and R j4 processing speed in millions of instructions (MI) per second.
between devices, where a connection n qj = (BW qj , LAT qj ) depends on the bandwidth BW qj and the latency LAT qj between the devices d q and d j . 3) Client layer consists of a set of users U = {u 1 , . . . , u z }, including sensor and actuator client devices that request Fog resources for placing their applications.

B. Multilayer Fog Resource Model
The single Fog network layer only represents the topological interconnection of the physical devices, thereby failing to capture their heterogeneous resources. To better handle resource diversity, we model the Fog across four layers L = {l 0 , l 1 , l 2 , l 3 } that incorporates various device relationships based on the network topology and resource types.  1) Inter-layer edges E ll = {E j ll |(d j , d j ) ∈ D × D} connect each device d j in the layer l ∈ L with the corresponding device d j in all the other layers l ∈ L, l = l . They uncover relations among resource types within the same device. We consider a weight w j ll = 1 for the inter-layer edges of the device d j in the layers l and l . 2) Intra-layer edges E ll = {E l qj |(d q , d j ) ∈ D × D ∧ q = j} connect two Fog devices inside one layer l ∈ L using a weight function w (l) : E ll → R, representing their similarity score w is the euclidean distance between their resources, calculated based on the features in each layer, as follows: whereR j1 andR j4 are the normalized number of cores R j1 and processing speed R j4 of d j . This facilitates the partitioning of Fog devices based on their specific resource capacity. The Fog devices with a similarity score of 1 have an identical resource in the layer l.

C. Dynamic Fog Network Model
We define a dynamic Fog infrastructure as a temporal sequence of network layers {F 0 , . . . , F t−1 , F t , . . . } changing over time. Every temporal Fog network represents the resources available on a device d t j during the time interval Δt. We update the set of available devices D t+1 = D t ∪ DJ t \ DL t with: a) Joining devices DJ t in the Fog network during the time interval Δt; b) Leaving devices DL t from the Fog network during the time interval Δt.

2) Network connections
where a network connection n t qj = (BW t qj , LAT t qj ) represents the available bandwidth BW t qj and latency LAT t qj between the devices d t q and d t j .

D. Dynamic Multilayer Fog Resource Model
We model the dynamic Fog resources as a temporal sequence of multilayer graphs {G 0 , . . . , G t−1 , G t , . . . }, where G t = (D t , E t , L) models the the multilayer Fog resources during the time interval Δt (see Section III-B). We represent the multilayer graph changes ΔG t = (GJ t , GL t ) in the Fog infrastructure during the time interval Δt = [t − 1, t] using the incremental growth GJ t and shrink GL t . 1) Incremental growth GJ t = (DJ t , EJ t ) denotes the set of joining devices DJ t = {d n+1 , . . . , d n+k } and their associated edges EJ t = {E t ll |∀l, l ∈ [0, L)} in the Fog network during the time interval Δt. We update the multilayer resource graph by replicating all new Fog devices in all the layers L of the multilayer graph G t . We finally generate the set of bidirectional edges EJ t containing the new inter-layer edges 2) Incremental shrink GL t = (DL t , EL t ) denotes the set of leaving devices DL t = {d n−1 , . . . , d n−k } and their associated edges EL t = {E t ll |∀l, l ∈ [0, L)} in the Fog network during the time interval Δt. We update the Fog multilayer graph by deleting the devices in DL t and their asso-

E. Application Model
An online time-critical application has three characteristics. 1) Application structure A = (S, M, Θ, u) requested by a user u is a DAG of services S = {s 1 , s 2 , . . . , s m } interconnected through request messages M . Every service s i ∈ S has a quadruple of resource demands s i = (r i1 , r i2 , r i3 , r i4 ), where r i1 is the required number of cores, r i2 is the required memory size, r i3 is the storage size, and r i4 is the workload in MI. We assume that the service resource demand is static and does not change over time. 2) Request message m pi = (SZ pi , s p , s i ) ∈ M has a size SZ pi , a source s p ∈ S, and a destination service s i ∈ S. A user u ∈ U triggers the application execution via an initial request message m u1 to the service s 1 . 3) Absolute soft deadlines Θ = {θ 1 , θ 2 , . . . , θ m } define the target completion time θ i for each service s i . Following the theory of soft real-time computing, our utility function maximizes the set of met service deadlines to optimize the overall application quality of service [29].

F. Workload Model
We consider a workload W = {AS 0 , . . . , AS t , . . . } of applications requested by different users U during multiple time intervals Δt = [t − 1, t] in a dynamically evolving Fog infras- ). An invalid placement μ t (s i ) = ∅ indicates no device in D t that satisfies the service constraints.
2) Execution time of a service s i is the ratio between its workload r i4 and the processing speed R j4 of the hosting where LAT t qj is the latency and BW t qj is the bandwidth of a network connection n t qj ∈ N t at the time interval δt. 4) Failure overhead caused by a leaving device d t j ∈ DL t to a service s i during the time interval Δt is the sum of its response time RT t ij and the time P T ij for executing the placement algorithm to select a new device d t j (see Algorithm 3, Section VI-A): Δt is the sum of a) the maximum response time RT t pq of its predecessors s p , including its request message transmission time in case of initial message request), b) its execution time ET ij , and c) any potential failure overhead caused by the leaving device d t j : for the required execution time intervals: t, t + 1, . . . , t + RT t ij Δt , and releases them for the intervals following its completion. 7) Application response time is the highest response time of all services s i ∈ S: The service with the highest response time has no successors in the application DAG structure. 8) Deadline fulfillment requires that the response time of the application placement satisfies the deadline Θ: RT A < Θ.

G. Placement Objectives
We define three performance objectives for placing a requested application set AS t in a dynamic Fog infrastructure F t during the time interval Δt = [t − 1, t]: 1) Maximize Fog placement rate, as the ratio between the number of services s i placed in the Fog and the total number of requested services for all applications A ∈ AS t : and |S| and |S µ | represent the cardinality of the two service sets.
2) Minimize processing, memory, and storage wastages, defined as the remaining percentage between the resource units r ik consumed by the placed services to the total resource units R t jk of the devices: 1core and unit 2 = unit 3 = 1GB (memory and storage sizes).
3) Maximize service deadline satisfaction rate, defined as the percentage of services fulfilling their absolute soft deadlines and |S Θ | along with |S| represent the cardinality of the two service sets.

A. Multilayer Graph Modeling
A multilayer graph models the Fog devices based on their network interconnections and similarity in processing speed and the number of cores, memory, and storage sizes, as described in Section III-B. A dynamic multilayer graph constantly updates the multilayer graph based on the incremental temporal changes (i.e., growth, shrink) in the availability of Fog devices (i.e., joining, leaving), as defined in Section III-D.

B. Fog Multilayer Partitioning
The multilayer partitioning of the Fog devices from a multilayer graph based on their network connections, processing, memory, and storage resources has three steps (see Section V-B and Algorithm 1): 1) Layer partitioning splits each layer l ∈ L of the multilayer graph in a set P(l) of disjoint partitions that cluster the devices based on their resource types targeting a single objective such as transmission time, processing speed, and resource wastage (see Section III-B).
r Network layer l 0 partitioning clusters the highly interconnected devices based on their network connections N .
r Processing layer l 1 partitioning cluster Fog devices with a similar number of cores and processing speed.
r Memory l 2 , and storage l 3 layer partitioning cluster Fog devices with similar memory and storage sizes.
2) Graph compression shrinks the disjoint partitions from the processing, memory, and storage layer partitions in a simpler representation associated with similar resources. The compressed graph enables the detection of overlapping partitions without analyzing individual devices resulting in considerable reductions in both time and cost. 3) Feature partitioning splits the compressed graph into disjoint partitions, where each partition clusters devices with similar average processing, memory, and storage sizes to address multiple objectives, such as minimizing transmission time, processing speed, and resource wastage. 4) Incremental multilayer partitioning updates the multilayer partitions, compressed graph, and feature partitions upon the temporal availability of the Fog devices and their resources (see Section V-C and Algorithm 2) to adaptively detect clusters in dynamic Fog with minimum cost and time.

C. M-RAP Placement
The placement phase maps all application sets AS t arriving during the time interval Δt in two phases.
1) Dynamic application placement allocates resources to all services simultaneously to ensure their placement in the same network partition and satisfy their deadlines. a) Similarity placement maps every service s i onto the proper feature partition F P t k based on its deadline and resource requirements according to the placement function (see Section VI-A and Algorithm 3). This phase narrows the search space by exploring the partitions rather than all Fog devices. b) Monitoring and replacement softens the hard deadline in critical cases of invalid placements and leaving devices and relocates the affected services to new feature partitions with minimal deadline violation. 2) Service placement assigns a Fog device d t j = μ t (s i ) in the selected feature partition to each service s i while ensuring the placement of the entire application within the same network layer partition (see Section VI-B and Algorithm 4).

V. FOG MULTILAYER PARTITIONING
This section describes the multilayer partitioning method.

A. Modularity
We convert multilayer graph of Fog resources into multilayer partitions using modularity [30] metric Q ∈ [−1, 1] to measure their connectivity strength: Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
qj are the connectivity strengths of d j and d q with the layer l's devices; jq is the total sum of the intra-layer edges' weights between devices d j and d q in layer l ∈ L; r W = l∈L W l is the total sum of the intra-layer edges' weights between Fog devices, ∀l ∈ L; r E j ll indicates the number of inter-layer edges of the Fog device d j from layer l to layer l ; r α ll is equal to 1 if l = l and 0 otherwise; r B jq is equal to 1 if j = q and 0 otherwise; r λ jq is equal to 1 if devices d j and d q belong to the same partition, otherwise 0. Q ≤ 0 represents low-quality partitions of disassortative Fog devices with sparse connections among them.
Q > 0 represents high-quality partitions with better connectivity strength among densely connected Fog devices. Hence, the goal is to find a set of partitions in a multilayer graph with the highest modularity (Q → 1).

B. Multilayer Partitioning
We convert the multilayer graph of Fog resources into a multilayer partition(P(l), P(G P )) of layer P(l) and feature partitions P(G P ) in three steps.
1) Layer partitioning defines a set of disjoint partitions P(l) = {p 1 , p 2 , . . . } in a layer l, and employs the Louvain clustering [31] and the modularity [30] metric to obtain high-quality partitions with densely connected devices. The Louvain algorithm applies two phases in multiple iterations until achieving a partition with the maximum modularity. a) We consider each Fog device d j in a layer l as a single partition, calculate the modularity and define it as Q max . Afterwards, we consider all its neighboring devices d q (i.e., (d j , d q ) ∈ E ll and (d q , d j ) ∈ E ll ) and calculate the modularity Q 1 by considering the new possible partition. If the gain in modularity is positive (i.e., Q 1 − Q max > 0), we place d j and d q in the same partition and consider Q 1 as Q max . We repeat this step sequentially for all devices in the layer l until no further modularity gain is possible. b) We consider all partitions from the first phase as nodes of each layer l, connected according to the edges between their devices. The edges between the Fog devices in the same partition represent self-loops. This new graph and its maximum modularity represent the input to the next iterative step starting with the first phase. 2) Graph compression merges similar resources from a partition in a single node with an average speed or size in the processing, memory, and storage layers. This highlevel intermediate representation of the partitions associated with similar resources automates the detection of overlapping partitions without a detailed analysis of individual devices. A compressed graph G P = (V P , E P ) corresponding to a multilayer graph G = (D, E, L) consists of two sets: r Layer partition nodes are the union in processing (l 1 ), memory (l 2 ), and storage (l 3 ) layers: r Inter-layer partition edges represent connections between a partition p in a layer l and a partition p in a layer l = l, such that there is at least one inter-layer edge (d q , d q ) ∈ E ll in the original graph G between a device d q ∈ p ∈ P(l) and a device d q ∈ p ∈ P(l ): 3) Feature partitioning splits a compressed graph G P = (V P , E P ) in a set P(G P ) of disjoint feature partitions (exhibiting similar features) by applying several Louvain clustering steps to achieve maximum modularity (similar to the layer partitioning from Section I). A feature of a layer partition p ∈ V P is a quadruple F p = (R p1 , R p2 , R p3 , R p4 ) with the average number of cores, memory and storage sizes, and processing speed across all devices in p. Example 4. We partition the multilayer graph in Fig. 2. 1) Layer partitioning: The Louvain algorithm finds the highest modularity partitions in each layer in two steps. a) We create four partitions P(l) = {p 1 , p 2 , p 3 , p 4 } in the layer l and other four P(l ) = {p 1 , p 2 , p 3 , p 4 } in the layer l (see Fig. 4). Each partition in P(l) and P(l ) consists of one Fog device (i.e., d 1 , Afterward, we consider all the neighboring devices and create two partition sets P(l) = {p 1 , p 2 } and P(l ) = {p 1 , p 2 } with a positive modularity gain (i.e., , indicating better device connectivity in each layer partition. Thus, we select the partitions P(l) and P(l ), update the maximum modularities (i.e., Q max = Q 1 , Q max = Q 1 ), and consider p 1 , p 2 , p 1 , p 2 as nodes in a new multilayer graph. b) We build one partition in each layer considering the partitions p 1 , p 2 , p 1 , p 2 with the highest modularities Q max and Q max from the first step, and the neighboring nodes, P(l) = {p 1 } and P(l ) = {p 1 }, with the modularities Q 2 = Q 2 = 0 (see Fig. 4). As the maximum modularities from the first step are positive for both layers, we consider them as the highly connected partition output. 2) Graph compression: Fig. 5(a) illustrates a compressed graph representing the four partitions in the l and l layers from Fig. 4 as nodes: Similar to the nodes, we compress the edges between two partitions: where each feature partition F P k consists of a single layer partition (i.e., p 1 , p 2 , p 3 , p 4 ). Afterward, we consider the neighboring partitions to generate a new feature partition set P(G P ) = {F P 1 , F P 2 } with a positive modularity gain Q = 0.37 > Q max , which becomes the maximum modularity Q max = Q. b) We start from the feature partition set P(G P ) with the maximum modularity from the first step. We iteratively check the neighboring partitions of each feature partition F P k ∈ P(G P ) considered as a node of the new graph and obtain a single partition F P 1 with lower modularity Q = 0. We select the feature partition set P(G P ) = {F P 1 , F P 2 } with the highest modularity from the first step as output.

C. Incremental Multilayer Partitioning
We update the multilayer partitions (P t (l), P t (G P )) during the time intervals Δt in three steps to maximize the modularity in response to the availability of Fog devices. 1) Incremental layer partitioning updates the previous layer partitions P t−1 (l) to P t (l) based on the layer partition changes ΔP t (l) computed from the multilayer graph a) Incremental growth GJ t = (DJ t , EJ t ) upon every joining device d n+1 ∈ DJ t has three cases. Case-1: If d n+1 has no intra-layer edges in the layer l ∈ L, we create an isolated partition p = d n+1 ∈ ΔP t (l) in l, and consider other layer partitions unchanged. Case-2: If d n+1 has intra-layer edges in the same layer partition p ∈ P t (l), we add d n+1 to the partition p ∈ ΔP t (l) and consider the other layer partitions unchanged. Case-3: If d n+1 has intra-layer edges in more layer partitions, we add it to the partition p ∈ ΔP t (l) with the highest aggregated similarity score Δw l (n+1)p = m d j ∈p w l (n+1)j across all its edges, where m is the number of devices in p. b) Incremental shrink GL t = (DL t , EL t ) upon every leaving device d n−1 ∈ DL t has two cases. Case-1: If d n−1 ∈ p ∈ P t (l) has no edges, we delete it from its associated partition p ∈ ΔP t (l) and consider the other layer partitions unchanged. Case-2: If d n−1 has several connecting edges, we restructure all the devices in the partition p ∈ ΔP t (l) containing d n−1 and its neighboring partitions p ∈ ΔP t (l) into single partitions, and apply the Louvain algorithm. 2) Incremental graph compression updates the compressed graph G t p = (V t p , E t p ) at the time interval Δt based on the layer partition changes ΔP t (l) in the processing, memory, and storage layers during the time interval Δt. We compute the compressed graph changes ΔG t p = (ΔV t p , ΔE t p ) by merging the layer partitions changes ΔP t (l) and their inter-layer edges into two sets of single nodes ΔV t p and edges ΔE t p . 3) Incremental feature partitioning updates the feature partitions P t (G p ) at the time interval Δt using the compressed graph changes ΔG t p = (ΔV t p , ΔE t p ) and the feature partitions during the interval Δt = [t − 1, t]. We restructure the layer partition changes ΔV t p in the processing, memory, and storage layers to single partitions and repeat the feature partitioning steps for the compressed graph changes ΔG t p (see Sections II and III). We leave the other partitions unchanged. Example 5. We update partitions upon incremental changes in the multilayer graph ΔG t = (GJ t , GL t ) from Section III-D.
1) Incremental layer partitioning updates the partitions based on the incremental growth and shrink in two steps: GJ t : The joining device d 5 in the layer l belongs to the Case-3, since it has two edges to the layer partitions p 1 , p 2 ∈ P(l) with the aggregated similarity scores of w l 51 = 0.5 and w l 52 = 1. We, therefore, assign d 5 to the layer partition p 2 with the highest score and update the layer partition changes ΔP t (l) = p 2 . The joining device d 5 in the layer l has two edges to the same layer partition p 2 ∈ P(l) and belongs to the Case-2. Thus, we add it to the layer partition p 2 and update the layer partition changes ΔP t (l ) = p 2 . GL t : The leaving device d 1 in the layers l and l represents the Case-2 in Section V-C. We restructure the layer partitions p 1 ∈ P(l) and p 1 ∈ P(l ) to single partitions and update them to p 1 = {d 2 , d 3 } and p 1 = {d 2 } using the Louvain algorithm. Finally, we update the layer partition changes to ΔP t (l) = {p 1 , p 2 } and ΔP t (l ) = {p 1 , p 2 }. 2) Incremental graph compression joins the devices in the layer partitions {p 1 , p 2 } ∈ ΔP t (l) and Similarly, it compresses the inter-layer edges between these partitions to single edges:

D. Multilayer Resource Partitioning Algorithm
Algorithm 1 receives the following input parameters: 1) a Fog multilayer graph with four layers L, 2) a set of Fog devices D and their processing, memory, and storage resources, and 3) the inter-layer and intra-layer edges E. First, line 1 initializes five empty sets corresponding to the partitions in the network (l 0 ), processing (l 1 ), memory (l 2 ) and storage (l 3 ) layers, and the feature partitions. Line 2 clusters densely connected Fog devices in the same network layer partition P(l 0 ). Similarly, lines 3-5 partition the Fog devices in the processing P(l 1 ), memory P(l 2 ), and storage P(l 3 ) layers (see Section I). Line 6 creates a compressed graph G P (V P , E P ) using the interlayer edges between the processing, memory, and storage layers, where V P = {P(l 1 ), P(l 2 ), P(l 3 )} (see Section II). Afterward, lines 7 computes the feature F P of each partition p ∈ V P as their devices' d j ∈ p average number of cores, memory, storage sizes, and processing speed. Line 8 performs feature partitioning on the compressed graph, as presented in Section III. Finally, line 9 returns the feature partition set P(G P ) and the set of partitions in the network layer P(l 0 ).  Computational complexity of Algorithm 1 is O(|L| · |E ll | + |E P |), where |L| is the number of layers in Fog multilayer graph, |E ll | is the number of intra-layer edges in layer l, |E p | is the number of edges in compressed graph G p , and |L| |E p | |E ll |. Hence, the algorithm has a linear complexity of O(|E ll |).

E. Incremental Multilayer Partitioning Algorithm
Algorithm 2 has the following parameters: 1) a Fog multilayer graph during the time interval Δt with four layers L, 2) the inter-layer and intra-layer edges E t , 3) the incremental multilayer graph changes ΔG t during the time interval [t − 1, t], and 4) the layer P t−1 (l) and feature partitions P t−1 (G p ) of the previous time interval. First, line 1 initializes five empty sets corresponding to the partitions in the network (l 0 ), processing (l 1 ), memory (l 2 ), and storage (l 3 ) layers, and the feature partitions P t (G p ) during the time interval Δt. Lines 2-4 update the layer partitions and calculate the changes ΔP t in the network, processing, memory, and storage layers based on the incremental multilayer graph changes ΔG t during the interval Δt and the previous layer partition P t−1 (l i ) (see Section V-C). Line 5 generates a compressed graph of partition changes in the processing, memory, and storage layers and their associated inter-layer edges. Line 6 calculates the feature F p of layer partitions changes as the average number of cores, memory and storage size, and processing speed of their devices d t j ∈ ΔV t P . Line 7 updates the feature partitions using the compressed graph changes ΔG t p (ΔV t p , ΔE t p ), and the previous feature partitions P t−1 (G p ) (see Section V-C). Finally, line 8 returns the network layer and feature partitions (P t (l 0 ), P t (G p )) during the time interval Δt.
dLayerPartition function updates the partitions upon changes in the Fog (line 10). Line 11 initializes the layer partition changes ΔP t and layer partitions P t (l i ) with the empty set. Afterward, lines 12-14 iterate through all joining devices d t j ∈ DJ t , and line 13 updates the partitions based on three incremental growth case conditions defined in Section V-C. Similarly, lines 15-17 iterate through all the leaving devices d t j ∈ DL t , and line 16 updates the partitions based on the two incremental shrink case conditions explained in section V-C. Line 18 creates a new set of partitions P t (l i ) based on layer partition changes ΔP t (l i ) and previous layer partitions P t−1 (l i ). Line 19 returns the layer partition changes ΔP t (l i ) and the updated layer partition P t (l i ).
compressGraph function initializes the compressed node changes ΔV t p and their associated edges ΔE t p with the empty set in line 22. Lines 23-26 iterate through the partitions in the layer partition changes and add them together with their associated inter-layer edges to the compressed graph changes ΔG t p (ΔV t p , ΔE t p ) returned in line 27. dFeatPart function updates feature partitions P t (G p ) upon incremental changes ΔP t (G p ) (line 29), initialized in line 30 with the empty set. Lines 31-33 iterate through changed partitions V t p in the compressed graph ΔG t p . Line 32 updates the previous feature partitions P t−1 (G p ) by repeating the feature partitioning step for every change V t P in the compressed graph, added to the feature partition changes ΔP(G p ). Line 34 updates the feature partitions P t (G p ) by adding the changes ΔP t (G p ) to the previous feature partitions P t−1 (G p ), returned in line 35.
Computational complexity of Algorithm 2 is O(|L| · |EJ ll + EL ll | + |ΔE P |), where |EJ ll | and |EL ll | are the numbers of intra-layer edges of joining and leaving devices in the layer l, |ΔE p | is the number of edges in the compressed graph changes ΔG p , and |L| |ΔE p | |EJ ll + EL ll |. Hence, the algorithm has a linear complexity of O(|EJ ll + EL ll |), which is negligible compared to Algorithm 1 (|EJ ll + EL ll | |E ll |), since it only updates the partitions based on incremental changes.

VI. M-RAP PLACEMENT
This section describes the placement of application services in dynamically selected feature partitions based on two dynamic application and service placement algorithms.

A. Dynamic Application Placement Algorithm
The dynamic application placement algorithm maps each application service during a time interval to a feature partition with the highest fitness. We define the fitness of a feature partition F P t k for a service s i ∈ S based on the similarity of its Fog devices to the service resource demands and the lowest transmission time to the requesting user u: is the maximum partition similarity between the service demand s i = (r i1 , r i2 , r i3 , r i4 ) with the feature F p = (R t p1 , R t p2 , R t p3 , R t p4 ) in F P t k at the time interval Δt, calculated using the euclidean distance in the [0, 1] interval. Placing a service to a device in a partition with the highest similarity satisfies the service demands and avoids resource wastage on devices with higher capacity. 2) Rank(p, T R uj ) ranks the partition p based on the transmission time T R uj between the device d t j ∈ p ∈ F P t k and the user u requesting the service s i . The partition with the highest rank, normalized in the [0, 1] interval, contains the device d t j with the lowest transmission time to the user. Algorithm 3 receives five parameters for computing feature partitions that satisfy the application deadline Θ and individual service resource demands: 1) an application workload W over t time intervals, 2) the feature partition set P t (G P ), 3) the network partition set P t (l 0 ), 4) the message transmission times T between the users and the Fog devices, and 5) the leaving devices during the time interval Δt. First, sorts the applications AS t requested in time interval Δt to prioritize those with the lowest deadline. The algorithm places the applications in two steps.
1) Similarity placement (lines 2-4) selects the appropriate feature partitions for all applications A ∈ AS t requested during the time interval Δt and place their services s i ∈ S on the Fog devices that satisfy their resource demands close to the requesting users. In this first stage, the placeApp function considers the service deadlines and declares invalid placement μ t (s i ) = ∅ in case of any violation. 2) Monitoring (lines 5-13) iterates through all executing services and takes two actions: a) release resources of completed tasks (lines 6-8); b) replace mapping devices in case of leaving devices (lines 9-10) or invalid placements (lines [11][12], by invoking again the placeApp function to select a new feature partition for placing the affected service close to its requesting user satisfying the resource demands. In the case of invalid placement, the new invocation relaxes the hard deadline (set to infinity) and places the affected service on the fastest available device. The algorithm returns an array of placement functions μL [W ] in line 13. placeApp function selects the feature partition (line 16) that satisfies the resource demand of each service s i ∈ S (line 17). Lines 19-22 iterate through each feature partition F P t k ∈ P t (G P ), calculate its fitness to s i , and insert it in an fpRank list in descending fitness order (line 20). Line 21 sorts the devices d t j ∈ F P t k in ascending order based on their transmission time T R uj to user u and stores them in a matrix dM atr, where each row contains the transmission time to Fog devices in a feature partition F P t k ∈ P t (G P ). Line 23 invokes a service placement function (see Algorithm 4) that maps all application services s i ∈ S onto devices with appropriate features in the same network layer partition. The function returns the device μ t (s i ) in line 25, which updates the application placement μ t returned in line 23.
Computational complexity of Algorithm 3 is O(|S| · |P t (G p )|), where |S| is the number of services and |P t (G p )| is the number of feature partitions at time interval Δt. The algorithm reduces the service placement complexity by clustering similar devices into a feature partition. Therefore, it only searches for appropriate feature partitions instead of iterating through all devices D, where |P t (G p )| |D|. Since

B. Service Placement Algorithm
The service placement algorithm allocates a Fog device to each application service in the selected feature partitions. Placing the services of an application across tightly connected Fog devices in the same network layer partition during a time interval brings two advantages: 1) less network instability due to alternative connections between Fog devices; 2) lower transmission time between devices inside a partition, which reduces application response time. Algorithm 4 receives the following input parameters: 1) a service s i to place on a Fog device μ t (s i ) in the same network layer partition as the other application services; 2) an absolute deadline θ i for a service s i , 3) a network layer partition set P t (l 0 ), 4) a feature partition set fpRank ranked based on the fitness to s i , and 5) a sorted list of devices based on their transmission time in each feature partition F P t k ∈ fpRank. Lines 2-3 extract the network layer partition of the first service placement μ t (s 1 ) in p 1 (if available). Lines 4-12 iterate through the feature partitions F P t k ∈ fpRank at the time interval Δt in descending order of their fitness. Afterward, line 5 extracts the set of Fog devices dList in each feature partition F P t k sorted by the transmission times to the requesting user. To place the service s i onto a Fog device, lines 6-12 iterate through each device d t j ∈ dList and line 7 extracts its network layer partition in p i . If this partition is the same as p 1 and the device d t j meets the resource constraints of the service s i (including its deadline θ i ), line 9 performs the placement and line 10 updates the available resources of device d t j . If no service placement on the same network partition is possible, line 14 assigns an invalid device. Lines 11 and 14 return the placement result.
Computational complexity of Algorithm 4 is O(1) in the best case, by placing a service onto a feature partition F P t k with the highest fitness and the device with the lowest user transmission time. If no device with sufficient resources exists, the algorithm searches for other feature partitions and their associated devices, leading to a worst-case complexity of O(|P t (G p )| · |F P t K |), where |P t (G p )| is the number of feature partitions during the time interval Δt and |F P t K | is the number of devices in the feature partition F P t K . Since |P t (G p )| |F P t K |, the algorithm has a linear worst case complexity of O(

VII. EXPERIMENTAL SETUP
We perform experiments based on a real computing testbed and real-world applications and compare the results against three related methods. We first validate M-RAP using simulation in Section VIII followed by a real testbed in Section IX.

A. Carinthian Computing Continuum
We used the Carinthian Computing Continuum (C 3 ) testbed consisting of eight heterogeneous virtual resource instances from four providers distributed in seven geographical locations across the Cloud and Fog layers, displayed in Table IV.
Cloud resources consist of virtualized large instances provisioned on-demand on the Exoscale data centers (https://www. exoscale.com/compute/) in Sofia.
Fog devices comprise Exoscale provider instances in the data centers of the A1 network operator in Vienna, Zurich, and Munich with a 10Gbit/s network throughput. The instances are of types medium, small, and tiny running Ubuntu 18.04 LTS. The University of Klagenfurt (AAU) [17] provisions virtualized large instances with 12-core AMD Ryzen Threadripper 2920X processors at 3.5GHz and 32GB of memory, and medium with 8-core processor and 16GB of memory running Ubuntu 18.04 LTS. Finally, five NVIDIA Jetson Nano (NJN) running Linux for Tegra (L4T), three Raspberry Pi-3 B+ (RPi3B+), and 32 Raspberry Pi-4 (RPi4) with Raspberry Pi OS complete the testbed.

B. Related Work Comparison
We compare M-RAP with three state-of-the-art methods investigating application placement on Fog and Cloud.
1) Availability-aware placement (AAP) [9] uses a greedy algorithm to improve the application deadline satisfaction upon failures. For this purpose, it partitions the Fog devices into hierarchical clusters based on network connectivity without considering the resource characteristics. AAP places applications with higher deadlines onto the Cloud upon insufficient Fog devices. 2) Resource-aware placement (RAP) [15] uses a fractional selectivity model to place services onto the Fog and Cloud based on their requirements. RAP optimizes resource utilization and response time but ignores network connectivity. 3) Joint container placement (JCP) [16] uses particle swarm optimization to improve the placement rate and deadline satisfaction of monolithic applications in a dynamic Fog and Cloud environment. JCP does not support workflows.

4) Extended M-RAP uses Cloud instances upon insufficient
Fog devices to enable a fair comparison against the previous related methods. After Algorithm 3 sorts and prioritizes the placement of applications with lower deadlines onto the available Fog devices, it places the remaining ones with higher deadlines onto Cloud instances with the highest similarity score and the lowest transmission time to the requesting user to satisfy their resource demands.

VIII. SIMULATION EXPERIMENTS
We evaluated the M-RAP method using the YAFS [32] simulator on an Intel Core (T M) i7-8650U processor at 1.90GHz and 16GB of RAM, running Ubuntu 18.04 LTS.

A. Workload Simulation
We use the Gn_Graph function from the NetworkX package to generate workflows compliant with the model in Section III-E. Each request has a size in the range 0.002 MB-50 MB, which generates a service workload of range 100 MI-500 MI according to a representative data stream workflow [1]. Each service has a deadline of range 20 ms-200 ms based on the analysis of five types of time-critical applications (see Table V). We generated the application microservices and their resource requirements (i.e., processing speed, number of cores, memory, and storage size) using a uniform random distribution based on a previous study [9].

B. Fog Simulation
We simulated the Fog as a bidirectional Albert-Barabasi random graph [33] with 100 C 3 devices, as recommended in [9], [34], using the Python NetworkX package. We selected 25 devices with the highest betweenness centrality [35] as Fog gateways. The device with the lowest betweenness centrality represents the Cloud data center. We configured the network latency and bandwidth by sending ICMP echo requests and measuring the maximum throughput using the iPerf3 tool [17] (see Table IV). We simulated the dynamic Fog over 10000s and configured the joining and leaving Fog devices using a uniform random distribution based on range 90-99% availability model [36]. We selected a simulation time interval 20ms representing the lowest service deadline of the simulated time-critical applications (see Table V). We monitor the leaving and joining devices in each time interval and perform incremental multilayered resource partitioning upon Fog infrastructure changes.

C. Simulation Scenarios
We divide the simulation into three parts, shown in Table VI. We split the 10000s simulation in five simulation periods (i.e., ΔT 1 -ΔT 5 ) of 2000s each to simplify the analysis and direct comparison with the AAP method [9]. 1) Dynamic service placement evaluates the Fog placement rate, failure rate, resource wastage, and hop distance in three scenarios (i.e., SMALL, MEDIUM, LARGE). Each scenario contains randomly generated application sets, the number of services, users, and application requests.

D. Dynamic Service Placement
We evaluate the dynamic placement regarding Fog placement rate, resource wastage, and hop distance.
1) Fog Placement Rate: Fig. 6 shows the placement and failure rates (marked "F" calculated as the ratio between the number of failed services and the total requested services S µ ) on the Fog devices over the 10000s simulation time in the three scenarios.
SMALL: Fig. 6(a) shows that M-RAP achieved an average Fog placement ratio of 0.94 during all simulation periods with only 0.03 failed services. Although AAP placed a similar number of services in the Fog during the ΔT 1 , ΔT 3 , and ΔT 5 periods, the failure rate increased from 0.1 to 0.46, leading to an average Fog placement ratio of 0.60. JCP performed worst with an average placement ratio of 0.23.
MEDIUM: Fig. 6(b) shows that M-RAP outperformed the other methods during all time intervals with an average Fog placement ratio of 0.68. M-RAP imposes only 0.034 failures ratio thanks to its incremental partitioning approach. Although JCP can also execute applications in a dynamic Fog with low failures ratio (0.03), it only achieved an average placement ratio of 0.23. In contrast, AAP and RAP cannot cope with the changing device availability and placed only 0.44, respectively 0.45 services, with an average failure ratio of 0.14 and 0.15. LARGE: Fig. 6(c) shows that M-RAP and JCP placed the services in the Fog with only 0.02 and 0.01 failed services, respectively. M-RAP achieved a higher average ratio of 0.4, while AAP and RAP placed only 0.29, respectively 0.26 of the services, with average failure rates of 0.11 and 0.07 (increasing from ΔT 1 to ΔT 5 ).
Summary: M-RAP outperformed the related methods by placing services onto a consolidated set of devices with high resource similarity, leaving powerful devices for other placements. M-RAP considers the device's availability and introduces low failures. In contrast, AAP and RAP are static greedy methods that deliver higher wastage, lower placement, and higher failure rates since they do not consider network changes. Although JCP exhibited fewer failures, it placed fewer applications by considering them monolithic.
2) Resource Wastage: Fig. 7 compares the four methods' average core, memory, and storage wastage (see section III-G) over a simulation time of 10000s. Figs. 8 and 9 analyze the idle resources, indicating the percentage of underutilized cores and memory on each device d t j :    SMALL: M-RAP and RAP provide an average core wastage of 0.60 and a memory wastage of 0.47, while 50% of the Fog devices are idle. AAP delivered a slightly better core wastage of 0.57 and an average memory wastage of 0.54; however, it did not reduce the idle Fog devices below 50%. In contrast, JCP has the highest core wastage of 0.97 and a memory wastage of 0.90 with 95% of idle Fog devices.
MEDIUM: AAP and RAP delivered an average core wastage of 0.31 and 0.3 with 20%, respectively 38% of idle Fog devices. M-RAP outperformed them with a lower average core wastage of 0.29 and 10% idle Fog devices. Once again, JCP performs worst with average core and memory wastage of 0.97 and 85% idle Fog devices.
LARGE: M-RAP has average core wastage of 0.27 and 10% idle Fog devices. AAP and RAP performed equally well (0.3) with insignificantly more idle devices. JCP had the highest wastage of 0.88 and 78% idle Fog devices. We can observe the same trends for memory wastage and idle memory in all three methods. In contrast, the average storage wastage was very high in all cases, with an average of 0.9 for M-RAP, AAP, and RAP and 0.95 for JCP.
Summary: The core, memory, and storage resources remained vastly underutilized in all methods. M-RAP maximizes the Fog placement rate and consumes more resources, which reduces the core and memory wastage compared to the other methods. We observe no significant differences in storage wastage. M-RAP consolidates resource capacity by placing services on fewer devices with similar requirements, keeping powerful devices available, and reducing resource fragmentation and waste.
3) Hop Distance: Fig. 10 compares the average hop distance, indicating the proximity of the hosting devices to the requesting users over the physical network channels during the simulation time of 10000s. One hop distance indicates a service placement at the Fog gateway devices. We compute a hop distance histogram to increase the placement rate at a low hop distance across all application services. SMALL: Fig. 10(a) shows that M-RAP and AAP placed 42, respectively 38 services at the first hop distance, while RAP and JCP did not manage to place any services. M-RAP and AAP outperformed RAP and JCP by placing 33 and 46 services at the hop distance 2, respectively 56 and 70 services at the hop distance 3. In contrast, RAP and JCP placed 5 services at a hop distance of 2 and 36, respectively 25 at a distance 3.
MEDIUM: Fig. 10(b) shows that M-RAP placed more services near end-users, i.e., 41 services at hop distance 1 and 48 services at hop distance 2. AAP placed 28 services at hop distance 1 and 35 at a hop distance 2. RAP and JCP performed worst and placed few services at hop distances of 1 and 2, and the rest at hop distances of 4 and 5.
LARGE: Fig. 10(c) shows that M-RAP outperformed the other methods by placing 44, 58, respectively 109 services at the first three-hop distances. In comparison, AAP placed slightly fewer services at the first three-hop distances (i.e., 39, 55, and 102).  RAP and JCP performed worst by placing 94% of the services at the distances of 3, 4, and 5.
Summary: M-RAP places more services across highly connected Fog devices closer to the users by considering the message transmission time as part of its methodology.

E. Runtime Analysis
We model the total application response time in three aggregated components, following our methodology in [37]: 1) Failure overhead caused by the leaving Fog devices has two components: a) lost computation overhead from a service failure, detected at the end of a time interval, and b) placement overhead from executing the incremental resource partitioning and placement algorithms. 2) Transmission and execution times represent all services' average successful response time after placing the failed ones on new devices. 3) Delay is the difference between the response time and the deadline in case of a deadline miss. 1) Response Time: Fig. 11 shows the average response time and the failure overhead (marked as "F") in a dynamic Fog. Fig. 12(a) and (b) indicate the simulation's average transmission and service execution times. D-SMALL: Fig. 11(a) shows that M-RAP outperformed the other methods for all time intervals with an average response time of 32.42ms and 4.3ms failure overhead. Although AAP performed slightly better for the first three simulation periods, it provided a higher average response time of 92.17ms due to the higher failure placement. RAP and JCP performed worst with average response times of 128.11ms and 116.6ms due to higher transmission time (see Fig. 12(a)). RAP had a high failure overhead of 41.5ms. D-MEDIUM: Fig. 11(b) shows that M-RAP reduces the response time and failure overhead to an average of 75ms, respectively 6.6ms due to the lower transmission time (see Fig. 12(a)). The improvements are most evident during the ΔT 5 period due to more devices joining and leaving the network. AAP provides a higher response time with an average of 117.38ms by placing fewer services onto the Fog and ignoring the network changes, leading to higher transmission times (see Fig. 12(a)) and failure overheads. RAP performed worse with an average response time of 168.07ms since it ignores service dependencies and network changes causing higher transmission times (see Fig. 12(a)) and high failure overheads of 29.13ms. Although JCP provides low execution times (see Fig. 12(b)), this negligible improvement does not affect the response time.
D-LARGE: Similar to other scenarios, Fig. 11(c) shows that M-RAP provides the lowest response times with an average of 82.36ms, negligible failure overheads of 4.84ms and lower transmission time (see Fig. 12(a)). Although JCP had an average failure overhead of 3.36ms, it provides a higher average response time of 101ms by placing more services in the Cloud with higher transmission time (see Fig. 12(a)). AAP and RAP performed worse with average response times of 94ms and 124ms, and failure overheads of 19.75ms and 21.76ms. The response time difference is again evident over multiple time intervals, as the static AAP and RAP methods do not consider network changes.
Summary: Fig. 11 shows that M-RAP outperforms the related methods by considering the network interconnections and introducing very low failure overheads. M-RAP places services onto highly connected devices with low transmission time close to users in MEDIUM and LARGE scenarios. In contrast, AAP performs better in the SMALL scenario by applying a greedy algorithm. Fig. 12(b) shows that M-RAP provides a higher execution time by placing the most services in the Fog and outperforming JCP, which places most services in the highperformance Cloud. However, JCP achieves higher response times due to the higher transmission time to the Cloud (see Fig. 12(a)). Finally, AAP and RAP explore static algorithms that do not consider network changes and fail to place all services successfully.
2) Failure Overhead: Fig. 11(d) performs an aggregated failure overhead analysis that splits the total average response time of all simulated applications that encountered at least one failed device in three components, following our methodology in [37]. Since most executions exhibited one service failure, the average lost computation is approximately equal to one time interval of 20ms. The placement overhead of executing the low-complexity incremental resource partitioning and placement algorithms is around 8ms (see Section IX).
Table VII summarizes the severity [37] of the failure overhead metrics, normalized against the application response time. We restrict the analysis to the executions that exhibited at least one failure to maximize the severity of the failures. Despite the high severity of the lost computation and placement overheads, all applications fulfill their deadlines in the D-SMALL scenario. The number of applications missing their deadlines increases with the scale of the simulated scenario due to the limited number of available Fog devices and the high number of requests. However, our method manages to keep the average delay reasonable (around 38.4ms) in the D-LARGE scenario, representing 25% of the total application response time. 1) Reliable Fog Infrastructure: Fig. 13(a) shows the average deadline satisfaction ratio without faults.

F. Deadline Satisfaction
D-SMALL: All three methods fulfilled the deadlines, except JCP placed more applications in the Cloud.
D-MEDIUM: M-RAP outperformed other methods with a slightly higher average deadline satisfaction rate of 0.69 compared to AAP and RAP. JCP performed worst due to its higher response time.
D-LARGE: M-RAP satisfied the deadlines with an average rate of 0.60, while AAP and RAP exhibited a lower 0.48. JCP performed worst with an average rate of 0.39.
Summary: M-RAP has a better deadline satisfaction rate for increasing application requests by placing dependent services across tightly connected Fog devices with lower latency and higher bandwidth within the same network partitions (see Figs. 11 and 12(a)).
2) Faulty Fog Infrastructure: We randomly failed Fog devices during the service execution every 20s, such that all devices are not reachable at the end of each simulation period of 2000s. Fig. 13(b) evaluates this faulty Fog's average deadline satisfaction rate.
D-SMALL: M-RAP and AAP fulfilled the deadlines with an average satisfaction rate of 0.34 and 0.35 upon randomly failing Fog devices. RAP performed slightly worse with a cumulative satisfaction rate of 0.31. D-MEDIUM: All methods fulfilled the deadlines with a lower satisfaction rate than in the D-SMALL scenario. However, the M-RAP performed slightly better than AAP, with an average rate of 0.21. In contrast, RAP and JCP exhibited a lower deadline satisfaction rate of 0.17, respectively 0.15. D-LARGE: JCP performed worst and fulfilled deadlines with a low average rate of 0.1, while M-RAP fulfilled their deadlines with a better average satisfaction ratio of 0.14.
Summary: We draw three observations from Fig. 13(b): 1) The number of deadline-satisfied applications decreases for fewer Fog devices leading to a high Cloud utilization with higher transmission times. 2) RAP and JCP exhibit a lower deadline satisfaction rate than M-RAP and AAP upon faults since they do not consider device failures and provide higher response times. 3) M-RAP achieves a higher average deadline satisfaction rate by placing all services across highly connected Fog devices in the same network layer partition. The request routes through another path upon network failures leading to a better deadline satisfaction rate. In contrast, AAP places dependent services across weakly connected devices prone to failures.

A. Implementation
We installed a Docker engine 20.10 on all C 3 testbed devices, which partitions their resources by deploying containerized services isolated from each other based on their resource requirements. The minimal scripts to create and run the containerized services are available in the GitHub repository (https: //github.com/SiNa88/M-RAP). The lack of interference among the isolated containers hosted by the same device ensures that the service execution times follow the model defined in Section III-F.
We run the M-RAP placement algorithms on a medium C 3 Fog instance type (see Table IV). The multilayer resource partitioning has an average execution time of approximately 3s, executed once for the complete Fog network before the application requests. The incremental multilayer partitioning and service placement algorithms have a significantly faster execution of approximately 3ms and 5ms due to their low complexity with high responsiveness to the incremental changes in the Fog network.

B. Real-World Applications
We validate in this section the simulation results using two real-world applications from the e-commerce and video processing domains, executed on the C 3 testbed (see Fig. 14). We generated ten simultaneous application requests (five for each application) from different users to utilize the testbed and avoid over-utilization entirely. We generated artificial network delays and emulated the distributed locations of user devices using the Linux tc, as summarized in Table VIII. 1) E-Commerce: E-commerce is a containerized online store application composed of the following components.
Web interface (WI) service provides the graphical interface for interaction with the user;  Catalogue (Cat) service previews product information; Account (Act) service provides the user profile storing the history of orders; Shopping cart (Shc) service adds or removes products; Payment (Pay) service connects the user to the bank payment application and performs the transaction; Order (Ord) service places the selected product from the stock after inspecting the catalogs; Shipping (Shp) service manages the shipping information by publishing it to the RabbitMQ message broker.
2) Video Stream Processing: Video stream processing is a traffic sign classification application following road safety inspection concerns [1].
Encoding (En) service receives and encodes the highresolution raw video stream.
Framing (Fr) service uses OpenCV to produce still frames from different video scenes.
Low-accuracy training (LT) service trains a convolutional neural network aiming for a 70% accuracy.
High-accuracy training (HT) service improves the multi-class classification model from newly collected data aiming for an accuracy 90%.
High-accuracy inference (HI) service uses the 90% accurate model to classify the signs in the video frames.
Transcoding (Tr) service converts the video in different resolutions and bitrates and prepares it for delivery by using the ffmpeg software suite.
Packaging (Pa) service delivers the transcoded stream.

C. Experimental Results
Fig . 15 shows the placement heatmap for e-commerce and video processing applications for M-RAP and the three related works on the C 3 testbed. We observe that M-RAP places most services onto Fog devices and very few onto the Cloud instances. AAP and RAP use static greedy methods with higher wastage Fig. 15. E-commerce and video processing placement heatmap for four related methods in the C 3 testbed. and lower Fog placement. Finally, JCP places most services into the Cloud instances by considering the applications as monolithic.
E-commerce: Table IX shows that M-RAP provides lower response times by placing all services onto the AAU cluster and Exoscale instances in Klagenfurt, close to the end-users. AAP provides a slightly higher response time by placing several services onto the AAU cluster and the rest onto the higher latency Exoscale instances in Vienna, Klagenfurt, and Sofia. In contrast, RAP places several services onto the AAU cluster and distributes the rest onto the Exoscale instances in Vienna, Zurich, Munich, and Sofia without considering their dependencies, leading to high transmission time. Lastly, JCP places the most services onto Sofia's A1 Cloud data center with a high response time.
Video processing: Table X shows that M-RAP presents the lowest response time again by placing the most services on the Exoscale instances in Vienna and Klagenfurt close to the enduser. However, M-RAP's advantage is less critical due to the high video stream processing workload, which diminishes the role of the transmission time. AAP places the services onto the smaller Exoscale instances, RPi4, and NJN, with higher response times because e-commerce services with an earlier deadline occupy the more powerful instances. RAP places the services onto the different clusters (AAU and Exoscale in Vienna, Munich, and Sofia) without considering their dependencies, leading to higher transmission time. Although JCP places all services onto the Cloud instances, it provides similar response times to AAP and is even lower than RAP since the powerful Cloud instances compensate for the higher transmission time.

D. Result Validation
We validate the correctness of the simulation conducted in Section VIII-E by comparing the response time and the Fog placement in the D-SMALL scenario (see Table VI) with the real testbed measurements for ten e-commerce and video processing requests.
Response time: Fig. 16(a) and (b) reveal that M-RAP has a lower median response time and follows the same trend in both simulated and testbed scenarios. The simulation results show a higher difference in response time than the real testbed due to the higher number of requests and devices and the longer transmission time to the Cloud. M-RAP and AAP placed most applications on the Fog with a lower simulated response time deviation than the real testbed, which placed more applications in the Cloud due to a lower number of Fog devices. Except for JCP, the response time of all methods follow the same trend in simulation and real testbed. The real testbed shows a lower response time median for JCP compared to RAP because it placed most of the applications in the A1 Cloud data center in Sofia with a lower transmission time than the AWS or Google data centers in Virginia and Frankfurt.
Fog placement rate: Fig. 16(c) and (d) show that M-RAP provides the highest Fog placement rate, and JCP performs worst in both simulation and real testbed experiments. The simulated Fog placement rate shows higher deviations for all methods except M-RAP due to more requests with different workloads than the real testbed. The higher number of simulated Fog devices leads to a significant placement rate difference between the methods. The fewer Fog devices in the real testbed cause a lower Fog placement rate. The placement rate divergence between the simulation and real testbed indicates that M-RAP, AAP, and RAP reached a high Pearson correlation of 0.87, 0.95, and 0.91, while JCP attains 0.65 because of more variations.

X. CONCLUSION
We introduced a multilayer resource-aware partitioning (M-RAP) method for adaptive application placement in a dynamic Fog infrastructure. M-RAP represents the heterogeneous Fog resources as an incremental multilayer graph and partitions it by considering network connections and resource characteristics. M-RAP constantly updates the multilayer graph and its partitions upon changes in the availability of Fog devices. M-RAP places the application in two steps. The first step matches the requested application services based on their requirements with feature partitions overlapping in the same network layer partition. The second step places the services on Fog devices closest to the user in the selected partitions. We evaluated M-RAP based on three simulation scenarios considering incremental changes in a dynamic Fog infrastructure during five simulation periods.
The results indicate that M-RAP can place 1.6 times as many services, satisfy deadlines for 43% as many application requests, optimize application response time by 58%, and reduce resource wastage by up to 54% compared to state-of-the-art methods. We confirmed the simulation using e-commerce and video processing applications executed on a real testbed comprising eight heterogeneous Cloud and Fog instances distributed over seven geographical locations.
We published our simulation code in the Code Ocean platform (https://doi.org/10.24433/CO.7045020.v1) to support the journal reproducibility initiative. In the future, we plan to extend M-RAP to support mobility by: 1) assigning and maintaining unique identities of leaving devices, and 2) incrementally assigning the mobile devices to the network partitions with the highest proximity and connectivity.