Digital Twin-Aided Orchestration of Mobile Edge Computing With Grant-Free Access

Digital twin-aided (DT) edge computing is investigated, where users utilize grant-free random access with adaptive rate to offload their tasks to the edge server. A novel, with lower implementation complexity, probabilistic partial offloading scheme is introduced, while each device is assumed to have an infinite buffer to store its tasks. The aim of the proposed work is to minimize the average delay of the partial offloading. To that end, the average delay of waiting in the queue, the delay of offloading, and the local computation delay are extracted by using queuing theory tools. Then, the non-convex problem of minimizing the average delay of all clients is formulated, while taking into account DT imperfections. Successive convex approximation (SCA), alternating optimization (AO), and various algebraic manipulations are utilized to transform the problem into an equivalent convex problem with tractable solution. Finally, simulation results showcase the value of the proposed analysis and offer important insights for the proposed DT-aided edge network. Specifically, the proposed partial offloading scheme is shown to be more delay efficient compared to both local computing and full offloading, particularly, for greater task generation rates at the users. Also, the impact of the DT imperfections at the average delay is shown to be more notable as the number of users, or the tasks’ size, increases.


I. INTRODUCTION
N EXT-GENERATION Internet of Things (IoT) networks are expected to rely on smart devices at which edge intelligence and computing can be fully realized, e.g., smartphones, vehicles, machines, and robots [1], [2]. Such a device-centric network poses new challenges and requirements on the design and operation of wireless communication since smart devices will not only generate or exploit data, but will actively join the network management [1], [2]. Moreover, the large number of potentially active devices complicates resource allocation, therefore contention-based protocols attracted attention recently, as a way to avoid too many resources remaining idle due to intermittent traffic [3], [4].
Toward reducing the access delay and signal overhead of traditional contention-based protocols, the 3rd generation partnership project (3GPP) in Release 16 of the 5th generation new radio (5G NR) proposed a two-step random access scheme, namely grant-free (GF) access [5]. The main idea behind GF access is that an active device does not wait for a response from the base station (BS) after transmitting its preamble (1st step), but immediately transmits its data packet (2nd step). Therefore, the handshaking process between the user and the base station is avoided, consequently reducing the associated signalling overhead. The key challenge for GF transmissions is contention, as multiple users may choose to transmit at the same channel resources at the same time. However, future networks as they are shaped by the sixthgeneration (6G) concept [6] will need to support novel use cases, for example, virtual, augmented, and extended reality (VR/AR/XR), tactile Internet, intelligent power grids and smart cities [1], [2], [7]. Those use cases will require scalable wireless sensor networks, but will also demand intensive computing and medium to high data rates [1], [7]. Moreover, intelligent functionalities will be extended to the edge nodes, due to their advanced computational capabilities, thus enabling the convergence of artificial intelligence (AI), communications, and edge computation [1]. As such, edge computing [8] architectures will provide intelligence physically closer to the end users [8]. This can significantly improve the end-to-end latency, especially for users who repeatedly offload intensive tasks to the server.
Furthermore, recently, edge computing has been combined with the emerging technology of digital twins (DTs). A DT is a comprehensive software representation of an individual physical object or system, that includes its real-life properties, conditions and behaviours [9]. By combining mobile edge computing (MEC) and DT, the MEC server status and the status of the end devices, such as the distance of the end users from the MEC server, the average task arrival at the users' buffers or the CPU frequency at the edge servers can be directly transferred to the network's orchestrator. Then, based on that knowledge, the orchestrator provides the physical network with intelligent and optimal decisions [9]. Therefore, the DT's goal is to capture the physical features of the network, while the orchestrator's aim is to develop an optimal strategy based on those features. Due to the fact the a DT continuously evolves, alongside its physical entity, it provides the orchestrator with an improved view of the underlying physical network.
MEC has also been extensively studied, often by taking into account congestion occurring at the buffers of the users or at the buffers of the MEC server [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26]. However, the MEC stateof-the-art cannot be generalized for GF transmissions, since continuous-time queueing models are adopted. For instance, in [16], the M/M/c/c queuing system was exploited and a holistic QoS-aware framework for Industrial IoT systems was designed, whereas in [17] a heuristic scheduling model was designed to maximize the offloading energy and execution efficiency of an Erlang queueing MEC system. Similarly, in [27] three Erlang based queueing models were applied, one at the mobile users, one at the edge server and one at the cloud server. However, their model is based on orthogonal multiple access, which is fundamentally different to the GF-access. Moreover, the delay of a MEC serving multiclass users, based on continuous-time queueing, was studied in [18], while in [20] a closed-form water-filling computation offloading solution was proposed to investigate the average delay. Furthermore, in [21], a distributed task offloading scheme was investigated with consideration to the upper layer queueing dynamics and the lower-layer coupled wireless interference. Also, in [25], a stochastic buffer-aided relay-assisted MEC was examined.
Furthermore, in [28], a cross-layer MEC design was studied for URLLC and enhanced mobile broadband (eMBB) services with short packet transmissions. The delay of the system was analyzed, however, the duration of the data transmission slot and partial offloading were not considered, while the channel access was not GF-aided. Furthermore, in [29], the users' power consumption under partial offloading was minimized, while statistical constraints were imposed on task queue lengths by applying extreme value theory. Moreover, in [30], a deep reinforcement learning algorithm was designed to study the joint optimization of task offloading for a MEC system with application to AR. In addition, in [31], resource allocation was investigated for URLLC vehicular edge computing, while in [32] the tradeoff between latency and reliability in URLLC MEC was examined. Nonetheless, queueing delay was not considered in any of [30], [31], [32].
Moreover, several studies have aimed on designing DTs combined with MEC [9], [33], [34], [35], [36], [37], [38], where the MEC-DT concept is utilised to extract optimal network orchestration strategies based on real-time data. For instance, in [33], [34], [35], [36], deep reinforcement learning (DRL)-based algorithms were proposed to train the DT of the MEC network for making offloading decisions, edge association and resource allocation, thus increasing the network's performance. In addition, [37] and [38] minimize end-to-end latency in a DT-aided MEC system, while also considering deviations between the DT and its physical counterpart. Finally, in [9] a secure and latency-aware edge computing architecture was designed. Despite their merits, none of the aforementioned works have studied the integration of DT and MEC for GF access schemes, which is a fundamental for the next generation IoT.

B. MOTIVATION AND CONTRIBUTIONS
Inspired by the above as well as the advantages of DTaided edge computing and GF access protocols, this work aims to combine DT-aided MEC with grant-free access. To the authors' best knowledge, this is the first work that proposes DT-aided edge computing with GF access and partial offloading. Next-generation IoT's role will be to improve intelligence in physical systems, such as smart cities, transportation and power grids by monitoring their unique characteristics [1], [2]. Those data will then be processed, either locally, or by the edge in order to improve the system's intelligence. Since IoT devices produce tasks sporadically, they do not need to access the wireless medium constantly. On top of that, the large number of IoT devices makes scheduling quite complicating, as wireless resources are limited to stay idle due to intermittent traffic. Thus, GF access is a strong candidate to enable the communication between next-generation IoTs and the MEC server.
Moreover, the majority of the existing literature on MEC or DT-aided MEC, considers the partial offloading as a procedure which splits the size of a task into two parts, which in practice may not be straightforward to implement, since separating a task into smaller subtasks depends on the task's structure and functions. In our analysis, a task is either transmitted to a MEC server, or it is locally processed, which offers lower implementation complexity, and it is more appropriate for next-generation IoT. Therefore, the proposed partial scheme is a probabilistic binary offloading scheme, which adjusts the ratio of the tasks transmitted to the MEC server and the number of the tasks processed locally, according to the channel conditions, the density of the network, the available preambles, etc. Furthermore, in contrast to the state of the art on discrete-time queuing theory, where instant packet transmissions on error-free channels are assumed, in our analysis, an error-prone channel is considered, while the data transmission duration is also optimized. The contributions of the paper are summarized below: • A DT-aided MEC system with GFRA is designed under a novel partial offloading. The partial binary offloading is a probabilistic binary offloading scheme, that can be visualized as a switch, which with probability θ sends a packet to the transmission buffer and with 1 − θ probability sends a packet to the local computing buffer. Closed-form delay expressions are extracted, under the assumption of infinity buffer capacity. In contrast to the state-of-the-art on discrete-time queuing theory, error-prone channel conditions are studied, while the transmission delay is also considered. • The average delay until a task is processed, either locally or by the MEC server, is minimized. Moreover, the data transmission phase duration is also optimized. It should also be noted that the formulated optimization problem can be easily modified so that the devices' power consumption is minimized, subject to delay constraints. To tackle the non-convex delay minimization problem, an efficient algorithm is proposed, which utilizes successive convex approximation (SCA) and alternating optimization (AO). • Numerical results illustrate the superiority of the proposed schemes over full-binary approaches.
Moreover, valuable insights about the DT-aided MEC network are extracted, while the convergence of the proposed optimization algorithm is illustrated.

C. STRUCTURE
In Section II, the proposed system model is introduced. In Section III, a brief stability analysis is presented for the buffers, followed by Section IV, where the delay analysis for the buffers is demonstrated, while in Section V, the computation model of the DT is presented. In Section VI, the delay minimization problem is formulated and solved. In Section VII, numerical results are shown and different insights are discussed. Finally, in Section VIII a conclusion is drawn.

II. SYSTEM MODEL
We consider a DT-empowered MEC server-client system model, which consists of N devices and one MEC server, as shown in Figure 1. The architecture relies on two layers, the physical layer and the DT layer. The physical layer consists of all the physical components of the network, such as devices, edge servers, base stations, while also taking into account each components limitations regarding the hardware, the transmission power etc. Each device k lies in a distance d k and employs GFRA to communicate with the server and to transmit computationally expensive tasks. The communication between the users and the MEC server is separated into two phases, a preamble phase and a data transmission phase. The preamble phase duration is much shorter than the data phase duration, while the available number of preambles is denoted as L p . The preambles are orthogonal, therefore a collision between the users can only happen during the preamble phase. A user that is active, uniformly chooses  one of the available preambles and transmits it to the MEC server. Then, immediately, the user enters the data transmission phase, without waiting for a response from the MEC server. The access probability, i.e., the probability that a user becomes active and gets access to the channel is defined as q k . Also, let P k represent the transmission power of the data transmission phase. Due to the fact that a user is not always active, the total number of active users is unknown. In Table 1, the list of symbols and notations is presented. Moreover, a device has the ability to perform basic computations by itself and hence a full offloading is not mandatory. We assume that each device k generates tasks at the rate of λ k tasks per second, where each task has a size of L k bits. Also, the devices' computing capabilities are described by their processors' CPU cycles per second, f k . Each task is described by its size and a required number of CPU cycles per bit, X k . The splitting factor θ k ∈ [0, 1] represents the percentage of the tasks which are transmitted to the MEC server and thus, 1 − θ k represents the percentage of the tasks that will be computed locally. We assume that each device has a buffer in order to store its generated tasks. Since the devices utilize a partial offloading strategy, it is convenient to assume that the storage capability of the devices is separated into two buffers, one for the tasks that are waiting to be transmitted to the MEC server and one for the tasks that are in line to be computed locally, as shown in Figure 2. Both buffers are considered to have infinite capacity. Practically, tasks cannot be afforded to be lost, therefore the assumption of an infinite buffer is meaningful.
The DT layer is an exact replica of the underlying physical layer which takes into account all hardware components of the physical devices and the network topology. The DT interacts with the physical layer in order to gather data which aid to improve the digital representation of the physical world and capture physical changes in real-time. Based on the information provided by the DT, an orchestrator efficiently manages the networks' available resources. The DT of the physical layer is represented as DT = {(M,M), (N ,Ñ )}, including the MEC server and the N users. Without loss of generality, it is assumed that the DT of the MEC server is perfect due to the superiority of the wired backhaul channels. Therefore, the set containing all the parameters and variables to describe the physical MEC server, M is exactly the same as its DT counterpart,M. The DT of the k-th user is denoted as DT k = {(N ,Ñ )}.Ñ is the set containing all the parameters and variables describing the physical users. As in [37], a deviation n of available CPU frequency is used to describe the deviation between real users and their digital counterparts, which can be either positive or negative. The deviation n will be assumed known in advance [37]. The following assumptions hold in our analysis: • The channel is not error-free. A packet can be lost either due to a collision during the preamble phase or due to an outage event caused by the channel's conditions during the data transmission phase. • The preamble phase duration is negligible compared to the data phase duration, due to the preambles' small size. Therefore, the data transmission phase approximately captures the whole duration of a time slot, which equals τ seconds. Due to synchronization issues, the time slot duration is equal for all users in the system and the duration of the time slot is limited by the worst user or the user with the biggest task [4]. • A packet that failed to be transmitted, will be retransmitted at the next time slot with the same probability q k . Also, the failure or the success of a packet is considered known instantly. • Tasks are generated with an average rate of λ k tasks per second. The communication time is divided into time slots of duration τ , therefore, a task is equivalently modelled to be generated at the end of a time slot with probability λ * k , and no task is generated with probability 1 − λ * k . From [39], the following holds Since 0 ≤ λ * k ≤ 1, it needs to hold that τ ≤ 1 λ k ∀k. • Regarding GF transmissions, a task departs from the queue at the beginning of a time slot, with the assumption that only one task departs at a slot. Note, that a late arrival model is adopted, and so, a task arrives just before the departure of a task. • During partial offloading, a task's size is not altered, but the task is either computed locally or it is transmitted to the MEC server as a whole. The partial factor θ k expresses the percentage of tasks that are transmitted to the MEC server. Therefore, θ k λ k tasks per second enter the transmission buffer and (1−θ k )λ k tasks per second enter the CPU buffer. Since 0 ≤ θ k ≤ 1, the partial offloading is a probabilistic strategy, where with θ k probability a task is sent to the MEC server, otherwise it is locally processed. It is noted that the proposed offloading offers lower implementation complexity compared to the conventional partial offloading, since separating a task into subtasks depends on the task's structure, content and functions. The average throughput (bits/sec) of the proposed GF transmission, under Rayleigh fading conditions, with normalized bandwidth, and for one available preamble, is given by [40] as where R k is the fixed data rate (bits/sec) of each device and with h denoting the channel fading coefficient that follows a Rayleigh distribution. Effectively, this term is related to the average received power, so it can be utilized to include the path loss. N 0 is the power spectral density of noise. The second term, from the left, of (2) expresses the probability of non-outage probability during the data phase, assuming a Rayleigh channel. The last term, from the left, of (2) is the probability of no-collision during the preamble phase, for the case of one preamble. The probability of no-collision, for L p available preambles, can easily be found, [14], as Also, the power consumption due to local computation is given as [8], where k k is a constant related to the hardware architecture.

III. DELAY ANALYSIS A. DELAY ANALYSIS OF THE GFRA BUFFER
We consider a GFRA scheme where each user has an infinite buffer capacity and a discrete-time queueing model. The analysis of the queue can be carried out similarly to [41]. The total probability flow through any closed boundary must be zero, so for the k-th user according to Figure 3, the following needs to hold, where g i,k denotes the probability that the buffer of the k-th user contains i tasks. The probability that a task will arrive to the transmission queue, following (1), is given as Also, μ k is the probability of successful transmission of the k-th user and for a GFRA scheme with imperfect channel conditions, is given by (2) as, From (5), g i,k , i ∈ Z, can be calculated with respect to g 0,k as, Substituting (8) into the total probability law, i.e., where ρ k = λ k /μ k . Now that g 0,k is known, every g i,k can be calculated using (5). Furthermore, the average queue length of the k-th user can be found as [41], and for an infinite buffer, i.e., K −→ ∞, by [41] we have Hence, using Little's law [41] the average response time, in seconds, is given as,

VOLUME 4, 2023
Note that the average response time is defined as the duration from the moment a packet enters the queue until its successful departure.

B. DELAY ANALYSIS OF THE LOCAL COMPUTATION BUFFER
The local computation buffer is also assumed infinite and there is no need for retransmission mechanisms at the output of the CPU buffer. The CPU buffer acts in a deterministic way, therefore it can be modelled as a Geo/D/1 queue [39]. Its average departure rate is then given as The average input rate is given asλ k = (1 − θ k )λ k τ . The Geo/D/1 queueing delay is known and given from [39], in seconds, as, whereρ k =λ k μ k . Note that the queueing delay measures the delay spent until a task reaches the end of the queue and begins to be served by the CPU.

IV. STABILITY ANALYSIS
A queueing system is said to be unstable if the queue size goes to infinity with non-zero probability, so it is important to examine the stability for queueing systems with infinite buffer capacity. In this section, we study the stability of the proposed buffer architecture with infinite buffer capacity. From [41] an infinite buffer system is global stable if and only if its mean input data rate is equal or less than its mean output data rate. For the buffer of the k-th device, which is dedicated to full offloading stability is provided when it holds that Moreover, when a packet successfully leaves the queue, all of its bits are pushed to the MEC server, which does not happen instantaneously. Its time duration has to be less than the duration of the data transmission phase of one time slot. Otherwise, the next packet in queue of any user may suffer an unnecessary collision in the next time slot, since by assumption all packets are transmitted to the beginning of the data transmission phase. Thus, the following has to hold Note that with (14), an adaptive rate is introduced to the GFRA transmission, which is contradictory to the existing literature where it is assumed that a packet is transmitted instantly. On the other hand, for the buffer dedicated to local computing to be global stable, the following is required to hold It should be noted that a system which can perform partial offloading is more flexible and stable than a system which performs binary offloading, since utilizing partial offloading means θ ≤ 1, therefore each buffer experiences less congestion in comparison to the case of full offloading or local computing. Moreover, the stability constraints have to be taken into account for any optimization problem, otherwise the optimal solution of the problem might lead to an unstable solution and huge queueing delays.

V. COMPUTATION MODEL OF PHYSICAL AND DT COUNTERPARTS
The computation latency until a task of L k bits is processed when a physical device's CPU operates with frequency of f k is known and given as The DT of the k-th device is expressed as wheref k = f k +n k is the estimated frequency at the DT and n k is the frequency deviation between the virtual representation and its k-th physical counterpart. Assuming that the deviation of the CPU processing frequency between the physical devices and their DT can be acquired in advance [37], the computing latency gap between the real value and the DT estimation can be calculated as [37] therefore the total computation latency estimated at the DT is given asd It is also assumed, that the DT has perfect knowledge of the MEC's condition, which can be justified by the superiority of the wired backhaul channels. By following a similar approach, the queueing delay at the devices' buffers is estimated at the DT as whereμ k = f k +n k X k L k τ andρ k =λ k μ k .

VI. DELAY MINIMIZATION A. PROBLEM FORMULATION
We aim to minimize the average delay of every device while taking into account their power and stability requirements. Therefore, the access probability q k , the offloading factor θ k and the data transmission duration will be optimized according to the total number of devices in the system, the average channel statistics, and the average task generation rate of each device. We note that the proposed problem also provides a lower bound of the delay when the activation probability q k is fixed or unknown. Furthermore, the formulated analysis can also be adjusted for a power consumption minimization problem subject to delay constraints.
The average delay between the k-th user and the MEC server is given as where d t,k expresses the delay caused when a task successfully leaves the queue until all of its bits are pushed to the MEC server and is given as, The computation time of the MEC server and the delay caused during the downlink communication between the MEC and its clients is denoted as d mec,k and it is omitted since the MEC server has superior capabilities compared to the devices it serves. Similarly, the average local computation delay of every device is given as, Thus, the overall average delay until a task of the k-th device is completed, either at the MEC server or locally, is given as, Therefore, the proposed delay minimization problem with power constraints can be formulated as follows, wheref k = f k + n k . Constraint C 1 limits the transmission power during the uplink communication using GFRA and the power consumed during the local computing, so that every device is power efficient. Constraints C 2 -C 4 ensure that the optimal solution of the proposed problem is also stable.

B. CONVEX TRANSFORMATION
The problem is non-convex due to its non-convex objective function and constraints C 2 − C 4 containing the product of various optimization variables. In order to formulate it as a convex problem we first transform it into its epigraph form as follows, min R,q,f,τ,τ k ,θ,P N k=1 τ k s.t C 1 : P k + k kf 3 k ≤ P max,k , ∀k ∈ N C 2 : The problem is still non-convex. By substituting the relations forμ k , λ k ,λ k , d o,k and d l,k , problem (25) is equivalently written as, Problem (26) is non-convex, due to μ k containing the product of several optimization variables, as well because of the constraints C 2 -C 6 . To formulate the problem as convex we will introduce the following auxiliary variablẽ One way to deal with the product of optimization variables in C 2 -C 6 is to take the logarithm of both sides of those constraints. However, due to the summation in the left side of constraints C 5 and C 6 , those constraints will be difficult to handle. To that end, the variables τ (i) k , i ∈ {1, 2, 3, 4} will be introduced, for which it holds that τ (1) k + τ (2) k ≤ τ k and τ (3) k + τ (4) Utilizing the above formulations, problem (26) is written as, Problem (29) is still non-convex. From constraint C 2 and relations (1), (12) it is concluded that the time slot duration τ is lower bounded by the value of max{ L k R k } and it is upper bounded by the lowest value of min{ 1 λ k } and min{ X k L k f k }. Therefore the following has to hold, Otherwise, the problem is either infeasible, due to (1) and (12), or collisions occur between two packets sent to different time slots due to C 2 . To make the problem simpler to solve, the concept of AO will be utilized, by fixing the value of τ when optimizing the rest of the variables. The optimal value of τ is given as To deal with the non-convex constraints, the logarithm of both sides of constraints C 5 -C 8 and C 11 will be taken. Consequently, the problem is formulated as follows, min R,q,f,τ k ,θ,P N k=1 τ k s.t (29) C 1 , C 2 , C 3 , C 4 , C 9 , C 10 where in order to cope with the non-convex negative logarithmic terms of constraints C 2 , C 3 , C 6 − C 8 and C 11 , that occurred due to taking the logarithm of both sides, SCA was exploited. Specifically, each negative logarithmic term was approximated by its first order Taylor approximation.
In [42] three conditions are mentioned that are required to hold when approximating a non-convex constraint. Assume Algorithm 1 Solution to (32) 1: Choose the maximum number of iterations n max , an initial point x 0 , initial points θ k,0 , r k,0 for the SCA prodecure and tolerance 2: while (n ≤ n max and ||x k − x k−1 || ∞ ≥ ) do 3: solve problem (32) and obtain optimal x * 4: Problem is infeasible 9: end if 10: end while that γ (x) is the non-convex term andγ (x, x k ) is its convex approximation. Then the following need to hold, It can be easily verified that for the log(·) function and its Taylor approximation all three conditions hold. Eventually, problem (32) is convex. Furthermore, it should be noted that the formulated average delay minimization problem can be easily written as a power consumption minimization problem subject to average delay constraints. Specifically, by fixing the values of τ k in constraints C 9 and C 10 and by substituting the objective of the problem with constraint C 1 , the formulation describes a power minimization problem subject to average delay constraints.

C. PROPOSED ALGORITHM
The problem can be solved by any general purpose convex optimization method, following Algorithm 1. A common approach to solve non-linear constrained convex problems is the interior-point method with complexity of roughly O(N 3 ) [43], where N is the number of variables. For convex problems, the interior-point method converges to a global optimal point with great accuracy. Moreover, in line 4 of Algorithm 1 the initial point of the SCA procedure is updated by the optimal solution obtained from the interior-point method. Therefore, at each iteration, the Taylor approximation is more accurate, since the approximation is closer to the optimal point of (32). Note that the complexity of Algorithm 1 in conjunction with the interior-point method is O(n max N 3 ), where n max is the maximum number of iterations allowed.

VII. NUMERICAL RESULTS
In this section, simulation results are presented for a GFRA MEC network. Unless otherwise stated, the simulation parameters are given in Table 2. To highlight the effect of average received power in the proposed method, the total number of users N is separated into two clusters, that are denoted by i ∈ {1, 2}. The users of the same cluster are considered to have equal average received power i . Without loss of generality, the path loss model is given In order to extract insights about the network's performance, the optimal resource allocation strategy presented in this section has been evaluated by using Algorithm 1.
In Figure 4, the offloading strategy and the average delay are plotted against the average task generation of the devices. In Figure 4a, we observe that for the parameters chosen the local computing delay is by far worse than both the full offloading and the proposed partial scheme. For low average task generation rates, the proposed partial scheme is slightly more efficient than the full offloading. However, as the task generation rate increases the performance gap between the proposed scheme and the full offloading increases as well. Both local computing and full offloading experience congestion that results in greater delays compared to the proposed strategy. It is interesting to note that the delay of the proposed scheme can be 5 times less than the full offloading delay and 20 times less than the local computing delay. The partial scheme will eventually also experience congestion, but for greater values of task generations rate, thus it is more delay efficient.
The delay efficiency of the proposed offloading scheme is attributed to the fact that it adjusts the percentage of tasks transmitted to the MEC server or the percentage of tasks that are computed locally. In Figure 4b the offloading strategy is shown for the same values of λ. Due to the fact that full offloading causes smaller delays in comparison to local computing, for low task generation rates the majority of the tasks are sent to the MEC server. When the task generation rate increases, full offloading, due to collisions or outage, cannot provide the required queue stability, therefore a percentage of the tasks are eventually computed locally. That causes θ to decrease, thus aiding the partial scheme to retain stability at its buffers, as can be verified from constraints C 3 − C 4 of (24).
In Figure 5, the impact of the tasks' size is investigated. In Figure 5a the average delay is illustrated. It is observed that both local computing and full offloading experience congestion for smaller task sizes in comparison to the proposed scheme, which results in increased delays. The delay of the partial scheme can be seen to be 4 times lower than full offloading and 10 times lower than local computing, while both binary strategies experience congestion. The proposed scheme offers increased flexibility, since from Figure 5a and Figure 4a it is concluded that the partial scheme supports both greater task generation rates and large task sizes.
In Figure 5b, the offloading strategy is presented. Note that the task size L does not affect the partial strategy θ directly, as it can be verified from the theoretic analysis. Nonetheless, the task size has a great impact on the time slot duration, since for collisions to be avoided between successive time slots, a task has to be transmitted to the MEC server in a duration less than a time slot. As such, Figure 5b shows that the task size has a great impact on the offloading strategy, since θ rapidly diminishes for greater values of L. This attributed to the fact that transmitting bigger tasks requires greater data rates R. However, greater data rates cause the outage probability to increase, as can be seen from (2), which undermines the stability of the transmission buffer.
In Figure 6, the number of devices in the network and its impact on the optimal offloading strategy are studied. As expected, local computing delay is constant for all number of users. On the other hand, full offloading experiences congestion for a relatively small number of users, which causes full offloading to be less delay efficient than local computing. This is due to the fact more clients are likely to transmit data to the MEC server, therefore the probability of collisions dramatically increases. The partial scheme is robust, since it can adjust its partial strategy. As a consequence, the average delay slowly increases with conjunction to the number of users and for 20 users, the delay is about half the delay occurred by utilizing local computing. Nonetheless, the delay of the partial scheme gradually approaches the delay of local computing, which can be verified by Figure 6b, where the offloading factor θ is shown to rapidly decrease as the number of users increases.
In Figure 7, the impact of the distance from the MEC server is examined. In this setup we have assumed that the 1st cluster of users lie in 50m distance from the MEC and the distance of the 2nd cluster is altered. It is observed that the 2nd cluster of users choose to offload a greater amount of tasks to the MEC server compared to the 1st cluster of users which lie closer to the MEC. At first glance that may seem contradictory. However, because of the greater path loss of the 2nd cluster, its users have to consume more power when offloading their tasks to the MEC server, compared to the users of the 1st cluster. If we take into account the probabilistic offloading scheme, with which some tasks will  be offloaded, and other will be locally processed, it is concluded that less power is available to the users of the second cluster, for local computing, which limits the tasks that are locally processed, and therefore, the offloading factor of the 2nd cluster is greater compared to the 1st cluster. Moreover, we note that the users of the 2nd cluster, due to higher outage probability, access the channel more frequently, causing increased interference to the users of the 1st cluster during the preamble phase. Furthermore, the offload factor of the 2nd cluster has a non-monotonic behaviour. In general, offloading the tasks to the MEC server is faster that local computing. As the distance increases the users of the 2nd cluster offload their data to the MEC for efficiency and the offload factor increases. However, after some distance, offloading a task to the server is less efficient than locally processing it, therefore the offload factor rapidly diminishes to the point where the users of the 2nd cluster offload less tasks than the users of the 1st cluster.
Regarding the impact of the available preambles on the networks' performance, from Figure 3-5 it is evident that as the number of preambles increases, the delay diminishes rapidly, due to the fact that the possible collisions are reduced. For an equal number of preambles L p , the proposed partial framework is more delay efficient as can be seen from the figures depicting the average delay, for L p = 1. From Figure 6, it is also verified that increasing the number of preambles increases the system's connectivity, since both full and partial offloading, for L p = 3, experience much less congestion compared to the case of L p = 1.
In Figure 8, the impact of the imperfection between the DT and its underlying physical counterparts is examined. In this setting we have assumed that a constant deviation between the devices' CPU frequency and their DT counterpart exists, as in Section V. On top of that, due to the unreliability of the wireless channels, the DT is also assumed to have poor knowledge on the hardware configuration of the devices, thereforeP max,k = P max,k + n k andλ k = λ k + n k . The imperfection will be expressed as a percentage of the real value of the parameters [37], so for n k = 1%,λ k = λ k ± 1 100 λ k , etc. From Figure 8a it is observed that a slight uncertainty between the DT and the physical system causes greater errors to the optimal resource allocation as the number of users increases. Therefore, imperfections tend to be more damaging as the size of the physical space and the DT increase. Moreover, in Figure 8b the percentage of error caused with various task sizes is plotted. The error increases as the task size increases. Consequently, inaccuracies between the DT and the physical network cannot be ignored when the state of the network approaches congestion. These imperfections might cause significant errors under computationally demanding network states, for instance, in cases of excessive traffic packet generation.
In Figure 9, the convergence of Algorithm 1 is shown. More users in the network indicate a larger number of constraints and optimization variables. By choosing a random point, which satisfies the stability constraints, and for the case of 2 users, it is shown that the algorithm slowly converges with accuracy of approximately 5 · 10 −2 within 20 iterations. However, for the case of 6 users the algorithm needs 5 iterations to converge with accuracy lower than 10 −4 . Moreover, for 12 users, only 4 iterations are needed. The fact that the warm start approach is used between the iterations greatly accelerates the proposed algorithm. Therefore, a good strategy for choosing the initial point is to run the algorithm for a small and easy problem, for example for the 2 users case, and then, the optimal point found is utilized as the initial point for other cases.

VIII. CONCLUSION
In this paper, we studied the average delay for a DT-aided MEC system with GF random access by using queueing theory tools. A novel partial offloading scheme was proposed in which a task is probabilistically computed locally or offloaded to the MEC server. The duration of the data transmission phase, an arbitrary number of preambles and the average outage probability were taken into account, while an adaptive data transmission rate was utilized. Then, considering imperfections between the DT and its physical counterpart, closed-form relations were extracted for the average delay of each device and an optimization aiming to minimize the average delay of all users was formulated by utilizing SCA and AO. Finally, simulation results were presented which give insights about the network's resource allocation strategy under different scenarios. The impact of different network's parameters, such as the number of users and the task generation rate were examined and it was shown that the proposed scheme can efficiently adjust its partial strategy to avoid congestion. Possible future extensions of this work could aim to study semi-GF access for MEC or to further investigate the dynamic characteristics of the DT-MEC architecture in the case of stochastic and unknown imperfections between the DT and the physical world.