Edge Computing Technology Enablers: A Systematic Lecture Study

With the increasing stringent QoS constraints (e.g., latency, bandwidth, jitter) imposed by novel applications (e.g., e-Health, autonomous vehicles, smart cities, etc.), as well as the rapidly increasing number of connected IoT (Internet of Things) devices, the core network is becoming increasingly congested. To cope with those constraints, Edge Computing (EC) is emerging as an innovative computing paradigm that leverages Cloud computing and brings it closer to the customer. “EC” refers to transferring computing power and intelligence from the central Cloud to the network’s Edge. With that, EC promotes the idea of processing and caching data at the Edge, thus reducing network congestion and latency. This paper presents a detailed, thorough, and well-structured assessment of Edge Computing and its enabling technologies. Initially, we start by defining EC from the ground up, outlining its architectures and evolution from Cloudlets to Multi-Access Edge Computing. Next, we survey recent studies on the main cornerstones of an EC system, including resource management, computation offloading, data management, network management, etc. Besides, we emphasized EC technology enablers, starting with Edge Intelligence, the branch of Artificial Intelligence (AI) that integrates AI models at resource-constrained edge nodes with significant heterogeneity and mobility. Then, moving on to 5G and its empowering technologies, we explored how EC and 5G complement each other. After that, we studied virtualization and containerization as promising hosting runtime for edge applications. Further to that, we delineated a variety of EC use-case scenarios, e.g., smart cities, e-Health, military applications, etc. Finally, we concluded our survey by highlighting the role of EC integration with future concerns regarding green energy and standardization.


I. INTRODUCTION
I N recent years, the number of connected devices has grown tremendously, causing congestion issues that push to consider handling more data at the network's Edge. According to Gartner [1], by 2025, 75% percent of data generated by enterprises will be processed at the Edge rather than the centralized Cloud. In addition to those network impracticability constraints, EC promises to provide an excellent service (latency, throughput) that will encourage the evolution toward this revolutionary paradigm. As IBM pointed out [2], moving the computing workload to the Edge will reduce data circulation time from 20 ms to 10 ms. The added value of EC offers tremendous market opportunities to multiple market participants, counting Cloud providers, ISP (Internet service providers), and numerous intermediate hardware and software companies. Based on research done by Grand View Research [3], the EC market is forecast to extend from 3.4$ US billions dollar in 2020 to 43.4$ US billions dollar in 2027, with a growth of 37.4 % percent each year. From a different perspective, EC has attracted much attention from academia in the last ten years, as evidenced by the exponential increase in published papers on topics ranging from EC architectures to deployment challenges, orchestration platforms, EC use-cases, and related technologies. Figure .1 depicts the evolution in terms of the number of EC papers published in Google Scholar [4]. In this effort, we present a survey on edge computing, beginning with an explanation of the novel computing concept and how EC can meet the growing demand for computing and memory resources with low latency communications, as well as how EC can solve the rising privacy problems associated with processing data at the cloud level. Alternatively, the collaboration of numerous heterogeneous devices at various levels of the network edge is what defines edge computing. EC enables those resources to be efficiently managed, scaled, and secured, allowing them to act as performing hosts for workloads received from end devices. Nevertheless, EC is associated with several innovative technologies, notably the Internet of Things (IoT). Along with the fifth-generation networks (5g), EC is a vital solution for enabling the polarization of connected objects. Moreover, artificial intelligence (AI) and machine learning (ML) technologies are becoming more prevalent in novel applications. Consequently, there is a growing need for computing resources. Not only will EC address this requirement, but it will also adapt AI models to the network edge environment, promoting the idea of "Edge Intelligence."

A. SURVEY ORGANISATION
The following is a summary of the rest of the paper, section II presents a definition of EC and a lecture study of the different related EC surveys, plus it underscores our unique contribution and novelty. Section III gives a brief history of the evolution of EC from Cloudlets to Multi-access Edge Computing (MEC) while also highlighting the differences in architectural design of each sub-EC concept. Further, Section IV discussed the recent advancements made in EC main pillars, counting resource management, computation offloading, data management, network management, security and privacy, and EC pricing & billing. Next, in section V, we examined the three major enabling technologies of EC, which are Edge Intelligence, 5G, and Containerization, and in the process, we demonstrated how these technologies are crucial for EC's success. Furthermore, in section VI, we presented the various scenarios and use cases in which EC is proving to be greatly useful, ranging from e-health to smart cities, and from entertainment to military applications. Succeeding that, in section VII, we discussed the the future concerns facing Edge Computing, such as standardization and efficient green energy integration to EC. Lastly, in section VIII, we ended our work with a conclusion paragraph. Fig. 2 shows the survey structure and a map for assisting the reader. Table 1 offers a helpful tool for defining the used acronyms and abbreviations in the survey.

A. EC: DEFINITION
There is no standards definition of EC, but many researchers view EC as a abstracted computing paradigm that aims to move cloud computing and storage capabilities to the network edge near where the end-users reside. For two main reasons. The first reason is to meet the current need for quality of service (latency & bandwidth) imposed by the latest applications, and the second reason is to address the problem of core network up-growing pressure. Additionally, there are some notable definitions of edge computing, one of the first papers to use the terminology "Edge Computing" is [5], within, the authors do define the concept as "the enabling technology that allows the computation to be performed at the edge of the network, on downstream data on behalf of cloud services, and upstream data on behalf of IoT services".  In order to completely comprehend the EC concept and its functions, the following sections tackle the main two questions in EC: where is the Edge located? And what exactly is the purpose of edge computing? 1) Where is the Edge located?
As a term, the word "Edge" signifies the extreme part of any given network. In the case of a telecommunications network, it refers to the RAN (Radio Access networks) part. While in the case of the data network (the Internet), the Edge or, more precisely, the Edge Device (ED) is any extreme end-users or IoT device (mobile phones, cars, smartwatches, etc.) [6]. However, In EC, the devices responsible for executing computation tasks are referred to as Edge Servers (ESs) or Edge nodes (ENs). Those computing devices can exist in one hope or a few more from the edge devices. Nonetheless, processing data at the Edge means typically handling it before it crosses any WAN (Wide Area Networks), knowing that passing through any WAN denotes a significant data transfer delay. Supplementary, the nature of an Edge Server varies depending on the architecture and context in which it was deployed.

2) Why Edge Computing?
There are numerous benefits and drawbacks that drove the need for a new computing paradigm known as Edge Computing, which we divide into two categories: QoS and necessity.
• Quality of Service (QoS) is one of the most important characteristics of novel applications, and it consists of two elements: low latency and high bandwidth. The first element allows numerous novel applications (e.g., autonomous vehicles) to access cloud services with the lowest response time. For the second element, the data is transferred in shorter paths between the Edge and the end-users, allowing for a higher bandwidth exchange between EC servers and end-users. Shortly, many industries, homes, and hospitals will strive to own those performance requirements, and with EC, they will be able to effectively receive them while edge computing suppliers handle edge servers deployment and management.
• Necessity, due to the rapidly rising number of IoT devices (tens of billions) and the limited bandwidth, the more computing is performed locally, the better it is for preserving the network capacity, thus the necessity of Edge Computing. Further, another subject that does raise much concern today is privacy. Many users and companies are not self-insured about sending their data to the far Cloud. Therefore, EC, with its ability to keep the data close to where the users requested it, could be the perfect solution for this issue. EC has also been found in [7] to be more environmentally friendly than the Cloud. The video analysis experiment showed that computing at the Edge would reduce CO2 injection by 50% compared to the Cloud.
Sharghivand et al 2021 Xu et al 2021 Sonkoly et al 2021 Luo2021 et al 2021 Khan et al 2020 Cao et al 2020 Hamdan et al 2020  This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. ). Moreover, many of them had fully or partially addressed some of EC's main pillars, including computation offloading, managing resources, managing data, and protecting confidential information [34], [37]. Furthermore, with the extensive explorations of the EC concept and its challenges, the number of publications also started to rise dramatically, as Fig.  1 demonstrates. Along with that, the road map of our targeted subject started to expand, and its branches grew tremendously, to a point where EC began to converge and touch other related technologies (e,g, IoT, 5G, EI, etc.). As a result, most surveys in the third group tend to focus on a single topic or a subbranch within EC, such as the recent resource placement survey in [12], resource scheduling in [13], or the work in [9] that cover security and privacy issues in EC.

1) Our contribution
In our work, we drew inspiration from the excellent EC systematic mapping study built-in [40], recognizing the need to fill the gap and produce a systematic lecture study on the subject.

III. HISTORY AND ARCHITECTURES: THE EVOLUTION OF EDGE COMPUTING
The evolution of this computing paradigm has been carried through several stages since 1997. As shown in Fig. 3, each step influenced the concept directly or indirectly and changed the way EC is conceived.

A. CONTENT DELIVERY NETWORKS (CDNS)
Content delivery networks (CDNs) were developed at MIT by a group of researchers who were trying to solve the flash crowd problem [41], where an individual server is unable to serve a large number of requests. The MIT researchers recommend replicating the content on numerous intelligently spread edge servers. Hence, CDNs were the first technology capable of delivering memory resources at the Edge of the network [42].
In the last decade, CDNs usage has become increasingly popular among website owners, who cache their files (HTML, Images, Java-Scripts) at a CDN provider to offer a better web experience to their users. Fig. 4 describes the process of requesting content from a CDN provider.

B. CLOUDLETS
To get into the history of Edge Computing, one must first understand its parents computing paradigms, pervasive computing, and cloud computing [43]. Pervasive or ubiquitous computing inspires the idea of making computations accessible everywhere, where clients can access capable computers from any place and at any timestamp. Motivated by the ubiquitous concept, the Cloud Computing (CC) framework developed as a modern worldview that conveys computing and memory resources on a payas-you-consume premise. It makes computing assets a service instead of a product [44]. In 2006, CC got prevalent with AMAZON's "Elastic Compute." CC solidifies numerous heterogeneous servers for providing infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS), all in an adaptable and versatile design. However, on the other hand, CC lacked one critical performance indicator: latency. Because centralized data centers were so far from the end-users, they could not guarantee short communication delays. In response to this urge, in 2009, Microsoft proposed Cloudlets [45], a concept in which Cloud users can request computing resources from micro-datacenters (from one to forty servers) called Cloudlets, which are widespread small data centers with virtualized infrastructure located closer to the end-users, thus offering low latency connection between them and Cloud users. In that process, with Cloudlets computing, users can request cached content like in CDNs, offload computation tasks to cloudlets, and, most importantly, get a response in a few milliseconds (« Cloud WAN) [8]. Over the next few years, Cloudlets will be renamed afterward to edge servers as a computing component of EC.

C. FOG COMPUTING
The origin of fog computing is traced back to Cisco vision, a company that served as a bridge between the end-users and the Cloud. It saw an opportunity in 2012 to introduce a new computing paradigm known as FOG Computing (FC) [46]. FC aims to provide a continuous computing capability between IoT devices and the Cloud. FC is established on collaboration between multiple heterogeneous devices known as fog nodes [47]. Those devices may exist at different levels of the network (e.g., switches, commodities, servers, micro-data centers, etc.). Fig. 5 shows the fog computing paradigm's architecture and its computing elements. The fog computing paradigm differs from Cloudlets because it does not consider fog nodes as isolated devices but as part of a pool of computing resources that can be extended to the Cloud. Among Fog Computing's keywords is "Orchestration" [48], which is the essential mechanism responsible for automating and managing fog resources across multiple network levels.
FIGURE 5: The different EC network levels Besides that, FC does require serious cooperation between different network and cloud provider entities, this lack of collaboration was the reason why two years later, Cisco canceled the fog computing project [49], as the workload was too much for Cisco to handle on its own. Additionally, as described in [50], there are two types of fog computing architectures, the hierarchical architecture, where nodes from different network layers can collaborate to perform tasks together, and the flat architecture, where nodes from the same layer join forces to perform fog computing.

D. MOBILE EDGE COMPUTING
From an alternative viewpoint, integrating rich computation capabilities into mobile devices has always been a significant concern, especially since the emergence of smartphones and their associated VOLUME 4, 2016 7 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Mist computing is a computing paradigm that exploits the participation of multiple extreme edge components (such as Micro-controllers, mobile devices, sensors, etc.) to provide a computing platform that is based on the IoT devices themselves without relying on outsider computing nodes located at the Edge, Fog, or Cloud level [55].

2) Dew computing
Dew computing is a computation concept that was introduced in 2015 [56]. Dew computing focuses on the formation of a collaborative link between Cloud Computing components and endpersonal computing devices. This collaboration allows resources to migrate between the two components depending on the network conditions.

3) Osmotic computing
Osmotic computing is a new computing archetype that supports the efficient execution of Internet of Things (IoT) services at the network edge. This paradigm is founded on the need to couple microservices deployed at the Edge with those services running on large skill data centers [57]. If the consolidation of multiple data centers creates CC, Osmotic Computing is characterized by connecting the Cloud, the Fog, and the Edge for the seamless and free microservices movement between them.  [59]: is a research area aimed at equipping satellites with computing power. Within, satellites utilize their collaborative coverage and one-hop connection with end-users to deliver EC services. • "UAV-EC" (Unmanned Ariel Vehicle Edge Computing) [60]: is a concept in which a collection of unmanned aerial vehicles (UAVs) swarm over a region to cover customers' needs for computing resources with low latency connections. • Robotics Edge Computing [61]: is a field that entails the merging of different robotic/industrial resources to enhance production processes and robot-human interactions.

G. USE-CASES ORIENTED COMPUTING PARADIGMS
In the following chapters of our survey, if the type of EC architecture is not specified, we will refer to any node in Fog, MEC, or Cloudlets as an edge server or an edge node.

A. RESOURCE MANAGEMENT
The task of resource management is the act of providing the right and comfortable scale of edge resources (CPU, memory, I/O) to any requesting edge application while also optimizing the usage of the exciting pool of resources. In CC, a good resource management strategy gives the cloud provider a flexible and efficient way to manage his IT resources, making it an essential element of any successful cloud business. The management of resources is one of the essential pillars in EC, as it represents the ability that enables the consolidation of multiple dynamic, heterogeneous, and dispersed edge nodes [62]. Resource management can be broken down into many connected phases, as shown in Figure  6, including generating and distributing a pool of resources, monitoring those resources and provisioning them ahead of time, and lastly, allocating those resources to forthcoming demands.

1) Resource allocation & scheduling
Allocating or scheduling resources entails assigning each upcoming workload to the best and most appropriate edge server (physical or virtual) to host it. Since QoS is a critical differentiator between the Edge and the Cloud, knowing the exact quantity and quality of resources to allocate for a pending request is critical to the success of Edge Computing [63]. In EC, resource scheduling is a complex function because there are numerous factors and conditions □ Cost-driven, in resource scheduling, one of the goals is to reduce computation and bandwidth costs while maintaining the required response time.
Numerous studies have been conducted to achieve a reasonable balance between reducing the consumed energy and minimizing the latency, [64] and [65] are some of them.
□ Crowd management, whenever a workload is mapped to the closed-edge resources, it must be aware of the load percentage of nearby edge devices. Therefore, overcrowding a specific host can be avoided [66]. Additionally, busty requests are one of the overloading issues that may affect a particular server or a region. The effort [67] provides a collaborative edge-to-edge method to address the issue of busy requests, in which the authors recommend dispersing requests to other surrounding nodes or regions.
□ Dynamic demand, applications, and services at the Edge are often characterized by dynamic changes in the quantity and quality of resources they require over time and space. Thus, the scheduler should take these changes into account and recalculate and adjust the allocated resource dynamically in order to avoid any degradation of QoS or under-utilization of resources [68]. The mobile augmented reality applications provide an excellent example of resource demand fluctuation [69].
□ Priorities, EC Environments feature concurrency, where many users are fighting over a spot in an edge node. In this case, the scheduler should be as fair as possible with all users' requests. The study in [70] compares different scheduling algorithms according to different priorities (for instance, the first coming, the type of client, the nature of the task, etc.). As an example, analytical queries follow a type of scheduling selection based on the type of the tasks. Within, the tasks assigner prioritize edge nodes with the most relevant data statistics for receiving the analytical tasks [71].
□ Agile learning, based on its past scheduling performance, the resource allocation optimizer can learn to make better decisions in the future, as demonstrated in the platform Deft [72]. Alternatively, predicting upcoming workloads in the space-time continuum provides the edge scheduler with valuable information on which to base his future-aware scheduling decisions. Regression models [73], LSTM [74], and Bayesian learning [75] are among machine learning models that have been used in literature for predicting upcoming workloads.
□ Fault-tolerance, because edge nodes can lose power or connectivity at any time, offering backup copies of any scheduled application would improve the overall fault-tolerance of EC services [76]. Another reason for duplicating service instances on several edge servers is to avoid software multi-tenancy architecture [77], in which one server serves numerous users; this might result in a significant reduction in data throughput, hence violating the EC latency requirement.

2) Resource placement and migration
The placement of resources refers to designing and engineering the optimal distributing strategy of physical and virtual edge servers. Since edge/fog computing is still in its infancy in terms of real-world deployment, many contemporary studies are now attempting to determine what is the best placement strategy for edge nodes, taking into account a variety of criteria (latency, reliability, user preferences), and considering a variety of scenarios (metropolitan network, vehicular network, etc.).
People and companies in metropolitan areas can utilize EC for their computation and memory needs. In this scenario, edge nodes are dispersed in a similar way to antennas, which means that as the density of a region increases, more edge servers are needed to satisfy demand [78]. Meanwhile, a more precise approach is to consider not only the number of users in a region but also the degree to which the end-users are interested in using latency-aware applications [79]. Moreover, in the case of multi-access edge computing, MEC servers are distributed around cells. In practice, it is advisable that when two or more users from different cells interact (play virtual reality games together, stream with each other), they should be served by the same MEC server to avoid a third-party aggregation server, which typically exists in the cloud [80]. Further, many researchers suggest placing edge servers as close as possible to access points to minimize latency. However, this approach raises concerns about spending CAPEX (capital) and OPEX (operational) funds. In remedying those costs, the authors of [81] investigated the best balance between QoS offering and cost reduction in 5G-MEC servers placement.
Furthermore, an important criterion to take into consideration when distributing edge resources is robustness [82]. This last is defined as the ability of a system to survive or function normally despite multiple edge nodes failing or being attacked. The resource distribution must be robust so that if an edge node dies, there should be another one in that region that can replace him. Depending on budget constraints, this placement approach may compromise users' coverage for failure resilience [83]. Additionally, another safety factor to consider when placing edge resources is uncertain or unexpected workload handling. To address this issue, the authors of [81] suggested learning about workload patterns before deciding on edge server placement strategies.
Besides physical resource placement, virtual machine placement is equally important. A simple technique to arrange VMs is to use fewer physical edge servers to place as many VMs (virtual machines) as possible in a few physical edge servers to minimize the number of active servers and therefore reduce the consumed energy [84]. However, this approach can create more congestion on the network because this procedure will lead to more VM migrating to follow widespread demand and users' mobility [85]. In addition to resource placement, resource migration is a key mechanism for balancing the load on edge nodes and accommodating the mobility challenges that exist in EC. When migrating VMs, the researchers in [86] propose using artificial intelligence models to predict user mobility, allowing VMs to migrate proactively before new workloads arrive. In addition, the following are the main three items to be considered when VMs are migrated with user mobility: • The handover effect, one technique to lessen the frequency of this effect in a VEC situation is to apply an intelligent server placement strategy, in which vehicle resources are transferred based on the user's movements, thereby enhancing resource availability [87]. • The task deadlines and workload should not just be shifted to the closest server but also to the strongest one that can help maintain the deadlines [88]. • The cost of migration, the edge service provider should study the users' paths to place services in a way that reduces the overall communication costs [89].

3) Resource provisioning
Resources provisioning is the technology that binds the quantity and quality of resources with users' desired quality of service. In EC, provisioning resources requires planning, estimating, and pooling the necessary amount of physical and virtual machines, along with their exact customization in terms of processor, memory, and network interfaces, all before they are passed to the scheduler to be used by the upcoming request [90].
However, unlike Cloud Computing, in which the costs of the resources are likely to stay steady, the edge environment is well known for its spatial-temporal variation in prices; thereby, VOLUME 4, 2016 resource provisioning actions must meet the recent changes and better balance the QoS with the costs [91]. A good example of spatial changes can be seen in the case of vehicular edge computing, where resources should be provisioned according to traffic [92]. To illustrate the importance of resource provisioning, consider the case of an edge application provider that rents servers from an edge infrastructure provider (EIP). In renting edge infrastructure, on one side, an over-provisioning case can result in a loss of energy and money. On the other side, an under-provisioning case can have a destructive impact on the offered QoS and will also result in a variety of overflow accidents [93]. Therefore, resource planning is highly dependent on knowing the available budget while estimating the demand by understanding edge client behavior patterns [94]. Additionally, an EIP may run out of resources in some locations or at certain times. To overcome these constraints, collaborative resource provisioning across various EIPs can make edge services available everywhere [95]. Another option for dealing with edge infrastructure constraints is to employ public cloud backup services [96], which can keep an application running even if the edge infrastructure runs out of capacity.
Furthermore, the resource provisioning strategy must be aware of the preparedness of resources. In fog/edge, nodes are characterized by high variability in terms of connectivity and availability. Given that, a continuous resource monitoring technique should be adopted [97], within which the aliveness and readiness of edge nodes are predicted by analyzing parameters such as the battery level, the movement patterns of edge nodes, etc. The monitoring of resources is conducted by a group of edge nodes that aggregate information about the states of the surrounding ENs [98]. One of the resource monitoring techniques is overlay gossip [99], which is widely used in wireless mesh networks [100], where a set of nodes distribute data to the whole network without overloading the system, and the nodes' correct reaction to that data is interpreted as a sign of their aliveness.

4) Resource pooling
Creating a pool of resources that can be provisioned or scheduled according to the coming requests is known as resource pooling. Resources pooling aims to group heterogeneous edge nodes and arrange them into a coherent community by allowing them to interact and use each other resources (computation networking) [101]. One of the ways of creating a sufficient pool of resources is to encourage multiple edge infrastructure providers (EIPs) to collaborate. In light of that, the work [102] proposes a game theory cooperative approach. The game's goal is to reduce the overall service latency by creating a coalition of multiple EIPs and rewarding the coalition members based on their contributions. Besides EIP/EIP collaboration, a MEC/Cloud collaborative was explored by [103], where the MEC provider could buy resources from the Cloud when he had an overabundance demand. In the counterpart, the cloud provider could buy MEC resources to offer premium QoS to his client. Also, resource sharing is crucial for some edge computing architectures, for example vehicular edge computing (VEC), where there is always a need for portal vehicles to lease their resource for the benefit of all, as they are being rewarded in return [104]. In Fog Computing, resource discovery and selection mechanisms need to be implemented to create a pool of resources from the massive Edge heterogeneous nodes. Resource discovery or node discovery helps locate new resources and add them successfully to the pool [105]. The thesis [106] is an excellent work that covers discovering fog nodes in the surrounding using customized WiFi beacons techniques. Another way to discover fog nodes is to provide them with a metadata description that makes them known to the other fog nodes [107].

B. COMPUTATION OFFLOADING
Computation offloading is a branch in computer science that deals with whether to run a process locally or send it to be processed by a commodity server outside. Computational offloading gained popularity with the rise of mobile cloud computing [51]. An incapacitated mobile device always requires more resources to run sophisticated applications, like Google Assistant or Apple Siri. As a result, those voice recognition tasks are offloaded to the Cloud. Nevertheless, as interest in edge computing has grown, the question of offloading has become more prevalent than ever, as well as it did take on new forms. With EC, offloading is not only vertical or unidirectional but also horizontal, from IoT device to IoT device, from the edge server to edge server, and from any IoT or end device to any destination server in the continuum mist-fog-cloud. In [108], the authors presented a literature review answer to the central questions in computation offloading, which are: When and where to offload? And according to what measurement should the decision be taken? In that, offloading entails selecting appropriate resources, filtering them, and deciding which ones are the most suitable for that giving task [109]. EC recognizes four types of offloading directions, listed below: • End-device-to-End-device, End devices close to one another can collaborate, as IoT and end-user devices are becoming more powerful. Tasks are executed locally if possible or forwarded to lightloaded collaborative IoT devices in the surrounding [110]. • End-device-to-Cloud, because not all tasks are time-sensitive, incorporating the Cloud into the offloading equation can significantly increase the system capacity [111]. • End-device-to-Edge-to-Cloud, also known as hierarchical offloading [112], is a technique in which an end-device sends requests to the most appropriate edge servers, and the ES makes the decision on which parts to execute and which to offload to the Cloud. • Vertical and horizontal offloading, end-devices can simultaneously transfer tasks vertically (to edge/fog/cloud) and horizontally (to neighboring nodes). This offloading type is well illustrated in VEC (Vehicular Edge Computing) [113].
The following sections examine the different aspects that can influence the offloading choice, ranging from task studying to the decision-making process to various used offloading algorithms.

1) workload studying
Before a task can be offloaded, it must first be understood and studied. Almost any task can be represented as a DAG (Direct Acyclic Graph) with multiple interdependent sub-tasks (see Fig. 9). Given the limited resources of edge nodes, effective task partitioning methods are highly valued in EC. The ultimate goal of task partitioning is to reduce the main task executing latency [114], which can be accomplished by creating as many parallel subtasks as possible. However, reducing latency in task partitioning should be accompanied by lowering communication costs, as distributing subtasks on many edge nodes may cause network congestion, as well as an increase in the probability of bits errors in the data transmission process [115].
Besides, while most theoretical studies assume that the complexity of a task is known, in practice, it is usually unknown before the task is executed. FIGURE 9: A task divided into a DAG, the result of both executing C and D is required to run the subtask D in a Raspberry pi Consequently, the offloading brain should always estimate the runtime of each task before carrying it [116]. Additionally, because tasks are divided into interdependent subtasks, the offloading process should take those dependencies into account to reduce transfer delays between subtasks, as well as to give priority to subtasks that are required to complete other subtasks [117].

2) Decision making
Making the correct offloading decision is a complex problem, and it should only be opted for if necessary since sending requests to the edge/cloud can cause transfer delays. When offloading a deep learning inference task, the effort [118] recommends employing an intermediate layer to measure the accuracy of a neural network (NN), where the authors suggest to stake to it if it is larger than a threshold, else send the rest of the NN to the Cloud to be processed by a more extensive neural network. Similarly, the reference [119] proposes using an estimator to determine whether a small NN was sufficient or a larger one was required. Meanwhile, when outsourcing a NN inference to the Edge, the decision should be taken when the NN is at a layer with a small number of neurons to reduce data transfer costs [120]. Moreover, one of the criteria that influences the offloading decision in the network environment is sending packages to congested networks may add extra delays to the EC tasks [121]. Within, using data compression techniques can help reduce network congestion when offloading [122], although they may add extra latency charges due to compression and decompression delays.
Generally, one of the most challenging aspects of FC is dealing with uncertainties when the system operates in a black-box environment. The task offloading assigner is unaware of the computing capabilities of the surrounding fog nodes. In that VOLUME 4, 2016 scenario, [123] proposes a Coded Computing approach based on the map-reduce model [124], which splits jobs into sub-jobs, each of which is sent to multiple edge servers, with the first completed replica of a sub-job being the only one taken into account, preventing the system from encountering some slow or untrustworthy servers. Additionally, [125] discusses another study that aimed to perform well in those uncertain environments (unknown state of nodes, lack of feedback from the environment), where a reinforcement learning approach was used to learn to adapt to those uncertainties.
Computation offloading can benefit significantly from resource monitoring, knowing that without a resource orchestrator that tracks, detects, and selects the appropriate quantity of resources, the offloading decision will remain unidirectional and unaware of dynamic changes in resource pool [126]. In that process, several efforts focus on joining computation offloading with resource scheduling [127]. Meanwhile, the offloading decision-maker is responsible not only for improving performance but also for maintaining system stability [128]. This stability is measured by the queue state of the multiple edge nodes that receive workload; when a receiving queue of an edge node is exhausted (hectic or converging to a situation where received tasks cannot be organized), the assigner must bypass those types of nodes.
Furthermore, the quality of experience (QoE) is a crucial criterion that needs to be addressed in offloading decision-making process. For example, offloading a video stream job is assessed by the low latency and high throughput that an Edge Service Provider (ESP) can supply [129]. One of the approaches used to increase user QoE is predictive workload [130]. Many edge-enabled apps use predictive information about upcoming workloads to perform parts of their tasks even before the user asks for them.

3) Offloading algorithms
When taking the offloading decision, many mathematical algorithms and methods have been proposed in the literature [12]; we highlighted them in the diagram Fig. 10.

C. DATA MANAGEMENT
The act of acquiring, storing, distributing, and using data is referred to as data management. Data management aims to assist edge nodes in placing,

Computation
Offloading Algorithms

Optimization techniques
Heuristic Algorithms  sharing, analyzing, and retrieving data from one another to make decisions and take actions that maximize their overall utility while minimizing the end-to-end latency. However, data management in EC is more complicated than it is in CC. This difficulty is due to the large number of heterogeneous and widely distributed edge nodes that can adhere to any spatial topology distribution. As a result of reviewing various and recent research related to data management in EC, we listed several keywords and terminologies related to data management in Fig. 11.

1) Content caching
Content replication, also known as content caching, is an old and essential concept that encapsulates the idea of delivering content close to where the user requested it. In content caching, several servers were installed at the network's Edge, creating what is referred to as a CDN (content delivery network) [42]. Further, caching content aims to optimize cache hit reward, measured by the number of times users request data stored on edge servers while FIGURE 11: Data management keywords lowering the cost of using those servers [131]. The caching problem is modeled as a multi-objective optimization problem with many parameters, as illustrated in Table III. This problem is difficult to solve in general. However, it can be approximated to a single objective optimization problem by considering the costs as constraints or by employing extra weight variables to control the optimization preferences between the rewards and the costs, as showed in [132].

Caching Parameters Description
System component The number of ES, and the number of users covered by each ES. The number of files, and the size of each file.

Constraints
The Memory and bandwidth restriction imposed by each ES

Decision variable
The Binary variables refers to the decision of caching a file(s) in a giving EN(s) at a time slot.

Cost
Costs associated with the use of a bench of ESs in giving period of time.

Reward
The reward gained from users requesting files from ESs. In Edge Computing, the main two factors that influence the caching policy are content popularity and edge environment capacity. In an edge environment, ENs are defined by memory and connectivity constraints. As a result, the caching policy at the Edge should consider load balancing among different edge servers to improve fairness in terms of exploitation and avoid exhausting a single node, or a group of nodes [133]. Additionally, when deciding on a caching policy for MEC-based caching, it is critical to consider the connectivity capabilities of base stations as well as bandwidth limitations [134]. Another capacity criterion is stability. In [135], the cache hit and system stability is optimized concurrently to maximize cache capacity and improve the overall system robustness. Moreover, there is no doubt that the rewards obtained from cache hits are directly proportional to content popularity. Therefore, many recent studies have focused on predicting the popularity of content using machine learning and deep learning models. Some of those studies include K-mean [136], GRU (Gated Recurrent Unit) [137], and reinforcement learning methods like the Multi-armed bandit that balance data exploration in finding the liked content with data exploitation in caching the in-demand content [138].
In the last 20 years, the telecommunication industry has grown tremendously, and one of the main reasons for its success is infrastructure sharing [139]. As a result of this collaboration, mobile users enjoyed a better QoE, and the mobile infrastructure was exploited to its full potential. Similarly, MEC researchers have begun studying the possibility of sharing MEC infrastructure between different MEC providers ( Fig. 12 depicts a simple architecture for a MEC providers' data caching collaboration). In this collaborative caching scenario, a MEC provider's added value is the cache hit gained from serving users subscribed to other MEC providers [140].

2) Data dissemination
Data dissemination, or data circulation, is a fundamental element of the Internet of Things domain. Considering the case of wireless sensor networks (WSNs), [141], wherein a massive amount of data is generated each second, finding the best strategy to circulate data between the Edge and the Cloud is a challenging task. On the one hand, the strategy should avoid crowding the network. A solution to that is presented in [142], in which the authors introduce a new broadcasting protocol that uses neighbor knowledge around each EN to prevent redundant broadcasting. Alternatively, on the other hand, if the circulating data is an emergency one, it should be kept away from overcrowded parts of the network [143]. Supplementary data disseminated at FIGURE 12: Sharing MEC infrastructure between multiple mobile operators the Edge must also avoid packet loss, for which the work [144] envisions a parallel push technique.
One of the fundamental aims of data dissemination is to maintain data availability, which is especially important when edge nodes enter and leave the network at any moment and intermittent wireless connections are the norm. In response to these circumstances, [145] proposed a strategy for regulating data transmission in fog computing using epidemic models. In addition, network stability is another goal of data dissemination. For example, in an Internet of Vehicles (IoV) scenario, according to the study in [146], the disseminated data between vehicles should use a restricted number of hops before being transferred to a roadside unit to ensure stability.

3) Data streaming
As of 2020, video streaming accounted for 71% of all downstream traffic according to Comcast Cable [147]. Just on Twitch, 17 billion hours of live streams have been viewed. As a result, streaming is now more prominent than ever. Currently, hierarchical streaming from the Cloud to the Edge and then from the Edge to the covered users is the most popular architecture, particularly in companies like Netflix, which have as a primary goal the prevention of central servers from becoming bottlenecks [148].
In edge data streaming, users' QoE(Quality of Experience) is influenced by video quality (high, medium, low), latency, and bit rate variance. In [149], the authors present Elephanta, an adaptive bit rate algorithm that adapts to the end user's preferences regarding average bit rate, rebuffering, switching, and buffer occupancy to select the appropriate bit rate. Likewise, to avoid the video flicker effect caused by changes in video bit rate, and in the context of vehicle fog computing (VFC), the authors of [150] modeled bit selection optimization as an actor-critic reinforcement learning (ACRL) problem. Even though Edge Computing helps reduce upstream traffic to the Cloud, the need to upload data to the Cloud persists, particularly for applications requiring high-performance computing. [151] proposes a new stream sampling technique that only uploads to the Cloud what is required for incremental learning (IL). IL is defined as the ability of a model to learn from the newest data continuously. Alternatively, in downstream traffic, edge devices are expected to extract important features from streamed data without repeatedly processing it [152]. One of the objectives of stream processing at the Edge is to discover anomalies and novelties, such as determining the skyline sets in data streams [153], those represent the most significant data points or data objects.
Furthermore, in EC, transcoding data streams into different formats and different quality levels is essential because of the heterogeneity of IoT and edge devices. Transcoding can be done proactively at the Edge to improve system efficiency, as demonstrated in [154]. Nonetheless, reducing transcoding time at the Edge is also critical for providing high-quality services. In that scope, [155] put forward a new method for reducing transcoding time by extracting information from the encoding time and saving it as meta-data, which is then used to reduce transcoding time at the Edge. Besides, [156] proposed the novel concept of collaborative live stream transcoding, in which viewers can transcode using their own devices and are rewarded in return. This method is a low-cost solution for reducing delays caused by cloud transcoding.

VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and Nowadays, vast amounts of data are generated every second, all over the place. There is a huge demand for data analytical tools; hence different data processing frameworks gained popularity, counting Hadoop [157], and Spark [158], due to their ability to extract information from large chunks of data. However, those tools require a lot of computing power and memory, making them more suited to the Cloud than the Edge. With today's trend toward EC, Edge Analytics is a key technology that will meet the requirement of keeping data at the Edge. In order to maintain real-time performance and reduce congestion in the core network [159]. However, edge analytics faced numerous challenges regarding accuracy decline due to insufficient computational resources and data decentralization. Table IV illustrates the advantages and disadvantages of centralized and distributed analytics. Additionally, some studies such as [160] focus on balancing analytic accuracy and bandwidth consumption based on the degree of data decentralization. Similarly, another bandwidthaccuracy trade-off occurs in cloud-edge analysis. Where edge devices can compress images before sending them to the Cloud to save bandwidth, but this can reduce accuracy, especially for highly detailed images [161].

5) Data placement
Data placement or data scheduling is the art of finding the best place to store data. Placing the data in an Edge-Cloud environment depends on the service's nature that requires the data. In the case of edge application data, the ultimate goal of a placement strategy is to keep the latency demand while also reducing the cost of transferring the data [165]. Meanwhile, since the sensors and IoT devices are becoming more powerful, they can handle a reasonable amount of data, making it more appropriate to store the data at the mist/fog level rather than the cloud [166]. Additionally, [167] conducted a comparative study of three different algorithms in different network topologies, concluding that the algorithm that favors placing data at the Edge outperforms the one that chooses a fog mapping or the standard cloud data placement. Besides, the effort [168] suggested the use of reinforcement learning for data scheduling in VEC, where the RL model can intelligently decide whether to place data locally, transfer it to a RSU, or forward it to a collaborative vehicle.
Before placing data, the placement strategy should anticipate data retrieval by shortening the retrieval routing path [169]. A good data location service is required to retrieve data; [170] proposes HDS (Hybrid Data Sharing), a fast data location service adapted to the MEC environment. Further, data placement must account for the heterogeneity and mobility of edge nodes. Vehicle Edge Computing (VEC) is a good example, where mobility is the norm and heterogeneity influences how data is placed based on the type of vehicle (public, private, or emergency) and the importance of the content [168].

D. NETWORK MANAGEMENT
Network management encapsulates the advancements made in network infrastructure and architectures that adapt the network parameters to the new computing paradigm. The network management function focuses on monitoring, analyzing, and dynamically adjusting network status as needed. For edge computing to be successful, it must enable resilient and cost-effective network management methods, ranging from access control to traffic engineering to the adaptation of the newest network technologies. References [35] and [58] are two of the most notable publications that have surveyed the networking elements of EC. The following sections will discuss notable network technologies that assisted EC, counting network abstraction (SDN, NFV), radio access networks (F-RAN, C-RAN), and radio-resources allocation.

1) SDN
SDN (Software Defined Network) is an abstraction technique that decouples the network control from the data transmission by logically centralizing the network command functions in a NOS(Network Operation System) or an SDN controller. The NOS instructs the network's forwarding devices on handling data packets (where and when to transfer data). For that to be possible, the Data Plan devices must support programmable switches that use the revolutionary OpenFlow protocol [171]. Overall, SDN allows the network to be more flexible and programmable. Fig. 13 illustrates the two layers of SDN architecture in fog computing. Besides, by designing the optimal path and employing the best packet forwarding procedures, SDN can be paired perfectly with the EC environment [172]. By taking all of these factors into account, [173] studied the characteristics and limits of SDN technology for edge-cloud computing.
Many scaling functions in edge computing, such as computation offloading or load balancing, necessitate the collusion of multiple network data plane components. In this regard, SDN architecture is helpful in EC because it enables the SDN controller to distribute bandwidth resources optimally across different data flows [174]. Additionally, because of SDN's ability to have a global overview of network topology and users' movement, SDN controllers aid in determining the best edge server destination to host migrated services [175]. Other specifications, such as throughput or user preferences, can also be implemented in a mobility scenario to guide the SDN controller in performing the optimal handover for edge services [176]. Aside from mobility, a networking load balance mechanism can be performed by configuring Software Defined Network (SDN) switches across several edge servers [177], to ensure network resilience and protect against traffic spikes [178].
Moreover, SDN can help improve the performance of many edge applications such as CDNs (content delivery networks), where the SDN controller can take advantage of its network aggregated information to shortest paths between users and the content provider server [179]. Additionally, UAV air-ground communication is another application enabled by SDN abstraction [180], in which the SDN controller can utilize the predicted traffic load to perform data forwarding with the highest throughput efficiency. Aside from network decisions, the SDN controller can perform a variety of other required decisionmaking tasks, including a ML inference task [181], in which the NOS, based on the accuracy, can decide whether to transmit the ML task to the Edge or to keep it at the level of IoT devices.
Furthermore, one of the improved functions of using SDN architecture is traffic classification. Traffic classification is the study of categorizing data flows, encrypted or not, into multiple categories based on the packet byte information to differentiate video surveillance traffic from e-health and email traffic. Numerous studies have been conducted on the traffic classification problem, with many of them employing machine learning techniques [182]. With the help of traffic classification technologies, SDN and its integration with MEC could provide better 18 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.  [183]. When MEC servers benefit from their communication with the SDN controllers, they can store the delayed tolerant traffic (for example, email traffic), and then redirect them after a reasonable delay. Alternatively, in the case of latency-critical tasks, the SDN controller can select reliable links that are less prone to failure for using them in critical traffic, such as e-health forwarding schema [184].
In the last few years, there has been an increased demand for SDN abstraction at the network's edges, which can be accomplished with the use of an SDN controller that tracks edge nodes in the data plan [185]. Traditionally, SDN controllers were deployed in the Cloud to provide a global view of the network. Although, With the shift to eURLL (enhanced ultra-reliable low latency) in 5G, SDN controllers have been relegated to the Edge (e.g., MEC servers, gateways, etc.) to facilitate efficient edge infrastructure provisioning. Additionally, ETSI sees MEC as a location for many SDN-based services [185], such as edge packet service management, data plane IP forwarding, adaptive routing for specific applications, etc. The Internet of Vehicles (IoV) is an excellent example of this expansion [186]; if the vehicles and roadside units (RSUs) are added to the data plan, the SDN controller can obtain agile information about vehicle movement changes, improving vehicle-to-X communication.
2) NFV NFV (network function virtualization) is a network abstraction technique initiated by ETSI [187]. Traditionally, network companies used specific types of hardware for each component of the network (Firewall, Switch, Load Balancer). However, this approach was costly and inflexible. Now with NFV, network functions (NF) can be deployed as software applications on the top of a VM or a container hosted by a Blade server, and many NF can be hosted on one server (as illustrated in Fig. 13), making the deployment of network functions (NF) more flexible and scalable with fewer expenses. Similar to how virtualization enabled CC, NFV is the technology that enables Fog Computing. Within fog commodity servers, network functions are deployed along with edge applications in VMs or Containers. NFV provides fog users with the ability to make service placement strategies that meet the end user's requirements [188].
Moreover, the main challenging problem in NFV is the placement of network functions (NFs) on the most appropriate fog nodes. The work [189] modeled this problem as an optimization one, where an NF deployed in a fog node is represented as a binary decision variable. The variables are chosen to reduce the time it takes to deploy, process, and communicate network functions. Additionally, As studied in [190], the NFs placement challenge can be paired with the optimal physical placement of fog nodes for offering the best network/computing services.

3) Radio access control
In 4G/5G networks, the radio access network comprises two main components: the remote radio unit (RRU), which performs radio frequency receiving/transmission tasks, and the baseband unit (BBU), which performs signal processing functions. C-RAN (Cloudified/Centralized Radio Access Network) is a network architecture proposed by China Mobile in 2010. C-RAN group BBUs from different antennas in a remote central office, known as a BBU hotel, away from their correspondent Remote Radio Units (RRUs), the grouped BBUs are distinguished using internal routers that exist inside the BBU hotel. The C-RAN architecture supports lower power consumption, efficient operation, and higher reliability. However, one of the disadvantages of C-RAN is the long distance between the RRUs and the BBU pool, which causes extra latency between the end-users and the BBU pool. This issue gave birth to a new RAN architecture called F-RAN [191], in which each BBU hotel is constructed by grouping only a small number of RRUs, giving EC'users more options for deciding on their optimal BBU hotel to run their services [191]. Likewise, the reference [192] conducted a comparison study between F-RAN and C-RAN, demonstrating that the F-RAN provides a faster response but at a higher cost. Fig  .14 represents the architectural differences between F-RAN and C-RAN.

4) Radio resources allocation
Radio resource allocation aims to dynamically joint edge-users with their most adequate radio resources, allowing them to select the best transmission channel to maximize their throughput while augmenting their SINR (Signal to Interference plus Noise Ratio). In that, multiple orthogonal multiple access (OMA) techniques are exploited, including OFDM (Orthogonal Frequency Division Multi-access), in which every Edge User (EU) uses an orthogonal range of frequencies, and TDMA (time division multiple access), within EU access periodically a sub-channel. Moreover, studying the radio resource allocation problem is crucial in EC, especially when multiple edge users request EC resources. In that, the wireless channel condition has a direct impact on the QoS requirement of EC applications [193].

E. SECURITY & PRIVACY
Since the network edge environment is much different, EC security issues are considered one of its biggest challenges. In CC, data is stored in multiple large data centers, which are very well protected physically, with many guards, fences, and security protocols. In addition to physical security, Cloud providers are heavily investing in cybersecurity. In contrast, the conditions in edge computing differ; physical edge devices are much more dispersed and heterogeneous; this makes them more vulnerable to physical attacks such as the cooling system attacks [194], in which the attackers inject extra thermal load on the cooling systems. Additionally, the high data offloading and circulation at the network Edge made ESs (Edge servers) prone to cyber vulnerabilities [195]. Although, since the data is kept close to the end-users, EC provides better privacy protection than the Cloud. The following sections will cover the progress made in detecting and defending against EC cyberattacks.

1) Attack detection and defense
Attack detection is the study of multiple adversarial attacks and vulnerabilities that target edge devices and servers, particularly embedded ones. Because some of those ES are lightweight, they cannot use defensive tools such as anti-viruses or firewalls; therefore, there is a need to develop software anomalies detectors and testing techniques adapted for those edge nodes [196].
One of the most well-known threats to the edge servers is DDOS (Distributed Denial of Service). The DDOS attack aims to overwhelm the server's capacity by requesting the server by thousands of zombie machines. These attacks are significantly more efficient at the Edge than in the Cloud. In the CC, solutions like CDNs are utilized to spread content across numerous servers to relieve the load on a single server. On the other hand, Edge users will be unable to use this approach since they must connect to the nearest edge server. Nonetheless, there are some proposed defenses against DDOs attacks, counting the collaborative edge nodes solution in [197]; this approach suggests allowing the targeted server to redistribute the upcoming requests to his neighbors, reducing pressure on him.
Moreover, in order to detect network attacks, an intrusion detection system (IDS) is required, which can detect anomalous packet transmissions by analyzing historical data from packet transfers, processors, and memory [198]. Many machine learning models are being investigated in the literature to discover those hidden intrusion patterns. Among those are [199] and [200]. Meanwhile, due to the fact that most edge devices are fog/mist ones, those have minimal energy resources, some attacks take advantage of this weakness by targeting ES batteries and causing them to consume a significant amount of energy. The Stretch attack [201], for example, sends data packets with headers that contain extended routing and looping paths, forcing these edge devices to consume energy from unnecessary data transmission routes. Another attack is Droplet attacks [202], in which the adversary sends an 802.15.4 data frame and then stops, putting the receiving edge server in continuous reception mode. Further, a clone node attack is another type of attack, mainly exploited in wireless sensor networks (WSNs) [203], where the attackers create a clone node of a sensor using its ID. If the cloning attack is not detected, it can cause a false data injection attack [204]. In order to recognize the clone nodes, studying the Channel State Information (CSI) is commonly used as an effective defense [205].

VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3183634 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Furthermore, edge computing is tightly coupled with computation offload since it is one of its main pillars. However, offloading makes edge devices vulnerable to attacks, for example, the Byzantine attack [206], in which a malicious receiver of the offload can corrupt the operation or change values as he sees fit. One of the proposed solutions to detect these types of attacks is homomorphic hash functions [206].

2) Data integrity
Verifying the integrity and consistency of data that is distributed over a network is known as data integrity. Ensuring data integrity in EC is substantial because many edge devices can be manipulated or corrupted intentionally by adversaries or accidentally due to sensor malfunctions or transmission errors [207]. For verifying data integrity, the work [208] put forward a protocol called EDI-V. Their approach is based on giving each data block a tag before storing it in an edge node and then having a robust and trusted third-party server audit the data changes by comparing the initial tags with the newest ones. However, adding a third party is not adequate for privacy issues. In that, the efforts [209] proposed a distributed and lightweight auditing approach based on Merkle Hash Trees. Moreover, an excellent example of data integrity issues is well illustrated in the case of data replications in CDNs, where consistency could be neglected for performance. In this case, the writers of [210] proposed a technique for locating corruption incidents that relied on generating signatures during inspection time and comparing them to the signature of the original data. Another method for ensuring data integrity is to use Blockchainbased architectures, which are well-known for their integrity and traceability protection [211].

3) Access control
Access control is a process that allows only authorized users to access a given edge server. Since the authentication time adds to the total latency, EC authentication challenges lie in making access control as light as possible [212]. Supplementary, users' mobility and the wide distribution of edge servers make the authentication process more challenging in EC. One of the first user authentication work in fog/edge computing is [213], where edge users receive a master key that allows them to access any edge server using this master key along with the targeted Edge server public key, then having the edge server decrypt it to check the user's authorized status. Another approach for controlling access is the implementation of user tokens [214]. The edge server receives tokens, then analyzes its cryptography representation and compares them to those in the database. Meanwhile, effective authentication is known for resisting and countering many attacks, counting privileged-insider and man-in-the-middle attacks [215].

4) Privacy
Today, user privacy is a controversial topic, especially with the expansion of camera surveillance systems and the exploitation of confidential users' data by social media platforms. Connected edge devices collect information about people to provide services that may jeopardize their privacy. In CC and EC, the journey of processing data passes by three main stages: data cleaning, data aggregation, and data analysis. The cleaned data is often less representative, with a smaller number of attributes than the rest. For privacy reasons, [216] put forward a distributed data cleaning algorithm that only asks users for data representation without transferring the actual data to the Cloud. After cleaning, data analysis necessitates transferring to a centralized server(s) to run analytic models. Many undertaking approaches were given in the literature to prevent this process from violating privacy rules; lightweight encryption of data before transferring it is one of them [217], or queries encryption to protect the content of data [218]. Additionally, [219] suggests sending data with noise and training ML models with noised data. Similarly, [220] proposes multiplying the data with projection matrices before sending it to the Edge/Cloud for training. Besides the methods that change or deidentified data before analyzing it, differential privacy emerged as a powerful technique that helps defend against re-identification attacks [221]. For analyzing data, Federated Learning (FL) is one of the suitable leading solutions that help extract information from data without centralizing it [222]. FL allows end-users to share model parameters without sharing private users' data; each user trains the ML model locally and then offloads parameters updates to the exterior. A good example of using FL is in vehicular edge computing, in which car owners refuse to share their data with others [223]. This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

5) Edge application Safety
Edge computing enables a wide range of applications, each vulnerable to a different type of attack depending on its nature. In the case of EC content caching, for example, some attackers request unpopular content regularly in order to deplete the reservoir of caching servers and force normal users to request unpopular files from the cloud [224]. Meanwhile, many edge applications are AI-based, which are subject to high-level attacks that plan to fool them without any virus injection of a network intrusion. In today's world, AI base applications face a variety of adversarial attacks. Those attacks are divided into two types, white box attacks [225], within which the attacker has access to the model parameters, and black-box attacks [226], in which the attacker does not know the model parameters but generates adversarial inputs from similar models. Overall, these attacks are not limited to AI-based edge applications, but they can also target AI models that were used to facilitate the primary functions of EC, such as AI-based offloading mechanisms [227] and Network Intrusion Detection models [228]. Further, some of the proposed approaches for preventing adversarial attacks include adversarial learning [229], in which NN is trained on the discovered adversarial examples. Another technique for increasing the NN is to train neural networks with data that has been perturbed by a small amount of noise with labels similar to standard clean data [230].
Furthermore, a positive indicator about the safety and the robustness of AI at the Edge has been addressed in the works [231], [232], in which neural networks adapted to the Edge using compression techniques (see section V.1) such as quantization or distillation are found to be more robust than their non-compressed counterparts.

F. BILLING AND PRICING
Pricing or billing of edge computing services is an important concern for any ESP (Edge Service Provider). In the EC market, there are four main market players: clients (individuals or businesses), ISPs (internet service providers), clouds (cloud providers), and ESP. These players are interconnected and can affect directly or indirectly one another [233]. The following section will discuss the various mechanisms for pricing EC services.

1) Edge service pricing
The ultimate goal of EC pricing is to find a strategy that maximizes edge services prices while also taking the concurrence and client's willingness to pay into account [234]. EC prices are regulated dynamically based on the supply and demand situation, with the supply usually known but the demand having to be estimated [235]. In pricing edge resources, the effort [236] suggests that the pricing of edge resources should be based on allocated resources rather than used ones in order to maximize profits. Overall, pricing aids in mitigating the abuse of edge resources, and it is an essential factor when selecting the best hosting ES [237]. Moreover, achieving full EC service coverage by a single entity is extremely difficult. As a result, collaboration among EIP providers is the norm to serve multiple users in a wide range of areas. Therefore, pricing policies that regulate cooperations are required. For that, the authors of [238] propose a peer-to-peer payment system in fog computing, based on virtual coins, in which each fog nodes owners has a budget of coins, and whenever he requests some resources from his neighboring fog node, he pays them using his coin, those coins can then be transferred to real money. In a similar manner, a Blockchain credit values exchange for resources was adopted in [239] to regulate multiple edge nodes' cooperation.
In contrast to users paying ESP, in VEC, vehicle owners are compensated for providing their vehicle resources. In that, the offloading decisions from users to vehicles should be based on jointly minimizing the cost of running servers locally, as well as the cost of running them in a collaborative vehicle [240]. The amount of money the vehicle owners are paid during this process is determined by the MEC services demand [241]; if there is an increase in demand, MEC providers should raise the remuneration price to entice more vehicle owners.

2) Auction based pricing
In an auction pricing system, each user or agency bids or submit a request for a quantity & quality of edge resources. The objective behind the auction is to design a payment system for all truthful users that reduce the difference between their valuations of the allocated resource and the proposed prices. One of the famous auction methods is Vickrey-Clarke-Groves (VCG) [242], where competitors give valuations of an item or a service without knowing each other's bids. Based on VCG, many auction pricing methods have been proposed for edge computing, including [243], where the aim was to maximize users' rational valuation 22 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3183634 without considering any envy intention. Another auction method is Mcafee auction [244], used in an environment with multiple ES (Edge Services) sellers and ES buyers. Additionally, as demonstrated in [245], the auction matching system can be used in computation offloading, where ENs accept or deny offloaded tasks not only based on their computation and communication resource but also based on the proposed bid by the users; if no edge node accepts the task, the user must augment their bid to be competitive.

3) Market analysis
The pricing of edge services is determined not only by the number of users in a market but also by the competition mechanism among multiple sellers, which has resulted in the pricing problem usually being modulated as a game, typically the Stackelberg game [246]. Because there are three main players (ISP, ESP, EU) in the EC pricing game, the reference [247] designing two nested Stackelberg games, one between the ISP and EU and another one between ISP and the ESP. The EU (Edge Users) subscribe to an ISP for radio resources allocated by access points, plus edge services, while ISP pays the ESP for leasing their given MEC resources.
In 5G, many private operators plan to enter the market to meet the demand for MEC services in a region. To achieve that, they will need to purchase and operate MEC servers and rent backhaul connections from legacy telecom providers. In order to have a good return on investment, private operators must have a reasonable pricing strategy adjusted to QoS offered to their clients [248]. Besides, to dynamite the EC market, intermediate platforms are widely initiated, like the suggested one in [249], inside which Edge services owners can put to rent part of their edge resources that they do not need for the users who request them, in exchange, the platform receiving a commission fee on each transaction.

4) Pricing parameters:
Many measurements influence edge computing service pricing. Those parameters are represented in the table V.

A. EDGE INTELLIGENCE
Edge Intelligence (EI) is the study of the coconvergence of AI and EC. EI field highlights and Pricing parameters Definition Ref

Offloading ratio
The Edge/Cloud pricing determine which tasks portions should be offloaded to the edge and which should be completed in the Cloud. [250] Users' reputation Discount on edge services, based on users' reputation. [251]

Stability
The pricing strategy must remain stable over the long run, avoiding selfishness of players. [252] Time delay When a task got delayed the users get discount. [253]  NNs are made up of many neurons with weighted connections connecting them (see Fig .15). During the learning process, the network attempts to modify the weights using an optimizer such as gradient descent [255], to improve model performance. There are several well-known neural network architectures, including CNNs (Convolutional Neural Networks), which are used primarily in image processing and object recognition, and RNNs (Recurrent Neural Networks), which are used to analyze time series.

2) Pruning
Pruning in EI refers to the act of removing or deleting elements from an AI model, generally in the context of deep learning, elements such as neuron(s), weight(s), filter(s), and even entire layer(s) from a NN, for the reason that those elements have a negligible impact on the Neural Network construction or performance. Fig .15 shows a case of pruning neurons and weights in a FFNN, plus filters in CNN. As a result of pruning, neural networks' computation and memory sizes are reduced, making them more adaptable to edge nodes. To answer the question, what should be pruned? Many studies intend to discard weights or neurons based on their magnitude or influence on the loss function; typically, this magnitude is calculated mathematically using the derivative of the loss function on a given weight or neuron. Then, various norms, such as the standard L1 norm [256], or the nuclear norm that gives more sparsity [257] can be used to measure and compare those values; after comparing the various normalized magnitudes, the goal is to discard a given number of the least important ones.
Meanwhile, due to the changes brought by pruning, it is necessary to calibrate the portion of the NN connected to the pruned elements. The rectifications in the case of a pruned neuron are represented by deleting the weights connected to that neuron. Unfortunately, pruning also results in a burst degradation of accuracy. In this case, to preserve previous NN knowledge, [258] proposed redistributing the pruned part of the NN parameters (weights, biases), using linear regression in a way that makes the sum of the outputs of the pruned layer approximate to the total outputs of the old layer (before pruning a neuron). Alternatively, to avoid training the model again after pruning, it is more effective to prune the NN during the training phase [259].
In CNN, convolutional layers or filtering layers perform most of the computation (more than 85%). As a result, they are the best-targeted elements during the pruning process. Within that, the effort [260] modeled the filter pruning as an optimization problem that seeks to minimize the cross-entropy loss function while having as a variable the binary decision vector of whether to keep a filter or not. An improvement in [261] suggests considering decision vectors as Bernoulli probabilities to facilitate the solving of the pruning optimization problem. Additionally, in [262], an advanced pruning method is used to address the goal of selecting a pruning strategy that achieves the best accuracy/speed tradeoff. This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. weight matrix factorization, as demonstrated in [265]. By decomposing the weight matrix into SVD (singular value decomposition) (W = USV) and employing a sparsification technique that allows many singular values to converge to 0 because they are unimportant in comparison to the others, this feature maximizes the number of multiplications by 0, and thus reduce the computation time.

4) Quantization
Quantization is the process of downsizing the bits' representations of a neural network's elements (weights, neurons, and activation) from a high precision representation (32 bits) to a lower precision representation (8 bits). Quantization's goal is to reduce the hardware execution time of a NN training/inference phase. In quantization, real variables (with 32 bits) are typically rounded to the nearest number in the new low representation; however, rounding to the nearest is not always optimal, particularly during the training phase, because it can disrupt the NN learning process, as demonstrated in [266]. Quantization can be classified into three types. The first type is a fixed bit variable quantization [267], in which the required number of fixed bits to represent all numbers is calculated based on the highest and lowest float numbers that may exist in the NN. The second type of quantization is uniform quantization [268], with each variable represented in the space R based on its size using a Uniform Distribution. The third type is non-uniform quantization, which is similar to uniform quantization, but it uses a different distribution, typically a Gaussian one [269].
Meanwhile, one disadvantage of quantization is that it reduces neural network accuracy and makes the training difficult due to non-differentiability. In [270], the authors proposed a progressive quantization approach that starts the NN training with low precision quantization and gradually increases the precision, allowing control over the trade-off between resources consumption and performance. Similarly, an adaptive learning method was introduced in [271], which enables the number of bits representation to be modeled as a learnable hyperparameter.

5) Knowledge distillation
Knowledge distillation is a transfer learning technique that transmits knowledge from a complex model (teacher) to a simpler model (student) with fewer parameters. Knowledge distillation is an efficient solution for bringing high-performance neural networks to the Edge. Some examples of knowledge distillation applications include increasing image resolution [272], fast person reidentification in a camera surveillance environment [273] and visual dialog comprehension [274].

6) Tiny Models construction
Besides using compression or reduction techniques, other approaches suggest building a small neural network from the start, knowing that not many intelligent tasks require a large NN. Many efforts in the literature have followed this direction, including [275], where the authors focused on reducing the dimension of filters in a way that does not affect the accuracy of the CNN model. Similarly, tiny TRU-Net was constructed in [276], and it was built using one GRU(Gated recurrent unit) cell. Aside from reducing NN size, changing the optimizer in the learning process can save a lot of computation time, as in [277], in which back-propagation is avoided in favor of a computation-friendly automatic adjustment of the NN weights based on the loss function. Likewise, some work like [278] even propose to bring back multi-layer perceptron, which require far less computation than the newest NNs. Meanwhile, another method of reducing AI inference time is to combine two or more AI models into one, as was done in SSD [279], where the model does object boxing and recognition at the same time, saving much time in comparison if the task was done using two separate NNs. Along with model parameters reduction, input compression is considered a valuable variant for reducing NN training time, as seen in [280].

7) Distributed edge computing
Aside from the compression and reduction techniques discussed earlier, the edge environment is well known for the large spread-out numbers of edge nodes that perform the computing tasks. As a result, distributed computing is an excellent solution for enabling EC to perform AI tasks. In this section, we will cover both distributed training and distributed inference. There are two ways of distributing a deep NN inference, either vertically or horizontally. In the case of vertical inference distributing, also known as the IoT-edge-cloud collaboration, to execute a model inference, usually, the IoT device starts computing the first small part of the neural network for privacy issues, then send a small part of the rest to the Edge, VOLUME 4, 2016 and then the large rest to the cloud [281]. In the case of horizontal inference distributing, one of the first proposed distributed inference frameworks is MoDNN [282], a map-reduce mechanism that makes partitions of the input feature according to the number of ENs, where each node does a part of the inference calculation, and an aggregation layer comes to reduce the results. In addition, In the case of CNN, Fused Tile Partitioning (FTP) is proposed by [283], inspired by the idea that in CNN, the result of the dot product of the input data with each filter depends only on a specific region in the input data. Thus, good parallelism computing could be achieved if each edge device took a portion of the input feature. Moreover, in the case of an FFNN (Feed Forward Neural Network), the heavy calculation exists in the multiplication of weight matrices with activation layers. Therefore, the work in [284] suggests dividing the weight matrix into multiple sub-matrices so that each edge node is responsible for doing the multiplication of the small sub-matrix with the corresponding part from the activation layer. Meanwhile, according to [285], while developing a distributed inference process, it is necessary to consider not just reducing the overall latency but also memory limits and communications costs related to parallelism computing.
Unlike distributed inference, distributed training is not a time-critical task for ML applications because the training phase is generally completed offline. Although most deep neural networks are trained in the Cloud (public, private), Edge Intelligence comes with the new perspective of training ML models at the Edge, following the fundamental goal of keeping the data at the Edge for better privacy and less network congestion. The standard method for training a ML model is to send the whole input data to a centralized server and then perform the learning process there. However, in EC, this approach is impractical since a single ES does not have enough memory and computing power to handle the training of an entire NN. Thereby, A new ML training paradigm has emerged called Federated learning (FL) [286]. FL considers distributing a NN' input data across multiple nodes, where each node trains based on his local data and transfers the outcome parameters to an aggregator. Next, the aggregator broadcasts the parameter changes for all collaborative nodes.
Further, Federated learning is plagued by high communication costs, and there is an urgent need to reduce data transfer overhead between collaborative training devices. Following that, [287] recommends sharing only the essential values of the gradient matrix since sharing the entire gradient matrix is an exhausting communication task. Alternatively, a different approach is to advise the aggregator to only receive updates from a small subset of ENs in order to reduce his throughput limitations. The work [288] chooses to select those collaborative nodes in a way that the staleness of the non-selected edge nodes at each timestamp is reduced.

8) Adapting the Edge to AI
The second part of edge intelligence is about adapting the Edge to AI. In this section, we highlight various technologies that allow AI algorithms to be well received at the Edge; we mainly focus on the edge hardware adaptation for AI models since the software part is covered in the containerization section IV.3. According to the recent survey [289], the most frequently asked question in AI hardware adaptation is what type of edge hardware is optimal for hosting AI model inference and training? Is it CPU, GPU, FPGA, or ASIC (application-specific integrated circuit)?
Starting with the CPU (Central Processing Unit), they are the computer's brain; a CPU is known to be more flexible since it can do a variety of jobs without intentionally favoring one over another. Alternatively, GPUs (Graphics Processing Units) differ from CPUs in that they have more transistors in their arithmetic logic units, making them more powerful than the CPU in doing math calculations. Further, unlike a CPU, a GPU can have thousands of cores, allowing the GPU to perform well in parallelism tasks like training a neural network.
Aside from GPUs and CPUs, FPGA (Field Programmable Gate Array) is a candidate integrated circuit for hosting AI inference. By structuring, configuring, and interconnecting a group of logic blocks, FPGA makes it possible to perform well in some targeted logical functions. FPGA can accelerate AI inference by creating registers for NN input and weights values and multiplication and arranging addition blocks in a way that couples memory and computation and reduces data flow latency inside the circuit, thus speeding up the inference [290]. Because FPGAs are known for their high customization and future profiling, if designed well, they are regarded as one of the best hardware to host efficient model inference. However,

VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3183634 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ FPGAs suffer when it comes to training AI models [291], especially in comparison to GPUs, due to their limited memory space. Although, as [292] pointed out, training in an FPGA is possible with the help of low-precision quantization and compression methods.
Further, ASIC (application-specific integrated circuit) is a hardware non-programmable architecture that is used for specific applications. ASIC chips are one of the best hardware for running AI inference at maximum efficiency due to their high design customization. Regrettably, the disadvantage of ASIC is that they are hard to design, especially when it comes to integrating and supporting multiple DNNs (deep neural networks) [293]. Some good examples of ASIC ships are Tesla D1 Dojo chips [294], and Microsoft TPU (Tensor flow processing unit) [295] which is specifically designed for the well-known TensorFlow framework models.  In comparison, many types of edge hardware can be used to accelerate AI training or inference fully or partially, ranging from high flexibility to precise adaptation. Based on the benchmarking done in [296], in terms of training, ASIC (TPU) and GPU outperform other types of hardware, and for neural network inference, FPGA and ASIC are proving to be the best. Besides electric hardware, photonic AI accelerators are now making an appearance as candidate hardware for hosting AI models [297].
In addition to digital hardware, analog circuits, including Neuromorphic ones, are emerging as a powerful alternative. Neuromorphic Hardware is a set of electrical circuits that mimic the human biological brain [298]. They are built with silicon-based artificial physical neurons. The primary electrical component in neuromorphic hardware is the memristor [298]. A Cross bare memristor perfectly replicates the analog operation of two matrix multiplication [299]. If neuromorphic hardware overcomes all of its current challenges, it will be the most advanced and adequate edge hardware to host an AI model since Neuromorphic computing is all about computing in memory [300]. Fig .16 shows the evolution of AI hardware accelerators.

B. 5G AND ITS EMPOWERING TECHNOLOGIES
5G refers to the fifth generation of telecommunications networks, the latest group of advancements made over the previous 4G LTE (Long-Term) networks. 5G promises three primary services: eMBB (enhanced Mobile Broadband), eMMTC (Massive Machine Type Communication), and eURLLC (Ultra-Reliable Low Latency Communication). Certainly, EC is one of the most important technologies enabling 5g, as it is recognized by ETSI as an essential component of 5g [24]. With the help of MEC services, 5g networks can provide URLLC services to clients by hosting their services at the edge/access level of the network. Subsequently, MEC resources are utilized for hosting many 5G functions, for instance, cellular traffic prediction [301].

1) MIMO
5g is empowered by multiple technologies, including massive MIMO (multiple-input, multiple-output). MIMO is a radio technology that incorporates different transmitter and receiver antennas. MIMO employs spatial diversity to reconstruct signals received from multiple receivers or transmit a signal using multiple antennas. Integrating MIMO technology with edge computing allows multiple edge users to send computation requests simultaneously and in high efficiency [302].

2) IRS(Intelligent reflecting surfaces)
IRS (Intelligent reflecting surfaces) are signalreflection surfaces that control the signal's transfer angle. IRS focuses energy by creating a beam directed toward a receiver. IRS can be deployed in various locations between the transmitter (antennas) and the receiver (mobile device), but when the IRS is deployed at signal source in base stations, it creates what is known as holographic beamforming. By reconfiguring the wireless propagation environments, MIMO technologies VOLUME 4, 2016 27 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ improve the offloading links (throughput, data rate) between edge devices and MEC servers by intelligently phase shift parameters of the IRS [303].
3) 5G optimization of communication resources 5g networks are well-known for supporting massive IoT communication via cellular networks; however, this effort can only be achieved by optimizing machine-to-machine and machineto-x communications, referred to as energydraining communication types. Consequently, multiple efforts in g focused on developing DRX (Discontinuous Reception) techniques for MTC (machine type communication) [304], those methods allow end-devices to save power by sleeping whenever there is no packet to receive. Some of this work is based on modeling packet arrival time using statistical distributions. Alternatively, as illustrated in [304], machine learning models can be used to predict future arrivals, allowing machine-type end-devices to schedule the ideal sleeping times. Additionally, the machine learning models for DRX could be hosted on MEC servers. Within [305] introduced Discontinuous Mobile Edge Computing (D-MEC), a DRX technique adopted for MEC servers. Further to that, avoiding the execution of redundant or identical tasks is another technique to optimize ED resources. This technique is accomplished by reusing or partially reusing previously completed compute tasks, exploiting what is known by Computer Reuse Architecture [306].
Besides, the end devices at the mist level are energy-draining and battery-depleted. The groundbreaking WPT (Wireless power transfer) systems emerged as one of the solutions to this energy problem. Some of the efforts about integrating WPT and MEC were documented in [307]. The harvested energy from surrounding antennas within the IoT/mobile devices range can be used in offloading works to MEC servers associated with those antennas.

4) Network Slicing
The 4G LTE networks are known for the concept of "one network fits all," All users share a single network with the same performance. However, this architecture wasn't ideal for many applications. Thus the 5G network intended to improve on 4G by introducing a new concept called Network Slicing. Network slicing aims to break a network into many slices, each of which is tailored to serve a specific group of applications with specified requirements (Latency, Peak data rate, Cell throughput, etc.). All of this is to provide adequate resource sharing in 5G networks [308]. A network slice is an end-toend connection that contains elements from different network parts (RAN, Core-network, transport, and so on). Fig .17 demonstrates how a network can be partitioned into several slices, each with its own set of customizations. One of the most challenging aspects of network slicing is automating network slices based on the requests [62]. Further, to combine edge computing with network slicing, physical MEC resources are typically divided into multiple VMs or containers, each with its own set of capabilities designed for a specific slice. However, the drawback of having multiple VMs in a physical machine will reduce the throughput, therefore, lowering the latency of applications hosted in those VMs [309]. Also, network slicing contributed to the difficulty of scheduling MEC resources since in 5G networks, provisioning and resource selection should be combined by picking the most appropriate network slice [310]. This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ of telecommunication networks (5G and 6G) is moving toward a non-orthogonal multiple access (NOMA) strategy. NOMA allows numerous devices to send data on the same band of resources (frequency time), maximizing spectrum efficiency. NOMA is based on the superposition of numerous signals at the transmitter side, followed by interference cancellation via successive interference cancellation (SIC) at the receiver side. NOMA is considered the leading technology that helps optimize the workload offloading process to MEC servers, as it enables multiple machinetype end-users to access 5g-MEC resources simultaneously. However, the main challenge in NOMA and EC is the establishment of the right radio/MEC resource assigning strategy that respects task delays while lowering total offloading energy [311].

6) 6G
The sixth network generation moves toward the polarization of intelligence usage in the network from the core to the Edge. The 6g networks promise ultra-smart and robust network functions; however, those functions will be accompanied by large (computing and memory) consumption. Along with Fog and Cloud computing, Edge computing will play a critical role in hosting 6g functionalities while providing them with ultra-low latency. The 6g evolution is characterized by the 'softwarization' of many parts of the network, which leads to what is known as the convergence of computing and networking [312].

C. VIRTUALISATION, CONTAINERISATION:
Virtualization is a powerful technology that enables cloud computing. Today, every data center uses virtualization to create large pools of resources (CPUs, memory, disks, network). For offering them to customers as scalable, consolidated VMs. Virtualization has changed the way people think about computing and communication resources. In cloud computing with VMs, computation has evolved into a service rather than a product. Similarly, in 5G, with NFV and SDN, network components are becoming network services. This section provides a brief introduction to containers and container orchestration and explains why containerization is one of the key enabling technologies of EC.

1) Container
Over the recent years, containers' popularity increased as they emerged as a promising alternative to virtual machines, leveraging the lightweight implementation of virtualization. Containers are as old as the Linux Kernel is. In 2008, Linux Control Groups (Cgroups) and Namespaces were combined to develop Linux containers (LXC). LXC aimed to create a complete OS-level virtualization technology that became a prominent Linux kernel feature. This Linux kernel feature was incorporated into many projects/organizations, and the most known of them is Docker. Fig. 18 depicts the differences between VMs and containers in terms of architecture. The main advantages of using VMs and containers are consolidation and elasticity. Consolidating workloads reduces hardware, power, and space requirements. Elasticity allows dynamic allocation of resources that are needed [313]. With VMs, companies are no longer required to own physical servers and accommodate peak demands whenever they occur. Additionally, yet importantly, due to their lightweight dependencies, containers offer higher portability than VMs. Alternatively, VMs provide a high level of isolation and thus better security [314]. Despite VMs on a physical machine sharing resources, mechanisms such as virtual paging are implemented to ensure that each VM's resource is entirely isolated. Finally, as shown in Figure  .18, containers share a standard operating system, different from VMs, where each one could run under a different operating system, adding overhead in memory and storage.

3) Containers orchestration
Orchestration is the automated configuration, management, and coordination of computer systems and services [315]. The goal of orchestration is to help manage complex tasks and workflows more efficiently. Container orchestration automates the deployment, management, scaling, and network of containers. There are many orchestration tools to choose from [315]. The orchestration tool manages the life-cycle of the running containers according to the specifications laid out in the container's definition file.
Kubernetes, Greek for the helmsman, is an opensource container orchestration tool developed by Google in 2008. Kubernetes's main responsibility is making sure that all the containers that execute various workloads are scheduled to run in physical or virtual machines [316]. Kubernetes is comprised of many components, but the main ones are: • Cluster is a collection of nodes containing at least one master node while the rest are worker nodes. • Node, also known as a minion, is a single host whose job is to run pods. • Master is responsible for the overall cluster-level scheduling of pods and handling of events. • Pods are an important feature and the basic unit of work in Kubernetes. Each pod contains one or more containers. • Deployments, replicas, and ReplicaSets. A deployment is a YAML object that defines the pods and the number of container instances, called replicas.

4) Containers at the Edge
Recent research shows that using containers at the Edge has many advantages. The advantages stem primarily from the low deployment time [317] and the quick migration time [318] provided by containerization technology. Containers orchestration allows consolidating multiple IoT devices with heterogeneous hardware for an increased quality of service at the edge [319].
In that, containerization perfectly matches the edge environment where mobility and constrained resources are the norms. In addition, containers also help expand the elasticity and resilience of the edge Eco-System, as their advanced task recovery methods allow tasks to run uninterruptedly at the edge [320]. Moreover, when edge nodes send data to the Cloud, they do not typically need to send raw data streams. Instead, the node only sends critical information. This event-driven approach in EC can be tackled by forking containers by the orchestrator whenever needed. Furthermore, with their lightweight and portability characteristics, containers are considered the best run time for edge/mist devices like SBCs(Single Board Computers) [321]. Meanwhile, in a MEC system hosting edge application, base station Handover needs to be coordinated with container migration as it was presented in [322]. However, one of the issues that containers at the Edge suffer from is the cold start problem [323]; it refers to the needed time to bring up a new container when there is no warm container available. The solution to this issue is to have warm containers available for usage by the event-driven application, except that this method may add extra energy consumption [323].

D. THE MOVE TOWARD FUNCTION AS A SERVICE (FAAS)
FaaS(Function as a Service) refers to the ability to decouple an application into a group of interconnected pieces of computation unit (code) known as functions. It is the next abstraction layer above software as a service in CC, and its goal is to provide cloud clients with a platform for running their software without regard for OS or any virtualization dependencies. The ability to create event-driven apps is one of the benefits of serverless platforms. With that, some functions such as sensing and abnormalities detection can be activated based on events, making serverless ideal for a wide range of edge-enabled applications, including precision agriculture, image processing, etc. Meanwhile, present serverless platforms are more suited to the Cloud and far from being viable at the level of an edge node with limited computation resources. However, recent efforts such as [324] research minimize the issues in deploying serverless edge computing.

A. SMART CITIES
Intelligent and connected cities are the image of our near future. Smart cities encompass a wide range of subdomains, including smart buildings, smart farms, smart roads, smart banks (Blockchain), and so forth.

1) Smart buildings
Smart buildings are a sub-branch of smart cities that make buildings more efficient, smart, and dynamic. Smart buildings' appliances (kitchens, light fixtures, TVs) are intelligently monitored and 30 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication.  controlled based on users' preferences. Controlling the building environment can be computationally expensive, and most modern intelligent functions require little delay interaction with the user; this makes EC a valuable tool for meeting the fast response and low-cost requirements of smart buildings [325]. Some advanced smart building functionalities include room occupancy estimation [326], video surveillance on outdoor [327] and person tracking and identification [328], etc. Meanwhile, intelligent buildings have been accused of consuming much energy, although they also contribute to energy savings by doing energy-saving actions like turning off lights or heaters [329].

2) Smart farms
Smart farming (SF), or precision agriculture, as a component of smart cities, employs the most recent and advanced ICT technologies to improve farm sustainability and profitability. Smart farming is focused on controlling actuators (motors, pumps, light regulators, and so on) based on various aggregated data from sensors (temperature, humidity, brightness, etc.). Moreover, UAV (Unmanned Ariel Vehicle) Computing is a type of application-oriented edge computing that is widely used in agriculture today [330]. The UAVs are deployed in the form of a swarm, outfitted with cameras and computation resources, where they fly over large farms to monitor crop health and plant stress. Aside from crop rentability, EC can be used to analyze farm animal behavior [331], which is vital for animal welfare and health. Meanwhile, one of the challenges agricultural areas face is isolation and a lack of solid and reliable connections to the data network. Private edge computing and communication infrastructure present a valuable solution to enrich agricultural areas to address this issue. Another reason to rely on edge/fog computing is to protect the core network from congestion [332], as transferring all camera records and sensor data from different farms to the centralized Cloud will be a tiresome network task in the future. Alternatively, one of the well-known lowenergy communication technologies is LORA (Long Range), which is used widely used in precision agriculture [333]. It consists of two components: Lora Gateways, which are responsible for transferring data from sensors, and a LORA VOLUME 4, 2016 31 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. central unit, which functions as an edge node for processing the acquired data.

3) Smart industry
EC has perfectly aligned with the most recent industry 4.0 requirements; this integration is also known as industrial edge computing [334]. Predictive maintenance is an approach that new industries rely on to reduce their CAPEX and OPEX expenses. In it, the machine is equipped with various IIoT (industrial IoT) sensors counting temperature, vibration, and pressure sensors, which gather data that is then transferred to fog nodes to be processed there for predicting machines failures and errors [335]. Further, the fourth industrial generation intends to incorporate artificial intelligence (AI) into its manufacturing processes. Because industrial companies cannot transfer their private data (for example, videos from the production scenes) to the Cloud, they must rely on the edge [336]. Object recognition with robots [337], automated guided vehicles (AGV) [338] and human pose estimation [339] are some examples of how industry 4.0 is working to improve itself using EI.
Outside of industries, E-commerce enterprises are now in desperate need of real-time interaction with their customers [340], as delivering quickly with EC is one of the best browser experiences they can provide to their customers. Likewise, EC can help secure in-store payment stations by upgrading their cameras with computer vision capabilities [341].

4) Smart Grid
Smart grid (SG) enhances electrical energy with efficient, flexible control of grid components through information and communication technologies (ICT). SG has three main branches, including smart grid metering and monitoring, intelligent control of grid functions, and effective integration of renewable energy sources into the grid [342]. Smart grids interact with edge computing via their energy management systems, which can be hosted in edge servers in a scalable and distributed fashion. One of the smart grid's primary functions is monitoring. Smart grids can monitor their grid equipment (voltage alarms, cables, towers, etc.) using EC. In the event of an overhead line failure, the smart grid can quickly restore power via its low communication with ES [343].
One of the primary smart grid energy measurement approaches is load forecasting, also known as electricity consumption prediction. The prediction is usually performed using ML models, and the training of those models can be accomplished using FL methods, in which each smart meter is connected to an ES that does the training on its electrical data [344]. In addition, malfunctioning smart meters is one of the issues EC helps to resolve in SG [345]; with EC, smart meters can be empowered with intelligence at the Edge, allowing them to detect whether or not their monitored data is erroneous. Moreover, in the intelligent grid control phase [346], the edge nodes can play a critical role in communicating with the various grid sensors and power sources for then sending the appropriate commands to generators and actuators in real-time. Renewable energy integration is a component of the Smart grid; one of the services that VEC provides to EVs (Edge Vehicles) is the use of MEC servers represented by roadside units to calculate the best time and location for charging electric vehicles based on tariffs and battery level [347].

5) Smart roads
Edge computing made the new smart cities' roads and transportation systems safer and more intelligent. Smart roads, as surveyed in [348], aid in the spread of global awareness among road elements by employing an intelligent traffic management system, where vehicles assisted by EC can communicate with one another to maintain road traffic safety and equilibrium [349]. Using this Vehicle-to-Thing communication, it is also possible to treat special road scenarios more efficiently, for example, in the case of an accident or emergency vehicles (ambulance, police car), where vehicles are commanded proactively to free up some road lines.
Based on the aggregated data from the road environment, one of the functions that will be hosted in ES is vehicle collision detection [350]; within ES can instruct and calibrate in real-time vehicle speeds and trajectories, as well as lighting systems to avoid collision accidents. Additionally, FC is used to enhance road cameras with functionalities such as vehicle detection and tracking [351]. Another issue in road safety is surface condition; the effort [352] proposed deploying a crowded surface sensing system with the assistance of vehicles, in which data is aggregated and analyzed using fog nodes.

VOLUME 4, 2016
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3183634

6) Smart banks (Blockchain)
Blockchain is an electronic transaction system invented in 2008, and it is free of any thirdparty controller (Banks, government, etc.). Miners, who work on solving a mathematically and computationally difficult problem known as the Proof of Work, perform transaction verification in Blockchain. Blockchain technology is the driving force behind smart banks. Because of the computationally intensive nature of mining tasks, Blockchain cannot be directly integrated into IoT devices. As a result, offloading mining tasks to the Cloud, Fog, or Edge represents a valuable solution [353]. Moreover, one of the challenges in Blockchain is pricing collaborative miners, as there is a need to improve the revenue of ECSP (Edge/Cloud service providers) while also protecting miners' investment gains from offloading to the Edge/Cloud [354].

B. E-HEALTH
By this century, electronic and information systems had already infiltrated the health sector, resulting in what we now call "e-Health." This new health paradigm is characterized by the use of electronic devices for diagnosing and treating patients and the widespread use of computers for collecting and analyzing health records. EC is here to benefit e-health in a variety of ways, including medical records storage, dealing with privacy concerns, and coping with long retrieval delays from the cloud [355]. Moreover, using fog/edge for continuous remote health monitoring of patients [356] will save a massive amount of network bandwidth. EC is essential for next-generation e-health applications, including Telesurgery operation [357], in which the doctor from their homes commands actions to be performed by robots in operation rooms; this communication requires a very low latency that can only be achieved with EC. Further, with the increasing use of AI in all human activities, e-health is now widely benefiting from this intelligence in many tasks, as the survey in [358] highlighted the EI different deployment use cases in healthcare systems. Further, some of those e-heathenabled tasks include arrhythmia detection [359], in which electrocardiogram sensors are enabled by Edge Intelligence to classify heartbeat. Another application is electroencephalogram monitoring [360], or human brain activity classification, where recent advancements in embedded brain-computer interface (BCI) combined with EI allows for brain seizure detection to be performed [361].

C. ENTERTAINMENT
There is no doubt that edge computing, in conjunction with 5G networks, will transform the gaming experience, particularly VR and AR games. The two criteria gamers despise the most are higher pings caused by high latency and poor FPS linked to low computing resources. In the case of augmented reality, EC promises to improve these two gaming quality requirements by connecting with low latency gamers' AR goggles and mobile devices to high resourceful edge computing nodes [362]. In addition to AR goggles, another type of wearable gaming device is haptic ones [363], which provide sensing information to users whenever an action occurs within a virtual game, to come close to instant human reaction when touching a fire, those require Ultra-low latency, which can only be achieved using edge Computing. Additionally, With the mean of smart wearable devices, EC promises to enhance and improve edge users' entertainment experience [364]. Aside from gaming, EC enhances mixed reality applications, such as blind and visually impaired people assistance [365].

D. MILITARY & SPACE
A trend toward edge computing is a trend toward Tactical-Edge Computing. Data cannot be backhauled to a central office in military operations because network infrastructure is one of the first tactical targets in a war, rendering the Cloud, fog, or MEC non-existent or out of service. In a war scenario, the military is left with distributed edge nodes that must be consolidated and scaled up in a high fault tolerance environment before being used to augment other military equipment with computation resources [366]. Given that the EC favors decentralization, the work in [367] investigated the case of a Swarm of Drones as an effective source provider of FOG computing services. Smart wearable devices (e.g., smart clothing, smartwatches) powered by EC have proven to be extremely useful on battlefields, as demonstrated by Microsoft's sale of the HoloLens smart glass to the US Army as part of a 22 billion dollar contract [368]. Another EC-powered military device is ground penetrating radar [369], which is typically carried by drones or low-flying aircraft. Adding local intelligence to these radars allows for an instant operation against underground detected objects. Similarly, EI can play an essential job in the surveillance of remote desert borders [370]. In a different scenario, maritime can perform rescue operations in the middle of the ocean using UAVs equipped with object detection models by using UAV-based Edge Computing [371]. Furthermore, because terrestrial edge computing (TEC) is vulnerable to disasters or tactical attacks, orbital edge computing (OEC) [372] is becoming a valuable backup in many scenarios, particularly with the breakthroughs advancement made in enhancing space-ground data rate transformation and lessening of satellite manufacturing and operating costs. Nonetheless, as [373] highlights, OEC still faces numerous challenges, such as high-speed movement and channel condition changes. Meanwhile, many studies, including [374] and [375], propose outfitting satellites with embedded processing boards to perform tasks like data analysis and AI inference.

A. GREEN ENERGY
Climate change is one of the most pressing issues of the current decade. Climate change has compelled the globe to rely more on clean and sustainable energy. However, with today's increased electricity usage due to novel edge applications, the demand for integrating renewable energy as the primary source of EC energy consumption is at its alltime high. Although EC promises to reduce energy consumption pushed by cloud data centers, there is an increasing need for powering ES with clean energy and harvested energy approaches [376]. On the other hand, many efforts are working on reducing ENs energy consumption while also making offloading decisions favoring EC powered by clean energy [377].

B. STANDARDIZATION
Edge Computing has emerged as a compelling and vital paradigm for industry and research. Several standardization institutions have put up a lot of effort to create recommendations and references on how to integrate EC either from the cloud-Edge side or from the MEC-5g standards network side [378]. In terms of cloud architectures, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) have put in a lot of work to define Cloud/Edge technical architectures, software platforms, virtual machine and container management, and orchestration. ETSI is a prominent player in the 5g-MEC sector, and their numerous white papers have helped to standardize Multi-access Edge Computing platforms [54].

VIII. CONCLUSION
The world is undergoing a massive shift toward digital services. As a result, computing and memory resources are in high demand. Furthermore, novel applications like smart cities, e-health, smart grid, and others require resources (computing & memory) with low latency services and a stable network free of security and privacy issues. EC has emerged to provide all of this. We presented a survey on the evolution and construction of this computing paradigm as part of this work. We discussed how related technologies such as 5G, Edge Intelligence, and containerization had pushed the evolution toward keeping and handling data at the Edge. Finally, we investigated how EC will respond to future concerns such as green energy and standardization. This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2022.3183634