A Heuristic for Load Distribution on Data Center Hierarchy: A MEC Approach

Mobile Edge Computing (MEC) extends Cloud Computing to the edge of the network, creating a hierarchy of data centers. This paradigm provides computing capacity close to final users, relieves backhaul and the leading network, and serves latency-sensitive applications. When providing computing services at different network levels, it becomes necessary to carry out a more efficient distribution of the resources that come to coexist. A random allocation of these resources can lead to a low service acceptance rate and backhaul overhead problems. Problems like these can be solved with MEC. To maximize the service acceptance and ensure a fair distribution of services in the kinds of servers guaranteeing their QoS requirements, we propose a MILP (Mixed Integer Linear Programming). The model performs an optimal allocation of applications in a two-level hierarchy of data centers: (i) MEC and (ii) cloud computing. On a large scale, the use of MILP becomes unfeasible due to the high computational cost, so we propose a heuristic based on the application profiles.We compare the proposed heuristic with two metaheuristics: (i) Genetic Algorithm and (ii) Particle Swarm Optimization. The solutions are compared in terms of service acceptance rate, largescale performance, and efficient use of available resources. Results show that the proposed heuristic reaches 91% of the optimal solution and over 140% compared to AG and PSO solutions.


I. INTRODUCTION
Seven times more mobile traffic compared to 2017 is expected for 2022 [1]. This growth will be a consequence of the new applications that the fifth generation of mobile networks (5G) will enable [2]. Internet of Things (IoT), education 4.0, and industry 4.0 are some examples of applications that will increase the demand for [3] resources. High demand volume and pressure on high network costs will drive the need for changes to maintain quality of service (QoS) and experience (QoE) for end-users [4], generate revenue for service providers' service, and optimize [5] operations.
The increasing dependence of users to perform intensive operations, both computing and storage, requires the download of applications to the [6] cloud. In addition, ultra-low latencies and high bandwidth availability will be standard requirements among new applications. Therefore, applications will often not be able to be served by the cloud due to the latency required to reach these servers (round trip time, router delays, queue delays, bandwidth bottlenecks, etc.). Sophisticated planning and operation approaches should be adopted by service providers considering the inherent heterogeneity of new [7] applications. Maintaining good levels of QoS would be difficult and expensive without bringing the cloud closer to the edge of the network and the end-users [6].
Mobile Edge Computing (MEC) supports these new [8] applications. MEC provides an Information Technology (IT) service environment and computing resources on the Radio Access Network (RAN) and in proximity to mobile subscribers [8]. They are implemented directly on cellular base stations (BSs) or local wireless access points (APs) using a generic [9] computing platform. Unlike the traditional cloud computing system, which uses remote public servers, MEC servers belong to the network operator. Fig. 1 illustrates the MEC architecture. The MEC architecture has three basic components: (i) edge devices, which range from smartphones and smartwatches to IOT devices, connected to the network; (ii) the edge cloud, which has fewer computing resources and is deployed at each base station or access point; and (iii) the public cloud, which is the cloud infrastructure hosted on the internet.
MEC is a crucial architecture and technology concept to enable the evolution to 5G, as it contributes to satisfying the demanding requirements in terms of QoS [5]. According to [10] and [11], some of the key features of MEC are proximity, low latency, high bandwidth, flexible deployment, and real-time response. Therefore, with new and heterogeneous requirements, the new set of applications can benefit from the MEC implementation.
The resources of the MEC platform are scarce compared to the cloud and can support a limited number of applications. Thus, cloud servers or even regional MEC servers can serve applications with less stringent latency requirements. Therefore, edge servers do not exclude the cloud; they work in [12] collaboration, creating a data center (DC) hierarchy. The orderly distribution of processing and storage powers diminishes as you approach the edge of the network.
There are several solutions to meet application requirements and provide a good quality of service to users, such as (i) deploying MEC servers at all access points and (ii) increasing the number of servers in the cloud and the capacity of backhaul links. However, these solutions are not ideal for hardware consumption or operating and planning costs. Therefore, it is necessary to have adequate load distribution solutions that respect application requirements and balance the available servers, maximizing the use of available resources.
Resource allocation planning is a necessary step to achieving better QoS levels. Aiming at this state, it presents a Mixed Integer Linear Programming (MILP) model to maximize the number of allocated applications. This work considers a fixed set of application demands and a static scenario with steadystate traffic. Due to the high complexity problem of linear solutions, a new heuristic with low computational cost is proposed. The heuristic is based on the known behavior of different application profiles. To validate the quality of the solutions found by the heuristic, we compare their results with the optimal solution of the MILP. Furthermore, we compare the heuristic with two meta-heuristics: the Genetic Algorithm (GA) [13] and the Particle Swarm Optimization (PSO) [14]. It is possible to improve QoS levels by making the most efficient use of available resources.
The rest of this paper is organized as follows. In section 2, some related works are presented. In section 3, the system model is given, and the load distribution problem is formulated. The heuristics are presented in section 4; the numerical results are discussed in section 5, and conclude the paper in section 6.

II. RELATED WORKS
Recently, the literature discussed the MEC architecture approach and the challenges faced to maintain the QoE and QoS levels of the end-users [15]- [17]. A common need for all network operators is to investigate the allocation of resources distributed across the network to meet new application profiles (i.e., IoT, smart home, and the increasing volume of data). Therefore, in this section, we present some related works to the problem of using MEC resources for new application profiles.
The authors in [18] considered a scalable Software Defined Networking (SDN) scenario where Cloud, Edge, and IoT resources are connected and intend to minimize different network metrics according to specific requirements and provide more efficient solutions; consider a Dijkstra algorithm that produces routing tables capable of reducing network latency. The authors in [19] proposed a reinforcement learning method to minimize the delay of NOMA-MEC (Non-Orthogonal Multiple Access) systems with multiple users and one BS (integrated with a MEC server). By applying NOMA to the MEC offload, they effectively reduced system latency.
In [20], an algorithm is proposed to minimize the total energy consumption considering the latency requirements of time-sensitive computing tasks for users. The simulation results demonstrate that the proposed algorithm can effectively reduce the total consumption of computing tasks and outperform reference schemes with better computational performance. In [10], an ideal offloading method for MEC computing smart mobile devices with latency restriction and searching to achieve energy efficiency is proposed. The results show that energy efficiency is achieved when offloading occurs as the number of devices increases.
In [21], the authors present a set iterative algorithm of offloading decision, cooperative relay selection, and resource allocation based on double Lagrangian decomposition, Shen-Jing Formula method, and monotonic optimization method where the objective is to minimize execution latency of the task. The simulation results show that the proposed Cooperative Offloading Decision Method (CODM) algorithm can obtain the lowest execution latency for different network parameters, surpassing the existing schemes. In [22], an offloading mechanism is proposed to maximize the total data rate of a multiuser MEC system in a heterogeneous 5G network, ensuring that the maximum delay time of the applications is respected. The proposed method guarantees the quality of the communication, increase in transfer rate, and spectrum use.
In [23], the authors propose an auction theory that assigns edge servers to mobile device tasks and defines payments for computing services performing the process through a pair of deep neural networks and effectively improving edge server utility, causing the allocation to satisfy the delay in completing the task. Virtual machine (VM) placement and workload assignment in a collaborative MEC system are discussed in [24], where a MILP is proposed to minimize hardware consumption to deploy VMs, meeting latency requirements heterogeneous from different applications. The authors in [25] propose two mechanisms for resource allocation in edge computing systems achieving results close to optimal.
In [9], the authors design a holistic solution for the joint downloading of tasks and the allocation of resources in a MEC network assisted by several servers to maximize the download gains of users. In [26], the authors formulate a problem of a geographic grouping of the MEC system through MILP and a heuristic based on graphs to find the partitions of MEC areas to maximize the use of the servers. In [27], the allocation of resources in a mutative MEC environment is investigated. An intelligent Deep Reinforcement Learning based Resource Allocation (DRLRA) algorithm based on the emerging DRL technology is proposed that obtains better performance than the classic Open Shortest Path First (OSPF) algorithm.
In [12], the orchestration of resources at the edge, caching, collaborative processing, and interference cancellation are addressed to show the benefits of having real-time collaboration, sensitive to the context of MEC systems and mobile devices. In this work, we present a heuristic based on the behavior of applications in [28], [29], which differs from the other papers given. The distribution of available resources on the network is optimized, maximizing the number of requests met with the latency requirement respected.

III. MEC LOAD BALANCING
This section presents the load distribution problem in scenarios with smart devices, and we give the corresponding mathematical model. We adapted the model from work introduced in [28] and [29]. Fig. 2 illustrates a hierarchy of DC, where there are three layers of computational resources. The first layer represents end devices with extremely limited computing resources. The second is the MEC layer which has network edge resources associated with a BS or AP. The third layer is cloud computing, where available resources are at the network's core. Compared to mobile devices, servers have more computing resources and greater energy efficiency. The hardware on a server comprises CPU, memory, and storage.

Internet
Cloud Server FIGURE 2. The computational hierarchy, which is the orderly distribution of powers, which decreases as the edge of the network approaches. The first layer of the model is made up of end users who have limited computing resources and a low power reserve; The second layer is the MEC layer, where the servers are associated with the base stations or access points and have limited computational power but larger than the fan devices; The third layer is cloud computing, which is at the core of the network and has unlimited computing resources.

A. SYSTEM MODEL
The model consists of two sets. A set of devices, D, where each has a request, R, and a topology, T (N ; L), with nodes attributes and associated links.

1) Topology Profile
The model considers two layers of computational resources available to fulfill requests. Available resources consist of N servers, divided into MEC servers, E, and cloud servers, C. Therefore, E ∈ N and C ∈ N . CP n , where n = {1, 2, ..., N }. L n1,n2 is a binary variable that takes the value 1 when there is a link between nodes n 1 and n 2 , and 0 otherwise. DL n1,n2 represents the network propagation delay between the links n 1 and n 2 . W l n1,n2 is the maximum traffic capacity that VOLUME 4, 2016 the link between n 1 and n 2 can support without generating obstructions or communication failures.

2) Device Profile
There are D mobile devices making service requests. Each i device, where i ∈ D, has a task request called R. Each i device consists of a set of parameters presented in Table 2. The fixed set of application demands is formed by the set of requests RiD.

Parameters
Description Cost   TABLE 2. Description of end device parameters. Each parameter represents a characteristic common to the users that is important for the allocation problem and serve as input to the model and the discussed heuristics in this paper.
R i represents the type of application that the i device is requesting. Θ i p , where p ∈ R represents the maximum delay allowed by the application profile in device i ∈ D. The delay considered does not define total end-to-end latency. Only the traffic latency from the source node of the application S i to its destination node, where the application will be processed or stored. γ i p represents the cost that an application profile must travel through the link between nodes n 1 and n 2 , where n ∈ N . ϕ i p defines the hardware cost that an application profile needs to meet 1 Megabit per second (Mbps). β i p represents the packet size, in Mb, from the device application profile i.

3) Variable Definition
The application can be run on a MEC server, E, or a cloud server, C, for each mobile device that performs the offloading. The model then needs to decide where the application p from device i will be processed. Once the destination is decided, the application needs to go through the topology to reach the server n. Thus, the model consists of two binary variables. The first variable is A i,n which represents which server n is serving request p from the device i. A i,n takes the value 1 if request R i will be served by the server n and A i,n takes the value 0 otherwise. The second, X i n1,n2 , is a binary variable that defines the path of the application p from device i of its source S i to the destination server n, where A i,n takes 1. So X i n1,n2 takes 1 for every application p from device i that travels between nodes n 1 and n 2 , and X i n1,n2 take 0 otherwise.

B. MIXED INTEGER LINEAR MODEL FOR MEC LOAD BALANCING
Each mobile user request is associated with only one BS to formulate the load distribution problem. Thus, removing the UE-BS connection part of our modeling is possible. As the MEC servers can coexist with the BSs on the same site, the source node of a request can also be the destination node. Equation (1) describes the objective function of maximizing the number of applications allocated, making a fairer distribution of the available resources. The formulation of the distribution problem is presented below: Objective Functions: Subject to: The equation (2) guarantees that requests can be served exclusively by a server. The equation (3) indicates that the resources used by requests are limited by the maximum number of available hardware capacity. The variable A i,n is defined in III-A3. The equation (4) and (5) guarantee the flow of the request from its source S i to the destination n, where A i,n takes on the value 1. When the application of device i passes through the nodes n 1 and n 2 to the variable X i n1,n2 takes on the value 1 .
The equation (6) enforces that a request from the device i cannot travel between two nodes, n 1 and n 2 , that do not have links connecting them. The equation (7) guarantees that the device request i will not loop between two nodes, n 1 and n 2 , wasting resources. The equation (8) ensures that the maximum delay accepted by the application p, where p ∈ R of the device i will be respected. The equation (9)  This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. number of requests p that can travel between nodes n 1 and n 2 . Note that we do not consider node affinity, i.e., the ability to define the node at which an application will be executed. However, this can be included in the model by extending the application definition and including a few constraints.

IV. OPTIMIZATION METHODS
In this session, we introduce the proposed heuristic, with low computational cost, to solve the problem of heterogeneous load distribution in a hierarchy of DCs, objective function, and the routing algorithm used. AG and PSO are also presented.

A. OBJECTIVE FUNCTION AND SOLUTION ENCODING
This subsection will present how the objective function value is calculated for the heuristics proposed in this work. We will first present how the solutions are coded in the proposed techniques. It is worth mentioning that a solution in AG is called a chromosome, while in PSO, it is a particle. The equation 10 describes the vector of possible solutions P S.
A vector of D positions represents a solution. The index i of the vector P S i represents the request identifier, RiD, and x represents the server identifier, SiD, where x ∈ N . The value of x is defined according to the individual strategies of each of the proposed heuristics, presented below. The solution encoding method guarantees the constraint presented by the equation (2). Fig. 3 represents the coding model for a solution to our problem. Having the P S vector, a possible solution to the problem, the heuristic can then call the objective function to check the quality of that solution. The algorithm 1 presents the sequential FO scheme for the algorithms.
For each of the positions of the solution vector, the first step is to check if there is enough computational capacity in the destination DC to meet the request. After the verification, it is necessary to verify if there is a possible route that meets the application's latency requirements. The values are updated if all conditions are met and the application can reach its destination.

Algorithm 1: Objective Function Algorithm
Input: P S Output:

B. DIJKSTRA'S ALGORITHM FOR ROUTING
The Dijkstra algorithm is an approach to finding the shortest paths between a specific source node and a destination node in a network [18], [30]. Vertices represent network devices in a graph, edges represent network links, and edge weights represent different network parameters. A full-duplex connection between two network devices can be arranged as a pair of simplex connections. Since we assume at least one path between one network device and the other, the derived graph is connected.
Dijkstra's algorithm has a simple procedure. The essence of the algorithm is to find the shortest path between two nodes and a graph T (N, L). Following the connected graph model, it is easy to organize the weight of the links according to different evaluation parameters (such as communication latency and maximum link capacity) that can also meet the requirements of applications in MEC scenarios. The algorithm 2 presents the pseudocode.
In this work, we consider the shortest path among those that meet the application's minimum requirements, such as latency, which has the most significant available capacity W l. Fig. 4 presents a model of the applied algorithm. Despite shorter paths (with a smaller value of Dist), the algorithm opted for the one with the highest capacities (W l) available.   Considering that all paths are within the delay time allowed by the application, we can verify that the chosen path is the one that presents the greatest number of resources available at that instant of time. Conserving network resources for more latency sensitive applications.

C. HEURISTIC BASED IN APPLICATION PROFILE FOR MEC LOAD DISTRIBUTION
The proposed heuristic to solve the resource allocation problem is based on the probability of a MEC server serving an application profile R. The probability P r is calculated based on the average result of the optimal solutions found by the model described in load balancing MEC. After simulating the experiments, it was verified the probability, in the optimal solution, of a particular application being allocated to the edge or core of the network. Therefore, we probability that a request that specifies R will be served on a MEC server. We use this value as the maximum threshold for the number of requests of a given type to be served by MEC servers.
We assume that each MEC server has an intelligent system that knows the proportion of each type of application that it can handle without compromising the network. It can determine whether to accept a given application R based on its current state. The strategy is to limit applications with lower sensitivities from being served at the network's edge, taking these applications to the central cloud. So, when an application enters the network, it looks for an available server to serve it, whether it is a MEC server or a cloud server, but MEC servers prefer to serve the most latency-sensitive requests. The first step of the proposed heuristic is to create an empty P S I vector, where each of the destinations of the existing requests will be stored. Pseudocode 3 presents the flow of creation of the possible heuristic solution.
Where O r n is the occupancy rate of the server n by the application profile R. Therefore, if the occupancy of the server n by the application profile r is less than the proportion P calculated previously, applications can be allocated to it. When serving a request, the server updates its current O r n state. After deciding where the application will be allocated, when creating P S i , the heuristic calls the objective function 1 to measure the fitness level of the solution. The calculation is performed according to IV-A. As long as the stopping criterion is not satisfied, the solution goes through an update process in search of better results. The solution's update is It is worth mentioning that the heuristic does not prevent applications from being served on a given node; it only limits the number of R applications of a specific type that can be served on the MEC servers. The 3 algorithm presents the proposed solution that is based on the behavior of the application profiles that are discussed in section V.

D. GENETIC ALGORITHM FOR MEC LOAD DISTRIBUTION
A Genetic Algorithm is a meta-heuristic for solving optimization problems inspired by the process of natural selection. In the algorithm, a population of possible solutions, called chromosomes, evolves within the optimization problem's domains towards an optimal solution [31]. Each individual corresponds to a chromosome, which is the coded representation of the solution. Each chromosome can be represented by a vector, with values representing the problem domain. Each chromosome is associated with a fitness level, calculated by the objective function presented in 1. The GA is an iterative process where a set of processes, called generation, creates a new population through the random recombination and mutation of selected individuals from the current population to generate the next population. Individuals are stochastically selected, with those with better fitness being favored over those with lower fitness. Generally, the evolution process ends up reaching a certain number of generations or finding a satisfactory fitness level.
The first step of GA is initializing of the population, which is a set of vectors with random values. Each vector generated represents a chromosome. The population is randomly initialized using a linear distribution. The proposed chromosomal (C) representation is presented in section IV-A.
The equation 10 represents how chromosomes are generated. C is a vector of D positions, where each position has a random value of [1, N]. After initializing generations, the objective function evaluates each individual in the pop-6 VOLUME 4, 2016 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and content may change prior to final publication. ulation. Each of the subsequent generations, called daughter generations, is created by natural selection methods, whereas the previous generation, called the parent generation, uses the selection, crossover, and mutation operators. This process is carried out until the predetermined stopping criterion is satisfied.
The parents must be chosen from the current population to generate the child population in each generation. The proposed algorithm uses the roulette method, where individuals from a generation are chosen through a roulette drawing. In this method, each population member is represented on the roulette wheel according to their fitness level. Thus, individuals with high fitness are given a more significant portion of the roulette wheel. In contrast, those with lower fitness are given a relatively minor portion of the roulette wheel. Finally, the roulette wheel is spun a certain number of times, depending on the size of the population, and those drawn on the roulette wheel are chosen as parent individuals.
Once the parents are selected, the crossover process can be performed with a probability rate called the crossover rate. In this study, we used a one-point crossover. A crossing point is chosen, and from this point, the genetic information of the parents will be exchanged. Information prior to this point in one parent is linked to information after this point in the other parent, generating a new individual. After crossover, the mutation occurs. The mutation operator is necessary for the introduction and maintenance of the genetic diversity of the population, arbitrarily altering one of the genes of the chosen individual, thus providing means for introducing new elements into the population. In this way, the mutation ensures that the probability of reaching any point in the search space will never be zero, in addition to circumventing the problem of local minima since, with this mechanism, the direction of the search is slightly altered. The mutation operator is applied to individuals with a probability named mutation rate.

E. PARTICLE SWARM OPTIMIZATION FOR MEC LOAD DISTRIBUTION
The term "Swarm Intelligence" is used to describe algorithms inspired by the collective behavior of colonies and other animal societies [32]. The particle swarm optimization (PSO) is then an algorithm based on self-organizing systems' collective and decentralized behavior. Thus, the PSO simulates a flock of birds or a shoal of fish in search of food. It treats each solution to the optimization problem as a bird that flies at a certain speed in the search space, and its speed is dynamically adjusted [14]. So, a swarm solution is a particle in a multidimensional search space. Consider that a solution is a vector of D positions, where each position represents a request and its value is its destination, that is, the address of the MEC or Cloud server. The encoding is discussed in IV-A. Each particle has a flight acceleration that determines its direction and speed, so they move within the search space at speed adjusted to each iteration according to cognitive and social factors. Similar to that discussed in IV-D, the population is initialized at random through a linear distribution. The solution, once created, is evaluated using the 1 equation, like in IV-A. After each particle has a fitness level, the particles move in the search space, looking for better solutions. With each movement that is performed by the particles, their fitness level is updated. The particle's motion is described by the equation 11.
P article i is an element of a set of solutions. P article i is the same as C i . V i is the speed at which the particle moves in the search space, at time t. V is presented in the equation 12.
W t is inertia at time t, that is, the tendency of the particle to continue in the same direction. C 1 is the cognitive factor, the tendency of the particle to move from past learning. C 2 is the social factor, the tendency of the particle to move from the learning obtained with the neighboring particles. rand 1 and rand 2 assume a random value between 0 and 1. The 4 pseudocode describes the application of the functions.

Algorithm 4: Particle Update
Input: P article i , V i (t + 1) Output: P article i (t + 1) begin for i ← 1 to D do P article i (t + 1) = P article i (t) + V i (t + 1) Round(P article i (t + 1)) while P article i (t + 1) > N do P article i (t + 1) = P article i (t + 1) − N end while end for end The first step of the update is to add the velocity vector to the particle vector, updating each of the i elements within VOLUME 4, 2016 the search space. After that, the elements that are outside the sample space undergo a correction to make the solution valid. After the solutions are updated, they are evaluated again until the stopping criterion is satisfied, either the number of moves or a determined fitness level. The parameters used are presented in Table 4

V. PERFORMANCE EVALUATION
In this section, the numerical results are presented, and the performance of the proposed algorithm is analyzed under different traffic conditions. For each scenario profile, thirty experiment runs were performed. We consider a MECenabled mobile network system in which each BS is equipped with a MEC server that assists mobile users in performing computational tasks. To achieve realistic propagation delays, seven cities in the eastern United States were chosen as edge nodes, which host small DCs, and two cities in the west as core nodes, which host cloud DCs. The latency of network propagation between each DC is obtained in [33]. The details of the edge and cloud nodes are listed in Table 5. The total load in each city is defined randomly using a linear distribution. The average of the various experiments was calculated to minimize the effect of outliers and draw a common behavioral profile in each scenario. The confidence interval used is 95%. We consider that requests always come from an edge node, a base station, or an access point, which has an associated MEC server. A parameter Ω is defined as a scale factor to expand and reduce the number of applications, linearly, between the seven cities and to see the performance of the proposed algorithms. For Ω = 1, there are 500 service requests. Ω = [1,10]. The sets are formed by applications drawn at random following a linear distribution. Eight application profiles are used as input to the model, and their latency and hardware requirements are listed in Table 6 [7]. The results are compared in terms of service acceptance rate with the optimal solution found by MILP and the metaheuristics presented. The proposed heuristic (IV-C) has an acceptance rate of 40% and 82% higher than AG and PSO, respectively. Regarding the optimal solution, the acceptance rate of the proposed heuristic is 91%. The good results are a reflection of the optimization strategy used by the heuristic, which preferentially allocates applications that allow high latency values in cloud servers further away from the edge of the network. In addition to maximizing the acceptance rate, it implies a better use of available network resources, minimizing the unnecessary use of these resources. Fig. 5 presents the acceptance rate of the different techniques presented. It is also noted that with the saturation of the network system, the presented heuristics present a similar drop behavior, with the proposed heuristic being the smoothest. In a high-density scenario, the proposed heuristic has an acceptance rate of 60% higher than the GA and 130% higher than the PSO.
In the case of uncontrolled growth in the number of requests existing on the network, the computer system could collapse and not be able to meet all requests. This would occur due to the overload that the network is exposed to and the lack of an adequate solution that optimizes the use of available resources. As it was observed, with better solutions such as the proposed heuristic, the gain in the adequate use of resources can be up to approximately two times when compared to other techniques proposed in the literature. Without having a good resource allocation and distribution strategy, especially the edge system, it would not be able to maintain QoS levels for end-users.
The proposed heuristic consumes resources from cloud servers up to 3,5 times more than AG and 11,7 times more than PSO in dense scenarios. These high values are directly related to the strategy's higher acceptance rate index. 6 presents the consumption of resources at each level of the hierarchy for each of the proposed techniques. Let's consider the average consumption for low and high network saturation scenarios. As discussed in the previous sections, MEC does not exclude CC but extends its capabilities to the access network. And due to this extension, the best use of available resources occurs when a better distribution of applications exists, which leads to a better acceptance rate of services. Thus, the inefficient use of cloud resources in the AG and PSO is due to the bad distribution of requests. Where this maldistribution saturates the edge servers quickly, and latency-sensitive requests cannot be met. Therefore, the superior use of the proposed heuristic is due to the better distribution of existing requests that allows latency-sensitive applications to be served in the MEC. The proposed heuristic has similar results to the optimal solution, with about 96% utilization of available resources in saturated scenarios (Ω = 10).  As latency sensitivity increases, applications tend to look for servers closer to the edge of the network. Fig. 7 presents the behavior of the application profiles for the optimal solution in low and high density (Ω = 1) and high (Ω = 10) scenarios. It is observed that more latency-sensitive applications, such as Tactile Internet and AR/VR, focus exclusively on edge servers at MEC. Applications that do not require real-time responses, such as Data Backup, tend to be served frequently on servers at the center of the network, in the CC. Thus, the influence of latency on the destination of different application profiles is clear. Therefore, requests that are more sensitive to latency seek servers at the edge of the network; and the less the sensitivity becomes, they tend to move away. This behavior occurs to allow more sensitive applications to be able to be met with their requirements being respected.  The optimal solution found by the MILP maximizes the use of resources when serving as many requests as possible, being this the main objective. Thus, the numerical results found and presented in Fig. 7 are used to calculate the probability P used in the proposed heuristic. Knowing in advance the average of the optimal behavior, it is possible to insert this knowledge in the allocation strategy aiming at maximizing the satisfaction of the final customers, thus increasing the QoS.

VI. CONCLUSIONS
MEC is an efficient technology to support the growing demand for communication and computing from future mobile networks. Service providers are adapting their architectures and looking to offer better levels of QoE and QoS to endusers, so it is necessary to have an adequate operating plan for future architectures, including the MEC system. The MEC will enable the distribution of computing resources from the cloud to the edge of the network, allowing delay-sensitive applications to run close to end-users while eliminating the use of backhaul links. VOLUME 4, 2016 In this work, we propose an optimal model and a new low computational cost heuristic for allocating network applications, maximizing the use of available computational resources. The proposed heuristic is based on the behavior of the application profiles.The results reach, on average, 91% of the optimal solution found by MILP, while AG and PSO get 65% and 50%, respectively. The allocations made meet the application's QoS requirements, such as latency and capacity of available resources. The results show that the heuristic proposed in this article can effectively improve service quality and user experience. From this study, challenges are still open, such as the mobility of users and the exchange of MEC servers during the session.

ACKNOWLEDGMENT
This study was financed in part by the Federal University of Para (UFPA) and National Counsel of Technological and Scientific Development (CNPq).