An Online Orchestration Mechanism for General-Purpose Edge Computing

In recent years, the fast development of mobile communications and cloud systems has substantially promoted edge computing. By pushing server resources to the edge, mobile service providers can deliver their content and services with enhanced performance, and mobile-network carriers can alleviate congestion in the core networks. Although edge computing has been attracting much interest, most current research is application-specific, and analysis is lacking from a business perspective of edge cloud providers (ECPs) that provide general-purpose edge cloud services to mobile service providers and users. In this article, we present a vision of general-purpose edge computing realized by multiple interconnected edge clouds, analyzing the business model from the viewpoint of ECPs and identifying the main issues to address to maximize benefits for ECPs. Specifically, we formalize the long-term revenue of ECPs as a function of server-resource allocation and public data-placement decisions subject to the amount of physical resources and inter-cloud data-transportation cost constraints. To optimize the long-term objective, we propose an online framework that integrates the drift-plus-penalty and primal-dual methods. With theoretical analysis and simulations, we show that the proposed method approximates the optimal solution in a challenging environment without having future knowledge of the system.


INTRODUCTION 1.Motivation
In recent years, the fast development of mobile devices, communications, and cloud computing has produced a surge of Internet-of-things services and applications.Because most mobile services and applications adopt the device-cloud architecture, mobile traffic across the Internet is dramatically increasing, becoming a challenging issue for mobile-network carriers.Due to the great distances Manuscript received April 19, 2005; revised August 26, 2015.
between mobile devices and cloud datacenters, however, mobile users suffer from large latency and do not receive satisfactory service.To solve these issues, deploying server resources at the edge of the Internet and providing mobile services with edge servers is a promising method.With edge computing, mobile content, service providers, and mobile devices can take advantage of approximate computing and storage resources and alleviate congestion in the mobile core network.
One of the most interesting uses of edge computing is data caching.An ECP can cache popular data in edge clouds when the data go across their networks, and then it can serve data requests with the cached data rather than forward the requests to the Internet.Task offloading is also an extensively researched area with edge clouds.Mobile devices can offload computation-intensive tasks, such as video analysis, to edge servers to save battery and accelerate processing.Recently, more advanced, complicated edge services, such as edge computing-assisted online gaming, virtual reality, and augmented reality [1] [2] [3], have been studied and developed, showing the great potential of edge computing to develop emerging applications.Currently, however, most edge-computing research is service-and applicationspecific, and there is little discussion of general-purpose edge computing.

Our vision
In this work, rather than an application-specific edgecomputing service, we consider edge-computing systems for general purposes.We analyze the business model, identify the main issues, and provide a solution.We illustrate the general-purpose edge computing in Fig. 1.ECPs operate multiple distributed edge datacenters, which are interconnected with high-speed networks.Recently, adopting C-RAN has been a major trend in mobile-network evolution.In this work, we consider the case of edge datacenters being co-located with C-RANs.With a distributed edge cloud network, an ECP provides general-purpose edge-computing services to mobile service providers and users like distributed cloud providers.Specifically, ECPs provide isolated environments with virtual machines (VMs) or containers for mobile devices associated with specific mobile services (identified by service ID) to execute computing tasks.The general-purpose edge-computing framework is supposed Three kinds of business entities could be involved in this process -ECPs, edge service providers, and mobile usersthat have flexible business relationships among them.For example, a mobile user could subscribe to a mobile service from an edge service provider, and the edge service provider has a contract with the ECP so that, when the ECP receives a request associated with the service, the ECP allocates the VMs to the mobile device and charges the edge service provider.Fig. 1 shows the monetary flow.Notice that the ECP and edge service provider are logical and conceptual.It is likely that in practice, the mobile-network carrier acts as an ECP because they operate the datacenters of C-RAN.The ECP could also be an edge service provider that provides original service to mobile users.

Challenges and contributions
From the viewpoint of ECPs, a complicated joint optimization problem exists regarding VM allocation and public data placement.As for the resource-allocation problem, due to the heterogeneous features of the physical resources of distributed edge datacenters, optimizing resource-allocation decisions without the statistical knowledge of future requests is challenging.To maximize long-term revenue, maintaining the availability of each resource in each edge datacenter is important.Allocating resources only considering proximity would deplete certain resources in certain edge datacenter facing specific request sequences.Besides revenue, the cost incurred by inter-cloud data transportation cannot be ignored.Placing popular data in distributed caches helps reduce inter-cloud data transportation, but resource-allocation and data-placement decisions have complicated interactions.For example, in Fig. 1, assembling the type-2 VM in C-RAN 1 can save the inter-cloud datatransportation cost of fetching o 1 and o 4 ; however, it would also cause a possible resource availability issue for C-RAN 1. Caching o 1 in C-RAN 2 earlier helps save the inter-cloud data-transportation cost if assembling a type-2 VM in C-RAN 2, but due to the size limit of the caching device, other data may have to be emitted.Without knowing the future request sequence, it is challenging to design an algorithm to make such online decisions.Even if the future request sequence can be predicted well, well-known online learning algorithms, such as the Markov decision process, fail to compute results in a reasonable period given a huge state space.In this article, by comprehensively analyzing the properties and interactions of resource-allocation and dataplacement decisions, we present an online mechanism that can solve the joint optimization problem efficiently and needs no prior knowledge of future request sequences.

Article organization
The remainder of this article is organized as follows: Section 2 presents the problem definition with a mathematical model, and Section 3 presents the detailed solution, including the framework and sub-routines.The performance of the proposed method is analyzed theoretically in Section 4 and experimentally in Section 5. We then introduce related work in Section 6 and conclude the article in Section 7.

SYSTEM MODEL AND PROBLEM FORMULATION
In this section, we present our vision of general-purpose edge computing in detail, followed by the settings of the considered model.Finally, we formulate the timeaverage revenue-optimization problem subject to various constraints.Tab. 1 summarizes the key notations used in this article.

Settings
We assume that an ECP operates multiple edge clouds distributed but interconnected with a high-speed network.The latency between edge cloud i and j

Q(T )
The virtual queue backlog of coarse-grained time slot T The aggregated demand from i for data o during coarse-grained time slot T

S i
The placed public data set in edge cloud i

S i
The cache size in edge cloud i α l The Lagrangian multiplier corresponding to constraint (1) The Lagrangian multiplier corresponding to constraint (2) Each edge cloud is indexed by i (sometimes j).The latency between edge cloud i and j is denoted as w i,j .Each edge cloud has multiple kinds of physical resources (e.g., CPU, memory, storage, etc.) that can be used to assemble specific types of VMs to mobile users.We also assume multiple kinds of resources and index specific resource with r.The amount of resource r for assembling 1 type-k VM is denoted as g k,r .The ECP provides K types of VM, each of which is indexed with k.The ECP obtains revenue at the rate p k for providing type-k VMs per unit time.In addition to the resources for computing purposes, there are also low-cost caching devices with which the edge cloud can place popular data locally to reduce inter-cloud data-transportation costs and better serve mobile users.The size of data o is denoted as s o , and the size of the cache in edge cloud i is denoted as S i .The ECP operates caching as supplementary to reduce the operational cost, and it is transparent to mobile service providers and users.We index requests with l.A request contains the service identifier, time length, a set of VMs associated with the data list to process, and the data list to upload.The service identifier is for the ECP to identify the associated mobile service provider for charging purposes.
The time length is used to specify the period of VM usage.
The data requested could either be existing data on the Internet (public data) or uploaded from the mobile user that initiates the request (private data).For example, in Fig. 1, the requested data o 1 is public (i.e., can be found in the Internet), while o 4 is uploaded from a mobile device.We denote the set of data from request l associated with the type-k VM as O l,k .For a request l, there are various resource-allocation strategies to satisfy the request.Denoting A l as a specific feasible strategy, we associate a binary decision variable x l A ∈ {0, 1} to it.Specifically, x l A equals 1 if A l and 0 otherwise.For a request, there is no need to adopt more than one resource-allocation strategy to satisfy it.Formally, we have the following constraints on x l A :

Time scales
In this article, we adopt two kinds of time scales: a coarsegrained time slot indexed by T and a fine-grained time slot indexed by t.The coarse-grained time slot captures longterm properties, such as time-average revenue, time-average transportation cost, and data popularity.We make decisions related to these relatively time-insensitive properties with the coarse-grained time slot.Besides the time-insensitive properties and tasks, there are also time-sensitive ones; for example, when a request arrives, the ECP should not buffer the request until the end of a certain time slot but must allocate resources immediately.For resource allocation, because the decision is subject to the amount of each kind of resource in each edge cloud, for every such constraint, we introduce a dual variable to evaluate the shadow price of the physical resource.Specifically, we construct the dual problem from the primal problem.The arrival of a new request l introduces a new set of decision variables x l A to the primal problem while introducing a new set of constraints to the dual problem.When a certain resource is allocated, we increase the dual variable (shadow price) associated with the resource exponentially to 1) reflect the rarity of the resource and 2) keep the ratio of the primal value increment and dual value increment a fixed value so that the solutions are always competitive with any sequence of request arrivals.However, when a request process finishes, the allocated resource is released, increasing the amount of available resources; correspondingly, the competitiveness changes.For this purpose, we further discretize each coarsegrained time slot into multiple fine-grained time slots.In a fine-grained time slot t, we only consider the arriving requests during t and neglect the finished requests.At the end of t, we update the available resource amount by considering all released resources during t and initiate a new primal and dual problem pair for t + 1. Correspondingly, the available resource amount is subscripted by t, and for each t, the following constraint must be satisfied: where t l is the set of the fine-grained time slots contained in l's request, g r,k denotes the amount of resource r needed to assemble each type-k VM, and N l A,i,k is the number of type-k VMs allocated in edge cloud i if the allocation strategy A ∈ A l is adopted.This method helps approximate the theoretically competitive solutions.In this work, the requests arrive at arbitrary times, and our method responds immediately rather than buffer the requests until the end of a time slot.

Revenue
In this work, instead of the revenue during a specific period, we aim to maximize long-term ECP revenue.We present the long-term revenue with a time-average formulation.Specifically, denoting the revenue obtained from a coarsegrained time slot T as R(T ), we have where τ l is the arrival time of l, and p k is the price rate for a type-k VM.Given the revenue obtained per time slot, we can define the time-average revenue of the ECP as

Data-transportation costs
Inter-cloud data transportation is inevitable, decreasing the quality of service (QoS) due to latency and, even worse, making it expensive for the ECP to use the limited intercloud links to transport bulk data.In this article, inter-cloud data-transportation costs occur due to two reasons: 1) when the ECP does not provision the VMs for a mobile user in the edge cloud with which the mobile device is associated, the ECP must transport the user's uploaded data to the other edge clouds where the VMs are provisioned with the intercloud links; 2) if a certain user's VM cannot find the data to be processed in the local cache, the VM tries to fetch the data from its neighboring edge clouds, which incurs intercloud data-transportation costs.In this article, we denote the data-transportation cost for data o with size s o between the edge clouds i and j as s o w i,j , where w i,j is the latency between i and j.We also denote the data cached in edge cloud i as S i and the dataset required by l associated with each type k VM as O l,k .Provided the data requirements and denoting f i,j,o as the amount that i obtains from j for o, we can formulate the transportation cost in a coarse-grained time slot T as Given that the transportation cost for the unit traffic between locations is fixed, the fractional routing f i,j,o becomes superfluous because the request for object o is always directed to the j with the lowest w i,j , that is, Thus, (5) becomes: The transportation cost is inevitable and should be kept under a certain level.However, the transportation-cost constraint is different than resource-capacity constraints in that the resource capacity can never be violated, while for the transportation cost constraint, violation in certain time slots is allowable.In particular, in this article, we consider controlling the time-average transportation cost under a predetermined number C. Specifically, we denote the timeaverage transportation cost as and let With such relaxation, we can better take advantage of the request dynamics.
Notice that the above problem definition has two subproblems to solve: the computing resource-allocation problem and the public data-placement problem.The two problems have different temporal and spatial characteristics.For the temporal aspect, because the request could arrive at an arbitrary time, the ECP must make computing resourceallocation decisions immediately; however, because the popularity distribution of the data changes relatively slowly, data placement should be determined based on observing a relatively long period, and frequent re-allocation does not help enhance the hit ratio.For the spatial aspect, VM migration between edge clouds is complicated, so if a computing resource-allocation decision is made, it does not change in the future, whereas for data placement, data should be reallocated to improve the hit rate and thus reduce the datatransportation cost.In the next section, we describe our approach to solve the computing resource-allocation and public data placement-optimization problems jointly.

THE ONLINE APPROACH TO SOLVE THE JOINT OPTIMIZATION PROBLEM
In this section, we present our approach to solve the computing resource-allocation and data-placement problems jointly.Our approach works with hybrid timescales: a finegrained time slot for computing resource allocation and a coarse-grained time slot for public data placement.Fig. 2 shows the framework.For the coarse-grained timescale, we notice that constraint (8) has a time-average formulation, which allows us to trade off the objective function and the constraint violation.For this purpose, we introduce a virtual queue whose length represents the "budget" for transportation costs that can be "consumed" in the next coarse-grained time slot, and then we employ the drift-plus-penalty method to optimize the long-term revenue and stabilize the virtual Fig. 2: Framework of the online mechanism with hybrid timescales queue.Specifically, at the beginning of each coarse-grained time slot, we accumulate the transportation cost from the previous time slot, update the backlog of the virtual queue, and then make computing resource-allocation decisions to optimize the sum of the weighted revenue and the "drift" of the virtual queue.This drift-plus-penalty-based method has attracted much interest in recent years because it can provide nearoptimal solutions without assumptions about future knowledge.However, it usually works in a buffer-and-decision manner, that is, buffering the requests for a certain amount of time and then making a decision on how to deal with the buffered requests.For this reason, the vanilla drift-pluspenalty method works well for delay-tolerant tasks but is incapable of addressing real-time problems.In this work, we overcome the shortcomings of the drift-plus-penalty method by introducing a primal-dual online algorithm to address the real-time requirement of computing resource allocation.Because the vanilla form of the primal-dual algorithm cannot be applied to the scenario of tasks finishing in a finite amount of time, we further discretize the coarsegrained time slot into fine-grained time slots to approximate the results obtained from the vanilla primal-dual algorithm.
When a request arrives, it is responded to in real time.At the end of each coarse-grained time slot, we re-allocate the data among different edge clouds to reflect the datapopularity change in the past time slot.With the proposed method, the immediate effect of computing resourceallocation decisions can be reflected in the long term, and long-term data-allocation decisions can guide computing resource allocation for a relatively long period.

The online joint optimization framework
We introduce a virtual queue to represent the difference between the actual transportation cost and the predetermined upper bound C. By introducing the virtual queue, the constraint (8) can be satisfied with queue-stabilization technology.Specifically, we denote the queue backlog of time slot T as Q(T ), and the dynamics are defined as where C is the predetermined upper bound of the timeaverage transportation cost.The fact that the virtual queue is stable implies that the constraint ( 8) is satisfied.This can be seen from the following lemma: Proof.From (10), we have Dividing by T and taking the limit as T → ∞, we have From [4], we know that the virtual queue rate is stable only if Thus, the lemma is proved.
To stabilize the virtual queue, we define the Lyapunov function as and the 1-slot drift as Following the drift-plus-penalty method [4], we can make the time-average revenue R(T ) within a constant optimality gap and stabilize the virtual queue simultaneously by maximizing a lower bound of the following term in each coarse-grained time slot: where V is a predetermined non-negative parameter to control the trade-off between revenue and the datatransportation cost violation.We assume that C 2 (T ) is deterministically upper-bounded by C max2 ; a lower bound of the above term can then be obtained from the dynamics of the virtual queue.In particular, where

2
. Multiplying both sides of the above inequation by (−1) and adding V R(T ) to each side yields a lower bound of (11): In each coarse-grained time slot, we greedily maximize the right-hand side of (13) with constraints ( 2) and (1), and in the long run, we approximate the global optimality and maintain the virtual queue stable.Alg. 1 summarizes the entire process: Algorithm 1 The online joint optimization framework

Input:
The backlog of the virtual queue: Q(T ) Output: The computing resource allocation decision {x l A } The public data-placement decision {S i } 1: Update the virtual queue in accordance with (10) 2: for each request arriving between T and T + 1 do In the following sections, we present the two subroutines of this algorithm in detail.

Online computing-resource allocation
The requests would arrive at arbitrary times, and the original drift-plus-penalty method only works in a bufferingand-deciding manner.In this work, we make a fundamental extension to the drift-plus-penalty algorithm with a primaldual approach.To explain in detail, we reformulate the optimization problem of a coarse-grained time slot T as follows: where For convenience, we denote the "component" of edge cloud and it is clear that R l A = i R l A,i .We introduce the Lagrangian multiplier α l to constrain (1) and β i,r,t to constrain (2), respectively, so we have the dual problem as follows: With the Lagrangian multipliers, we propose the following online algorithm to respond to users' requests: Algorithm 2 Online computing-resource allocation algorithm Input: The backlog Q(T ) of the virtual queue The public data-placement profile {S i } The request sequence (indexed by l) Output: Computing resource-allocation decisions 1: Initialize β i,r,t ← 0 ∀i, r, t 2: For each request l Compute the A * ∈ A l such that Reject the request 5: else 6: Allocate resources to l as A * 7: Update β i,r,t as

Public data placement
Notice that the public data-popularity distribution reflects the relatively long-term aggregating requirements; we do not update the data placement at the arrival of individual requests but at the end of each coarse-grained time slot.
We denote the size of content o as s o and the size of the storage in edge cloud i as S i .Given the aggregating data demand in time slot T , the data-placement problem can be modeled as a transportation cost minimization problem subject to the storage-size constraints.Remember that d i,o (T ) is the aggregating data demand in edge cloud i for o during time slot T .Clearly, For edge cloud i, we denote the placed data set as S i .We call a data placement in i feasible if the total size of the placed data does not exceed the storage size, that is, We define the set of all the feasible data-placement profiles of i as F i and introduce an indicator vector } to denote whether a feasible dataplacement profile is adopted, where |F i | is the number of all feasible data-placement profiles of i.For example, suppose the universe of data as {o 1 , o 2 , o 3 } with s o1 + s o2 ≤ S i , s o1 + s o3 ≤ S i , and s o1 + s o3 + s o2 > S i , so Only one feasible caching profile in F i can be selected for a specific time slot.Consider that in a certain coarse-grained time slot, We then have the following data-placement problem: Lemma 2. The public data-placement problem defined with (22) and ( 23) is NP-hard.
The above data-placement problem can be reduced to a general assignment problem (GAP), so it is NP-hard.Before presenting our approximation algorithm, we introduce some important concepts needed for further instruction [5].

Definition 1 (Submodular function). Let X be a finite set and the function
An equivalent definition of "submodular" is based on marginal value.Denoting the marginal value of i with respect to A as f Definition 2 (Simple Partition Matroid).X is a ground set partitioned into l disjoint sets X 1 ∪ X 2 ∪ ... ∪ X l with associated integers k 1 , k 2 , ..., k l , and I = A ⊆ X : |A ∩ X i | ≤ k i , i = 1, ..., l.Then M = (X, I) is a partition matroid.Specifically, if k i = 1 ∀i, the matroid is called a "simple partition matroid." We define the ground set X as {(i, F )|F ∈ F i }.Letting I = {{i, F i } ∈ X : F i ∈ F i }, we can reformulate the data placement-optimization problem as follows: Lemma 3. The function f ({i, F i }) is a monotone supermodular function.
The proofs of Lemma 3 and 4 are similar to the proofs of Lemma 3 and 2 of [6].Thus, we establish that the data-placement problem is a minimization of a monotone supermodular function subject to simple partition matroid constraints.Therefore, a greedy algorithm, like the one shown in Alg. 3, obtains a 1/2-competitive solution.

Input:
The aggregating data requirement {d i,o (T )} The size of storage {S i } The size of each data {s o } Output: The data-placement decision S Initialize S ← ∅; I ← {i} 1: while I = ∅ do 2: for i ∈ I do 3: end for 5: S ← S ∪ F * 7: Notice that we use f S (F i ) to denote the marginal benefit of adding F i to the already-chosen set S.

PERFORMANCE ANALYSIS
In this section, we present a performance analysis of the proposed method.We first analyze the performance of the computing resource-allocation algorithm and the dataplacement algorithm within a coarse-grained time slot and then present the long-term performance guarantee of the ECP's revenue.

The optimality within a coarse-grained time slot
We begin analyzing online computing-resource allocation for a certain coarse-grained time slot T with the following claim.
Proposition 1. Alg. 2 returns an integral solution and has the competitive ratio 1 − 1/e.
We prove this claim by showing the feasibility of the dual and primal problems with Lemma 5 and 6 and the primal-to-dual ratio with Lemma 7. Lemma 5. Alg. 2 produces a feasible dual solution.
Proof.Consider a dual constraint corresponding to request l.If the dual constraint cannot be satisfied.Otherwise, the algorithm allocates the resources to l as A * .Setting guarantees that the constraint is satisfied for all l.Lemma 6. Denoting δ P and δ D the changes in the primal and dual costs for iterating Algorithm 2, in each iteration, δ P = (1 − 1/e)δ D .
Proof.Whenever the algorithm updates the primal solutions, the change in the objective function is Remind that A * denotes the allocation for request l.Then the change in the dual objective function has two parts: the change corresponding to α l and the change corresponding to β i,r,k .Denoting the former as δ D α and the latter as δ D β , we have r,t .
Then we have Thus, the lemma is proved.
Lemma 7. Algorithm 2 produces an almost-feasible primal solution in which the resource constraint c i,r,t is violated by at most max i,r,t k N l A * ,i,k g r,k .
Proof.Consider a primal constraint

Whenever we increase some x l
A by 1, we increase β i,r,t by some factor.The value of β i,r,t behaves like a geometric sequence.Formally, we claim that This can be proved by induction.Note that β i,r,t = 0 initially, so the statement is trivially true.We assume that the inequality is satisfied up to request l − 1, that is, After the arrival of l, The first inequality follows the updating rules of β i,r,t of Alg. 2. The second inequality follows the fact that given that k N l A * ,i,k g k,r /c i,r,t approaches 0. The third inequality is achieved by scaling p k so that A * ,k,t g k,r ≥ c i,r,t implies that β i,r,t ≥ 1, in which case we reject the request, so the resource constraint c i,r,t is violated by at most max i,r,t k N l A * ,i,k g r,k , which is far less than c i,r,t .

Optimality of public data placement
Proposition 2. The proposed public data-placement algorithm (i.e., Alg. 3) achieves a competitive ratio of 1/2.
This emerges from the proposed algorithm greedily optimizing a submodular monotone set function over constraints associated with a matroid.
Optimality against the N -slot Look-ahead Mechanism Because the request arrival rates and the data popularities are arbitrary, finding the global optimal solutions is difficult.Instead of comparing with the optimal result directly, we introduce an N -slot look-ahead mechanism as an approximation of the global optimal revenue.The Nslot look-ahead mechanism has the same objective function as (9) but assumes that the request sequence and data popularity changes in N coarse-grained time slots are known in advance.In particular, in the N -slot lookahead mechanism, time is divided into frames, each frame consisting of N coarse-grained time slots.Suppose the zth time frame consists of the following coarse-grained time slots: {zN, zN + 2, . . ., zN + N − 1}.In each time frame, the following problem must be solved: (1), ( 2), and where z = 0, 1, • • • .The N -slot look-ahead mechanism assumes that all the request arrivals in the coming N coarsegrained time slots are known in advance and that the data popularity estimation is perfect, thus approximating the global optimal solution.We show that the time-average revenue R(T ) yielded by the proposed method has a constant gap from the result of the N -slot look-ahead mechanism.
Theorem 1.Let R N (z) denote the optimal objective function value with the N -slot look-ahead problem in the zth time frame.Suppose a period of ZN coarse-grained time slots, where Z is a constant.We have Proof.Remember that for the 1-slot drift, we have where

2
. We multiply both sides by −1 and add V R(T ) to each side: where R * (T ) is the revenue and C * (T ) is the datatransportation cost from coarse-grained time slot T from any alternative resource-allocation decision.The last inequality emerges from the proposed method trying to maximize V R(T ) − Q(T )(C(T ) − (C)) − B for every T with the primal-dual online algorithm, yielding a (1− 1 e )-competitive solution (Proposition 1).Now we consider a process starting from coarse-grained time slot zN + n, where Considering that we can substitute it into (30), and it follows that We sum up the above inequation with n = 0, • • • , N − 1 and denote the N -slot drift as We have The last equation emerges from We can replace Notice that in the above inequation, we eliminate the term We sum up the above equation over z ∈ {0, • • • , Z − 1} as follows: Dividing both sides of the above inequation by V N Z, we prove the theorem.

PERFORMANCE EVALUATION
Basic settings.To evaluate the proposed method, we developed a discrete-event simulator with MATLAB.In this section, we present the evaluation results with simulations.
We consider an ECP of moderate scale that operates 5 edge clouds.Each cloud has three kinds of resources (e.g., CPU, memory, storage, etc.) for mobile-task computing.To keep the research general, we do not specify the resources.We assume the resources are divisible and the initial amount of each resource is 5, 000 in each location.The resources can be used to resemble two types of VMs.Type-1 VMs need 10, 20, and 30 of each resource, while type-2 VMs need 30, 20, and 10.The price of a type-1 VM is 10 per unit time, and the price of a type-2 VM is 20.Besides the resources needed to assemble VMs, we also assume that there are caching devices in each edge cloud that can transparently cache popular data.The size of each caching device is the same in each edge datacenter.Without specific illustration, the total cache size is 40% the size of the universal content.In this research, we consider not only the data that can be obtained from the Internet (public data) but also the data uploaded by mobile users (private data).Without specific illustration, we set the total amount of private data as twice the amount of public data.The edge clouds are interconnected with high-speed networks.The end-to-end latency of fetching data from the local cache, from a neighboring cache, and from the remote cloud is randomly assigned following uniform distributions in the ranges [5,10](ms), [20,50](ms), and [100, 200](ms), respectively [7].We discretize time with coarse-grained time slots and fine-grained time slots, with each coarse-grained time slot containing 500 fine-grained time slots.The lifetime of each VM is assumed to be uniformly distributed in the range [1,5] of fine-grained time slots.The data popularity follows a Zipf distribution with an exponent of 0.6 [7].In the rest of this section, we compare the proposed method with some benchmark methods and discuss the impact of the parameters.Evaluation with dynamics of VM request.In this simulation, we let requests follow Poisson arrivals; moreover, the expected arrival speed λ varies randomly from 0 to 50 every 25 fine-grained time slots.Type-1 and type-2 VMs are requested with equal probability.Fig. 3 shows the dynamics of the requests.During the simulation, we kept the data popularity fixed.We implemented two methods as benchmarks: myopic resource allocation with cooperative caching (MyopicCoop) and myopic resource allocation with non-cooperative caching (MyopicNoCoop).Myopic resource allocation refers to the method that, when an ECP receives a request, it assembles the required VMs with the minimum transportation cost.With cooperative caching, the ECP makes caching decisions with Alg. 3, while with non-cooperative caching, each caching device caches the data with the highest popularities independently.We set the time-average transportation cost constraint L = 35, 000 and the drift-plus-penalty related parameter V = 100, 000.Fig. 4 compares the time-average revenues of the proposed method and the benchmark methods.Fig. 5 shows the transportation costs of the three methods.We can see that the proposed method performs much better than the my- The proposed method MyopicCoop MyopicNoCoop Fig. 5: Transportation cost under dynamic requests opic methods while satisfying the long-term transportation cost constraint.Fig. 5 partly illustrates the advantage of the proposed method: taking better advantage of the timeaverage expression of the transportation cost constraint.In some coarse-grained time slots, even if the transportation cost greatly exceeds L, the requests can still be accepted.By doing so, in the successive time slot, the relative weight of the transportation cost increases relative to revenue.Intuitively, the proposed method has a larger feasible region given the time-average expression of the transportation cost constraint than the sample myopic methods.To study the stability of the system, we also checked the length of the virtual queue during the simulation period, as shown in Fig. 6 shows, which approaches 0 over time.Evaluation with dynamics of data popularity.In this research, the dynamics depend not only on VM requests but also on data popularity.The data popularity changes with time, and as far as we know, no effective method exists to formally define and measure the popularity change.In this experiment, we propose a method that avoids directly measuring the popularity change of each data; instead, we introduce the concept "popularity estimation error rate." The estimation error rate is a Poisson random number with an average value of 0.3, which means that the probability of the data contributing to the top 50% of the traffic is not cached.Fig. 7 shows the dynamics of the estimation error.In this experiment, we fix the request-arrival rate at Fig. 7: The data popularity estimation error 25 per fine-grained time slot to focus on the impact of data-popularity dynamics.We set the transportation cost L = 35, 000 and the drift-plus-penalty related parameter V = 100, 000.Like the earlier experiment, we compare the time-average revenues in Fig. 8, the transportation costs in Fig. 9, and the virtual queue lengths in Fig. 10.Fig. 9 shows The proposed method MyopiCoop MyopicNoCoop Fig. 9: Transportation costs with dynamic data popularity can see that the proposed method has the best performance.Fig. 9 shows the reason: The proposed method takes better advantage of the time-average constraint expression, while the myopic methods cannot.Fig. 10 shows that the length of the virtual queue approaches 0 over time, implying the stability of the system.
Evaluation of the impact of total cache size.The cache size deeply influences the system.Larger cache sizes allow more data to be cached, therefore incurring smaller transportation costs.Mathematically, larger cache sizes produce  larger feasible regions for the optimization problem.In this experiment, we set the ratio of the total cache size to the universal data size to be 0.1, 0.5, and 0.9, the time-average transportation constraint L = 35000, and the drift-pluspenalty related parameter V = 100, 000, and we carry out the simulation for 150 coarse-grained time slots.Fig. 11, 12, and 13 show the time-average revenues, the transportation costs, and the time-average virtual queue lengths, respectively.Fig. 11  Fine-grained time slot cache size/universe content = 0.1 cache size/universe content = 0.5 cache size/universe content = 0.9 Fig. 13: Time-average virtual-queue lengths time-average revenue.Fig. 12 shows that the time-average transportation cost is satisfied for all three cache sizes, and Fig. 13 implies the stability of the system.
Evaluation of the impact of data sources.Besides the cache size, the ratio of the private-data volume to the public-data volume is also a key factor in the performance of the system.This ratio reflects the composition of the edge applications.An example of private data is a video uploaded from a mobile device for further processing; such private data cannot be shared with others, so caching them has no value.An example of public data is a short video from over-the-top (OTT) providers.A large portion of public data increases the usage of the caching device.In this simulation, we set the ratios of private to public data volume at 0.5, 2.0, and 3.5, the time-average transportation constraint L = 35, 000, and the drift-plus-penalty related parameter V = 100, 000, and we carry out the simulation for 150 coarse-grained time slots.Fig. 14, 15, and 16 show the timeaverage revenues, the transportation costs, and the timeaverage virtual queue lengths.The results are similar to the Fig. 16: Time-average virtual-queue lengths results with difference cache sizes because more public data implies that more data can be cached, thereby increasing the feasible region of the optimization problem.Fig. 15 and 16 show that the proposed method is feasible and stable with different data compositions.
Comparison with the N -look-ahead algorithm Remember that we have proved that our method achieves competitive optimal solutions, which means that the performance gap between the proposed method and the theoretically optimal method is fixed.Because it needs vast computing resources to obtain the theoretically optimal results, in this experiment, we implemented an N -slot look ahead algorithm to approximate the optimal solution.The algorithm assumes that the future with N coarse-grained time slots can be perfectly predicted and then solves the following optimization problem.Clearly, if N approaches infinity, the N -look-ahead algorithm becomes the theoretically optimal solution.Fig. 17  Evaluation of the impact of V .In our algorithm, there is only one parameter, the drift-plus-penalty parameter V , to trade off between revenue and cost.In this experiment, we vary the time-average cost constraint L from 30, 000 and service providers have more opportunities to develop emerging services and applications.However, nowadays, a majority of research is focusing on application-specific edge computing systems, which can not take the advantages of edge computing sufficiently.In this work, we present the vision of general-purpose edge computing, and analyze the applications and the ecosystem in general-purpose edge computing environment.We identify the main challenges to realize efficient general-purpose edge computing: interaction of resource allocation and data placement with dynamic requests in heterogenous environment, and then propose a novel online framework and algorithms that can obtain an approximation to the optimality with constant gap without any assumptions and knowledge of the future system states.Both theoretical analysis and simulation results show that with the proposed method, the edge cloud provider can receive near-optimal benefit without assumptions and system future knowledge.

3 :
Execute Alg. 2 to decide computing resource allocation 4: end for 5: Compute the aggregated data demand {d i,o (T )} at the end of the current time slot 6: Execute Alg. 3 to update the data placement {S i } based on {d i,o (T )}

Fig. 4 :
Fig. 4: Time-average revenue comparison of myopic methods with dynamic requests

Fig. 6 :
Fig. 6: Time-average length of the virtual queue

Fig. 8 :
Fig. 8: Time-average revenues of myopic methods with dynamic data popularity the transportation costs of the three methods.In Fig. 8, we

Xun
Shao received his Ph.D. in information science from the Graduate School of Information Science and Technology, Osaka University, Japan, in 2013.From 2013 to 2017, he was a researcher with the National Institute of Information and Communications Technology (NICT) in Japan.Currently, he is an Assistant Professor at the School of Regional Innovation and Social Design Engineering, Kitami Institute of Technology, Japan.His research interests include distributed systems and networking.He is a member of the IEEE and IEICE.Go Hasegawa received his M.E.and D.E. in information and computer sciences from Osaka University, Japan, in 1997 and 2000, respectively.From July 1997 to June 2000, he was a Research Assistant at the Graduate School of Economics, Osaka University.From 2000 to 2018, he was an Associate Professor at the Cybermedia Center, Osaka University.He is now a professor at the Research Institute of Electrical Communication, Tohoku University.His research work is in information network architecture.He is a member of the IEEE and IEICE.

TABLE 1 :
Symbols used in this article