Joint Resource Management and Pricing for Task Offloading in Serverless Edge Computing

We consider the problem of resource allocation, pricing and application caching for latency sensitive task offloading in serverless edge computing. We model the interaction between a profit-maximizing operator and cost-minimizing Wireless Devices (WDs) as a Stackelberg game where the operator is the leader and decides the price, resource allocation and set of applications to cache, while the WDs are the followers and decide whether to offload their tasks. We first show that the game has a Subgame Perfect Equilibrium (SPE), but computing it, is NP-hard. Importantly, we show that an SPE, which maximizes the operator's revenue, results in minimal energy consumption among the WDs. For computing an approximate SPE, we propose a linear time approximation algorithm with bounded approximation ratio for resource allocation and pricing, and we propose an efficient heuristic based on the utility density of individual applications for the joint optimization of caching, resource allocation and pricing. Our results show that the proposed algorithm outperforms state-of-the-art methods by up to an order of magnitude both in terms of revenue and total energy savings and has small computational overhead. An interesting feature of our results is that the utility of the operator is maximized by a solution that maximizes the WDs’ energy savings through computation offloading, which makes it a promising candidate for energy efficient edge cloud deployments.


Joint Resource Management and Pricing for Task
Offloading in Serverless Edge Computing Feridun Tütüncüoglu and György Dán , Senior Member, IEEE Abstract-We consider the problem of resource allocation, pricing and application caching for latency sensitive task offloading in serverless edge computing.We model the interaction between a profit-maximizing operator and cost-minimizing Wireless Devices (WDs) as a Stackelberg game where the operator is the leader and decides the price, resource allocation and set of applications to cache, while the WDs are the followers and decide whether to offload their tasks.We first show that the game has a Subgame Perfect Equilibrium (SPE), but computing it, is NP-hard.Importantly, we show that an SPE, which maximizes the operator's revenue, results in minimal energy consumption among the WDs.For computing an approximate SPE, we propose a linear time approximation algorithm with bounded approximation ratio for resource allocation and pricing, and we propose an efficient heuristic based on the utility density of individual applications for the joint optimization of caching, resource allocation and pricing.Our results show that the proposed algorithm outperforms state-of-the-art methods by up to an order of magnitude both in terms of revenue and total energy savings and has small computational overhead.An interesting feature of our results is that the utility of the operator is maximized by a solution that maximizes the WDs' energy savings through computation offloading, which makes it a promising candidate for energy efficient edge cloud deployments.Index Terms-Combinatorial optimization, convex optimization, edge computing, function as a service, stackelberg game.

I. INTRODUCTION
S ERVERLESS computing (also called Function as a Service (FaaS)) is transforming the cloud computing landscape by offering a paradigm shift in the way applications are developed and deployed [1], [2], [3].It eliminates the need for users to manage the server infrastructure, enabling them to focus solely on writing code to implement business logic [4], [5].Its ease of use combined with the pay-as-you-go billing model make serverless computing a particularly appealing service model from a user perspective.At the same time, it allows the cloud operator more freedom in managing its communication and computing resources for serving the user demand.
Serverless computing at the edge could provide low-latency access to computing resources on-demand to mobile Wireless Devices (WDs), enabling task offloading for computationally intensive applications without advance reservation of resources, thereby saving battery power [6], [7], [8].Nonetheless, from the operator's perspective, the inherent capacity constraints in edge computing make the adoption of serverless computing at the edge challenging compared to centralized clouds [9], as communication, storage and computing resources have to be orchestrated for meeting application latency requirements, and at the same time, the operator's financial interests have to be catered for.
The orchestration of wireless and compute resources at the edge have been extensively studied in recent years [10], [11], [12], [13], [14].Nonetheless, the management of storage and the availability of executable code at the edge servers, which are prerequisites for serverless computing, received less attention.Existing works focus mainly on minimizing the total cost of the WDs in terms of delay, energy consumption, or their combination [15], [16], [17], [18], but they do not consider the financial interests of the edge operator: the operator's objective is arguably the maximization of its profit, while the minimization of the total cost of the WDs and their latency constraints should rather be considered a possibly conflicting secondary objective or a constraint.
Indeed, storage, computing and communication resource allocation and pricing are mutually dependent.The availability of code determines whether a WD is able to offload, the compute and communication resources determine whether offloading would meet the latency requirements, and the price determines whether it is worthwhile to offload.The decisions of the WDs in turn determine the revenue of the operator and hence its decision what applications to make available.Optimal pricing and resource management is thus inherently challenging and at the same time fundamental for realizing a serverless edge ecosystem.
In this work, we address this challenging problem.We model the interaction between a profit-maximizing operator that performs storage management, resource allocation and pricing, and cost-minimizing autonomous WDs that can offload their computation subject to code availability and latency requirements as a Stackelberg game.Based on the model, we propose a pricing scheme that maximizes the operator's revenue and simultaneously incentivizes the WDs to make energy optimal decisions.Our main contributions are as follows; r we propose a Stackelberg game to model the interaction between the operator and latency sensitive WDs, r we show that a Subgame Perfect Equilibrium (SPE) exists in the proposed Stackelberg game, r we show that the joint optimization of pricing and the allocation of wireless and computational resources is a convex problem for given set of offloading WDs, and the solution results in an equilibrium that minimizes the energy consumption the WDs, r we show that computing an optimal set of offloading WDs is NP-hard, and we propose a linear complexity approximation algorithm, r we show that computing the optimal set of applications to cache is NP-hard and we propose an efficient algorithm for computing an approximate solution, r we provide numerical results based on simulations that show that our proposed algorithm is efficient for the joint optimization of caching, resource allocation and pricing, and it outperforms state-of-the-art algorithms.The rest of the paper is organized as follows.We present the system model and the problem formulation in Section II.We show the best response of the WDs and the the existence of equilibria in Section III.We address optimal resource allocation and pricing for a fixed set of offloaders in Section IV and we propose an approximation algorithm to compute a near optimal set of offloaders in Section V. We address the problem of caching in Section VI, and we show numerical results in Section VII.We discuss related work in Sections VIII and IX concludes the paper.

II. SYSTEM MODEL AND PROBLEM FORMULATION
We consider a multi-access edge computing system that consists of an edge server with storage capacity S managed by an operator, and a set N = {1, 2, . . ., N} of WDs that can offload their computational tasks for execution at the edge server through a wireless link.WD i ∈ N wants to execute a task of type φ i ∈ J , where J denotes the set of applications (i.e.set of task types).The applications are the software images required for the execution of the tasks.The computational task of WD i is characterized by the size D i of the input data in terms of bytes, by the expected number L j of instructions (I) per byte required to perform the task for j = φ i , and by the completion time requirement τ l i defined by the WD.The edge operator can decide to cache a subset X ⊆ J of applications subject to its storage capacity constraint where s j is the memory size of the application image.If the application that WD intends to use is cached by the operator (φ i ∈ X ) then WD i can decide whether to offload the computation to the edge server.We denote by a i the offloading decision of WD i; a i = 1 corresponds to offloading and a i = 0 to local computing.If WD i offloads then it is charged a price of π i ≥ 0 and is given a portion of the finite processing capacity F and finite bandwidth capacity W by the edge operator.The price and the resource allocation are decided by the operator before the WDs decide whether or not to offload.Fig. 1 illustrates a system with |J | = 6 applications and N = 7 WDs.We next present our model of local computing and computation offloading, followed by the problem formulation.

A. Local Computing
If WD i chooses not to offload, the task needs to be executed using local computing resources (i.e., local CPU).We denote by f l i the local processing power (measured in Giga Instructions per Second (GIPS)) of WD i and we use it to express the local processing time as We consider that f l i is chosen such that local computing completes upon the task completion deadline τ l i of the task of WD i, i.e., τ l i = τ l i .Thus, the task completion deadline τ l i will influence the decision of the WD whether or not to offload.This assumption is reasonable, as dynamic voltage and frequency scaling is widely used for reducing the energy consumption of battery powered WDs while meeting performance needs [19], [20], [21].

B. Computation Offloading
If WD i decides to offload, it has to transmit D i amount of data over the wireless channel to the edge server via an Access Point (AP), and then processing is performed at the edge server.We denote by w i the bandwidth allocated to WD i by the edge operator and, we make the common assumption of a Gaussian channel [11].We can then express the upload time of WD i using the Shannon formula [22], where h i is the channel coefficient from WD i to the AP, p i is the transmit power of the WD i, and σ2 i is the noise power at the AP.We consider that the transmit power is bounded by the maximum transmit power pi , i.e. p i ≤ pi .This model of the transmission Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.rate corresponds to Orthogonol Frequency Division Multiple Access (OFDMA), adopted in 5 G and WiFi6 [23], [24], which allocates resource blocks (also called resource units) to WDs for data transmission and avoids intra-cell interference despite simultaneous transmissions from multiple WDs by using nonoverlapping subcarriers.1 [25].Similar to previous works [15], [26], [27], we make the common assumption that the time needed to transmit the results of the computation from the edge server to the WD is negligible, because for many applications (e.g., object detection, recognition and tracking) the size of the output is significantly smaller than the size of the input data.
We denote by f i the allocated computing power of the edge server (measured in GIPS) and we express the processing time at the edge server as

C. WD Cost Model
Computation and offloading incur energy consumption and monetary cost at the WDs.In case of local computing the cost is the energy consumed by the WD for executing the task, where κ l i is the energy efficiency parameter of the WD with unit J per Hz per GI 2 and γ i is the unit local energy cost with unit of $per J. γ i is determined by the cost of electricity and by the cost of charging the battery of WD i, e.g., in terms of time, etc. and serves as the conversion factor from energy consumption to its monetary cost.We make the reasonable assumption that γ i is known to WD i and thus C 0 i can be computed.In case of offloading, we define the offloading cost as the sum of the energy consumption cost for transmitting the input data and the price that is to be paid, i.e., where a −i denotes the offloading decisions of WDs i ∈ N \ {i}, β i denotes the transmit antenna power efficiency parameter of WD i.Let us define the vector ρ ρ ρ i = [f i , w i ] of resources allocated to WD i, then the cost of WD i is We consider that the WDs have a preference for saving the state of charge of their batteries, thus in case of a tie between local computing cost and offloading cost the WD would choose to offload.The local cost of a WD represents its valuation of task execution, and its formulation is consistent with the modeling approach used formerly in cloud computing [28], [29].
In economic terms, this valuation corresponds to the reservation price, which is the highest price that a customer would pay for a particular product or service [30], [31].

D. Problem Formulation
We consider that the WDs and the operator are rational, strategic entities.The objective of WD i is to minimize its cost subject to its completion time requirement, the constraint on the maximum transmission power, and the caching decision of the operator.Thus, WD i aims to solve where the first constraint ensures that WD i does not offload if τ u i (p i , w i ) + τ e i (f i ) > τ l i , i.e., if the completion time when offloading, exceeds the task completion deadline, the second constraint ensures that the WD can only offload if its application is cached by the operator, thus offloading will not result in cold start of the application, and the last constraint ensures that the transmit power remains within the limit of the maximum transmit power.We refer a i ∈ A i = {0, 1} as the action set of WD i denoting local computing and offloading respectively.
Aligned with FaaS pricing models used today, we consider that the income of the operator depends on the price it sets for offloading and on whether or not WDs offload.Thus the operator's utility from the offloading WDs is where φ i ∈X is the indicator function, and we refer to the collection a a a = (a i ) i∈N as the offloading decision of the WDs.We refer to the collection ρ ρ ρ = (ρ ρ ρ i ) i∈N as the resource allocation decision, and to the collection π π π = (π i ) i∈N ∈ [0, π] N as the pricing decision, where π ∈ R is a sufficiently large constant that serves as an upper bound on the price.We consider that the operator aims at maximizing its utility, by choosing resource allocation ρ ρ ρ, prices π π π, and caching decision X , i.e., the operator wants to solve The resulting problem is a multiple-follower single leader Stackelberg game, where the operator is the leader and the WDs are the followers.We refer to the problem as the Joint Pricing, Caching and Resource Allocation Game (PICRA).We are interested in the existence of Stackelberg equilibria and the complexity of computing equilibria, under complete information, i.e., the system parameters and utilities are known.While this assumption may seem strong, it enables us to analyze the structure of the game and formulate an approach for computing an equilibrium.Moreover, analyzing the complete information case serves as an initial step for subsequent analysis under incomplete information [10], [12], [32], [33].We start with solving the problem Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I SUMMARY OF FREQUENTLY USED NOTATIONS
faced by the WDs, and we then turn to solving the problem faced by the operator.

III. WD BEST RESPONSE CHARACTERIZATION AND EXISTENCE OF EQUILIBRIA
We start the analysis with characterizing the best response of the WDs for given caching decision X , pricing π π π and resource allocation ρ ρ ρ, announced by the operator.For caching decision X , we denote by N X = {∀i ∈ N |φ i ∈ X } the set of WDs whose applications are cached by the operator, i.e., the potential offloaders, and we define N X = |N X |.We first show that the best response of the WDs has a threshold structure and can be computed efficiently.
Lemma 1: where , then WD i cannot complete the task on time if it offloads, thus to complete the task before the deadline it has to perform local computing, i.e., the optimal offloading decision is a * i = 0. Otherwise, WD i should choose a transmit power that minimizes its cost while ensuring timely completion.Observe that the uploading time τ u i (p i , w i ) is a strictly monotonically decreasing function of p i , and ) is a strictly monotonically increasing function of p i .Thus, i minimizes its cost by choosing a transmit power p * i that yields Since, the optimal transmit power yields and ( 4) into (16), and obtain (15), which proves the result.
We know by Lemma 1 that the operator can compute the WDs' best replies for a given strategy ρ ρ ρ, π π π, X .Given the best response of the WDs, we next show the existence of a SPE, defined as follows.
Theorem 1: The PICRA game possesses a SPE.Proof: By Lemma 1, for given (ρ ρ ρ, π π π, X ), the best response a a a * is unique and can be computed efficiently.Then, by the extreme value theorem [35], there exists a solution to problem ( 13)- (14), and this solution is by definition an SPE.This proves the result.
Observe that the operator could use Theorem 1 for computing an SPE.Thus, we turn to the analysis of the complexity of computing an SPE.

IV. OPTIMAL RESOURCE ALLOCATION AND PRICING FOR A FIXED SET OF OFFLOADERS
We start by considering a feasible caching decision X of the operator, i.e, j∈X s j ≤ S, and a set N o X = {i ∈ N X | a i = 1} of offloaders.We are interested in computing the optimal resource allocation and pricing, i.e., one that results in the optimal utility U N o X X for given caching decision X and set of offloaders For given set of caching decision X and set of offloaders N o X , the optimal utility U N o X X of the operator is the solution to max Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Our first result characterizes the optimal pricing strategy π π π * of the operator, i.e., π π π * = (π * i ) i∈N X .Proposition 1: Consider that problem ( 19)-( 23) is feasible for N o X , i.e., there is a resource allocation ρ ρ ρ and price π π π such that WDs i ∈ N o X can offload.Then the operator's optimal pricing strategy is Proof: Observe that in the optimal solution of problem ( 19)-( 23) ;, the constraint ( 23) is always active.Hence, the optimal price satisfies π Observe that i∈N o X C 0 i is constant and hence, the solution set of ( 29)-( 33) is the same as that of ( 24)- (28), which proves the result.
Importantly, Proposition 1 implies that computing the optimal price requires computation of the resource allocation X that minimizes the total transmission energy cost.We next show that problem ( 24)-( 28);is convex, thus an optimal strategy can be computed using numerical solvers for a given set of offloaders [36].
Theorem 2: Problem ( 24)-( 28) is a convex problem.Before providing the proof of the theorem, we present two auxiliary results.
Lemma 2: The optimal transmit power is We provide the proof in the Appendix, available online.
Lemma 2 shows that the optimal transmit power of the WD is a convex function of the allocated resources.We now use this result to show that the offloading energy cost is a convex function of the computing and wireless capacity allocation.Lemma 3: We provide the proof in the Appendix, available online.
Proof of Theorem 2 The convexity of constraints ( 27), ( 28) and of the objective function (24) follow from Lemma 2 and Lemma 3. The capacity constraints in (20) are convex and compact.This proves the convexity of the problem.
Thus, if ( 24)- (28);is feasible then it can be solved in polynomial time, e.g., using interior point methods [36].Using numerical solvers is, however, computationally not feasible if decisions are to be taken in real time.In what follows, we thus propose a closed-form approximate solution that can obtain a good solution at minimal computational effort.
Proposition 2: Let N o j be the set of offloaders that execute application j ∈ J , and let i , and consider that the constraints ( 27) and ( 28) are not binding, corresponding to the high capacity case.Then the optimal solution is where . Proof: Since the problem ( 24)-( 28) is convex, any feasible allocation ρ ρ ρ * that satisfies the Karush-Kuhn-Tucker (KKT) conditions will be optimal if Slater's condition holds.To obtain the KKT conditions, consider the Lagrangian dual [36] and denote by λ λ λ * the KKT multipliers in the optimal solution.Recall from Lemma 3 that the objective function is monotonically decreasing in (f i , w i ), ∀i ∈ N o j , thereby the capacity constraints in (20) will always be binding in the optimal solution, i.e.
Since we consider the high capacity case where constraints (27) and (28) are not binding, the KKT multipliers λ * k = 0, for k > 2 in order to satisfy complementary slackness conditions.Next, we show that the stationary conditions can be expressed as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Observe that the term A in ( 70) is approximately 1 in the high capacity case, and the expression A − 1 − log(2)A Lw i K is approximately −(Lw i K) −2 in (71).These approximations are indeed valid for Lw i K >> 1 and f i > f l i .We then obtain the expressions in (34), which satisfy the primal feasibility conditions, where we use the notation ≥ 0 by Lemma 3, hence satisfying dual feasibility.Finally, complementary slackness conditions are satisfied since λ * 1 ≥ 0, λ * 2 ≥ 0 and λ * k = 0, k > 2. Thus, we found the primal and dual optimal points ρ ρ ρ * , λ λ λ * that satisfy the KKT conditions.
Proposition 2 shows that under high capacity conditions a closed form approximate solution could be used to further decrease the computation time.Unfortunately, ( 24)-( 28);need not be feasible in general, i.e., there may be a set N o X such that some i ∈ N o X cannot offload, in which case the optimal utility U N o X X = 0. We thus turn to computing the optimal set of offloaders.

V. CHOOSING AN OPTIMAL SET OF OFFLOADERS
In this section we show how to choose the set of offloaders that maximizes the operator's utility for given caching decision, i.e., we address the problem max Recall that for given N o X , the inner maximization problem is computable by Theorem 2 and Proposition 1. Thus the optimization problem (37)-( 41) is a set function maximization problem over the ground set N X , and can be equivalently written as,

A. Complexity Analysis
Our first result in this section shows that problem (37)-( 41) is NP-hard.
Proof: Before providing the proof we first introduce some notation.We write the optimal utility as a difference of the total local cost of the set of offloaders N o X and the total energy consumption cost due to transmitting data to the edge server, where we denote by , and we denote by the total transmission energy cost for the set N o X of offloaders.In what follows we prove the result through reduction from the partition problem, which is known to be NP-hard.
Problem 1 (Partition Problem): Given an instance of the partition problem, we let N X = k, and let the corresponding set of WDs be N X .Let us set For the reduction to work, we need to ensure that the total energy consumption E(N o X ) < 1 for any feasible N o X .With this, operator would always choose N o X with maximal We first set a suboptimal resource allocation for the WDs and satisfy E(N o X ) < 1.If a suboptimal allocation satisfies this inequality, so does the optimal one.We set Next, we set the transmission power as thus the transmission energy cost becomes Assume a hypothetical case that all b i = 1 and k = 2 A. Then there has to be a set In this scenario, f i will be the lowest by its definition, thus the energy consumption cost will be the highest from Lemma 3. Observe that the only unknown variable in (45) is W , thus one can find W such that After setting W , we calculate p i , ∀i ∈ N X and set p > max i p i .Observe that by construction, if the answer to the partition problem is YES, then the solution set of our problem is N * X and gives i∈N * , if the answer is NO then our problem has solution i∈N * X b i < A. This concludes the proof.
The NP-hardness of the problem implies that an optimal set of offloaders cannot be computed efficiently.Thus, we are interested in designing an approximation algorithms that can compute a near optimal solution efficiently.

B. Singleton Greedy Maximization
Before we describe our proposed algorithm, let us recall the definition of monotonicity and submodularity of set functions.These two properties of set functions are widely relied upon in the design of approximation algorithms.
Definition 2 (Monotonicity): Let Ω be a finite set and V : Ω → R a set function.V is monotone if for any Ω † ⊂ Ω and i ∈ Ω \ Ω † we have V (Ω † ∪ {i}) ≥ V (Ω † ).That is, adding a new element to any feasible input of the function does not decrease its value.
Definition 3 (Submodularity): Let Ω be a finite set.The set function V : 2 Ω → R, where 2 Ω denotes the power set of Ω, is submodular if for every Ω † ⊆ Ω and ω ∈ Ω \ Ω it satisfies Monotonicity and submodularity are known to allow efficient approximation algorithms [37], but the utility of the operator is neither monotone (see Proposition 6 in the Appendix, available online) nor submodular in general (see Proposition 7 in the Appendix, available online).Nonetheless, as we show next, the operator's utility is submodular in the high capacity region.
Lemma 4: Under high capacity conditions, the utility function U N o X X is submodular, and can be expressed as where Proof: As shown in Proposition 1, for all i ∈ N o X , the optimal price is π , the optimal utility will be To show submodularity we need to show that We show (49) holds, by showing that inequality holds for the each individual WD using (48).For WD i and j local cost at the both sides of the inequality (49) cancel out.Observe from (34) that w * i and f * i decreases as number of offloaders increases, hence, for WD i and WD j, E X ∪{i ,j} j hold since transmission energy cost is monotonically decreasing function of (f i , w i ) from Lemma 3. Thus for only WD i and WD j inequality (49) holds.

We next define F
H k for notational simplicity.Next, we need to show that the inequality (49) holds for any WD i ∈ N o X .For a WD i, (49) can be expressed as after substituting f * i , w * i from ( 34) and applying algebraic manipulations.

Next we define g
, which is convex for x ≤ F .By the definition of convexity we know that thus ) holds.To conclude the proof, let us define the functions Notice that showing χ(y, z) ≥ 0 is equivalent to showing (50) holds.Thus if we show that ∂χ(y,z) ∂y ≥ 0 and ∂χ(y,z) ∂z ≥ 0 for any y, z ≥ 0, this would imply that χ(y, z) ≥ 0 since χ(0, 0) ≥ 0 from (52).The partial derivatives are Observe that ∂χ(y,z) ∂y ≥ 0, ∂χ(y,z) ∂z ≥ 0 for any y, z ≥ 0, hence (50) holds for any i ∈ N o X .We already showed that inequality Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Algorithm 1: SGM.

Require
holds for WD i and j.Thus, (49) holds as well.This concludes the proof.
Lemma 4 shows that the utility function is submodular and monotone under certain conditions, and thus existing approximation algorithms for monotone submodular functions can guarantee an approximation ratio bound of 1  2 , e.g., by always adding an element based on marginal gain [38] with O(N 2 X ) time complexity.In what follows we propose an approximation algorithm with lower computational complexity called Singleton Greedy Maximization (SGM).The algorithm greedily adds WD i * to the set of offloaders with the highest singleton utility, i.e., obtained when only WD i * offloads.Then since the utility is not monotone with respect to set of offloaders, the algorithm checks for the increase of the utility after the addition of WD i * in Line 4. It then removes the WD i * from the ground set and keeps iterating until the ground set becomes empty.A flow chart of the proposed SGM algorithm is shown in Fig. 2. The flow chart also marks the steps in the proposed algorithm that have been made possible by our analytical results.
SGM is computationally very efficient, and at the same time it allows an approximation ratio bound.As we show in Propositions 6 and 7, the considered problem is neither submodular nor monotone in general.An approximation ratio bound for this kind of set maximization problems is not known in general.Nonetheless, in what follows, we develop an approximation ratio bound based on two properties of the total transmission energy cost function E(.), introduced in the next two lemmas.
Lemma 5: Let N o X be such that Problem ( 24)-( 28) has a feasible solution.Then Proof: Observe that for any since in this case the optimal solution for the operator is to allocate Proof: We will first use the decomposition of the utility in the form of (43).Then we write out (57) as, holds from Lemma 5 if N o X admits a feasible solution.If N o X does not admit a feasible solution then U N o X X = 0, and (57) holds trivially since U i X ≥ 0, ∀i ∈ N X .This concludes the proof.Lemma 6 shows the intuitive result that the sum of the utility of WDs offloading individually is higher than when all WDs offload simultaneously.Based on this result, we are now ready to derive a bound on the approximation ratio of SGM.
Theorem 3: Let N * X be the optimal set of offloaders for given caching decision X , and let N SGM X be the set of offloaders computed by SGM.Then, Proof: By using Lemma 6 we write the upper bound of the optimal utility, where i * = arg max i∈N X U i X .In the first iteration of the algorithm, the WD with maximal singleton utility i.e., i * , is chosen by the algorithm.Thus, i * ∈ N SGM X provided that there is i ∈ N X such that U i X > 0. In the rest of the iterations if the algorithm chooses a new WD j = i * that implies U i * X ≤ U i * ,j X thanks to Line 4. Thus, justifies (61).This concludes the proof.
Importantly, Theorem 3 provides a bound on the worst case performance of the proposed SGM algorithm.

VI. OPTIMAL CACHING POLICY
In this section, we address the problem of choosing an optimal set of applications to cache, i.e., max (62) Choosing an optimal set of applications is NP-hard, a result that can be shown using the same approach as in Proposition 3 by setting φ i / ∈ ∪ i ∈N X \{i} φ i , ∀i ∈ N X i.e., all WDs want to execute different applications, and by setting s i = b i , and setting F and W high enough such that E(N ) < 1.We next show that despite the non-monotonicity of the utility function with respect to the addition of new WDs to the set of offloaders, the utility of the operator is a monotone function with respect to the addition of new applications to the cached set.
Proposition 4: Let X ⊆ J and j ∈ J \ X .Then Proof: We will prove the statement by contradiction.Let U N * X X be the optimal utility for caching decision X .Let U N * X ∪{x} X ∪{x} be the optimal utility for caching decision X gives higher utility compared to N * X ∪{x} , then the operator would choose N * X instead of N * X ∪{x} when set X ∪ {x} is cached.Hence, U N * X ∪{x} X ∪{x} cannot be optimal, which is a contradiction.This concludes the proof.
At the same time, the operator's utility with respect to the addition of a new application need not be submodular in general.
Proposition 5: Let X ⊆ X and j ∈ J \ X , then i.e., the optimal utility U * X need not be submodular with respect to the set of cached applications.
Proof: The proof is based on a counterexample and is given in the Appendix, available online.
Thus, the problem (62) is a monotone non-submodular set function maximization problem subject to a knapsack constraint, imposed by the cache capacity constraint.Recently proposed solutions for such problems provide approximation guarantees, but at the price of high computational cost [39], we thus propose a fast heuristic called Singleton Revenue Maximization (SRM), which uses SGM for pricing and resource allocation.The algorithm first calculates the utility of each individual application j ∈ J using SGM.Then, it selects the application j * with the highest utility to storage size ratio (Line 3) and adds it to the to caching decision set if storage capacity allows (Line 4-6).The algorithm then removes the application j * from the ground set J (Line 7).The algorithm stops when all applications have been considered or if the storage capacity is exceeded.Fig. 3 shows the flow chart of the proposed SRM algorithm, including the interaction between SGM and SRM.Algorithm 2: SRM.
We used extensive simulations to evaluate the performance of the proposed algorithm in terms of operator utility, simulation time, total energy saving and consumption through task offloading, the number of offloaders in SPE and provide a sensitivity analysis of the proposed algorithm to faults in wireless communication.
For the evaluation we consider a system with up to N = 200 WDs, and up to |J | = 50 applications.The storage capacity is S = 10 GB, and the computational complexity L j is drawn from a uniform distribution on [100,500] I/B and size of the application s j is drawn from a uniform distribution on [1,2.5]GB.The computational capacity of the edge server is F = 200 GIPS, and total channel bandwidth W = 200 MHz.The task type φ i of WD i is chosen uniform at random from the set J .The maximum transmission power pi is drawn from a uniform distribution on [100,1000] mW, f l i is drawn from a uniform distribution on [0.5,3] GIPS, D i is drawn from a uniform distribution on [5,50] MB.The channel noise variance σ2 i and the channel gain h i is uniformly distributed on [0.1,1] and [0.1,1], respectively.The energy efficiency parameter κ l i and the unit energy cost parameter β i are drawn from a uniform distribution on [10 −22 , 10 −19 ] J/Hz/GI 2 , and on [10 −3 , 1], respectively.We set γ i = 0.1 $/J, ∀i ∈ N .These choices of parameters are similar to those used in works [17], [32].The results shown are the averages of at least 200 simulations, together with 95% Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II OVERVIEW OF THE SIMULATION PARAMETERS
confidence intervals, which are within 1% of the mean.The simulations were conducted using Matlab on a desktop computer with Intel i9 and on a server with AMD EPYC 7543P CPU.
We consider five baselines for the evaluation.The first baseline is exhaustive search over the set of offloaders and computes the optimal resource allocation and pricing.The second baseline is called Random Greedy Selection (RGS); it randomly chooses a set of WDs, calculates the optimal utility and returns it if it is positive, else chooses another set of WDs.The third baseline is called Marginal Greedy Maximization (MGM), and is based on the greedy algorithm proposed in [40].MGM computes the marginal utility of each WD, and adds the WD with the highest marginal gain to the set of offloaders if doing so increases the utility.The last two baselines are approaches that are widely used in previous works [10], [32].The first, Equal Sharing (ES) allocates resources equally among offloading WDs.The second, Load Proportional (LP) allocation, allocates resources to all WDs proportionally to their task complexity L φ i D i .
For computing the optimal set of cached applications we use three baselines.The first baseline is Popularity Based Caching (PBC), which selects the set of applications with the highest number of WDs |{φ i ∈ X |}| while satisfying the storage constraint.The second baseline is Utility Based Caching (UBC), which chooses the set of applications with highest i∈N X L φ i D i .The third baseline is Random Selection (RS), which chooses a random set of applications satisfying the storage constraint.

A. Validation of the Approximate Resource Allocation
We first evaluate the accuracy of the proposed approximation in (34).Fig. 4 shows the mean absolute relative error of the resource allocation vector and of the utility computed using the proposed approximation and using the interior point method.To create high capacity conditions, we set W to 2 GHz, pi is drawn uniformly from [1,2] W, β i is drawn uniformly from [10 −8 , 10 −6 ], L φ i is drawn uniformly from [1500,2000] and κ l i is drawn uniformly from [10 −20 , 10 −19 ] J/Hz/GI 2 so that the constraints ( 27) and (28) are not binding for all WDs.The figure shows that under these conditions the approximation is very accurate.Fig. 4. Relative error in resource allocation vector and utility for 200 randomly generated problem instances, obtained using a numerical method and using the proposed approximation.

B. Choosing Optimal Set of Offloaders
Figs. 5 and 6 show the utility as a function of the number of WDs and the simulation time as a function of number of the WDs for a single cached application, respectively.The figures show that for a small number of WDs (N ≤ 10), the utilized approximation algorithms for joint pricing and resource management, namely MGM and SGM, performs close to the optimal solution with a much lower simulation time.In contrast, for N > 10, the proposed SGM performs close to MGM at a much lower computational cost.For high number of users, MGM becomes practically infeasible as the operator would have to solve problem ( 13)-( 14) in real time.This highlights the significance of the proposed SGM algorithm.As the number of WDs increases the only algorithm that provides high utility at low computational cost is SGM.It is interesting to note that RGS is computationally more expensive than SGM, as finding a set of offloaders that gives a positive utility becomes harder as the number of WDs increase.As RGS could be considered as one of the most light weight solution, SGM can still be computed faster and the computation time difference increases above 80 WDs, which shows that our proposed approach strikes a good balance time complexity and the achieved utility.Lastly, the poor performance of ES and LP justifies the importance of joint optimization of resource allocation and pricing for maximizing the utility.

C. Operator's Profit
Fig. 7 shows the operator's utility as a function of the number of WDs for |J | = 20.The figure shows that SRM, which jointly optimizes caching, resource allocation and pricing, outperforms ES-UBC, LP-UBC and ES-PBC, LP-PBC, by up to an order of magnitude particularly for high number of WDs.More importantly, SRM outperforms UBC-SGM and PBC-SGM, i.e. the baselines that use the proposed SGM to compute optimal resource allocation and pricing, showing that joint caching and pricing provides significant benefits compared to pricingunaware caching.
Fig. 8 shows the utility as a function of the number of applications for N = 50.The utility obtained by the algorithms that use SGM for pricing and resource allocation decreases monotonically in J, as the number of WDs per application decreases.On the contrary, the utility obtained by the algorithms that use ES or LP for resource allocation increases, as for a few WDs per application they get closer to the optimal allocation.Importantly, the proposed SRM algorithm is almost insensitive to the increase in the number of applications and its performance advantage increases as with the number of applications.

D. Energy Optimal Resource Allocation, Pricing and Number of Offloaders in SPE
Figs. 9 and 10 show the number of offloaders as a function of number of WDs and number of applications, respectively.The figures show the superior utility of SRM compared to the baselines is correlated with that it allows more WDs to offload, owing to that it computes the optimal resource allocation and pricing.Finally, Figs.11 and 12 show the total energy saving, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

defined as i∈N
, through task offloading as a function of the number of WDs and as a function of the number of applications.The figures show that SRM achieves the highest total energy saving.Consequently, the objective of the operator to maximize its utility combined with the objective of the WDs to minimize their cost leads to an energy efficient solution for WDs.

E. Energy Consumption of the WDs
We also evaluate the energy efficiency of the proposed solution from the WDs' perspective.Fig. 13 shows the total energy cost of the WDs as a function of the number of WDs.The results in the figure are aligned with those in Fig. 11, showing that the operator's optimal strategy for maximizing utility indeed leads to lower energy cost for the WDs.The figure also shows that as the number of WDs increases, the difference in terms of energy consumption between SRM and the baselines increases, showing the superiority of the proposed approach.
Fig. 14 shows the empirical CDF of the energy consumption of the WDs for N = 20 and N = 40, with |J | = 20, across 500 randomly generated problem instances.Aligned with the results shown in Fig. 13, the proposed SRM outperforms the state-of-the-art methods, and the performance difference becomes more pronounced as the number of WDs increases (c.f. the subfigures in Fig. 14).The figure also shows that compared  to local computing, SRM decreases the 99th percentile of the energy consumption by 50% and by 33% and its median by 53% and by 39% for N = 20 and for N = 40, respectively.We show corresponding results for SGM in the Appendix, available online.

F. Sensitivity to Communication Failure
Finally, we evaluate the sensitivity of the proposed solution to communication failure due to channel outage.Channel outages are inherent to wireless communication [41], [42], and are often due to imperfect knowledge of the channel state information by the transmitter due to, e.g., fading and mobility [43], [44], [45].
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.An outage happens when the intended transmission rate exceeds the instantaneous channel capacity [46], and its probability can be up to 1% [47].In what follows we show results considering that WDs may experience channel outage, and cannot perform offloading even if they would like to, but this information is not known to the operator during the optimization, leading to loss of revenue.We consider two models of outage based on the model presented in [46]; in the first model the outage probability of a WD is proportional to its transmission rate, i.e. w * i log 2 (1 + ) [48].In the second model the outage probability is proportional to the amount of transmitted data D i [49].
Fig. 15 shows the utility as a function of the average outage probability among offloading WDs for the proposed SRM algorithm and for the baselines.The figure shows that the utility of the operator decreases approximately linearly with the average outage probability under both outage models, both for SRM and for the baselines.We can observe a slight difference between the shapes of the curves due to the correlation between the outage of a WD and the revenue it would give to the operator under different pricing and resource allocation schemes, but based on the results we can conclude that all algorithms exhibit graceful degradation under communication failure.

VIII. RELATED WORK
A number of recent works deal with energy efficient computation offloading for a single mobile user and show the energy reduction obtained by computation offloading [50], [51], [52], [53], [54].[50] introduces a system that facilitates energy-aware offloading to the infrastructure.[51] conducted an investigation into cloud computing with a focus on the utilization of bandwidth and energy consumption.They presented the results obtained from an experimental platform, specifically Amazon EC2.The findings of the study indicate that cloud offloading can be considered sustainable in terms of energy consumption.The authors also propose an algorithm that aims to maximize energy savings while minimizing the computational burden associated with offloading.[52] proposes CPU frequency scaling and transmission power adaptation to optimize the energy consumption of the computation of a task.[53] presents a dynamic offloading algorithm in order to achieve energy savings under time constraints.In [54], experimental results are used to show that battery power savings can be achieved using computation offloading.Inspired by these works that show the potential energy savings through offloading, we consider a system level optimization problem of computation offloading with an emphasis on the interaction between the WDs and the operator.
Going beyond offloading by a single device, a number of works consider computation offloading for multiple WDs [55], [56], [57].[55] considers a model in which tasks arrive simultaneously to the cloud through a single wireless link and proposes a non-cooperative game among users that minimize their own energy use.[56] considers a hierarchical MEC network, where mobile users can make offloading decisions, and can decide the uplink transmission power, perform cloud selection, and route the tasks.A distributed offloading approach is developed based on game theory, in which user equipment collaborates with each other to minimize the network cost in terms of energy consumption and latency.[57] models the load-balancing problem as a stochastic congestion game in which each users aims to minimize its task execution time.Unlike these works that focus on the WDs' costs only, our problem formulation accounts for the financial incentives, i.e., the pricing of the operator and for service caching together with the optimization problem faced by WDs, resulting in a Stackelberg game formulation.
Related to our work are recent works that address the pricing problem in edge computing [11], [58], [59], [60].Authors in [11] consider a Bayesian Stackelberg game in which the operator is the leader, and the WDs are followers.The objective of the operator is to maximize its revenue through pricing storage.In contrast, the WDs minimize a combination of the price paid and the delay.Different from ours, this work does not consider the optimization of communication and computing resources.Authors in [58] examine various models to optimize pricing for the task offloading problem, but they do not optimize pricing and resource allocation jointly; instead, they allocate compute resources to WDs proportional to their payment, and do not take into account communication resources.Authors in [59] proposed an auction for resource allocation and offloading.Resource allocation is based on bids from the WDs for a portion of the available edge resources, but joint optimization of pricing and resource allocation is not considered.Authors in [60] consider the problem of offloading, pricing and risk awareness in edge computing, modeled by a Stackelberg game played between the WDs and the edge servers.Compared to our work, the model does not consider the optimization of the edge resources.
Most related to ours are recent works that consider application caching and computation offloading [15], [16], [17], [18].In [15] authors consider a computation offloading and service caching problem with the objective of minimizing the total system cost defined as the weighted sum of energy consumption and completion time.Different from our work, they do not consider the joint optimization of bandwidth and computing resource allocation, as they do not consider bandwidth in the proposed optimization problem.In [16], authors consider computation offloading, resource allocation (wireless and computation resources) and service caching.They formulate the problem of total delay minimization subject to the capacity of the operator without considering the energy consumption of the WDs.Similarly, in [17], authors consider computation offloading, wireless and computation resource allocation and service caching and they formulate the problem of minimizing the total weighted sum of the delay and the computation energy cost of the WDs.The model was extended in [18] to consider maximization of the users' quality of service focusing on a multi-edge server scenario, and a decentralized solution was proposed.
Different from the above works, our paper is the first to jointly consider service caching, wireless and computation resource allocation, as well as the financial incentives of the edge operator, specifically pricing.While previous works have focused on minimization of the total cost, with various cost definitions, they have not taken into account the operator's financial incentives in conjunction.On the contrary, our game-theoretic model considers the interaction between WDs and the profit maximizing operator in the form of a Stackelberg game.Our results confirm that caching, pricing and the strategic interactions of the WDs need to be jointly considered for maximizing the operator's utility and for minimizing the WDs' cost.

IX. CONCLUSION
In this work we have provided a game theoretic analysis of pricing, caching, wireless and computation resource allocation for edge computing.We modeled the interaction between WDs and the operator as a Stackelberg game.We showed that the operator's utility maximization problem is NP-hard and we proposed an efficient approximation based on a decomposition of the problem and by characterizing the subproblems.Our numerical results show that joint optimization of caching, pricing and resource allocation provides significant advantages compared to non-joint optimization, and our proposed algorithm can indeed find a near optimal solution, outperforming state of the art methods.
There are a number of interesting avenues of future work concerning pricing and resource allocation in edge computing.One example is the case of incomplete information, where the operator has to learn the applications' and the WDs' utilities and resource requirements in real time.Another direction is the case of a dynamic population of users, where pricing and resource allocation have to anticipate the effect of future arrivals of WDs.

Fig. 1 .
Fig. 1. Figure shows a system with J = {A, B, C, D, E, F } and |N | = 7.The operator chooses applications X = {B, C, E} to cache.The dotted arrows show that WDs choose to offload to the edge server to reduce their computation costs.The WDs without arrow choose to execute their task locally.