Optimal Scheduling of Content Caching Subject to Deadline

Content caching at the edge of network is a promising technique to alleviate the burden of backhaul networks. In this paper, we consider content caching along time in a base station with limited cache capacity. As the popularity of contents may vary over time, the contents of cache need to be updated accordingly. In addition, a requested content may have a delivery deadline within which the content needs to be obtained. Motivated by these, we address optimal scheduling of content caching in a time-slotted system under delivery deadline and cache capacity constraints. The objective is to minimize a cost function that captures the load of backhaul links. For our optimization problem, we prove its NP-hardness via a reduction from the Partition problem. For problem solving, via a mathematical reformulation, we develop a solution approach based on repeatedly applying a column generation algorithm and a problem-tailored rounding algorithm. In addition, two greedy algorithms are developed based on existing algorithms from the literature. Finally, we present extensive simulations that verify the effectiveness of our solution approach in obtaining near-to-optimal solutions in comparison to the greedy algorithms. The solutions obtained from our solution approach are within 1.6% from global optimality.


Optimal Scheduling of Content Caching
Subject to Deadline Ghafour Ahani and Di Yuan, Senior Member, IEEE Abstract-Content caching at the edge of network is a promising technique to alleviate the burden of backhaul networks.In this paper, we consider content caching along time in a base station with limited cache capacity.As the popularity of contents may vary over time, the contents of cache need to be updated accordingly.In addition, a requested content may have a delivery deadline within which the content needs to be obtained.Motivated by these, we address optimal scheduling of content caching in a time-slotted system under delivery deadline and cache capacity constraints.The objective is to minimize a cost function that captures the load of backhaul links.For our optimization problem, we prove its NP-hardness via a reduction from the Partition problem.For problem solving, via a mathematical reformulation, we develop a solution approach based on repeatedly applying a column generation algorithm and a problem-tailored rounding algorithm.In addition, two greedy algorithms are developed based on existing algorithms from the literature.Finally, we present extensive simulations that verify the effectiveness of our solution approach in obtaining near-tooptimal solutions in comparison to the greedy algorithms.The solutions obtained from our solution approach are within 1.6% from global optimality.
Index Terms-Base station, content caching, deadline, timevarying popularity

A. Motivations
Whereas the amount of data traffic is exponentially growing, it has been realized that the major portion of the data traffic originates from duplicated downloads of a few popular contents [1].These duplicated downloads congest the backhaul links, hence lowering the quality of service.It is costly to increase the capacity of backhaul links, hence they should be used more effectively.A promising technique is to store the popular contents on the edge of network such as BSs with caching capability [2]- [4].This technique helps to improve the efficiency of communications systems via providing the contents of interest from the BSs instead of from the core network.In fact, the measurement studies in [5], [6] showed up to 66% of traffic reduction in 3G and 4G networks via caching techniques.
Optimal content caching heavily depends on two main factors, namely the number of requests for the contents and the delivery deadlines of such requests.The number of requests for a content, referred to as the popularity of a content, may vary over time.Therefore, the contents of the cache need to be updated accordingly.An update incurs a downloading cost due to getting contents from the server to the BS cache.
It is commonly assumed that a content request needs to be served as soon as it is made.We extend the problem setup and investigate a scenario in which a user can put a deadline on the delivery time of the requested content.To the best of our knowledge, the joint impact of delivery deadline and content downloading cost in content caching has not been studied in the literature.In order to close this gap, we study content caching along time in a BS with limited caching capacity.We address optimal scheduling of cache updates taking into account the downloading cost subject to delivery deadline and cache capacity constraints.

B. Related Works
Content caching has been studied in various system scenarios in the context of wireless communication networks.We provide a review with emphasis on the recent developments.We refer the reader to [7] for a comprehensive survey.
The works such as [4], [8]- [11] studied content caching in BSs when the probability distributions of contents are known.In [4] the objective was to minimize the expected downloading time of contents.In [8], [9] collaborative content caching among BSs was considered with the objectives of minimizing an operational cost and average downloading delay, respectively.In [10] decentralized content caching was studied with the presence of multi-hop communications.In [11] the user's hit probability was maximized.
The studies in [12]- [16] enhanced the system models in the works mentioned above to take into account the impact of user mobility in content caching of BSs.The works in [12], [13] took into account the movement of users where the trajectories of users are known.In [14], caching contents in both BSs and users was investigated with the objective of minimizing energy consumption.The works in [15], [16] further improved the system model in [14] and considered caching on mobile users such that they can obtain their contents of interest from each other via device-to-device (D2D) communications.
In contrast to the aforementioned works, the studies in [17]- [24] investigated content caching in BSs when the popularity distributions of contents are unknown.The work in [17] determined the popularity of a content based on the previously stored contents.The work in [18] computed the popularity of a content using a big dataset, and proposed an optimal content caching algorithm to minimize the delivery time of contents.In [19], the authors estimated the popularity of contents via local interest for the content and then proposed a caching algorithm to maximize the hit rate.In [20] an online algorithm is proposed to estimate the popularity of contents based on the incoming requests.The works in [21]- [24] proposed learningbased methods to estimate the popularity of contents.
In all works mentioned so far, the popularity distribution of contents is invariant along time.The studies in [25]- [31] relaxed this assumption and considered content caching with time-varying popularities.In [25], [26] caching contents of uniform size was studied, however, the cost of cache updating was neglected.In [27], the authors studied content caching in set of BSs from a learning perspective.In [28], [29] content caching with updates were considered in D2D and vehicle-tovehicle networks, respectively.In [30] content caching in a BS was studied in which the cost of cache updates and freshness of the contents were jointly optimized.In [31] collaborative caching was studied, where the cost of updates is accounted for.In [30], [31], the authors assumed a requested content needs to be served instantly after the request is made.This may not be true in some circumstances when a requester can wait before the content is delivered until a time point, that is deadline.
The works just mentioned above are the most related studies to our work in the sense that they have also considered cache updating along time.However, in these investigations either the main effort was devoted to estimating the popularity distributions of contents rather than designing effective content caching algorithms, or the cost of performing updates is neglected, or the deadlines of content requests are not considered.Therefore, we aim to complement the above works and devote our effort to designing an effective content caching algorithm where the deadline constraints and the cost of cache updates are considered jointly.

C. Our Contributions
We investigate scheduling of content caching in a BS with limited caching capacity in a time-slotted system under delivery deadline and cache capacity constraints.Our main contribution lies on the joint consideration of time-varying popularity of contents and the deadlines of requested contents.Our objective is to optimally schedule the updates across the time slots so as to minimize the total cost of obtaining the requested contents by users.The main contributions of this work are summarized as follows: • We formally prove the NP-hardness of the problem based on a reduction from the Partition problem.

A. System Scenario
The system scenario consists of a content server, a base station (BS), U users within the coverage of the BS, and F contents.The set of users is denoted by U = {1, 2, . . ., U }.The server has all the contents, and the BS is equipped with a cache of size S. Denote by F = {1, 2, . . ., F } the set of contents.Denote by l f the size of content f ∈ F .The system scenario is shown in Fig. 1.We consider a time-slotted system in which a time period is divided into T time slots.Denote by T the set of time slots with T = {1, 2, . . ., T }.At the beginning of each time slot, the contents of the cache are subject to updates.Namely, some stored contents may be removed from the cache and some new contents may be added to the cache by downloading from the server.
The popularity of a content is determined by the number of requests for the content.In our model, user u ∈ U, requests at most R u contents within the T time slots based on its interest.The set of requests for user u is denoted by R u .The length of a time slot is long enough to complete the downloading process of the requests from the BS or the server.We assume the time of making each request is known or can be predicted via using a prediction model [32].In addition, each request has a deadline before which the requested content must be delivered to the user.For user u and its r-th request, the requested content, the time slot of request, and the deadline of request, are denoted by h(u, r), o(u, r), and d(u, r), respectively.
A content may become available or unavailable in the cache from a time slot to another due to cache updates.A content is either downloaded from the cache if the content is available in the cache in at least one of the time slots between o(u, r) and d(u, r), or, otherwise from the server.Denote by c s and c b the costs for downloading one unit of data from the server and from the cache, respectively.Intuitively, c s > c b to encourage downloading from the cache.The time duration for downloading data from the server to the BS is neglected as the backhaul capacity is significantly higher than that of wireless access.The problem of optimally scheduling content caching subject to deadline is abbreviated to SCCD.The objective is to minimize the total cost of content downloading.

B. Complexity Analysis
In this section, we formally prove the NP-hardness of the problem based on a reduction from the Partition problem.
Proof.The proof is based on a polynomial-time reduction from the Partition problem that is NP-complete [33].Consider a Partition problem with a set of N = {n 1 , . . ., n N } integers.The task is to determine whether it is possible to partition N into two subsets N 1 and N 2 with equal sum.
We construct a reduction from the Partition problem as follows.We set F = {1, . . ., N }, l f = n f for f ∈ F , S = 1 2 f ∈F l f , and T = 1.In this case, there is no updating cost and we only have downloading cost.The time slots of requests and deadlines for all requests are set to 1, i.e., o(u, r) = d(u, r) = 1 for u ∈ U and r ∈ R u .Denote by m 1f the number of users requesting content f in this time slot.We set m 1f = 2 for f ∈ F , c s = 2, and c b = 1.
Otherwise, the m 1f users have to download content f from the server, giving rise to the downloading cost of By this construction, the total gain that can be achieved is upper-bounded by 1 2 f ∈F l f .Now the question is whether we can achieve this gain.Solving the defined instance of SCCD will answer this question and also the Partition problem.Namely, after solving this instance of SCCD, if a total gain of 1 2 f ∈F l f is achieved, then the answer to the Partition problem is yes, and the contents inside and outside the cache correspond to the two subsets N 1 and N 2 , respectively.
Otherwise, the answer to the Partition problem is no.Hence the conclusion.

A. Cost Model
Denote by y urt a binary optimization variable which equals one if and only if the r-th request of user u is downloaded in time slot t ∈ D (u,r) = {o(u, r), . . ., d(u, r)} from the cache.The downloading cost for user u to obtain the content requested in the r-th request, denoted by C ur , is expressed as: (1) where the first term indicates that if the content is downloaded before its deadline from the cache, the downloading cost is c b l h(u,r) .Otherwise, it is downloaded from the server with cost c s l h(u,r) .The downloading cost for completing all requests of user u, denoted by C u , is: Thus, the downloading cost for completing all requests for all users, denoted by C download , is expressed as: For the cache, the cost due to cache updates is referred to as the updating cost.This cost over the time slots, denoted by C update , is expressed as: where a tf is a binary variable which equals one if and only if the cache does not store content f in time slot t − 1, but stores the content in time slot t, and l f (c s − c b ) is the cost for downloading content f from the server to the cache.

B. Problem Formulation
In general, as the popularity of contents changes over time, storing popular contents in each time slot will reduce the downloading cost, but it significantly increases the updating cost.On the other hand, if the stored contents remain unchanged over the time slots, the updating cost is low, but the downloading cost will be high.Based on this, our optimization problem is to minimize the total cost consisting of the downloading and the updating cost by optimizing decisions in terms of caching the contents over the time slots.Denote by x an F ×T matrix of optimization variables for F contents and T time slots: where x tf is a binary variable that takes value one if and only if content f is stored in time slot t.SCCD can be formulated as an integer linear program (ILP) and shown in (5).
Constraints (5b) indicate that the total amount of cache space used for storing the contents is less than or equal to the cache capacity in each time slot.Constraints (5c), (5d), (5e), and (5f) together ensure that a tf is one if and only if the cache does not store content f in time slot t− 1, but stores the content in time slot t.Constraints (5g) state that y urt can take value one only if x th(u,r) = 1, i.e., content h(u, r) is stored in the cache in time slot t.Constraints (5h) say that request r (ILP) min x,a,y dur t=our from user u is met in at most one of the time slots between the time slot of request and its deadline.ILP ( 5) can be solved by an off-the-shelf integer programming algorithm from optimization packages.However, for large-scale problem instances solving the problem needs significant computational effort.Therefore, we develop a column generation algorithm and rounding mechanism, presented in Section V, to obtain near-to-optimal solutions of SCCD.

IV. PROBLEM REFORMULATION
In this section, we provide a reformulation of SCCD that enables a solution approach based on column generation.We will see in Section VII that the algorithm achieves near-tooptimal solutions.
We define sequence x f = [x 1f , x 2f , . . ., x T f ] T to represent the caching solution of content f over the T time slots.As x tf ∈ {0, 1} for t ∈ T , in total K = 2 T possible sequences exist for content f .However, as will be clear later on, the algorithm needs to deal with only a small subset of the candidate sequences.Denote by K a set, with K = {1, 2, . . ., K}. Denote by w f k a binary variable where w f k = 1 if and only if the k-th sequence of content f is selected, otherwise zero.Exactly one of them is used in the solution of the problem, thus K k=1 w f k = 1.For any given sequence, the total cost of the sequence can be calculated as the sequence contains known caching decisions.The total cost for content f with respect to the k-th sequence is denoted by C f k and is expressed in (6).Denote by constants x urt can be determined.
Based on the above notion, SCCD is reformulated as (7).Constraints (7b) formulate cache capacity over the time slots.These constraints have the same meaning as constraints (5b).Constraints (7b) say that exactly one sequence has to be selected for each content.In formulation (7) the deadline and updating constraints (i.e., constraints (5c)-(5h)) are not present, and they are embedded in the sequences.As can be seen both ( 5) and ( 7) are valid optimization formulations of SCCD.However they differ in structure.
V. ALGORITHM DESIGN In this section, we present our solution approach.We first consider the continuous version of formulation (7) and apply column generation to derive its global optimum.This gives obviously a lower bound to the global optimum of SCCD.Next, if the solution obtained from the column generation algorithm (CGA) is fractional, we use a tailored rounding algorithm (TRA) to obtain integer solutions.Using TRA, some of the decisions in terms of caching will be fixed and CGA will be used again to resolve the new problem subject to these decisions.This process will continue until an integral solution is obtained.We refer to this solution approach as repeated column generation algorithm (RCGA).

A. Column Generation Algorithm
For some structured linear programming problems, column generation can reduce the computational complexity for solving large-scale scenarios [34].The main advantage of using column generation is that the optimal solution can be obtained without the need of considering the set of all possible columns of which the number is typically exponentially many.In column generation, the problem under consideration is decomposed into a so called master problem (MP) and a subproblem (SP).The algorithm iterates between a restricted MP (RMP) and SP.The idea is to start with a very limited set of columns.The algorithm solves the SP to generate one or multiple new columns that improve the objective function of the RMP.This process is repeated until no improving column exists.In SCCD, a column is defined as a value assignment of sequence [x 1f , x 2f , . . ., x T f ] T .
1) MP and RMP: MP is the continuous version of formulation (7).CGA starts with a small subset K ′ f ⊂ K for any content f ∈ F .This leads to a so-called restricted version of the MP problem referred to as RMP, which is expressed in (8).Denote by K ′ f the cardinality of K ′ f .
(RMP) min 2) Subproblem: The SP uses the dual optimal solution to generate new columns.Denote by w * the optimal solution of (8).Denote by π * and β * the optimal values of the corresponding dual variables of constraints (8b) and (8c), respectively.Here, After obtaining w * , checking if w * is the optimum of MP can be determined by finding a column with the minimum reduced cost for each content f ∈ F .If all these values are nonnegative, then the current solution is optimal.Otherwise, we add the columns with negative reduced costs to their respective sets. Given tf are replaced with their counterparts of optimization variables.To find the column with minimum reduced cost for content f ∈ F , we need to solve subproblem SP f , shown in (9).Denote by x * f the optimal solution of SP f , i.e., f is a constant and thus dropped from the objective function.
(SP f ) min x,a,y Even though SP f is an ILP, we show that it can be solved in polynomial time by mapping to a shortest path problem.

B. Subproblem as a Shortest Path Problem
For SP f , we construct an acyclic directed graph where finding the shortest path from defined source to distention is equivalent to solving the subproblem.Denote by Q f the total downloading cost for content f when all requests over all time slots are served from the server, i.e., Q f = u∈U r∈Ru:h(u,r)=f l f c s .Denote by q f = l f (c s − c b ) the updating cost when the content is not stored in the previous time slot, but is stored in the current time slot.Denote by p tf = −l f π * t the cost related to the dual optimal solution in time slot t.Denote by g ≥d tf the cost of the requests made for content f in time slot t with deadline greater than or equal to time slot d, that is: The graph is shown in Fig. 2. We first introduce the vertices and then the arcs.Two vertices S f and D f are defined to represent the source and destination, respectively.V 00f is a vertex representing x 0f = 0.For time slot t ∈ T , in total t + 1 vertices are defined, represented by V t1f and V k t0f , k ∈ {0, . . ., t − 1}.Vertex V t1f represents decision x tf = 1 and vertices V k t0f , k ∈ {0, . . ., t − 1}, represent decision x tf = 0 for the following scenarios.Vertex V 0 t0f indicates that the content has not been stored in the cache in time slots 1, . . ., t, i.e., x jf = 0 for j ∈ {1, . . ., t}.Vertex V k t0f , k ∈ {1, . . ., t − 1}, indicates the content has been in the cache in time slot k, but not in the subsequent time slots until time slot t, i.e., x kf = 1 and x jf = 0 for j ∈ {k + 1, . . ., t}.These vertices are defined to trace the most recent time slot that the content was in the cache.Tracing enables to define the cost of each arc with respect to deadline.Now, we introduce the arcs and their weights.There is an arc from S f to V 00f with weight Q f .For time slot 1, there are two outgoing arcs from V 00f , one to V 11f with weight q f −p 1f −g ≥1 1f and the other to V 0 10f with weight zero.Consider time slot t ∈ {2, . . ., T }, for vertex V t1f there are t incoming arcs such that one comes from V (t−1)1f with weight p tf −g ≥t tf , and the others come from V k (t−1)0f for k ∈ {0, . . ., t− 2} with weight q f + p tf − t i=k+1 g ≥t if , respectively.Selecting vertex V k (t−1)0f in the path means that no request has been served in time slots k +1, . . ., t as x jf = 0 for j ∈ {k +1, . . ., t}, hence the third term in the weight is defined to serve all requests that are made in time slots k + 1, . . ., t with deadline later than or equal to time slot t.For each vertex V i t0f , i ∈ {0, . . ., t − 2}, there is one incoming arc from V i (t−1)0f with weight zero.For vertex V t−1 t0f the arc comes from V (t−1)1f with weight zero.There are T + 1 arcs from vertices V T 1f and V i T 0f to D f all having weight zero.
Theorem 2. For each content f ∈ F , SP f can be solved in polynomial time as a shortest path problem.
Proof.We show that the optimal solution of the subproblem can be obtained from the shortest path of the graph defined above.Assume the optimal solution of SP f , i.e., x * , a * , and y * are given.The path is constructed as follows.One of the following three scenarios may happen in time slot t ∈ T .First, if x tf = 1, the vertex on the path is Graph of the shortest path problem for SP f .j = i + 1, . . ., t, the next vertex is V i t0f .By construction of the graph, this path from S f to D f gives the same objective function of SP f as x * , a * , and y * .
Conversely, assume the shortest path is given.For time slot t, if the path contains one of the vertices V i t0f for i ∈ {0, . . ., t − 1}, we set x tf = 0. Otherwise, the path contains vertex V t1f , and we set x tf = 1.As soon as the values of x tf for t ∈ T and f ∈ F are known, the values of a tf for t ∈ T and f ∈ F and y urt for u ∈ U, f ∈ F , and t ∈ D (u,r) = {o(u, r), . . ., d(u, r)} can be easily determined.The value of y urt is set to the first time slot that the request can be served.By the construction of the graph, this solution gives the same objective function value as the shortest path.To clarify why this is correct we give an example.Assume that the shortest path Then, we set x tf = 0 for t ∈ T \ {2} and x 2f = 1, a 2f = 1, and y urt = 1 for all requests that can be served in time slot 2. With these setting of variables, the objective function has the same value as the length of the shortest path, as shown in (10).Based on the rationale illustrated in the example, it is straightforward to conclude the correctness in general.
Finally, the shortest path problem can be solved in polynomial time [35].Hence, the conclusion.

C. Rounding Algorithm
As the solution obtained from the RMP (i.e., w * ) may be fractional, we need a mechanism to obtain a feasible integer solution.One straightforward way is to round the fractional elements of w * .However, this way of rounding has some limitations.First, the solution may easily become infeasible.Second, even if the solution is feasible, it may be far from the global optimum.Third, when an element of w * , say w f k , becomes fixed in value, the caching decisions of content f for all time slots are made, and consequently there is no opportunity to improve the solution of content f .In order to overcome the above limitations, we make a rounding decision for one content and one time slot at a time.More specifically, the caching decision of content f in time slot t is made based on the value of z tf , and z tf is the sum of those elements of w * such that the corresponding columns store content f in time slot t, that is, In fact, the value of z tf can be viewed as an indicator of how probable it is to store content f in time slot t at optimum.In the following we prove a relationship between z and w * and then base our algorithm on this result.Theorem 3.For any content f ∈ F and k ∈ K f , w * f k is binary if and only if every element of z f is binary, where Proof.For necessity, for any content f ∈ F , if w * f k is binary for any k, k ∈ K ′ f , it is obvious that all elements of z f are binary.Now, we prove the sufficiency.For any content f ∈ F , assume that every element in z f is binary.Assume that w * f must be either all zero or all one.Otherwise, as , one of the elements of z f will become fractional.This means that all columns corresponding to w * f k for k ∈ K ′′ f must be the same.Having two columns with the same values violates the fact that the sequences of any two w * f k differ in at least one element.Therefore, for any content f ∈ F , if z tf is binary for any t ∈ T , then w * f k is an binary for any k ∈ K ′ f .Hence the proof.
A family of rounding algorithms can be derived based on how the caching decisions of the contents are made.We do it gradually.First, for content f and time slot t, if z tf = 1 then the decision is to store this content in this time slot, i.e., x tf = 1.Next, we find the fractional element of z being closest to zero or one, and round the value, giving the caching decision of the corresponding content and time slot.Next, the CGA will be applied subject to the rounded values to obtain the new w * .This process is repeated until a feasible integer solution is obtained.Note that a caching decision for a content and time slot, once made, will remain in all the subsequent iterations.An important observation is that the SP f , f ∈ F , with the giving caching decisions still can be solved via shortest path.If x tf = 1, we simply remove vertices V i j0 , for j = t, . . ., T and i = 1, . . ., t, and the arcs connected to these vertices from the graph.If x tf = 0, we remove vertex V t1 and its connected arcs.
TRA is presented in Algorithm 2. Symbol ← is used when a value is assigned to a programming variable and symbol ⇔ is used when an optimization variable is fixed to a value.The details of TRA are as follows.First, in Line 1, z is calculated.For each t ∈ T and f ∈ F , if z tf has value one, then TRA fixes x tf = 1 in SP f by Line 2. In addition, as x tf is fixed to one, the columns in K ′ f that have value zero in time slot t cannot be used any more and they are discarded.To achieve this, we fix tf = 0.This is done by Line 3.
Second, as long as w * is not an integer solution, then by Theorem 3 at least one element of z must be fractional.The fractional value of z being nearest to zero, its corresponding time slot, and content are calculated by Lines 4-5, and these are denoted by z, t, and f respectively.Likewise, the fractional value of z being nearest to one, its corresponding time slot, and content are calculated by Lines 6-7, and these are denoted by z, t, and f respectively.If z is less than z, TRA fixes the value of time slot t to zero by Line 9. Furthermore, those columns not compatible with the decision are discarded from K ′ f .This is done by Line 10. Otherwise, TRA checks whether there is enough spare space to store content f .If yes, then the value of time slot t is fixed to one in SP f by Line 12, and the columns with value zero in time slot t are discarded from K ′ f by Line 13.If no, the value of time slot t is fixed to zero by Line 15 and the columns with value one in time slot t are discarded from K f by Line 16.
Third, TRA fixes x tf = 0 for the contents that have size larger than the remained spare cache space.This is done by Lines 21-23.
Finally, the above operations may lead to discarding all columns of a content such that the RMP becomes infeasible.To avoid this, an auxiliary column for each content is added such that the column has value one in the time slots that are fixed to one so far, and zero in the other time slots.This is accomplished by Line 25.Note that the fixed variables remain in effect in all subsequent iterations of RCGA.

D. Framework of RCGA
Note that as none of the variables in the SPs or RMP is fixed when CGA is applied for the first time (i.e., in the first iteration of Algorithm 3), the cost from CGA provides a lower bound to the global optimum of SCCD.This lower bound can be used to measure the effectiveness of the final solution from Algorithm 3 or the solution obtained from any other suboptimal algorithm.The RCGA framework is shown in Algorithm 3. The maximum number of iterations required to obtain a feasible solution is bounded by F × T .Because, each time TRA is used, at least the caching decision of one content in one time slot is made, and as there are F contents and T time slots, Algorithm 3 terminates in at most F × T iterations.

VI. GREEDY ALGORITHMS
In this section, we consider cheap algorithms.We propose two greedy algorithms that deal with one time slot at a time.These algorithms are developed based on two conventional caching algorithms in the literature, i.e., popularity-based caching (PBC) [36] and random-based caching (RBC) [37].In PBC, a content is chosen as a candidate to be stored in the cache based on how frequently it is requested.In RBC, the candidate content will be chosen randomly and proportionally to its popularity.That is, the higher a requested content is, the more likely this content will be selected as a candidate content.Popularity of content f in time slot t is modeled by Algorithm 2: Tailored Rounding Algorithm (TRA) x tf ⇔ 0 in SP f 10: xt f ⇔ 1 in SP f 13: xt f ⇔ 0 in SP f 16: F ′ ← {f ∈ F|x tf is fixed to one} 19: for f ∈ F\F ′ do 21: if l f > S ′ then 22: x tf ⇔ 0 in SP f 23:   Apply TRA to w * the total number of the requests that must to be satisfied in this time slot, namely, all requests with deadline t.Denote by P tf the set of these requests for content f in time slot t.Denote by P tf the cardinality of set P tf .P tf can be computed as: The flow of the two algorithms is similar and a general description is as follows.The time slots will be considered one by one starting from the first time slot.The cache is initialized with size of S units of spare capacity.For each time slot under consideration, the algorithms treat contents one by one based on popularity in PBC and randomness in RBC.Once a content is selected as a candidate to be stored in the cache, the algorithms use an updating strategy based on the one in [25] to decide whether to store the content in this time slot.The updating strategy is as follows.For candidate content f , one of the following scenarios may arise: 1) If there is no enough spare space in the cache to store content f , the algorithms set x tf = 0. 2) If the cache has enough spare space and the content was stored in the previous time slot, the decision is to keep the content, i.e., x tf = 1. 3) If there is enough spare space but the content needs to be downloaded from the server, then the algorithms store the content if it is at least as popular as some of the stored contents in the previous time slot.Specifically, content f should be at least popular as the least popular contents with total size similar to l f .This comparison is due to the fact that storing the candidate content leads to deleting the contents that were in the cache in the previous time slot.Thus, it is beneficial to put this content in the cache only if it is at least as popular as them.The flow of the two algorithms is shown in Algorithm 4. RBC: select contents randomly proportionally to their popularity and put following resulting order in set F 7: for f = 1 to F do 8: if l f > S ′ then 9: x tf ← 0 10: else if (l f ≤ S ′ and x (t−1)f = 1) then 11: x tf ← 1 12: S ′ ← S ′ − l f 13: else if (l f ≤ S and x (t−1)f = 0) then 14: Ψ ← {i ∈ {f + 1, . . ., F }|x (t−1)i = 1} 15: E del ← 0 16: while (l del ≤ l f and |Ψ| > 0) do 18: if P tf ≥ E del then 23: x tf ← 1 24: S ′ ← S ′ − l f 25: else 26: x tf ← 0 27: return x

VII. PERFORMANCE EVALUATION
In this section, we conduct simulations to evaluate the performance of RCGA, PBC, and RBC by comparing them to the lower bound of global optimum; the lower bound is hereafter referred to as LB.As explained in Section V-D, the LB is provided by the solution of the first iteration of Algorithm 3. In general, deviations of RCGA, PBC, and RBC from global optimum are hard to obtain, because it is difficult to calculate the global optimum of SCCD as it is an NP-hard problem.Therefore, we use the LB to measure the effectiveness of the algorithms because the deviation to the global optimum cannot exceed the deviation to the LB.Hereafter, we refer to the relative deviations of RCGA, PBC, and RBC from LB as the (worst-case) optimality gaps.

A. Simulation Setup
For the simulation setup, we set T = 24 where each time slot has a length of one hour [38], [39].Similar to the works in [17], [22], we set U = 600 and F = 200 where the sizes of contents are uniformly generated within interval [1,10].The capacity of the cache is set as S = ρ f ∈F l f .Here, ρ ∈ [0, 1] is a parameter that shows the size of cache in relation to the total size of all contents.The number of requests for each user is uniformly distributed in interval [1,10].o(u, r), u ∈ U and r ∈ R u , are randomly selected between time slots 1 and T .The deadlines of content requests are uniformly selected in interval [o(u, r), α(T − o(u, r))] in which α indicates the tightness of deadlines.We will show the impact of α on the system cost.
Same as many works (e.g., [4]) in the literature, the content popularity distribution is modeled by a ZipF distribution, i.e., the probability that a user requests the f -th content is Here γ is the shape parameter of the ZipF distribution and is set to γ = 0.56 [4].The requests for contents are generated with varying content popularity over time.We will vary the parameters α, T , U , F , ρ, and γ in the simulations to show their impact on the system cost.Table I summarizes the definitions of parameters for reference.

B. Performance Comparison
The performance results of algorithms are reported in Figs.3-8.The lines in black, green, blue, and red represent the costs originating from the LB, RCGA, PBC, and RBC, respectively.The curves of RCGA and the LB are virtually overlapping in all figures, and the optimality gap of RCGA is consistently at most 1.6%, thus the RCGA performance is impressive when it comes to solution quality.Fig. 3 shows the impact of tightness of deadlines on the cost.When α increases from 0 to 1, the costs obtained from RCGA, PBC, and RBC decrease by 31.9%,35.9%, and 33.5%, respectively.The reason is that with less stringent deadline, the system has more opportunities to satisfy the requests via caching.The optimality gap of RCGA increases slightly from  0.6% to 1.6%, while the corresponding values for PBC and RBC decrease from 26.1% and 27.2% to 19.3% and 24.6%, respectively.Fig. 4 shows the impact of number of time slots on the cost.The costs decrease with respect to the number of time slots.There are two reasons for this: With larger T a) there are more opportunities to update contents of the cache, and b) more requests can be satisfied via the cache during the time period.The optimality gap of RCGA stays always less than 1%.However, the gap for PBC is 9.6% for T = 6 and increases to 20.1% for T = 36.The reason is that with larger T , the problem becomes more difficult which results in a higher optimality gap.The gap from RBC stays around 20.8% for all values of T .
Figs 5 and 6 show the impact of U and F on the cost respectively.As can be seen, the cost increases with respect to U and F .Obviously, this is because with larger U , the total number of requests increases accordingly which leads to a higher cost.Also, when F increases, the diversity of requested contents increases, and as the cache capacity is limited, more requests need to be downloaded from the server which leads to a higher cost.In general, the optimality gaps of RCGA, PBC, and RBC are approximately 1%, 18.5%, and 19.5%, for all values of U , respectively.The gaps of all algorithms slightly increase with respect to U and this is more apparent for RBC.We can say that even if the size of problem increases with U (i.e., more difficult), the solution quality of algorithms slightly decreases.
Increasing F from 100 to 300, the optimality gap of RCGA decreases from 1.6% to 0.2%, while the optimality gaps of PBC and RBC increase from 15.4% and 18.6% to 19.8% and 21.1%, respectively.This shows that RCGA can effectively utilize the cache capacity, while PBC and RBC are not able to achieve this.In fact, with larger F , the diversity of requests increases and the problem becomes more challenging.Fig. 7 shows the effect of cache size in relation to the total size of contents.Overall, it can be observed that when ρ grows from 0.1 to 0.9, the cost and optimality gaps obtained from RCGA, PBC, and RBC all decrease.This is due to the fact that a cache with more space can store more contents.RCGA outperforms both PBC and RBC and has nearly optimal solutions.The optimality gaps of RCGA, PBC, and RBC for ρ = 0.1 are 1.4%, 21.1%, and 35.5% respectively and they decrease to 0.1%, 4.5%, and 4.5% when γ increases to 0.9.The reason is that when γ = 0.1, the capacity is extremely limited, and it is crucial to utilize the capacity efficiently.RCGA is able to achieve this compared to PBC and RBC.When the caching space increases, the costs and optimality gaps start to decrease.When the caching space becomes excessively large such that most of the requested contents can be stored in the cache, optimizing the caching space becomes rather a trivial task and all algorithms have similar performance.Finally, Fig. 8 shows the impact of popularity of contents on the cost.As can be seen the costs and optimality gaps decrease with respect to γ.Note that when γ increases, the popularities of contents become more distinct and thus it is easier for the algorithms to determine which contents should be stored in the cache in order to achieve low cost.

VIII. CONCLUSIONS
This paper has investigated a content caching problem where the joint impact of content downloading cost and deadline constraints are accounted for.First, the problem is formulated as an integer linear program (ILP).Even though the ILP can provide optimal solutions, it needs significant computational time for large-scale problem instances.Thus, three algorithms are developed for problem solving.The first one is a solution approach based on a repeated column generation algorithm (RCGA).The second and third algorithms are developed from popularity-based (PBC) and random-based caching (RBC) from the literature.PBC and RBC are simple and fast and thus they are suitable for very large-size problem instances.Simulation results have demonstrated that RCGA outperforms PBC and RBC algorithms and provides nearly optimal solutions within approximately 1.6% gap of global optimum.In addition, simulation results show that one-third of the system cost can be cut off when content requests have longer deadlines.PBC and RBC are suitable for the scenarios when the cache capacity is fairly large or the popular contents are apparent, because for such scenarios they can provide solutions with qualities nearly the same as RCGA.
tf the values of x tf , y urt , and a tf with respect to the k-th sequence, respectively.Note that given the values of x (k) tf the value of y (k)

3 :
x tf is previously fixed to one and x tf = 0 otherwise Algorithm Framework of RCGA 1: STOP ← 0 2: while (STOP= 0) do 3:Apply CGA with fixed variable values so far and obtain w *

4 :
if (w * is an integer solution) then

Algorithm 4 : 3 :S ′ ← S 4 :
The flow of PBC and RBCInput: S, l f , c b , and cs Output:x 1: x f 0 ← 0, ∀f ∈ F 2: for t = 1 to T do Calculate P tf , f ∈ F 5:PBC: sort contents based on their popularity and put them in the sorted order in set F 6: