Cached Files Updating Revisited: The Distribution of Popularity-Weighted Average File Age

Low-latency information transmission is required in 5G communication networks, where the cache is commonly used to speed up the data access. Meanwhile, a large number of IoT applications need real-time information exchange, resulting in the information freshness more important. In this article, under the discrete-time model, we consider the average age of all files in a cached-files-updating system. To keep the cached files fresh, in each time slot the server refreshes files with certain probabilities. The age of one file or its age of information (AoI) is defined as the time the file stays in cache since it was sent to cache last time. Assume that each file in cache has its own request popularity, we determine the probability distribution of the popularity-weighted average file age. For the random age of single file, both the mean and its probability distribution can be obtained by establishing a simple Markov chain. Following the same line of thinking, we show that an $N$ -dimensional stochastic process can be constituted to characterize the changes of $N$ file ages simultaneously. By solving the steady state of the resulting process, we obtain the explicit expression of stationary probability for an arbitrary state vector. Then, the distribution of the popularity-weighted average file age can be found by mergering a proper set of stationary probabilities, which gives the complete description of the average age. We also provide the numerical results of the distribution for some simple cases. The age-distribution can be utilized to compute certain stochastic performance indices, such as the AoI violation probability of some status updating systems.


I. INTRODUCTION
With the emergence of many real-time applications, especially in IoT networks, low-latency information transmission becomes more and more important. For a data server, by putting the popular contents in a high-rate cache, the expected download delay of the users can be effectively reduced. Lots of caching policies have been proposed to minimize the data access delay in communication networks, such as work [1] and [2]. In addition, in order to maintain a low transmission delay, some other authors considered the best scheme to place the caches geographically [3], [4].
In recent years, apart from focusing on download delay, the freshness of the cached contents gets attention. Observing that the systems with minimal transmission delay do not imply that the receiver can always obtain fresh messages. To keep the delay small, the source should decrease the The associate editor coordinating the review of this manuscript and approving it for publication was Zhenyu Zhou . generation rate of the packets so that the network congestion is alleviated. However, in this case the message at the receiver will become too old due to lack of updates. The real time information is needed for the central node to take actions or make decision in a large number of practical applications, such as autonomous vehicles and positioning the moving objects.
The concept of cache freshness was first introduced in [5] for an opportunistic mobile networks, where the authors assumed one data source updating multiple caching servers and further each caching server can update other nodes. Almost a decade ago, in articles [6] and [7] a new freshness metric called age of information (AoI) was proposed, which characterizes the time elapsed since the message was generated in source up to now. Combining the cache system and the AoI metric, the research considering the expected age of cached contents begins. In the literatures of AoI, such caching models are called cached contents updating system or cached files updating system which is used in the current paper. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ The cached files updating system consists of one server and a local cache. The server generates N files and sends them to the local cache. The users can only request files from the cache since the server is located far away. The files contained in cache are refreshed by the server so that their ages will not be too large. The freshness of one file is measured by its age, or the age of information (AoI) which is defined as the time this file stays at the cache since the last time it was delivered to the cache. Notice that the canonical AoI was considered in the continuous time model, while in this article its discrete form is adopted. The AoI is a newly proposed metric and is widely used to characterize the timeliness of an information transmission system. An introduction and survey of the AoI was given in a recent paper [8], where the authors summarize the recent contributions in the broad area of the AoI.
In general, part of the cached files are popular but some others are less. Suppose that each file has its own request popularity, which is proporption to the download times from users in a sufficient long time. Then, the popularity-weighted average file age can be defined. Refreshing the cached files in real time was first formulated in [9]. In paper [9], a remote server generates multiple different files and transmits them to the local cache, so that the users can request files. The authors assume that each file has its own request popularity and define the average age of all cached files according to the file popularities. Then, it was asked how the server should update files such that users can obtain the most recent version of their requests. By solving a relaxed optimization problem, it was proved that an asymptotically optimal policy should update each file in proportion to the square root of its popularity. Other variants concerning minimum-AoI cached files updating include [10]- [12]. The authors in [10] considered the case where several sources generate updating packets and deliver them to a local server. In every source assume that the successive packet arrivals form a Poisson process, and is independent of other arriving processes. They analyzed the average AoI of one user's requests during a period of time, assuming that all the source states are updated periodically. In addition, another related metric called age of synchronization (AoS) is also discussed. The results obtained in [10] suggested that the optimal updating frequency of each source should depend only on the square root of the source popularity, which is similar to the conclusion of [9]. In paper [11], the refreshing frequency of each file is supposed to be a function of its instantaneous AoI. For this case, the average age of all the files is optimized over all the feasible schemes that the server can use. Furthermore, in work [12] the cache size is supposed to be limited. Notice that in this case the cache cannot store all the files and part of the user demands may fail. Therefore, the user was allowed to download files directly from the server with a higher download cost, compared with requesting the files from the local cache. Then, the average AoI and the total download costs are jointly optimized over all the possible updating schemes used at the server.
The existing work about cached files updating system mainly focuses on determining the long term average AoI, or finding the AoI-optimized system settings which is usually solved by using optimization theory. Lots of system design problems based on AoI-minimization and the related metric such as transmission delay have been considered in recently years, such as [13]- [25]. A variety of optimization methods are used to find the optimal cache updating systems even combining with the machine learning approach [26]- [32]. Many average or peak AoI expressions are proposed in terms of different measures including the length of file, the download cost, the download energy consumed for each file, and so on. However, to our best knowledge, there is no work concerning the average age distribution for the cache updating systems has been published before.
Notice that the cached files updating model is a special case of the general status updating system. For the AoI of basic status updating system, we notice that the previous works mainly focus on its mean and the AoI analysis is carried out based on continuous time model. Under the discrete time setting, the majority of AoI literatures consider the system design that minimizing the average AoI at the monitor. Mathematically, regarding the steady-state AoI as a random variable, the mean of the AoI gives very limit knowledge of its statistic characteristics. Instead, if the AoI-distribution is known, then all the knowledge about the AoI is obtained. We can derive the mean, the variance and any higher moments of the AoI, all of these can be used to evaluate the performance of a certain status updating system. From an application perspective, apart from the average AoI, some other kinds of performance indices are of interest, such as the outage probability used in [33] which represents the probability the peak AoI exceeds a given threshold, and the outage update probability denoting the portion of time the updating packets having age larger than a certain threshold in [34]. In order to determine such stochastic guarantees, we also have to know the stationary distribution of the peak AoI and the AoI. Therefore, both in theoretical and in practical, it is very necessary to find the AoI-distribution of the status updating systems.
While great progress has been made in the research of the AoI in continuous time model, the discussion of discrete AoI is just beginning. The analysis of the discrete AoI for the general status updating system was definitely proposed in work [35] for the first time, where the average AoI is determined for some basic queue models including Ber/G/1 and G/G/1. In a later paper [36], the stationary distributions of the AoI and peak AoI are derived using their probability generation functions (PGF). Although the explicit expressions of the AoI-distribution are not given, [36] is the first paper studying the AoI-distribution in discrete time model.
In this article, rather than a practical updating scheme, we assume that the files are updated randomly by the server in each time slot. For the cached files updating system we obtain the stationary distribution of the popularity-weighted average age, which gives a complete description of this average age. Introducing an N -dimensional vector tracking the ages of all cached files, it is shown that an N -dimensional discrete stochastic process can be constituted which describes the transitions between the state vectors. By solving the stationary equations of the resulting process, we find the exact stationary probability of every state vector. For a given set of file popularities, the probability that the popularity-weighted average age takes any value can be obtained by mergering a certain group of stationary probabilities. In this way, the explicit result of the popularity-weighted AoI-distribution is determined for the cached files updating system. The idea of both [36] and our current paper starts with describing a sample path of the AoI stochastic process. The authors of [36] observe the AoI curve in piecewise manner, while we record the age of all files in every time slot and characterize the transfers of these file ages from the current time slot to the next.
The rest of this article is organized as follows. We depict the cached files updating system model and define the popularity-weighted average file age in Section II. Firstly, the simple case where only two files are generated at the server is solved in Section III, and the mean of this average file age is given at the same time. The main results of this article is placed in Section IV. In order to illustrate the idea and the solving method, we first consider the AoI-distribution of popularity-weighted average age for the case N = 3 by establishing a three-dimensional stochastic process. After then, the same idea and solving procedure are applied to the general case and the explicit expression of AoI-distribution is derived. At last, we also consider the age-distribution of some more general cached files updating models that our idea and approach can be used. In particular, for the case where multiple-file-updating is allowed in each time slot, we constitute a stochastic process and give the stationary equations. The numerical results are offered in Section V. We draw the distribution of the popularity-weighted average age for the cases N = 2 and N = 3 to show the trend of the average age distribution. Finally, we conclude this article and discuss the future work briefly in Section VI.

II. SYSTEM MODEL AND PROBLEM FORMULATION
In this Section, we describe the cached files updating system model and formulate the interested problem.
The server generates N files and sends them to a local cache. At the server, the files are supposed to be new, i.e., all the files maintained in server have age zero. Assume that the user can only download the files from the cache because the server is located far away. Each file in cache has its own request popularity which reflects the download frequency from the users. As time goes, the files stored in cache become obsolete. To keep the files fresh, in each time slot the server refreshes the cached files with certain probabilities. Cached files updating system model is depicted in Figure 1.
We intend to find the probability distribution of the popularity-weighted average age over all cached files. In this article, assume that transmitting each file from the server to the cache consumes exactly one time slot. Thus, when a new file arrives to cache, it has age 1. Firstly, for a set of file popularities {p i , 1 ≤ i ≤ N }, the time-average popularity-weighted age is defined. Then, we give the formal definition of the popularity-weighted average file age.
Definition 1: Denote as the time average AoI over all cached files according to a group of file popularities {p i , 1 ≤ i ≤ N }. Then, is written as where a i (k) represents the age of the ith file at the kth time slot. For every file in cache, its age is a random variable taking values 1, 2, . . . . We define the popularity-weighted average file age pw as follows.
Definition 2: Let a 1 , a 2 , . . . , a N be the random ages of files f 1 , f 2 , . . . , f N . The popularity-weighted average file age pw is defined as which is also a random variable. Assume that the cache updating system is stationary and ergodic, which is commonly used in AoI literatures. These assumptions ensure that the time average AoI defined in (1) converges to the expectation of pw as time goes to infinity. Therefore, we have

III. DISTRIBUTION OF POPULARITY-WEIGHTED AVERAGE AGE WHEN N = 2
Within this Section, based on the age distribution of single file, we first obtain the mean of the popularity-weighted average file age E[ pw ]. Next, the probability distribution of pw is computed for the first case N = 2. We derive this distribution using two different ways, i.e., by using convolution formula directly or constituting a two-dimensional stochastic process.
At the beginning of each time slot, assume that the file f i is refreshed by the server with probability c i , 1 ≤ i ≤ N identically and independently. In this article, we assume that the file refreshing probabilities are independent of their request popularities. Denote the age of f i as a i , 1 ≤ i ≤ N . We declare that all the a i 's are independent. The above claim is easily explained. It shows that the file age a i depends only on its value in previous one time slot and the file updated by the server at the current time slot. As a result, a i has nothing to do with a j , j ∈ {1, 2, . . . , N } \ {i}.
Therefore, in equation (3) it shows that the mean of pw is equal to which implies that the average age E[ pw ] is determined as long as the mean of every file's age is known. Next, we derive the mean and the probability distribution of single file's age. These basic results are useful in determining the probability distribution of pw in the following paragraphs of this Section.
For single file f in cache, assume that f is updated with probability c at each time slot. In this circumstance, the successive updates of f form a Bernoulli process with parameter c. Equivalently, the time interval between consecutive updates follows a geometric distribution.
If we define the age of f as the discrete state n, n ≥ 1, then we have following single-step transition probabilities. For n ≥ 1 P n,n+1 = 1 − c and P n,1 = c where P i,j , i, j ≥ 1 is defined as the one-step probability the state transfers from i to j. We depict the state transtion diagram in Figure 2.
Let π n be the stationary probability of state n. Then, the stationary equations of the resulting Markov chain can be written as Since all the stationary probabilities add up to 1, it is not hard to solve that Hence, the age of single file f is geometrically distributed after the cache updating system reaches the steady state. The mean of the age is calculated as Provided these results, we can now obtain the mean of popularity-weighted average age E[ pw ]. From equation (4), it shows that We summarize the above results as a Theorem. Theorem 1: For the average age of all cached files, assume that the server updates files f i with probability c i , 1 ≤ i ≤ N in each time slot and the file popularities are represented by p i , 1 ≤ i ≤ N . Then, the file ages a i , 1 ≤ i ≤ N are independent geometric random variables when the cache updating system reaches the steady state. The mean of the popularity-weighted average file age E[ pw ] is obtained as We have proved Theorem 1 in previous paragraphs.
To make the mean age E[ pw ] smaller, expression (8) shows that the larger refreshing probabilities should be assigned to those files who have greater request popularities. This result can be proved by using the following simple inequality.
Lemma 1 (Rearrangement Inequality [37]): For two sets of number In addition, notice that the mean of the popularity-weighted average age (8) is obtained by letting the server update the files randomly. Therefore, there must exist certain practical policy which can achieve the mean age less than N i=1 p i /c i . Next, the probability distribution of pw is considered. Observing that pw is the sum of N independent random variables p i a i , so that its distribution can be obtained as the convolution of distributions of p i a i , 1 ≤ i ≤ N . We show that where we use Pã i , 1 ≤ i ≤ N to represent the stationary distribution of p i a i , which is also geometric since we have known that a i is geometrically distributed. So far, the mean of the average age E[ pw ] is obtained in (8), and in equation (9) we determine the probability distribution of pw as the convolution of N distributions Pã i , 1 ≤ i ≤ N . Although expression (9) can be applied to the general case, it is observed that as N becomes larger calculating the explicit convolution result is harder. Furthermore, expression (9) relies heavily on the assumption that all the file ages are independent. For the more general cached-files-updating models, it is certainly possible that the ages of different files are correlated. Apart from this, no file or multiple files are updated in each time slot is possible in some cache updating systems, where the refreshing probabilities may be dependent of the file popularities. For these reasons, in order to find the distribution of pw for such cache updating models, we have to develop other approaches.
In the following, we show that by establishing a discrete stochastic process, the average age analysis can be solved. Actually, the idea has been use to obtain the age distribution of single file, but now a multiple-dimensional stochastic process is needed. We derive the distribution of average age pw for the case N = 2 in this Section, while the general cases N ≥ 3 are discussed in Section IV.
Let N = 2, we now derive the probability distribution of the popularity-weighted average age.
Assume that the server generates only two files and the cache size equals two as well. Define a two-dimensional integer vector (a 1,k , a 2,k ) be the discrete state of the cache updating system. The vector components a 1,k and a 2,k denote the ages of two files at the kth time slot. Constituting the stochastic process Age 2 = (a 1,k , a 2,k ), k ≥ 1 . Observing that at each time slot one of two files are updated by the server, so that either a 1,k or a 2,k is equal to 1 at all times. At the beginning, let both files have age 1. Similarly, the stationary probability for the state (n 1 , n 2 ) is represented by π (n 1 ,n 2 ) .
Suppose that the updating system reaches the steady state, we show that the stationary equations for the process Age 2 are written as Equation (10) is explained as follows. At first, for n ≥ 3, it is easy to see that the state (n, 1) can be obtained from (n − 1, 1) and let the server update f 2 . Since at each time slot one of two ages have to be 1, and notice that it is impossible (n, 1) is obtained from a state of form (1, n 2 ), which has first component equal to 1. Therefore, there is only one way the state (n, 1) can be achieved and we have the first line of (10). However, beginning with one state (1, k), k ≥ 1, as long as the second file f 2 is refreshed at the next time slot, the state vector will jump to (2,1). This gives the second relation in (10). The last two equations can be explained accordingly.
where in last step the second equation of (10) is used. The equation (11) is valid for n ≥ 2.
Similarly, for the state (1, n), n ≥ 2 we have Next, summing up equations (11) and (12) from n = 2 to infinity yields the following results.
Since all the stationary probabilities must add up to 1 and notice that π (1,1) = 0, we have that Combining (13) and (14) gives the following results ∞ n=2 Substituting (15) into equations (11) and (12), the explicit stationary probability for every state vector is determined to be With all these results, in the end the probability distribution of pw can be obtained. Assume that the file popularities are denoted by p 1 and p 2 . Then, we have that On the other hand, remember that the distribution of pw can also be calculated by directly using convolution formula (9) when N is not too large. It shows that where in (19) we use the observation that either the first file age or the another must equal 1 and in (20) notice that we have the relation c 1 + c 2 = 1.
In above calculations, the distribution of single file's age, which is geometric in (6), is used in determining the probability distribution of pw . VOLUME 8, 2020

IV. DISTRIBUTION OF POPULARITY-WEIGHTED AVERAGE AGE FOR GENERAL CASE
For N = 2, in Section III we find the stationary distribution of pw using two different ways, where the stochastic process method is stronger and can be used to handle the average age analysis for more general cache updating systems.
In this Section, for general N we obtain the pwdistribution by constituting an N -dimensional stochastic process At the beginning, assume that all the cached files have age 1. As preliminary work, in Proposition 1 we first discuss the characteristics of state vectors in the state space of process Age N . After then, the stationary equations of Age N are established in Theorem 2. The remainder of this Section is then divided into three subsections. In subsection A, we derive the pw -distribution for the case N = 3. The idea and the solving procedure used in this case is developed to deal with the general situation in subsection B. We find the explicit solution to every state vector and obtain the distribution of pw in Theorem 4. Finally, several more general cache updating models are discussed in subsection C. For these models, the analysis of the average file age can be solved by constituting an appropriate stochastic process. For example, for the case multiple-file-updating is allowed, we constitute the stochastic process Age L N and determine its stationary equations in part C. Proposition 1: For the state vectors of process Age N = {(a 1,k , a 2,k , . . . , a N ,k ), k ≥ 1}, the following statements hold: (i) except for the initial state, for an arbitrary state vector exactly there is one vector component equal to 1; (ii) if the files f i and f j , i, j ∈ {1, 2, . . . , N } are updated by the server at least once, then they will never have the same age, i.e., a i = a j at all times; (iii) the only case multiple file ages are equal occurs when all of them are not refreshed by the server from beginning to end. In addition, the value of this age is maximal over all the components of the state vector.
Moreover, for the state vectors with several identical components, their stationary probabilities equal to zero. As a result, when solving the stationary equations, we only need to consider those states with different file ages.
Proof: The first statement is obvious. Since at each time slot a file in cache is refreshed, so that one of the N file ages are reset to 1. Thus, at all times there is one component equal to 1 in an arbitrary state vector. This explains the first statement. Assume that f i and f j are updated at two different time slots, then their ages a i and a j will not be equal because two ages start at different times, and both start from 1. Therefore, the statement (ii) holds. Finally, since we assume that the initial state is (1, . . . , 1), if some files are never updated by the server, then their ages will still be equal. Observing that updating a file can only reduce its age, so that the files having not been refreshed from beginning to end have maximal age. This proves the result (iii). Next, we prove that the stationary probability of a state vector is zero as long as it has identical vector components. Suppose that the state vector at the kth time slot is (1, n 2 , n 3 , . . . , n i , M , . . . , M ), where the last (N − i) file ages are all equal to M . Due to the statement (iii), all of them are never updated by the server such that their ages are maximal. Without loss of generality, for the first k components of the state vector, assume that 1 < n 2 < n 3 < · · · < n i . By repeatedly finding the state at the previous one time slot, we obtain (1, 2, n 3 −n 2 +2,. . ., n i −n 2 +2, M − n 2 +2,. . ., M −n 2 +2) at the (k − n 2 + 2)th time slot. Go backward one time slot further, it was observed that the first file f 1 was updated. Assume that the age of f 1 before this refreshing is denoted by l 1 , we represent the ages of all files in the (k − n 2 + 1)th time slot as (l 1 , 1, n 3 −n 2 +1,. . ., n i −n 2 +1, M −n 2 +1,. . ., M −n 2 +1) Notice that the last (N − i) files have the largest age over all the files, we show that the relation l 1 ≤ M − n 2 + 1 holds. Continue with this procedure, assume that l 1 > n i − n 2 + 1 such that the age of f 3 returns to 2 at the (k − n 3 + 2)th time slot. Then, in the (k − n 3 + 1)th time slot, it was shown that f 2 was updated by the server. Denote the state vector as (l 1 −n 3 +n 2 , l 2 , 1,. . ., n i −n 3 +1, M −n 3 +1,. . ., M −n 3 +1) Also, we have l 2 ≤ M − n 3 + 1 hold and assume that l 2 > n i − n 3 + 1. Proceeding in this manner, eventually we obtain a state vector containing multiple ''1''. For a concrete example, starting with (1, 3, 5, 7, 7), we show the evolution of the file ages in Table 1.
Specially, if l 1 , l 2 and l 3 in table 1 equal the maximal values they can take, we show that the state vector will return back to (1, 1, 1, 1, 1) in the end. The detailed state evolutions is given in Table 2.
For the first case, we reduce the N file ages to a state vector with multiple ''1''. However, the statement (i) shows that there is only one file having age 1 in an arbitrary state vector. Therefore, these states are impossible and their stationary probabilities is zero. In Table 2, we give a procedure that reducing a state with identical components to the initial state (1, . . . , 1). This implies that the state is achievable. But notice that the stationary probability of (1, . . . , 1) itself is equal to zero, since the state vector will never transfers to (1, . . . , 1) as long as it jumps out of this state. Summarize both cases, we conclude that for the state vectors with identical components, their stationary probabilities are all equal to zero. In other words, for all the cases multiple files having identical age, we prove that their stationary probabilities are all zero. This completes the proof of the Proposition 1.
Assume that the stochastic process Age N reaches the steady state, in the following we determine the stationary equations of Age N .
Theorem 2: For state vectors (n 1 , n 2 , . . . , n N ) of the process Age N where all the components are different, the stationary probabilities π (n 1 ,...,n N ) satisfy Notice that in Theorem 2, all the permutations of state vector (n 1 , n 2 , . . . , n N ) should be included. Thus, there are different group of equations like (24) corresponding to every permutation of (n 1 , n 2 , . . . , n N ).
Proof: Define π (n 1 ,n 2 ,...,n N ) as the stationary probability of the state (n 1 , n 2 , . . . , n N ). According to Proposition 1, we only need to consider the state vectors having different vector components.
First of all, consider the state vector (1, n 1 , n 2 , . . . , n N −1 ) where all the ages are different and n k ≥ 3, 1 ≤ k ≤ N − 1. It is easy to see that the probability π (1,n 1 ,...,n N −1 ) is equal to π (1,n 1 −1,...,n N −1 −1) c 1 . Notice that at any time exactly one of the N files have age 1. Here, the file must be f 1 , since we assume that the ages of other (N − 1) files are all greater than 3. In this case, at the previous one time slot, the state vector can only be (1, n 1 − 1, . . . , n N −1 − 1).
A total of N (N − 1) equations like (23) can be obtained by considering all the cases n i = 1 and n j = 2 where i, j ∈ {1, 2, . . . , N }.
Therefore, we complete the proof of Theorem 2.

A. SOLVING THE STATIONARY EQUATIONS FOR THE CASE N = 3
In the following, we solve the system of equations (24) in Theorem 2 so that all the stationary probabilities can be found. In order to show the idea and solving method, we first consider the case N = 3. The explicit expression of stationary probability for each state vector is obtained and given in Theorem 3. Theorem 3: For the stochastic process Age 3 , the solution to the stationary equations, i.e., stationary probabilities of the state vectors are determined by following equations. At first, for the state (1, 2, 3) and all of its permutations, we have π (1,2,3) = π (1,3,2) = π (2,1,3) = π (2,3,1) = π (3,1,2) = π (3,2,1) For the state vector (1, 2, n), n ≥ 3 and their permutations, the stationary probabilities are equal to At last, the stationary probabilies for the states (1, n 1 , n 2 ) where n 1 , n 2 ≥ 3 are given as For the states (n 1 , 1, n 2 ) we show that π (n 1 ,1,n 2 ) equals VOLUME 8, 2020 and the probabilities π (n 1 ,n 2 ,1) are determined by In Theorem 3, for the case N = 3 we completely solve the stationary equations (24) and determine the stationary probability for each state vector. The detailed calculations are postponed to Appendix A.
Provided all the stationary probabilities, in the following we compute the probability distribution of the popularityweighted average file age pw .
Continue the calculation is tedious and not necessary. Since we have known the stationary probability for every state vector (m 1 , m 2 , m 3 ), and the integer triples satisfying p 1 m 1 + p 2 m 2 + p 3 m 3 = j can be rapidly found by designing a simple computer program. We will provide the numerical results for some examples in Section V.
The idea and solving method for the simple case N = 3 are helpful when we derive the explicit solutions to the stationary equations for the general case.

B. DETERMINING THE AVERAGE AGE DISTRIBUTION FOR GENERAL CASE
The stationary equations for the general case are solved in two steps. We first calculate the probability of certain state vectors. Then, the other stationary probabilities can be found by introducing a permutation operator and an one-to-one mapping which is induced from the permutation. Without loss of generality, the probability π (1,n 1 ,n 2 ,...,n N −1 ) is computed assuming that the file ages satisfy n N −1 > n N −2 > · · · > n 2 > n 1 .
Lemma 2: The following statements concerning the stationary probabilities hold.
Assume that the state vector at the beginning is represented as (j 1 , j 2 , j 3 , . . . , j k , n k+1 − k, . . . , n N − k). By refreshing the first k files in the order from f k to f 1 , the state vector will jump to (1, 2, . . . , k, n k+1 , . . . , n N ) in the end. Therefore, the equation (42) holds. As previously mentioned in the Proposition 1, at any time there must be one file having age 1, so that the summation in (42) can be divided into k disjoint sums in (43) according to the file whose age is 1. For the equation (44), let the server update the first k files except the file of age 1 in an arbitrary order. The different orders the server uses to update the other (k − 1) files will create different state vectors. However, notice that in (43) the latter (N − k) components of those vectors are the same and the first k file VOLUME 8, 2020 ages are always some permutation of (1, 2, 3, . . . , k). Since we have proved in statement (ii) that the stationary probabilities π (σ l (1,2,3,...,k),n k+1 −1,...,n N −1) in (43) are all equal, so that the last equation (45) is obtained and finally we obtain that π (1,2,3,...,k,n k+1 ,...,n N ) = π (1,2,3,...,k,n k+1 −1,...,n N −1 This completes the proof of the Lemma 2. So far, for any N file ages (n 1 , n 2 , n 3 , . . . , n N ), we have determined the stationary probability π (n 1 ,n 2 ,n 3 ,...,n N ) in Theorem 4 assuming that the cache updating system reaches the steady state. Taking the request popularities p i , 1 ≤ i ≤ N of cached files into consideration, the probability that the popularity-weighted average age pw equals j can be obtained as Pr{ pw = j} = p 1 n 1 +p 2 n 2 +···+p N n N =j π (n 1 ,n 2 ,n 3 ,...,n N ) = p 1 n 1 +p 2 n 2 +···+p N n N =j Although a sum over certain state vectors is included in (46), we show that determining these states by numerical calculation is not a complicated thing. Thus, expression (46) almost gives the probability distribution of pw .

C. POPULARITY-WEIGHTED AVERAGE AGE DISTRIBUTION OF MORE GENERAL CACHED FILES UPDATING MODELS
In previous paragraphs of this Section, we showed that by establishing an N -dimensional stochastic process, which describes the changes of N file ages simultaneously, the probability distribution of the popularity-weighted average age over all the cached files can be determined. We obtain the mean E[ pw ], and the pw -distribution for the general case where N files are generated at the server. More importantly, the idea that creating a discrete stochastic process can be applied for the cache updating model with more general settings. For example, assume that at each time slot the server can update multiple files simultaneously, or no files are update at all. Allowing multiple-updates at one time slot will dramatically increase the number of possible state vectors, but we show that the random transitions of N file ages can still be characterized by the stationary equations of the newly established age-process. In the following, we shall provide the stationary equations for this case without further calculation. Now, let the cache size be N . We consider the cached files updating problem assuming that the server can refresh multiple files at each time slot. Here, an N -dimensional random process is established as well and only its stationary equations are given. Finding the closed-form solutions to the stationary probabilities are not included in this article. We may solve the stationary equations in further work.
At each time slot, suppose that the server can update l files randomly, 0 ≤ l ≤ L, making the ages of these files reset to 1. The refreshing probabilities are defined as c I , I ∈ A, where A denotes the collection of all the subsets of N = {f 1 , f 2 , . . . , f N } whose size is no greater than L. For the sake of simplicity, the probability c I can also be determined by the set of indices of file in set I . That is to say, for a set The probability c ∅ corresponds to the case in which no files are updated, so that all the file ages become larger by one in this time slot. Define the N -dimensional random process where a ik represents the age of the file f i at the kth time slot.
At the current time slot, assume that the N file ages form the state s = (n 1 , n 2 , . . . , n N ). Remember that the files from the server are always new and transmission of each file from server to cache consumes exactly one time slot. This ensures that at the next time slot the age of any file f i will jump to either n i + 1 or 1, depending on whether or not this file is updated at current time slot. Define another state vector s = (n 1 , n 2 , . . . , n N ) where n i = n i I {f i / ∈A} + 1, 1 ≤ i ≤ N . We use A to denote the set of files that are refreshed by the server. The indicator function I(·) is defined as Thus, the single-step transition probability P s,s that the state vector changes from s to s is equal to Since the server can update several files in a time slot, at most L files in cache have age 1. In addition, it is also possible that all the file ages are greater than 1, because with a non-zero probability c ∅ the server does not refresh any file. The characteristics of the state vectors in the state space of the process Age (L) N can also be considered, like we do in Proposition 1 before. Now, in a state vector there are at most L components equal to 1. Without loss of generality, assume that all these ''1'' occur in the first L locations of the state vector s. The case that some ''1''s are contained in the last (N − L) components can be converted to the former case by using certain vector permutations, and can be discussed accordingly. Suppose that the updating system reaches steady state, let π s be the stationary probability of the state vector s.

V. NUMERICAL SIMULATION
In this Section, we depict the probability distribution of pw for the case N = 2 and N = 3 to show the trend of the distribution curves.
The explicit expression of the pw -distribution for the case N = 2 has been obtained in equation (18). We depict this probability distributon in Figure 3. Assume that the refreshing probabilities are c 1 = 0.3 and c 2 = 0.7, we consider two different groups of file popularities and draw the distribution curves. For the case N = 2, it shows that when one of two files have larger updating probability and file popularity at the same time, the probability distribution goes down with a faster rate.
Notice that the age of one file is an integer, so that the average file age is discontinuous. In addition, the minimal average age under our setting is equal to min{0.6 × 1 + 0.4 × 2, 0.6 × 2 + 0.4 × 1} = 1. 4 (50) which is right for both distribution curves.
To draw the average age distribution for the case N = 3, we have to compute the explicit expression of pwdistribution. Here, we use the convolution formula (9). For the sake of simplicity, assume that three files have identical  request popularity. Then, for j ≥ 2, we have where I(·) denotes the indicator function. VOLUME 8, 2020  The calculation details are placed in Appendix B. Expression (51) shows that when j = 7/3, 9/3, 11/3, . . . , the last three terms exist and are subtracted. We first depict the exact discrete probability-points of pw in Figure 4. In Figure 5 we also provide the graphs of the upper bound and lower bound of the pw -distribution by letting the indicator function equal 0 and 1 at all times, respectively.
In Figure 6 we show the pw -distribution under two different groups of refreshing probabilities, where the upper bound of pw is depicted. The numerical results show that as the updating probabilities get close to the uniform distribution, the distribution of pw is falling faster, but from a lager initial value.

VI. CONCLUSION
For the cached files updating system, in this article we consider the stationary distribution of the pupolarity-weighted average file age in discrete time model. We show that an N -dimensional discrete stochastic process can be constituted to describe the random transfers of N file ages simultaneously. The stationary probability for an arbitrary state vector can be found as long as the stationary equations of the resulting process are solved. Then, the probability distribution of the popularity-weighted average age for a set of given file popularities can be obtained by mergering a proper group of stationary probabilities. For the case N = 2 and N = 3, the numerical results are given, which shows that when the updating probabilities are well matched with the file popularities, the average age-distribution will fall rapidly. Therefore, to obtain smaller average file age, the server should update the files with higher request popularities more often. This conclusion is consistent with the results of some work which considered the AoI-minimization system design. Since in this article the random updating scheme is used, then there exists certain practical policy whose average AoI is less than the mean age given in (8).
At last, some more general cache updating models are discussed. In particular, for the case the server can update several files at one time slot, we give the stationary equations of the discrete stochastic process constituted for this case. For the further work, we intend to consider the probability distribution of the popularity-weighted average age for more general models. For example, assume that the popularities and the updating probabilities of the files are dependent, or even time-varying.

APPENDIX A PROOF OF THEOREM 3
In this part, we prove Theorem 3.
The two summations in (53) are dealt with further as follows.
The explicit expressions of π (n 1 ,1,n 2 ) and π (n 1 ,n 2 ,1) are calculated using same method and are directly given as follows. For π (n 1 ,1,n 2 ) we have  So far, all the stationary probabilities are obtained and we complete the proof of Theorem 3.

APPENDIX B COMPUTING THE DISTRIBUTION EXPRESSION FOR THE CASE N = 3
Suppose that the file popularities are identical, i.e., p 1 = p 2 = p 3 = 1/3. Then, we have that Pr{ pw = j} = Pr = Pr{a 2 + a 3 = 3j − 1} + Pr{a 1 + a 3 = 3j − 1} From expression (69), we obtain that 3j ≥ 6. Therefore, in this case the form of the average age j can be denoted as l/3 where l ≥ 6.
For the last equality, we use the fact that one of three files have age 1 at all times. Remember that all the file ages are independent and geometrically distributed. Thus, the first term in (70) can be calculated as For the first probability, we let a 1 = 1. Since all the file ages are different, so that the age of a 2 is larger than 1. As k gets larger, the file age a 3 becomes smaller. Similarly, this age is also larger than 1. This gives the maximal value the age k can be. Notice that the file ages a 2 and a 3 are also different, then we should exclude the case a 2 and a 3 are equal in (71) if a 2 = a 3 = (3j − 1)/2 is an integer. This event occurs with probability The other two probabilities in equations (70) can be computed and discussed accordingly, the case where two files have identical age should be discarded as well. Using the indicator function I(·) we represent the final result as