Multi-Antenna Coded Caching for Location-Dependent Content Delivery

Human-computer interaction continuously evolves towards a genuinely immersive experience, submerging users in a three-dimensional (3D) virtual world. A realistic, immersive experience necessitates a highly reliable and agile wireless connection to support immense data transmission. Yet, there are abundant but underutilized memory resources available at the devices which can be harnessed as supplementary assets to reduce the excessive burden on the wireless medium. What is more, the use of Coded Caching (CC) techniques enables the cumulative cache memory of users in the network to be used as an additional communication resource. To this end, a location-dependent multi-antenna CC-based content delivery scheme tailored specifically for wireless extended reality applications is proposed in this paper. First, a novel memory allocation process is developed, enabling an appropriate trade-off between local and global caching gains. In this regard, the local caching gain is maximized when the memory is mostly dedicated to locations with poor connectivity conditions (absolute fairness). In contrast, the global caching gain is maximized when the memory is uniformly allocated among all the locations. As a result of the memory allocation process, unequal fractions of location-dependent multimedia content are cached by each user. Given the asymmetric cache placement, a novel algorithm is proposed to create suitable codewords for each user during the subsequent delivery phase, which simultaneously achieves a global and local caching gain. The proposed delivery scheme also combines global caching and spatial multiplexing gains using a weighted max-min multicast beamformer design with multi-rate modulation. Numerical experiments and mathematical analysis demonstrate significant performance gains, in terms of the 95-percentile expected delivery time, compared to unicast and multicast scenarios where either the local or global caching gain is maximized.


I. INTRODUCTION
It is expected that 5G penetration will surpass the ten percent mark by 2023 while the average per-user throughput will encounter more than a ten-fold increase compared with what was achievable five years earlier with 4G-LTE [1].This is primarily due to new data-intensive services such as wireless extended reality (XR) applications offered by 5G and beyond [2]- [8].Wireless XR applications necessitate stringent quality of service (QoS) in terms of both low latency (< 10 ms) and high throughput (6.37 − 95.55 Gbps) [2]- [8].Indeed, supporting the high-datarate wireless connectivity with low latency necessitated for such data-intensive applications calls for more advanced solutions than merely increasing the available bandwidth [4].Meanwhile, improving caching and computing capabilities at end-users has been deemed highly effective in increasing the transmission efficiency [7]- [9].As such, upcoming mobile broadband applications rely heavily on asynchronous content reuse [10], and hence, proactive caching of popular content at the end-users could relieve network congestion and bandwidth consumption during peak traffic demand times [11].In this regard, various works have considered proactive caching in a singleinput-single-output (SISO) setting to demonstrate its potential [12]- [15].Specifically, by utilizing caching and computing capabilities of XR mobile gadgets, the traffic burden over the wireless network can be effectively alleviated.Moreover, significant bandwidth and delay-reduction gains have also been demonstrated in [12]- [15].
The coded caching (CC) technique, initially proposed by Maddah-Ali and Niesen in [16], has recently gained attention due to an additional global caching gain compared to traditional (local) caching schemes.This gain is achieved by intelligent utilization of the aggregate cache memory available throughout the network.Remarkably, the global caching gain scales linearly with the total number of users in the network, making it appealing for multi-user collaborative use cases such as XR applications [17].In this regard, a recent work in [18] has reduced the transmission bandwidth for delay-constrained XR applications by leveraging coded cache placement and mobile devices' computing capabilities in a SISO setup.However, an exciting property of CC schemes is their capability to combine global caching and spatial multiplexing gains resulting from multi-antenna transmissions [19].This makes CC even more appealing as multi-antenna connectivity plays a critical role in future communication systems [3].Nevertheless, there is a gap in the literature when it comes to applying multi-antenna CC techniques to XR setups, especially taking advantage of their location-dependent content access characteristics.
In this paper, we introduce a new multi-antenna CC delivery scheme with location-dependent content requests well-tailored for future collaborative XR applications.In the proposed setting, a single transmitter equipped with multiple antennas has access to a library and serves a group of cache-enabled users.We consider a wireless connectivity scenario where the users are free to move, and their requested contents depend on their instantaneous locations in the application environment.Such a scenario entails a substantial volume of multimedia traffic with guaranteed QoS throughout the operating environment.In this regard, a location-dependent, uneven memory allocation is carried out based on the approximated or predicted data rate at each given location.
Specifically, the portion of memory dedicated to each location is affected by the quality of wireless connectivity at that location.This is in contrast to conventional CC schemes, where the same portion of the memory is dedicated to each file in the library, necessitating new delivery schemes to be devised.Thus, a novel packet generation scheme is introduced to handle the irregularity by creating packets with sizes proportional to the corresponding uneven cache ratios.
Finally, a multicast beamforming scheme with an underlying multi-rate modulation is proposed to leverage global caching and multiplexing gains simultaneously and hence, to improve the QoS compared to the state-of-the-art.

A. Prior Art
Single-and multi-antenna coded caching.Encouraged by the appealing CC gains, the original error-free single-server system model in [16] was later extended to various other practical scenarios such as multi-server and wireless multi-antenna coded caching [19]- [21].Interestingly, the global caching gain was shown to be additive with the spatial multiplexing gain when CC is applied to a multi-antenna setup [19].Moreover, the optimized multi-antenna precoder design was shown to be crucial for the CC, especially in the low signal-to-noise ratio (SNR) regime, to account for the inter-stream interference [21] appropriately.Device-to-device (D2D) extensions of multi-antenna coded caching can also be found in [22]- [25].Specifically, while [22] and [23] considered an infrastructure-less network where the only available link is D2D, works [24] and [25] extended this system model to a general framework where the downlink transmission is assisted with D2D links.Meanwhile, various practical limitations of CC were also addressed by the research community.Most notably, it is well-known that to achieve the original caching gain proposed in [16], the underlying scheme requires splitting finite-length files into an exponentially growing number of subpackets (with respect to the network size) [26].This exponential growth is even more severe in multi-antenna setups [19]- [25], motivating the research on reducedsubpacketization CC schemes with no or moderate performance loss [26]- [28].In a similar work, the effect of the subpacketization on the low-SNR rate was also investigated in [29].Unlike [19]- [29], which consider perfect channel state information at the transmitter (CSIT), authors in [30] devise a scheme for imperfect CSIT that scales with the number of users.Finally, as in the CC network, a common message is being transmitted to several users, users' privacy requirements to prevent information leakage are also addressed in the literature (e.g., [31] and [32]).
Coded caching with multi-rate transmission.A less-studied problem of CC schemes, affecting content delivery applications in general and XR applications in particular, is the near-far issue.
Specifically, due to the underlying multicasting nature of conventional CC schemes [19]- [21], the achievable rate in any multicast message is limited to the rate of the user with the worst channel conditions.In fact, as studied in [33], the effective gains of conventional SISO-CC schemes could entirely vanish at the low-SNR region due to the near-far issue.To address this shortcoming, a congestion control technique was proposed in [34] to avoid serving users in adverse channel conditions, and multiple descriptor codes (MDC) were utilized in [35] to serve ill-conditioned users with a lower quality of experience (QoE).Similarly, a stochastic CC model considering queue minimization and packet control was introduced in [36], and joint power minimization and scheduling over a wireless CC network for delay-constrained applications was also proposed in [37].Using a different perspective, it was discussed in [38] that as long as user mobility patterns were known at the server, different cache profiles could be assigned to multiple cache-enabled helper nodes scattered throughout the environment to improve the CC transmission rate.Moreover, guiding users towards locations with preferable conditions in an immersive XR application using learning-based techniques was considered in [39], and an orderoptimal location-based coded cache placement was proposed in [40] to assign different cache profiles to cache-enabled transmitters located in distinct locations.
Unlike [35]- [40], which were based on standard XOR-ing of data elements, nested code modulation (NCM) was utilized in [41] to allow building codewords that serve every user in the multicasting group with a different rate.Several other multi-rate modulation schemes can also be found in [42]- [45].The multi-rate property in these schemes was achieved by altering the modulation constellation using side information available to each user.Later on, authors in [46] and [47] benefited from the shared-cache idea of [26] along with the NCM scheme to compensate for the near-far problem caused by the users with adverse channel conditions.However, all these works are either limited to single-antenna transceivers or fixed-connectivity network conditions where the users' rates are fixed and known to the transmitter.Thus, the near-far issue still needs to be addressed in both multi-antenna setups and real-time applications where users frequently move within the network, and their achievable rate changes accordingly.

B. Our contribution
This paper proposes a novel CC-based multi-antenna content delivery scheme for locationdependent data requests, particularly focusing on collaborative XR applications that require high data-rate connectivity and are bound to strict delay constraints.In realistic multiuser environments, available radio resources should be shared among all the users of the given XR application, limiting the available link qualities due to a higher network load [17].In such a scenario, efficient utilization of in-device memories available to the users will be highly beneficial.In a collaborative XR setup, all users are served simultaneously in a bounded environment, where the actions and choices of each user affect the final results perceived by all users.Specifically, we follow the XR connectivity framework in [17], where the XR content is decomposed into the so-called static, and dynamic parts (see Figure 1) and multi-antenna CC techniques are used to deliver the cacheable part efficiently.In addition, cache-enabled users are scattered in the application environment and move freely.As users change their position in the environment, their achievable rate varies based on their location.We assume that the XR application environment is split into several single transmission units (STU) such that a separate 3D omnidirectional image is needed to reconstruct the virtual environment in each STU.Each user requests the server to receive the missing data to reconstruct its field of view (FoV).After collecting all users' requests, the server transmits the missing data for dynamic and static file parts to all users (over the air) while also instructing them on reconstructing their FoVs using cache content and the delivered data.
Depending on the distance from the transmitter and possible infrastructure elements obstructing the wireless link in the XR environment, communication quality could vary for different STUs.
Therefore, this paper aims to design caching and delivery schemes to minimize the transmission time and avoid excessive delays in serving all the XR users in a non-uniform wireless connectivity scenario with location-dependent content requests.Intuitively, larger cache portions are allocated to the contents requested in the wireless connectivity bottlenecks to avoid excessive delivery duration and minimize the average transmission time.In this regard, we first design a new memory allocation process that uses the available predictions of the achievable rate within the application environment to prioritize the caching of the content requested in STUs with reduced connectivity.The proposed allocation process enables a trade-off between local and global caching gains, such that the local caching gain is maximized when the memory is mostly dedicated to locations with poor connectivity conditions (absolute fairness), and the global caching gain is maximized when the memory is uniformly allocated among all the locations.Then, for the resulting non-uniform cache allocation setup, a novel content delivery algorithm is introduced for achieving a global caching gain additive to the spatial multiplexing gain.The proposed delivery scheme relies on underlying multi-rate transmission techniques to simultaneously serve users with diverse channel conditions, i.e., to transmit smaller amounts of data to users in poor-connectivity STUs while simultaneously delivering larger amounts of data to other users.Indeed, the non-uniformity in the cache allocation causes a degrees of freedom (DoF) loss compared with the existing multi-antenna coded caching schemes that use symmetric content placement.Nevertheless, the proposed scheme is better tailored to the considered XR application scenario as it avoids excessive delivery time for serving users in areas with poor communication quality.
The current paper is an extension of our earlier conference publications [48] and [49].In [48], a novel location-dependent CC scheme is proposed for a single antenna transmitter, and [49] is the multi-antenna extension of [48], where we assume the memory allocation process is done such that the global caching gain at each location is an integer.In this paper, 1) the beamforming design is described in more detail, 2) the requirement for integer global caching gains is relaxed, and 3) a two-phase delivery scheme comprising both multicast and unicast transmissions is introduced to improve the overall performance.In [50], a similar location-dependent CC scheme was proposed based on a signal-level cache-aided interference cancellation scheme from [27].
The scheme in [50] benefits from a lower subpacketization and simpler beamformer design than the scheme proposed herein.Yet, the scheme in [50] is limited to scenarios where the spatial multiplexing gain exceeds the coded caching gain (thus limiting the actual benefit of applying coded caching techniques).In addition, the finite-SNR performance of [50] is strictly inferior to the scheme proposed herein as it lacks the multicasting gain of XORing approach [16] (c.f., [27]).

C. Notation and structure
Matrices and vectors are presented by boldface upper and lower case letters, respectively, and calligraphic letters are used to denote sets.For the set A and vector v, |A| and v represent the cardinality of A and norm of v, respectively.Also, for two sets A and B, A\B includes the elements of A that are not in B.Moreover, [m] denotes the set of integer numbers {1, ..., m}, and ⊕ represents addition in the corresponding finite field.Finally, Table I represents some of the main notations used throughout the paper.
The rest of this paper is organized as follows.In Section II, we describe our locationbased system model.A two-phase cache placement scheme comprised of memory allocation and cache arrangement processes is described in Section III, while Section IV discusses the delivery procedure.In Section IV-A, weighted-max-min beamforming, tailored for the considered location-based cache placement setup, is introduced.In the end, numerical results are provided in Section V, while Section VI concludes the paper.

II. SYSTEM MODEL
We envision a bounded environment (gaming hall, operating theatre, etc.) where a base station (BS) with L transmit antennas serves K single-antenna users1 through wireless communication links.The set of users is denoted by K = [K].The users are equipped with finite-size cache memories and are free to move throughout the environment.Every user requests data from the BS at each time slot based on its location and the application's needs.The requested data content  can be divided into static and dynamic parts (see Figure 1), and a user needs to obtain both parts to reconstruct the detailed FoV.Typically the major share of the FoV is comprised of the static part, which is the main target in the delivery phase.The dynamic part is delivered parallel to the static part by allocating a portion of the available radio resources (frequency, time, space), depending on the current content demands for both static and dynamic parts.However, in typical virtual gaming scenarios, also the dynamic parts (e.g., geometrical shapes, textures, and avatars) of the FoV are almost entirely cacheable and can be partially stored in advance at the end users [7].Due to the interaction of the objects in the virtual world, low-overhead control data describing how to reconstruct the dynamic content from both the cached elements and the multicast data must also be provided to the users.This paper only focuses on the wireless delivery of the location-dependent cacheable content, partially aided by in-device cache memories.A realworld application of such a setup is a wireless XR environment, where the requested data is needed to reconstruct the location-dependent 3D FoV at each user.As a particular example, a 3D XR gaming environment can be considered, where obstacles, walls, buildings, surrounding nature, etc., constitute the static part.On the other hand, players themselves and how they interact with the environment can be considered dynamic content (see Figure 1).Naturally, users located in different locations experience distinct channel conditions due to varying wireless connectivity.
Thus, the goal is to design a cache-aided communication scheme that maximizes the achievable rate over the wireless link while also avoiding extensive transmission delays at wireless bottleneck areas.
As discussed in Section I-B, we assume that the application environment is mapped into several STUs, and a separate file is required to construct the scenery at each STU.Specifically, we assume the requested file contains enough data to render the whole 360 degree spherical FoV around the user.Any dynamic change prompted by the users' head rotation or other changes in the environment is assumed to be locally rendered by the users, in accordance with the locally available sensory data or after receiving the necessary instruction set from the server for reconstructing and overlaying the dynamic content.Let us assume that STU mapping is done such that all points in a given STU have almost the same expected level/quality of wireless connectivity.For simplicity, we use the term state interchangeably with STU.A graphical representation of a simple application environment with eight states is provided in Figure 2.
We use S to represent the set of states and assume that |S| = S. Also, the file requested by a user in state s ∈ S is denoted by W (s). Without loss of generality, we assume for every region s ∈ S, the size of W (s) is F bits.If not stated otherwise, we consider a normalized data unit in the following and drop F in subsequent notations.
Similar to other centralized coded caching schemes, our new location-dependent scheme works in two distinct phases, I) cache placement and II) content delivery.Each user k is equipped with a cache memory of M (normalized) data unit and has a message stored in its cache during the placement phase, where Z k (•) denotes a function of the files W (s), ∀s ∈ S, with entropy not larger than M data unit.
Upon a set of requests d k ∈ S, ∀k ∈ K at the content delivery phase, the BS multicasts several coded messages, such that at the end of transmission, all users can reliably recover their requested files.Let us assume that coded messages are transmitted in different time intervals and use x K to denote a coded message that sends data to all the users in K ⊆ K.The number of coded messages and their generation process is detailed in Section IV.However, as a general description, every message x K comprises several codewords x U , where each codeword x U contains useful data for a subset of users U ⊆ K. Thus, x K is built as precoding vector dedicated to users in set U. After sending x K, every user k ∈ K receives where the channel vector between the BS and user k is denoted by h k, K ∈ C L , and z k ∼ CN(0, N 0 ) represents the additive white Gaussian noise.Note that to reproduce the requested file W d k , the decoder of user k makes use of the local cache content Z k as well as the received signals from the wireless channel over different time intervals (i.e., y k, K).Throughout the rest of the text, we present the delivery procedure for a specific transmission and assume that the same procedure is repeated at each transmission.Hence, we use y k and h k interchangeably with y k, K and h k, K, respectively.We assume that instantaneous and error-free channel state information is available at the transmitter (e.g., via reciprocal reverse link pilot measurements) and used for beamformer design and rate allocation during the delivery phase.
For the exact location-dependent cache placement, we would need to know the normalized achievable throughput r(s) [files/second] at each state s.However, this is not possible; since, to compute r(s), we would need prior information about the delivery scheme.This includes, for instance, the number of users scheduled in parallel, all users' locations and channel states, and the precoding algorithms used for data transmission.However, such data is not available during the placement phase.Therefore, expected or approximated delivery rates at each state must be considered for placement purposes.The expected location-specific data rates r(s) can be attained through various means, e.g., via collecting statistics from past active users.In this paper, since the number of users served in parallel in each transmission interval and their respective channel states are not yet known during the placement phase, a hypothetical single-user scenario is considered for the placement to approximate the expected achievable rates r(s).As a result, the expected interference-free throughput attained in state s ∈ S, normalized by the file size F [bits], can be roughly approximated as where C p is a pre-log scaling factor containing any practical overhead, P T is the transmission power, Ω is the communication bandwidth, and h ks ∈ C L is the channel vector between the server and a user k located in state s.Note that the expectation is taken over all user locations and channel realizations in state s.It is worth noting that ( 2) is an 'upper bound' for the achievable rate at any state.The main objective is to have a relative throughput measure among states.
The negative throughput scaling due to serving multiple users in parallel (including practical overheads, the impact of scheduling, etc.) would not change the memory allocation process since it can be considered almost the same across all the states.

III. LOCATION-DEPENDENT CACHE PLACEMENT
Different from the existing works, our cache placement phase comprises two consecutive processes, memory allocation and cache arrangement.The placement phase is executed, for example, before the users enter the application environment or when they pass through specific high data-rate locations (data shower), e.g., nearby the transmitter.During this phase, users' cache memories are proactively filled with valuable data aiming to minimize the required transmission time during the upcoming delivery phase.Note that different from the existing works where the delivery time is optimized only during the content delivery phase, we proactively consider minimizing the delivery time also in the placement phase.As a result, the contents relevant to locations with poor wireless connectivity are prioritized in the user's memories to help prevent excessive delays during upcoming transmissions.
Memory Allocation: Due to the considered real-time application, it is crucial to guarantee to deliver the requested data within a limited time.Intuitively, this requires reserving a larger share of the total cache memory for storing data needed in locations with poor communication quality.
In this regard, the amount of cache memory dedicated for storing (parts of) every state-specific content file W (s) at each user is determined during the memory allocation process.In this paper, we assume that there is no a priori knowledge about the users' spatial locations during the placement phase.Hence, for memory allocation, we consider uniform access probability for all the states (using prior knowledge about the states' popularity, the performance can be further improved).Let us use m(s) to denote the normalized cache size at each user allocated to store (parts of) W (s). Since the size of W (s) is normalized to one, a user in state s needs to receive 1 − m(s) data units over the wireless link to reconstruct the FoV of state s.Intuitively, the delivery time for a user in state s can be approximated by 1−m(s) in the single-user case, where r(s) is the approximated rate at state s (c.f.Eq. ( 2)).However, for the multi-user case, the approximate delivery time will be somewhat different.The approximated delivery time TT when multiple users are served in parallel will be detailed in Sec IV, where we show that if m(s) values are known, TT is formulated as where m(s) is the minimum achievable global caching gain given the non-uniform memory allocation.We use α ≤ L to denote the spatial multiplexing gain, which can be tuned for a given scenario based on, e.g., available transmit power, constraints on the beamformer design, the number of users in the network, etc. (c.f.[21]).
Note that the t + α term in the denominator represents a lower bound on the achievable DoF for the non-uniform memory allocation scenario (for the uniform allocation, the DoF of approximates the worst-case delivery time across all the states when K users are served simultaneously.Next, we first rewrite (3) as , where m equals min m(s).Then, to minimize the delivery time for any possible realizations of user locations, we formulate the memory allocation process as the following linear fractional programming (LFP) problem: Note that at the optimal point, m = m = min m(s).Using Charnes-Cooper transformation [52], this problem can be reformulated as an equivalent linear program (LP) as follows: Note that after solving this problem, the actual allocated memory would be m(s) = m ′ (s)/ξ, ∀s, for all s ∈ S do 4: for all V(s) do 7:  Cache Arrangement: After the memory allocation process, we store data fragments in the cache memories of the users following a similar method as proposed in [16].To this end, for every state s ∈ S, we first split W (s) into K t(s) sub-files denoted by W V(s) (s), where t(s) = Km(s) and V(s) can be any subset of the user-set K with |V(s)| = t(s).Then, at the cache memory of user k ∈ K, we store W V(s) (s) for every state s ∈ S and every subset V(s) ∋ k.The cache arrangement process is outlined in Algorithm 1.For simplicity, here we assume that for every s ∈ S, m(s) > 0, and t(s) is an integer (the general case where these assumptions are not necessarily met is discussed in Appendix A).Also, for notational simplicity, we ignore the brackets and commas while explicitly referring to a given V(s), e.g., W ij (s) ≡ W {i,j} (s).
Example 1.Consider a simplified XR application scenario with K = 4 users, where the application area is split into S = 5 states, and for each state, the required data size is F = 400MB.Each user has a cache size of 900MB; hence, the normalized cache size is M = 2.25 data units.Assume that the spatial distribution of the approximated normalized throughput is as given in Table II, where the memory allocations resulting from solving (4) are also shown.
It can be easily verified that t(1) = t(5) = 1, t(2) = t(4) = 2, and t(3) = 3.As a result, W (1), W (3) and W (5) should be split into 4 sub-files, while W (2) and W (4) are split into 4  2 = 6 sub-files.The resulting cache placement is visualized in Figure 3.To show that the memory constraint is strictly satisfied, we remind that each W (s) is split into , where V(s) can be any subset of users with size t(s).Then, each user k stores every W V(s) (s) for which k ∈ V(s).In other words, W (s) is split into K t(s) sub-files, from which K−1 t(s)−1 sub-files are stored in the cache memory of each user.Hence, the total memory size dedicated to W (s) at each user is i.e., the proposed algorithm satisfies the cache size constraints.

IV. ASYMMETRIC CACHE-AIDED CONTENT DELIVERY
At the beginning of the delivery phase, every user k ∈ K reveals its requested file W d k ≡ W (s k ).Note that, according to the system model, W d k depends on the state s k where user k is located.The server then builds and transmits several nested codewords, 2 such that after receiving the codewords, all the users can reconstruct their requested files.As detailed in Section II, user k requires a total amount of one normalized data unit to reconstruct W d k .However, only a subset of this data, with size m k ≡ m(s k ), is available in its cache, and the remaining part should be delivered by the server.Note that the conventional multi-server CC-based delivery schemes (e.g., [53] and [21]) that assume all users cache the same amount of data do not apply to our considered scenario where each user has cached a different amount of its requested file.Thus, a new delivery mechanism is required to achieve a proper multicasting gain.
The new delivery algorithm is outlined in Algorithm 2. First, the server builds and transmits multiple transmission vectors x K in a time division multiple access (TDMA) manner for every subset of users K ⊆ K : | K| = t + α, where t = min k∈K t k is the common global caching gain 2 In this paper, we consider the NCM scheme [44] to support multi-rate transmission.However, the scheme is oblivious to the modulation procedure, and any other multi-rate modulation scheme could be used (e.g., [42]- [45]).

5:
for all U ⊆ K : |U| = t + 1 do 6: x U ← 0 for all k ∈ U do 8: for all x U ← NEST (x U , G U ,k ) 14: x Transmit x K and t k ≡ t(s k ). 3 The transmitted signal vector is comprised of multiple nested codewords x U , where U can be any subset of K with |U| = t + 1.The elements (constellation points) of every nested codeword x U are drawn from complex Gaussian distribution such that E[|x U | 2 ] = 1.The details of the nesting operation, as well as the coding and decoding procedure, are explained in [44,Section 4].Also, every x U is precoded with a tailored beamformer vector v U ∈ C L , designed to suppress (or null-out) the interference caused by x U on every user in K \ U.After the transmission of x K, the corresponding received signal at user k ∈ K follows equation ( 1).
The nested codeword x U is built to include a useful data term/packet G U ,k for every user , where ( * ) denotes the nesting operation (c.f., [44]).The data term G U ,k is chosen to be available in the cache memory of every other user in U \ {k}, so that these users can remove its interference using their cache contents.To satisfy this condition, denoting U −k ≡ U \ {k}, we build G U ,k to include (parts of) every suitable sub-file W V(s k ),k for which However, since W V(s k ),k is cached in the cache memory of every user in V(s k ), and also because we may find more than one suitable sub-file W V(s k ),k to be included in G U ,k .In fact, there exists exactly , which should be split into smaller parts and concatenated while building x U .Note that every sub-file W V(s k ),k appears in t k t different U −k sets (c.f.step 10 in Algorithm 2), and each user set U is targeted K− t−1 α−1 times during the delivery phase (c.f.steps 3 and 5).Hence, to send fresh content in each transmission, we need to divide every sub-file W V(s k ),k suitable for user k into exactly ) before the concatenation.In other words, we split every suitable sub-file into ϕ k segments, and then concatenate χ k number of these segments to build , where (A, B) is the bitwise concatenation of files A and B.
The function CHUNK in Algorithm 2 ensures none of the segments of a sub-file is sent twice, and functions CONCAT and NEST denote bit-wise concatenation ( ) and nesting operation ( * ), respectively.We will later discuss in section IV-A that using the nesting operation (c.f.[44]) to create codeword x U , we can simultaneously transmit every G U ,k with rate R k such that Note that the size of the sub-files of A, B, C, D are 1 4 , 1 6 , 1  6 , and 1 4 data units, respectively.As L = 2 and the common global caching gain is t = 1, our proposed algorithm can deliver data to t + α = 3 users during each transmission.Let us consider the transmission vector x 123 for users K = {1, 2, 3}.Following equation (7), we have x 123 = v 12 x 12 +v 13 x 13 +v 23 x 23 , where the nested codewords x 12 , x 13 , and x 23 deliver a portion of the requested data to user sets {1, 2}, {1, 3} and {2, 3}, respectively.Based on the users' request sets T k , for each of the nested codewords x 12 and x 13 , there exists only one suitable sub-file for user 1 (i.e., χ 1 = 1), and these sub-files should be split into α 1 = 2 segments.However, for users 2 and 3, we have χ 2 = χ 3 = 2 and the segmentation factor is α 2 = α 3 = 4.As a result, x 12 is built as ) are delivered with proportional rates R 1 = 3 2 * R 2 .Following the same procedure to build x 13 and x 23 , the transmission vector x 123 is formed as Now, let us consider the decoding process for x 123 at user 1.Following Eq. (1), user 1 receives where the term ))+z 1 contains the (suppressed) interference terms and noise at user 1.Now, to recover its requested data terms A 1 2 and A 1 3 , user 1 has to jointly decode the two underlined messages (using successive interference cancellation (SIC), c.f. [21]), benefiting from its cache contents (i.e., (B 1  13 , B 1  14 ) and (C 1 12 , C 1 14 )) as a priori knowledge for demodulation.Specifically, to achieve a symmetric rate over two messages A 1 2 and A 1  3 , a SIC receiver combined with appropriate time-sharing between different decoding orders in the resulting MAC region is required [21].For example, the server first allocates rates such that user 1 is able to first decode A 1 2 assuming A 1 3 as interference, then remove A 1 2 from y 1 and finally decode A 1 3 interference-free.Then, for another time interval, the rate allocation changes such that user 1 decodes A 1 3 first and A 1 2 last.Similarly, users 2 and 3 can also decode {B 1  13 , B Note that compared to [21], in all these transmissions, we serve users 1 and 4 with a higher rate (1.5 times) compared to users 2 and 3 using the NCM modulation [44].
Lemma 1.Using the proposed cache placement and content delivery algorithms, every user receives its requested data.
Proof.The user k in state s k needs to receive 1 − m k data units during the delivery phase.This data is delivered by K−1 t+α−1 transmission vectors x K for which k ∈ K, i.e., by all the user subsets K that include user k.However, the number of nested codewords x U for which k ∈ U in every such vector x K is t+α−1 t , and each x U delivers to user k a data term G U ,k that is comprised of χ k segments each with size 1/ϕ k K t k data units.Hence, the total data size delivered to user k is

A. Weighted Max-Min Beamforming
In this section, we illustrate how the beamforming vectors v U in ( 7) are built.Note that, due to the underlying multi-rate transmission requirement of our proposed scheme, the optimized beamformer design in [21] is not readily applicable here.Thus, unlike [21] that considers maxmin-fairness to design precoders, here we formulate the objective function as a weighted-maxmin (WMM) problem, where the weights reflect the non-uniform amounts of data transmitted to different users.
As discussed in Section IV, for the proposed scheme, data delivery is done using K t+α transmission vectors x K. Also, every vector x K comprises t+α t+1 data terms x U and the same number of beamforming vectors v U , as shown in (7).As a result, after the transmission of x K, the received signal in (1) can be rewritten as where each of the D = t+α−1 t underlined terms in (9) contains fresh data for user k with size desired messages, they should all be transmitted with the same rate R k , i.e., where denotes the sum rate over all |Q| messages, and is the set of interfering message indices for user k.Note that R k is the symmetric rate per message, and since user k receives D messages in each transmission, its overall symmetric rate would be DR k .Moreover, since the total size of received data at this user is Dc k , the required delivery time for user k is , and the delivery time for x K would be T K = max k∈ K T k seconds.Now, as we aim to minimize the delivery time, the beamformer optimization problem can be formulated as min So, the weighted rate maximization for a given transmission can be formulated as where P T is the total available power at the transmitter.Problem ( 12) can be equivalently rewritten in epigraph form as Note that R is an auxiliary variable ensuring that user-specific rates R k are dedicated based on the corresponding weights c k (c.f., (13a)).Moreover, condition (13b) ensures that each user's dedicated rate R k lies within the MAC region.Auxiliary variables γ k U are considered to help facilitate the convexification of conditions (13b) and are limited to the message-specific SINR in (13c), which is a non-convex constraint.Finally, constraint (13d) ensures that the dedicated power to the beamformers {v U } does not exceed the available transmit power P T .Problem ( 13) is similar to the max-min optimization in [21] with the extra convex conditions (13a).Thus, it can be efficiently solved following the same successive convex approximation (SCA) approach proposed in [21].Here, we briefly recap the steps for the sake of completeness.First, Eq. (13c) is rewritten as and then, using the first order Taylor expansion, the right hand side of ( 14) is lower bounded by where, vV and γk U are the fixed approximation points.Now, substituting ( 15) in ( 14), we can approximate (13) as the following convex problem Finally, following the same approach as [21], the beamformers v V are found by iteratively solving (16) until the convergence.
Remark 2. Denoting the total delivery time of the proposed scheme with T T , we have This simply follows the fact that every user k needs to receive 1 − m(s k ) units of data from the server during the delivery phase, and this data is delivered using t+α−1 t data terms in K−1 t+α−1 transmission vectors (cf. the proof of Lemma 1).
It should be noted that although the discussions so far applied to the delivery phase, we also need an approximation of the expected delivery time to optimize the memory allocation during the placement phase (c.f Section III).However, during the placement phase, actual user locations are not known.Hence, the common global caching gain t, the actual achievable rates R k , and the actual delivery time T T can not be computed.To tackle this issue, we use an approximation of T T assuming uniform access probability for all the states as follows.
Lemma 2. The total delivery time T T calculated in (17) can be approximated as Proof.Noting that the total dedicated rate to user k for all the D = t+α−1 t messages is DR k , we first substitute DR k with its upper bound r(s k ) to approximate (17) as Then, using inequality max r(s) , we substitute the RHS of ( 19) with its upper bound r(s) .Finally, using inequality t ≤ t, T T is approximated as (18).
The delivery time approximation in (18) can be further simplified by assuming that R w = r(s) 1−m(s) is independent of the state s.The intuition behind this assumption is that with the proposed location-dependent cache placement and nested data delivery, the amount of data sent to each target user during every transmission is directly proportional to its delivery rate.With this assumption, we have TT ∼ K t+α 1 Rw and the symmetric rate will be R s w = K T T = ( t + α)R w .Following a similar argument for the symmetric multi-antenna CC scheme in [21], the symmetric rate there could also be approximated as R s u = K T T = (t + α)R u , where R u = r 1−M/S , r is the common max-min sum-rate, and t = KM /S is the global caching gain (c.f.[21] Section IV).

This results in
which indicates that compared with the symmetric scheme of [21], the proposed locationdependent scheme can improve the delivery time if the DoF loss resulting from the non-uniform cache placement (i.e., t+α t+α ≤ 1) can be compensated by the rate improvement (i.e., Rw Ru ≥ 1) due to the multi-rate transmission support.

B. Resolving the Imbalanced Global Caching Gain Bottleneck
Our proposed location-dependent CC scheme enables the global caching gain of t = min k∈[K] t k to be achieved together with the spatial multiplexing gain of α, while also addressing the wireless connectivity bottleneck problem.However, the min operation in t could cause the global caching gain to vanish if a subset of users were located at states with strong wireless connectivity (i.e., a subset of users have small t k values).This is an undesired effect as the users with better channel conditions limit the performance improvement enabled by the underlying multi-antenna CC mechanism.
To address this issue, we use the phantom user concept introduced in [54].In a nutshell, phantom users are virtual, non-existent users that are assumed to be part of the network when the transmission codewords are designed, and their effect is removed before the actual transmission.
Considering the fact that the global caching gain t is limited by users with strong wireless connectivity, the idea is to exclude such users from the CC-aided (i.e., multicast) delivery phase and serve them through multi-user unicasting (i.e., considering spatial multiplexing and (possible) local caching gains only).Then, to make CC-aided delivery work for the rest of the users, the excluded users are substituted by the same number of phantom users, all located in poorconnectivity states (hence, with large t k values).This results in an improved global caching gain for users with poor channel conditions, as the min operation would no longer be limited to users with strong wireless connectivity.As discussed in [27], the DoF loss caused by phantom users can be (partly) compensated through an improved beamforming gain enabled by optimized beamformers.
The enhanced multicast content delivery with phantom users is summarized in Algorithm 3.
In this algorithm, we keep substituting users with best channel conditions with phantom users until the global caching gain t becomes larger than some threshold t target , 4 while also checking that the real achievable DoF for the remaining users does not fall below t + α.The following example clarifies this procedure.
for all U ⊆ K : |U| = t + 1 do 13: x U ← 0 14: x U ← NEST (x U , G U ,k , R k ) 21: x Transmit x K Algorithm 2 for the considered users' distribution, t = 1 results in four transmissions where three users are served during each transmission.However, following Algorithm 3, user 4 is first excluded from the set of CC-aided users (i.e., K p = {4}), and hence, the common global caching gain and the potential DoF increase to t = min k∈{1,2,3} t k = 3 and t + α = 5, respectively.
However, since the actual remaining number of CC-aided users is three, the real achievable DoF, in this case, remains equal to three, but the excess spatial multiplexing gain could be used for an enhanced beamforming gain (and better rate).Thus, using Algorithm 3, the data and U 2 = {4} interference-free, respectively.

V. SIMULATION RESULTS
The performance of the proposed location-dependent scheme is evaluated by numerical simulations.We consider an XR application in a bounded 30 Unless otherwise mentioned, we assume that the transmit power is set such that the received SNR at the room borders is equal to 0[dB] (ignoring the 'shadowing' effect ζ).During the delivery phase, we assume every user k ∈ [K] can be located at any state s ∈ [S] with uniform probability.In all simulations, we use optimized beamformers obtained by solving (12).
As discussed throughout the paper, the proposed location-dependent CC scheme applies to application scenarios such as XR gaming, where the QoS of users is affected by the delivery time.Therefore, we do not just compare the average delivery time (rate) as done in most related CC studies (e.g., [21]) but also consider the 95-percentile of the expected delivery time as a figure of merit.The performance of the following placement and delivery schemes are compared: 5 where the placement phase is carried out by setting φ ≫ α K in Alg. 1, but delivery is done by unicasting (i.e., by using the spatial multiplexing gain only and ignoring coded caching techniques); • , where φ = α K is used in Alg. 1 (i.e., considering multicast content delivery already in the placement phase) and delivery is done using Alg.3; • MS, where the cache placement is uniform (i.e., φ ≪ α K is used in Alg.1), and the baseline content delivery algorithm of [21] is used.
We first compare different schemes based on their delivery times accumulated over 500 random user drops.Fig. 4 plots the cumulative distribution function (CDF) of the total delivery times for all realizations.For reference, we have also added simulation results for two other schemes, which are very similar to Proposed, φ ≫ α K and Proposed, φ = α K but use Algorithm 2 for data delivery (i.e., they don't use phantom users to address the caching gain bottleneck as discussed in Sec.IV-B).As can be seen, incorporating phantom users always improves the performance; hence, throughout the rest of the text, we always assume that Algorithm 3 is used for data delivery.From Fig. 4 it is clear that the MS scheme has the largest variation in total delivery time among all the schemes, which is undesirable in our considered use cases with locationdependent content requests (e.g., XR gaming).The reason for this considerable variation is that the MS scheme only intends to maximize the global caching gain, which results in better performance than other schemes when all the users have good channel conditions but deteriorates the rate when a subset of users experience poor connectivity.On the other hand, our proposed schemes provide a robust performance by keeping the variance in delivery time very small, with the Proposed, φ = α K scheme providing the best results.This robust performance is a direct result of the proposed non-uniform cache placement, as it makes the algorithm immune to wireless connectivity bottleneck areas by allocating more memory to store the content of such areas.Figures 5 and 6 compare the performance of different schemes with respect to the standard deviation of obstructed locations parameter (σ).As illustrated, for small σ (i.e., less variation in large-scale fading among states), the traditional MS scheme outperforms other methods.This is because in our proposed schemes, we sacrifice the global caching gain (i.e., t = KM S ) for a higher local caching gain (i.e., m i ), which results in a better transmission rate for individual users (the Rw Ru ratio in ( 20)) but reduces the number of users served simultaneously (the t+α t+α ratio in (20)).The rate improvement for individual users is more prominent when they experience relatively poor channel quality, which is not the case when σ is small.However, as σ becomes larger (i.e., there are more attenuated states), the MS scheme performs worse than the proposed schemes.This is because, with larger σ, users are more likely to experience poor connectivity, increasing the effectiveness of the local caching gain in decreasing the total delivery time.It should be noted that in both Figures 6 and 5, if σ > 10, the Proposed, φ ≫ α K outperforms all other schemes.This is because, in that regime, the large variety in the expected achievable rate of different states forces the memory allocation to become more non-uniform.As a result, the minimum achievable global caching gain (i.e., t) becomes very small, and we need to rely more on phantom users to resolve the imbalance global caching gain bottleneck (c.f.Section IV-B).However, phantom users work better with the Proposed, φ ≫ α K scheme as it allows even more non-uniform memory allocations than the Proposed, φ = α K scheme.Figures 7 and 8 compare the performance of different methods with respect to the SNR value at the room border and the spatial multiplexing gain α, respectively.As shown in Fig. 7, for smaller SNR values at the room border, the performance gap between the proposed schemes and the MS scheme widens.This is because with smaller SNR values (i.e., smaller transmit power), the  achievable rate in different states gets highly affected by large-and small-scale fading, resulting in larger variations in the achievable rate of different users.As a result, the proposed schemes that use non-uniform cache placement to compensate for such variations perform better than the MS scheme.On the other hand, for higher received SNR at the cell edge, performance gap between MS and other schemes decreases as the uniform memory allocation becomes almost optimal.Also, as illustrated in Fig. 8, for a larger spatial multiplexing gain, our proposed schemes perform better than MS.The reason is that, with a larger α, the DoF value gets less sensitive to the global caching gain, i.e., the t+α t+α ratio in (20) converges to one.As a result, the rate improvements for individual users due to the increased local caching gain of the proposed schemes become more effective in reducing the total delivery time. 6n Fig. 9, we have compared the performance of different schemes with respect to the (normalized) available cache memory at the users M S .As depicted, the performance gap between the proposed schemes and the MS scheme grows rapidly at first but then narrows as M S is increased.The reason for this behavior is that when M S is small, performance improvement is due to the local caching gain; i.e., all the available memory is used to cover wireless connectivity bottlenecks.However, as more memory becomes available, we reach a point where all the bottleneck areas are covered, and cached data starts to be used to improve the global caching gain.In fact, as the amount of available memory grows, the result of the memory allocation  process in (4) gets closer to the uniform allocation of the MS scheme.Finally, Fig. 10 shows how the user count parameter K affects the delivery time of various schemes.As depicted, since a larger K also means more data to be delivered, the delivery time generally grows with the number of users.However, since the global caching gains also scale with K, the number of users served in parallel (i.e., t + α) also increases for larger K, resulting in an overall performance improvement for all the CC-based schemes.Still, the Proposed, φ = α K scheme provides the best performance among all the schemes, and its required delivery time is affected minimally by the increase in K.This is because, with larger K, there is a higher chance of having users with poor connectivity, increasing the effectiveness of the improved local caching gain resulting from the underlying non-uniform cache placement.

VI. CONCLUSION AND FUTURE WORK
A centralized location-dependent coded caching scheme with multi-rate content delivery, tailored for future XR applications, was proposed in this paper.Initially, the area of interest was divided into many small states such that the achievable rate at every point in each state could be considered the same.Then, based on each state's approximated achievable delivery rate, a memory loading process was performed to reduce the burden on wireless resources to serve ill-conditioned locations.This resulted in an uneven memory allocation, where larger cache portions were allocated to the contents requested in poor-connectivity states.Then, during the content delivery phase, a novel algorithm based on coded caching and multi-rate transmission techniques were devised to enable combined global caching and spatial multiplexing gains at the transmitter.Finally, the proposed method was shown to outperform the state-of-the-art in

File segment of
Wd k χk User k file-concatenation factor D Desired message count WV(s)(s) Sub-file of file W (s) GU,k Transmitted data to user k ∈ U I

Remark 1 .
Substituting the term α K in (4) with a general constant term φ enables a trade-off between the local and global caching gains.Selecting a large φ ≫ α K prioritizes the local caching gain m(s) at the expense of the minimum global caching gain t = K m (as the denominator in the objective function becomes almost constant).On the other hand, if φ ≪ α K , the minimum allocated memory m converges to M S and the minimum global caching gain is maximized at the cost of lower local caching gain for the states with poor expected connectivity r(s), resulting in higher delivery time fluctuations.Algorithm 1 Location-based cache placement 1: procedure CACHE PLACEMENT 2:solve for {m(s)} with (4) 3:

Example 2 . 1 = 1 , s 2 = 2 , s 3 = 4 , s 4 = 5 .
Consider the network in Example 1, for which the cache placement is visualized in Figure 3. Assume that there exist two antennas at the transmitter (i.e., L = α = 2).Let us consider a specific time slot, in which s Denoting the set of requested sub-files for user k with T k and assuming A ≡ W (1), B ≡ W (2), C ≡ W (4), and D ≡ W (5), we have

and the rest I = t+α− 1 t+1
non-underlined terms are seen as interference.Thus, from the user k's perspective, y k is a multiple-access-channel (MAC) with D desired messages and I interference terms.Let us use D k = {U | U ⊆ K, |U| = t + 1, U ∋ k} to denote the set of all the desired message indices for user k, i.e., |D k | = D. To minimize the overall time to decode all the D

× 30[m 2 ]
room, where a different 3D image is needed to rebuild the FoV in every tile of size 1 × 1[m 2 ], resulting in S = 900 states.The requested data is served by a transmitter with L = 32 antennas and spatial multiplexing gain of α ≤ L, located in the middle of the room on the ceiling at the height of 5[m].The small-scale fading of the channel h k is assumed to follow Rayleigh distribution, while the path loss for a user at state s ∈ [S] is modeled as[55]:P L(s) = 32.4[dB]+ 20 log 10 (f ) + 10η log 10 (d s ) + ζ,where d s is the distance between the center of the state s and the transmitter, η = 3 is the path-loss exponent, and f is the frequency.The term ζ ∼ N(0, σ) with standard deviation σ is used to model the impact of randomly-placed objects obstructing the propagation path between the transmitter and the receivers, similar to the shadowing effect in outdoor propagation environments.To compute the state-specific expected throughput r(s) for the initial cache placement in (5), the expectation in (2) is taken over all possible user locations and channel realizations in state s.

TABLE I :
Main Notations sum Sum rate over |Q| messages

TABLE II :
Location-specific rate and memory allocation for Example 1.
14), where superscripts are used to differentiate various segments of a sub-file.The nesting operation in x 12