Scalable Cell-Free Massive MIMO Networks With LEO Satellite Support

This paper presents an integrated network architecture combining a cell-free massive multiple-input multiple-output (CF-M-MIMO) terrestrial layout with a low Earth orbit satellite segment where the scalability of the terrestrial segment is taken into account. The main purpose of such an integrated scheme is to transfer to the satellite segment those users that somehow limit the performance of the terrestrial network. Towards this end, a correspondingly scalable technique is proposed to govern the ground-to-satellite user diversion that can be tuned to different performance metrics. In particular, in this work the proposed technique is configured to result in an heuristic that improves the minimum per-user rate and the sum-rate of the overall network. Simulation results serve to identify under which conditions the satellite segment can become an attractive solution to enhance users’ performance. Generally speaking, although the availability of the satellite segment always leads to an improvement of users’ rates, it is in those cases where the terrestrial CF-M-MIMO network exhibits low densification traits that the satellite backup becomes crucial.


I. INTRODUCTION
Little more than a lustrum has passed between the identification of the technological pillars supporting the fifth generation (5G) of mobile systems [1] and the first incarnations of the standard currently being rolled out worldwide. Despite not being a priority in the initial 5G studies, the incorporation of a satellite segment into the 5G ecosystem has gained momentum [2], [3], specially with the advent of low Earth orbit (LEO) satellites and their current massive deployment in the form of mega-constellations such as Starlink, OneWeb or Telesat [4], with this trend bound to continue in 6G [5], [6]. Owing to their much lower altitude in comparison to classical geostationary orbits, communications using LEO satellites are subject to low latencies with round-trip delays in the range of 30-100 ms The associate editor coordinating the review of this manuscript and approving it for publication was Kai Li . (rivalling those found in terrestrial networks). Furthermore, the possibility of combined terrestrial-satellite terminals able to directly communicate with both the ground and the satellite segments [7], [8] are pushing the move towards integrated space-terrestrial architectures. As an example, authors in [9], [10] have recently introduced data-offloading schemes from the terrestrial to a LEO-based satellite segment with the objective of maximizing the spectral and energy efficiencies of the whole network.
In sheer contrast with the late consideration of the satellite element, the ultradense network paradigm was one of the pivotal concepts in the genesis of 5G and one that will surely keep playing a fundamental role in the future evolution of 5G towards beyond 5G (B5G)/6G systems. One specific flavour of network densification that has attracted considerable attention is the cell-free massive multiple-input multipleoutput (MIMO) architecture. Initially proposed in [11], cellfree massive MIMO (CF-M-MIMO) assumes the existence of a single central processing unit (CPU) to which a plethora of access points (APs), irregularly distributed over the area to be covered, are connected via fronthaul links. Despite the first CF-M-MIMO proposal advocated for the use of distributed precoding in the form of AP-based conjugate beamforming (CB) processing, the potential of centralized CPU-based precoding was investigated shortly afterwards [12] showing that it greatly outperforms CB precoding in terms of rate although at the cost of having to centralize the precoder design at the CPU, and thus requiring the exchange of fast-fading instantaneous information (e.g. channel gains, precoding vectors) over the CPU-AP fronthaul links [13]. Interestingly, by exerting a so-called max-min power control strategy, CF-M-MIMO is able to provide a uniform qualityof-service (QoS) throughout the coverage area resulting in all mobile stations (MSs) attaining the same rate. According to virtually all recent 6G surveys, cell-free topologies are one of the key architectural proposals that will be at the core of future B5G/6G networks [14], [15].
It was already pointed out in [16] that the advent of LEO constellations is likely to make this space-based solution significantly more cost effective and flexible than the classical fiber-linked macrocellular architecture, thus making the study of hybrid CF-M-MIMO/LEO networks attractive from a B5G/6G perspective. In fact, there are even early proposals to implement CF-M-MIMO technology solely based on a LEO satellite segment [17], although such architectures seem appropriate only in scenarios where there is no terrestrial segment available. Satellite communication coverage tends to be specified in terms of the satellite footprint, typically encompassing, in the specific case of LEO satellites, areas of several hundreds of square-km. Interestingly, recent advances in antenna array technology allow the radiation of multiple independent beams with each beam providing coverage to much smaller areas (a few square-km, comparable to terrestrial mobile networks cells) within the overall satellite footprint while allowing an increase in spectral efficiency by employing a per-beam full frequency re-use of the available bandwidth [18]. The combination of analog and digital processing allows even the definition of smaller coverage areas (in the order of hundreds of square-meters) by conforming beams whose angular widths are as narrow as 0,4 • [19].
While a major thrust for the deployment of satellite-based communication systems has been the provision of broadband coverage to under-served areas [20], it is specially appealing in the case of the LEO, given its rather narrow footprint, to investigate how this extra network capacity can be exploited to complement the existing terrestrial infrastructure when a LEO satellite is in sight of a given terrestrial segment. Very recently we have first proposed in [21] a hybrid CF-M-MIMO/LEO that combined these two promising B5G/6G trends, namely, satellite integration and cell-free topology trying to address a well-known caveat of CF-M-MIMO maxmin network optimization whereby the performance of the whole network may be seriously compromised by a few ill-conditioned users that bring down the common user rate. This has been shown to occur in both centralized massive multiple-input multiple-output (M-MIMO) [22] as well as decentralized CF-M-MIMO systems [23]. Nevertheless, when trying to advance the CF-M-MIMO concept to reality one key aspect to be addressed is that of scalability, that is, how to actually adapt the theoretical CF-M-MIMO results to network roll-outs over extensive geographic areas with a large number of users since the primal cell-free concept of serving any user from all APs is not feasible in practice. Very recently [24] has studied how scalability impacts the performance of CF-M-MIMO by considering various degrees of cooperation and information exchange among the CPU and AP, establishing distributed dynamic clustering (DCC) as a key mechanism to CF-M-MIMO implementation. Under DCC, every user is served by a finite number of APs rather than the whole network resulting in precoder designs that take into account this partial connectivity (e.g. partial zeroforcing (ZF)/minimum mean square error (MMSE), see [25] for a thorough review of different alternatives). Likewise, the initial power allocation techniques proposed in the CF-M-MIMO context, most notably the max-min technique, are virtually impossible to implement in practice given that a complete statistical channel state information (CSI) among all the nodes in the network is required. Consequently, more practical strategies have been proposed, like fractional power allocation (FPA) [26], that despite not achieving a perfect equal-rate provision among all users, are able to approximate this objective with milder statistical CSI requirements and, moreover, do so in a scalable-friendly manner.
Although our earlier work already demonstrated the benefits of the backup the satellite segment provides [21], no scalability aspects were taken into account since the CF-M-MIMO ground segment performed an overall nonscalable (ZF-based) precoding and max-min power allocation, that may somewhat miscalculate, in practice, the effect of the satellite segment. Furthermore, the extrapolation of the results to larger geographic areas is dubious since it is unlikely that a single CPU can gather all the required statistical and instantaneous CSI information our earlier scheme required. To address these shortcomings, this paper proposes a scalable hybrid CF-M-MIMO/LEO networking solution that is able to cover arbitrarily large areas with finite computational/fronthaul resources. In particular, the main contributions of this paper are: • A two-tier architecture is proposed where a scalable CF-M-MIMO network is backed by a constellation of LEO satellites that conform an alternative link for ill-conditioned terrestrial users [23]. This ill-conditioning might be caused by the poor propagation conditions certain users may experience, to excessive interference levels of their surroundings or as a side effect of scalability (e.g., an MS served by only a few APs). Additionally, a poor densification of certain parts of the terrestrial infrastructure may render the space 37558 VOLUME 10, 2022 component a key element towards the maintenance of acceptable transmission rates.
• Both centralized (CPU-oriented) and distributed (AP-oriented) precoding strategies are incorporated to the framework that require of various levels of CSI information at the different terrestrial nodes. Note that the centralized precoder, despite being computed at the CPU, is ensured to remain scalable by adhering to the DCC principle.
• A new adaptive and scalable scheduling algorithm is proposed that governs the diversion of users to/from the satellite segment and that allows the likelihood of a user to be transferred to the space segment to be adjusted in accordance to the network operator requirements. The algorithm can be tuned to act in accordance to the terrestrial segment policy (i.e., favouring either minimum rate performance or sumrate).
• An extensive set of simulation results are presented illustrating the impact the LEO satellite segment can have on a wide variety of settings. Interestingly, these results can pave the way for network operators to explore the economic viability of two competing network rolloff strategies, namely, densification or satellite resources hiring, and to identify the optimal trade-off. Concluding these contributions, it is worth pointing out that the satellite business cases considered in standards (e.g., 3GPP) mostly target underserved areas (i.e., geographic zones with very poor or non-existing terrestrial infrastructure). Since LEO constellations are made of a large number of satellites with fast orbits around the Earth, it is expected that many elements of the constellation spend significant fractions of their orbits illuminating geographic areas where there is some degree of terrestrial infrastructure rendering the satellite segment non-essential. However, the terrestrial-tosatellite user diversion introduced here exploits the transient availability of the communications resources the satellite provides that would otherwise be infra-utilized when orbiting such a non-underserved region. In fact, satellite operators can benefit from this extra usage of the space segment to further reinforce their role in future broadband networks.
The rest of the paper is organized as follows: Section II introduces the characteristics of the two networks (terrestrial and satellite) alongside their channel models and corresponding estimators while detailing the scalability aspects of the ground component. Section III describes the integration methodology of both network segments alongside an algorithm that governs de diversion of users from the terrestrial to the satellite component. Section IV presents a complete set of numerical results that help identifying under which terrestrial network conditions it is most beneficial to recur to the satellite link. Finally, Section V recaps the main outcomes of this work and provides hints for further work.
This introduction concludes with a brief notational paragraph: vectors and matrices are represented by lowerand upper-case bold symbols, respectively. The symbol E{·} denotes mathematical expectation. D(x) represents a diagonal square matrix with x at its main diagonal, whereas D(X 1 , . . . ,X M ) corresponds to a block-diagonal matrix with X 1 . . . X M denoting a set of square matrices. The symbol ||x|| denotes the norm-2 of vector x. Superscripts X T and X H serve to denote transpose and conjugate transpose of a matrix X, respectively, and X * is the complex conjugate.

II. SYSTEM MODEL
This work focuses on a scenario like the one shown in Fig. 1 whereby a cell-free network consisting of a multitude of APs all connected to one or more CPUs serve users distributed over a prescribed coverage area. The CPUs are in turn linked (physically or logically) to the gateway of a multi-beam LEO-based satellite segment (SAT-GW) with both network elements, CPUs and SAT-GW, being connected to the network core, thus it can be safely assumed that they can be jointly managed by allowing the deployment of an integrated control plane between both segments. This joint operation allows MSs to be transferred from one segment to the other as postulated in the EU H2020-SANSA project and in many other terrestrial-satellite integrated proposals [27]- [29] as well as in recent 3GPP standardization activities [20]. It is noteworthy that such an integrated approach has already been implemented in the first generation of commercial LEO systems, Iridium, albeit mainly targeting voice traffic and low bit-rate communications [30]. Unlike [21], this work focuses on the downlink (DL) of LEO deployments operating on the S-band (2.2 GHz), as this allows the direct LEO-to-MS communication (i.e., satellite-to-handheld terminal) without the need to route traffic through a satellite gateway. While the available S-band spectrum is much narrower than that in the Ka/Ku bands (i.e., 30 MHZ rather than 250 MHz), this frequency band allows user terminals to share the RF front-end with that devoted to a sub-6 GHz terrestrial segment, thus significantly easing the implementation of dual terrestrial-satellite handsets [31].

A. TERRESTRIAL SEGMENT
We consider a terrestrial segment as the one depicted in Fig. 1 with a CF-M-MIMO deployment consisting of M APs, each with available transmit power P AP operating over a bandwdith B CF and equipped with a uniform linear array (ULA) of N AP antennas. These APs are randomly distributed following a uniform distribution throughout the coverage area and connected by means of fronthaul links (dashed lines) to one or more CPUs, which in turn are connected to the SAT-GW node through backhaul links (solid lines). This infrastructure provides service to K single-antenna MSs randomly distributed throughout the coverage area of the CF-M-MIMO (rounded boxes in Fig. 1). This ground segment could be arbitrarily expanded by incorporating many different CF-M-MIMO segments and allowing the information exchange among different CPUs in different segments through a backhaul network [23].
For the sake of practical implementation, and unlike our prior work in [21], this paper places special emphasis on the scalability aspects of the proposed hybrid system. In particular, scalability can be achieved by ensuring that the computational and fronthaul requirements of the terrestrial and satellite segments are kept finite even when the number of APs and/or users in the network grows unboundedly. In practice, this scalability condition can be achieved by guaranteeing that each AP or the satellite node only serve a finite number of users, as this ensures that the precoder design and power allocation remain computationally feasible and also, that the uplink (UL) training and fronthaul requirements are bounded [25].

B. SATELLITE COMPONENT
The availability of a multi-satellite LEO constellation is assumed along the lines of those currently being considered for 5G New Radio networks [32]. Each satellite potentially illuminates an area of A SAT square-km whose footprint, typically on the order of thousands of square-km (e.g., 25,000 square-km), is defined by a prescribed minimum elevation angle between the user position on Earth and the satellite. State-of-the-art satellites are equipped with multiple antennas and signal processing capabilities to generate many different spot beams each providing coverage to a specific area within the satellite footprint and that typically ranges from hundreds of square-meters to a few square-km [19]. Remarkably, each spot beam can be directed to an arbitrary area within the satellite footprint [32]. For conciseness, the scenario shown in Fig. 1 is considered where a narrow LEO beam illuminates the terrestrial area where a single CF-M-MIMO segment is located. LEO satellites are located at height h SAT above the Earth, with typical values in the range 300 ≤ h SAT ≤ 2000 km and each has an available per-beam transmit power P SAT and per-beam bandwidth B SAT (this bandwidth is assumed here to be exploited through full-frequency re-use across multiple beams but it could also be considered a frequency band of a 4-colour reuse scheme). It is assumed that the satellite antenna architecture has a single feed per beam, and therefore to all modelling effects can be considered a single-antenna system. While it is technically possible to illuminate a given Earth region by superposing multiple beams from the same or adjacent satellites, we concentrate in this work on the case where one ground region made of a single CF-M-MIMO segment has access to a specific beam from a LEO satellite while leaving more complex architectures for further work (i.e., multiple beams from the same or different satellites serving various CF-M-MIMO segments in a coordinated fashion). Also, we note that, commonly, two orthogonal polarizations are available, effectively doubling the actual bandwidth. In this work, the variable B SAT will be used to denote the channel bandwidth irrespective of whether this is available on a single polarization or split into the two orthogonal components.

C. CHANNEL MODELS
Let us denote by β CF mk the large-scale propagation gains (i.e., path gain and shadowing) of the link joining AP m and MS k, which can be expressed as β CF mk = ζ mk χ CF mk with ζ mk representing the distance-dependent path gain 37560 VOLUME 10, 2022 where ζ 0 is the path gain at a reference distance of 1 meter, d mk is the distance (in meters) from AP m to MS k and α the path gain exponent. These two coefficients may vary under line-of-sight (LOS) and non-line-of-sight (NLOS) conditions. The component χ CF mk corresponds to a log-normally distributed random variable statistically characterized by log 10 (χ CF mk ) ∼ N (0, σ 2 χ ) whose spatial correlation model is described in [11, (54)-(55)]. The link between the m-th AP and the k-th MS will be considered to be either in LOS or NLOS, with the LOS probability being given by [33] where d 0 is a reference distance. The resulting DL channel vector g mk ∈ C N AP ×1 from the k-th MS to the m-th AP (including both large-scale and small-scale fading) can then be generically characterized as a Ricean fading channel consisting of the presence of a LOS component with probability p LOS (d mk ), on top of a Rayleigh distributed component [33]. That is, with h mk = β CF mk e jκ mk a mk , where κ mk ∼ U (0, 2π) represents the random phase factor and vector a mk = 1, e jπ sin ψ mk , · · · , e j(N AP −1)π sin ψ mk T is the ULA response vector at the AP with ψ mk denoting the angle of arrival (AoA) between the k-th MS and the m-th AP. The NLOS component h mk follows a distribution CN (0, R mk ) with R mk representing the spatial correlation of the antenna array at AP m as seen from user k with and AoA that follows a Gaussian distribution around the nominal angle ψ mk and modelled as in [34,Chapter 2], and subject to Tr(R mk ) = N AP β CF mk . Parameter K mk denotes the Ricean K -factor, with K mk = 0 for NLOS propagation links and 10 log 10 (K mk ) ∼ N µ K , σ 2 K for LOS propagation links. In line with most CF-M-MIMO literature, block-fading is assumed with the channel remaining static for the duration of one block and then varying independently from block-toblock.
Turning now our attention to the space segment, when modelling the satellite channel, three conditions are assumed [35]: 1) as in most previous literature, the land mobile satellite (LMS) channel is safely assumed to be frequency flat specially when considering S-band scenarios mostly subject to moderate-to-large elevation angles with predominant LOS propagation [36], 2) the channel is deemed constant during a frame transmission and, 3) receivers are provisioned with satellite ephemeris and are equipped with global navigation satellite system (GNSS) receivers, thus MSs are capable of compensating Doppler effects. Under these assumptions, the satellite-to-MS channel can be modelled using a scalar gain v k characterized by a Ricean distribution whose generation conforms to the specifications in [37], which for user k follows wherev k is the multipath (flat fading) component with the distribution CN (0, β SAT k ), with the coefficient β SAT k modelling the large-scale propagation gain (path gain and shadowing), K k is the k-th satellite-user Ricean K -factor, and v k is the direct path given bȳ where θ LOS k represents the phase term that results from the beam radiation pattern and the radiowave propagation in the direct path which, in contrast to the terrestrial environment, can be assumed to be perfectly known as under the specified assumptions its rate of variation is considerably slower than that of the multipath component. Large-scale propagation gains (path gain and shadowing) are modelled as with G R denoting the MS antenna gain and G k T corresponds to the satellite antenna gain in the direction of the position of the k-th user. The shadowing component χ SAT k is a zeromean log-normal random variable, log 10 , whose specific values of ϑ k depend on whether the specific user experiences good or bad propagation conditions. The Ricean K -factors for each user, K k , are assumed to conform to a log-normal distribution whose mean and variance are specified in [37, 6.7.1] again in accordance to the user's good/bad status. Finally, L k represents the large-scale propagation gains (due to free-space propagation) that are defined as L k = 10 L dB k /10 with L dB k denoting the losses of the Friis' model in dB [37] L dB k = 32.45 + 20 log 10 (f SAT c ) + 20 log 10 (d SAT k ), (5) with f SAT c denoting the carrier frequency (in GHz) of the satellite component and d SAT k is the distance (in m) separating the satellite from user k. Note that the MS-to-satellite distance can be computed as [37] where R E is the Earth's radius and k denotes the elevation angle from the CF-M-MIMO network to the LEO satellite.
Since the satellite height is very large in comparison to the dimensions of the CF-M-MIMO segment, the elevation angle can be considered to be the same for all users, k = , and so is the MS-to-satellite distance, d SAT k = d SAT . Regarding the fast fading term, this is generated in accordance to the two-state model specified in [37] whereby users are considered to be in either good or bad states, assumed here to correspond to the probability of being in LOS/NLOS scenario as defined in Table 6.6.1-1 in [37].

D. CHANNEL ESTIMATION
As in most of the massive multiple-input multiple-output (MIMO) literature, time-division duplex (TDD) is assumed whereby during the UL terrestrial training phase, all K MSs simultaneously transmit pilot sequences of τ p samples to the APs, thus resulting in an N AP × τ p matrix of received training samples at the mth active AP given by where P MS p is the available pilot symbol power at the MS, ϕ k denotes the τ p × 1 training sequence assigned to MS k, with ϕ k 2 = 1, and N p m ∈ C N AP ×τ p is a matrix of independent identically distributed (iid) zero-mean circularly symmetric Gaussian random variables with standard deviation σ u . Recent results show that the phase agnostic Bayesian linear MMSE channel estimate is given by [38] wherȇ and The channel estimation error, ε mk = g mk −ĝ mk , conforms to a distribution ε mk ∼ CN (0, A mk ) with For latter convenience, the MN AP × 1 vector collecting the channel responses from all APs to user k in the network is defined now as g k = g T 1k . . . g T Mk T and its corresponding . g K collects the channels between the M APs and the K MSs andĜ = ĝ 1 . . .ĝ K denotes its MMSE estimate.
Unlike the terrestrial CF-M-MIMO segment, satellite links tend to rely on frequency division duplexing (FDD) and therefore, a dedicated training phase is required for both UL and DL. Since our focus here is on the DL, the satellite needs to send a training sequence to enable satellite channel estimation at the MS receiver. Owing to its frequencyflat character, channel estimation of the satellite-MS link is conducted by periodically transmitting a unit norm pilot symbol φ k with power P SAT p from the satellite and performing matched pilot filtering at the MS, with This estimate can easily be shown to be distributed aŝ from which we can define the channel estimation error as which given its dependence with respect to slowly-varying large-scale parameters can be safely assumed to be known at the receiver. For simplicity of presentation, it is assumed that all K users estimate both the terrestrial and satellital channels.

E. USER CLUSTERING
Let us assume that the set of K users to be served, denoted by U, can be split into two subsets U CF and U SAT with cardinalities K CF and K SAT , which denote the MSs served by the terrestrial APs and by the LEO satellite, respectively. Obeying to the cell-free scalability requirement, every MS assigned to the ground segment is served only by a fraction of the APs. To formalize this selective transmission, and following [39], we define the M × K terrestrial connectivity matrix C whose entries c mk are defined as where U CF m denotes the set of users served by AP m. Note that any user k in U SAT results in the kth column of C to be an all-zero vector. For convenience, we define at this point the 1 × K vector c [m] as the mth row of C that represents the connectivity of AP m, and the M × 1 vector c [k] as the kth column of C that corresponds to the connectivity of MS k. It is worth pointing out that this selective connectivity has two implications: on the one hand, it bounds the processing requirements each AP has to endure (i.e., estimation of a finite number of channel responses and calculation of a finite number of precoding vectors), thus guaranteeing scalability; on the other hand, there is the possibility for an MS to be 37562 VOLUME 10, 2022 served by only a few APs or even to be left totally unserved, thus resulting in an outage from the terrestrial segment for that particular user. In this work the connectivity matrix entries are set in accordance to the scalability principle established in [24] whereby each AP is enforced to serve a maximum of U AP max users, thus resulting in the scalability constraint that |U CF m | ≤ U AP max . Among the many potential strategies to determine each U CF m , in this paper the technique from [24] is adopted whereby each AP m serves the min{U AP max , K } users experiencing the strongest large-scale propagation gains from that specific AP. Note that (14) must be recomputed in accordance to the rate of change of the large-scale propagation coefficients β mk . × 1-vectorw k representing the overall precoding action for user k ∈ U CF . Note that the average power constraint at the APs implies that E x m 2 ≤ P AP . The signal received at the kth MS in U CF can be expressed as where υ k ∼ CN (0, σ 2 υ ) is the Gaussian noise component. Distributed and centralized schemes are now discussed.
1) Distributed precoding. Splitting the precoder at the mth AP into its user-specific components we can writeW m = w m1 · · ·w mK , where each precoding vectorw mk is derived using a two step procedure: one aiming at determining the directivity of the precoding vector and another one targeting the normalization of the precoder's power. Formally, with p DL mk defining the power assigned by AP m to MS k. The divisive factor in (16), E{ w mk 2 , represents the power normalization step, which eases the subsequent derivation of different power allocation policies (i.e., computing p DL mk ) by relying on the fact that the directivity vector has unit norm. Note that the normalization guarantees that the average power of the precoding vector fulfills E{ w mk 2 } = p DL mk . Turning now our attention to the precoder directivity, for conciseness, we focus on the local MMSE (L-MMSE) precoder [25] that is shown to combat the interference among the users served by a given AP exploiting the fact that the strongest interferers for an arbitrary user k ∈ U CF will typically be those closely positioned on the coverage area and it is defined as where B mi = ĝ miĝ H mi + A mi , and p UL k denotes the power coefficient used by user k in the UL. Recall that the inverse in (17) needs only be computed for those users served by that AP (i.e., for those cases where c mi = 1). We note that an signal-to-interference-plus-noise ratio (SINR) optimal precoder is not analytically known. Fortunately, by relying on the UL-DL duality theorem [25, Theorem 6.2], selecting the precoder's directivity as in (17), which is a scaled version of the corresponding uplink MMSE combiner (whose SINR maximizing optimality is demonstrated in [40]), constitutes a very effective heuristic towards the maximization of the downlink SINR.
2) Centralized precoding. The centralized MMSE (C-MMSE) precoder, introduced in [40], aims at minimizing the interference among all users in the network and can be considered to provide a performance upper bound due to its near optimality [25]. Looking at the global precoderW column-wise,W = w 1 · · ·w K CF , the overall N AP M × 1 user-specific precoder follows where p DL k is the overall DL power for user k and where denotes the set of MSs that are served by partially the same APs as MS k, and Note that (18) follows the same structure as its distributed counterpart by first determining the precoder's directivity and then applying a normalization to enforce a unit norm, thus VOLUME 10, 2022 ensuring that E{ w k 2 } = p DL k . Also, as in the distributed case, the specific choice of the precoder's directivity stems from the UL-DL duality theorem that ensures that the C-MMSE precoder is an attractive strategy to maximize the resulting SINR for each user.
Many previous works on CF-M-MIMO rely on max-min principles to derive the power weights as the solution to a quasi-convex optimization problem that results in equal user rates across the network [12]. Unfortunately, such a solution is only feasible when all MSs are served by all APs (i.e., when C = 1 M ×K ), a condition that compromises the scalability requirement. Therefore, and for the sake of scalability, a fractional power allocation is adopted, which is defined by [25] for the case of centralized precoding and, for the case of distributed processing, with υ ∈ [−1, 1] denoting a parameter used to approximate the power allocation to different performance targets (e.g., sum-rate, max-min).
In particular, values of υ < 0 tend to favour max-min fairness whereas values of υ > 0 strive for maximum sumrate [25].
In accordance to the CF-M-MIMO spirit (i.e., providing uniform QoS), the former option (υ < 0) will be adopted. Although our focus is on the DL, the precoders depend on the UL power-allocation policy (see (17) and (19)), which are set in accordance to the UL fractional power allocation described in [25, eq. (7.34)] and given by with P MS denoting the available transmit power at each MS. Under the assumption of statistical CSI at the receiver, 1 the instantaneous DL SINR for user k is then given by where DS k = g H kw k , BU k = g H kw k − E{g H kw k } and UI k k = g H kw k . This SINR expression allows to derive the ergodic rate for user k ∈ U CF as where the expectation operator is taken with respect to multiple small-scale and large-scale fading realizations. Note than unlike the case of conjugate beamforming precoding 1 The statistical CSI knowledge required to evaluate SINR CF k basically consists of estimating E{g H kw k }, a procedure that can be implemented at the receiver side by relying on the user-available large-scale information (i.e., propagation losses, power coefficients) and the average of the received samples in (15)  and its variants, that admits the derivation of a closed-form expression for a tight lower bound of the rate [11], [41], the use of more advanced precoding schemes implies that this expectation can only be evaluated via numerical simulation. For an exhaustive and recent discussion of the different precoding schemes and power allocation strategies in a CF-M-MIMO context, the interested reader is referred to [25].

G. SATELLITE TRANSMIT PROCESSING
Considering that an arbitrary satellite-served user is indexed by k ∈ U SAT , the transmitted symbol from the satellite conforms to x k = √ η k q k with q k representing the corresponding information symbol and η SAT k the power weight. At the reception end, the user terminal implements a matched filter to maximize the received SNR 2 and, relying on the fact thatv k = v k + k , allows the user estimated symbol to be expressed aŝ with w k denoting a zero-mean AWGN sample with variance σ 2 w,SAT . Based on the previous equation, and conditioned on the knowledge of the channel estimatesv k and the corresponding channel estimation error variance σ 2 ,k , an achievable rate for this satellite user can then be derived as where SNR SAT k denotes an equivalent instantaneous SNR for user k ∈ U SAT given by Note that the bandwidth B SAT has been assumed to be equally split among all the K SAT users diverted to the satellite segment in an FDMA-like fashion. In line with the terrestrial segment, power loads are chosen using the same fractional power allocation policy where no scalability issues arise now as power coefficients are jointly determined for all satellite diverted users as It is easy to check that this power allocation guarantees that k∈U SAT η SAT k = P SAT . Note that the equal-bandwidth split among all diverted users and the fact that the satellite received signal strength is greatly influenced by the free-space propagation loss (common to all users), coupled to a power allocation policy in (28) that tends to equalise differences arising due to user-specific shadowing effects, effectively implies that all diverted users virtually achieve the same transmission rate (e.g. R SAT k R SAT ∀k ∈ U SAT ).

III. INTEGRATED CF-M-MIMO/LEO OPERATION
Using a large number of APs throughout the coverage area and employing FPA with parameter υ tuned for max-min performance (i.e., υ < 0) results in a similar-rate performance for all users in the coverage area. Nonetheless, as it happens when using strict max-min (non-scalable) power optimization, it is often the case that a few ill-positioned users condition the performance of the whole network, a situation often found in LOS environments [22]. A possible solution to this problem consists of discarding these bad users and this has been shown to significantly increase the aggregated throughput at the cost of introducing a certain outage probability [42]. In this work we consider the potential benefit that a LEO satellite, visible from the CF-M-MIMO segment, can have in increasing the user rate. Towards this end, this paper advocates for the off-loading of the users that limit the performance of the terrestrial network to the satellite segment. In doing so, we combine the CF-M-MIMO precoder design and power allocation with the satellite power allocation in (28) while still aiming at providing a similar quality of service (QoS) to any user in the coverage area. The scalable LEO-enhanced CF-MIMO optimization problem can be formally posed as where ρ, with 0 ≤ ρ ≤ 1, is a designer chosen weighing parameter that serves to bias the optimization to favour the use of the terrestrial segment. In particular, when ρ = 0, the proposed hybrid system totally neglects the satellite segment and thus becomes a conventional CF-M-MIMO network (29) falls back to the system defined in Section II-F. The function f (·) is an arbitrary function depending on the ground and satellite user rates. For conciseness, in this paper we focus on the function f = min(·) as this matches the heuristic defined by FPA with υ < 0 in an attempt to favour the performance of the worst users in the network. It is important to recognize at ths point that by relying on the specific precoder designs and power allocations procedures detailed in Section II, the following constraints are fulfilled: That is, the solution to (29) is guaranteed to satisfy the power constraints at each AP and the satellite maximum transmit power condition while enforcing scalability in the CF-M-MIMO segment. Problem (29) is a mixed-integer optimization problem and, as such, it is non-convex and its solution requires of an exhaustive search over all possible user groupings in U CF and connectivy patterns C. Such a search quickly becomes unfeasible, even for a modest number of users, given that the evaluation of each possible

Initialization:
1) Selected users/index: U CF (0) = U, U SAT (0) = ∅, i = 0. 2) Derive initial connectivity matrix C (0) using (14). 3) Determine, using L-MMSE (17) or C-MMSE (19), precoding matrixW. 4) Compute power allocation coefficients, p DL mk using (21) for L-MMSE or p DL k using (20) for C-MMSE. 5) Determine baseline user rates R 2) Update iteration: combination entails the recalculation of the CF-M-MIMO precoding filters. Given the non-polynomial (NP) character of (29), we tackle it using a computationally viable greedy approach, detailed in Algorithm 1, to be executed at the BSC/SAT-GW. In particular, we take as starting point of the search a setup where all users are served through the CF-M-MIMO network (i.e., U CF = U, U SAT = ∅) with corresponding precoders and connectivity calculated as if no satellite segment was present. The algorithm then proceeds by trying to offload one user at a time from the terrestrial to the satellite segment. The potential user to be diverted is the one experiencing the lowest transmission rate on the CF-M-MIMO network. Interestingly, the removal of a user from the terrestrial segment allows for the rest of CF-M-MIMO users to see their rates enhanced as the diverted user often tends to badly condition the precoders (and resulting SINRs) of the rest of users in the CF-M-MIMO segment. The procedure finalizes whenever it is detected that the next user to be transferred to the satellite segment has a better terrestrial rate than that the last diverted user is enjoying on the satellite link. The rationale of this condition is rooted on the fact that satellite users achieve nearly perfect rate fairness and therefore, the inclusion of a new user in the satellite segment would unavoidably plunge the rate of all diverted users so far below that of the worst terrestrial user. 3 The resulting optimal satellite user set is trivially derived from the CF-M-MIMO one as U SAT opt = U − U CF opt . It is important to recognize that the terrestrial/satellite user partition needs only be carried out on a large-scale basis as it is solely commanded by the large-scale parameters (β CF mk , β SAT k ). Also, it is worth mentioning that although in this work we focus on improving the worst user performance, other strategies could be pursued: sumrate improvement could be targeted by choosing υ > 0 and an exit loop condition that enforces ground-to-satellite diversion whenever the combined sumrate of both segments is maximized.

IV. NUMERICAL RESULTS
We consider a DL scenario where a single CF-M-MIMO terrestrial network is providing coverage to an squared area of side L through the random (uniform) deployment of M multiantenna APs, each with P AP = 200 mW and operating over a bandwidth B CF = 20 MHz. Spatially correlated antenna arrays are assumed at the APs with azimuth angular spread of 10 • according to the model defined in [34]. The terrestrial segment is supplemented by a LEO satellite situated at a height of 600 km above the Earth with elevation angle, unless otherwise stated, = 70 • , thus resulting in a satellite to CF-M-MIMO distance of roughly 635 km. The satellite link operates at a carrier frequency of 2.2 GHz (S-band) over a per-beam bandwidth of B SAT = 30 MHz. The satellite is assumed to be equipped with directional antennas with a maximum gain of 30.5 dBi and maximum transmit power per-beam P SAT = 10 W, resulting in an effective isotropic radiated power (EIRP) of 40 dBW/MHz. The noise figures for the terrestrial and satellite transceivers are 9 dB and 7 dB, respectively [37]. Please refer to Table1 for an exhaustive list of the parameters used to generate  these simulation results. Users are also deployed in a random fashion (uniformly) throughout the terrestrial coverage area and they are assumed to have dual connection capability (terrestrial/satellite) [7]. Users transmit pilot sequences with power P MS p = 100 mW to enable the terrestrial channel estimation while the weighing parameter is set to ρ = 1, thus giving equal weight to the terrestrial and satellite component. Power allocation coefficients for both terrestrial and satellite users are obtained using FPA as explained in Section II.F. Coherence interval is set to τ c = 200 samples with pilot size fixed to τ p = 16 samples. As per scalability concerns, the maximum number of users any AP can serve is fixed to U AP max = 16. Figures 2-5 are intended to identify the benefits the satellite segment brings along under different operating conditions by depicting average and minimum user rates for different levels of terrestrial densification (varying number of APs per area unit) for a fixed user load of K = 30 users. In all these figures Ricean fading is considered in both the terrestrial and the ground segments. Digging deeper into each figure, Fig. 2 assumes the use of CMMSE with APs and MSs uniformly distributed over a squared area of side L = 1000 meters and assuming each AP has N AP = 4 antennas. Average and (average) minimum user rates are shown for both the stand-alone CF-M-MIMO system and the integrated LEO-CF-M-MIMO one. For the later case, aside from the overall user performance and to gain further insight, the disaggregated average rates are also depicted for the users on each of the two segments. Overall average user rates denote a slight superiority of the integrated scheme over the stand-alone CF-M-MIMO that is more apparent when only a small number of APs are deployed. However, a much more significant difference is observed when looking at the minimum rates: in this case the LEO-CF-M-MIMO scheme provides a far more significant improvement as it more than doubles the minimum user rate for low number of APs with this difference progressively diminishing as more APs are incorporated. This behaviour serves to confirm the correct heuristic of Algorithm 1 whereby ill-conditioned terrestrial users are diverted to the satellite segment. Note, nonetheless that for dense APs deployments, the benefit of the satellite access is virtually marginal. It is interesting to explain the behaviour of the average satellite user rate: when only a small number (20 to 60) of APs are present, a fair amount of users are diverted to the satellite segment, thus resulting in satellite user rates below those achieved by the users left on the ground segment. As the CF-M-MIMO becomes denser, less and less users are diverted, up to the point that if only a single user is diverted, this will enjoy the full 30-MHz bandwidth and the full transmit power, and hence attain a high data rate. This effect could be modulated by setting the ground-satellite weight ρ to a value less than one. Recall that Algorithm 1 has been tuned to improve the performance of the users with lowest rates, objective that is indeed satisfied, aside from also improving the average rate with respect to the CF-M-MIMO stand-alone system. Figure 3 examines the same configuration as that in Fig. 2 except that now the coverage area is defined by a square of side length L = 2000 meters, thus quadrupling the coverage area previously considered. Clearly, the benefits that the satellite network brings along are now far more apparent and, importantly, fairly consistent regardless of the number of APs in the terrestrial network. The minimum rate improvement is drastic: for example, to achieve 10 or 20 Mb/s per-user minimum rate, the integrated LEO-CF-M-MIMO needs roughly 40 APs less than the CF-M-MIMO standalone network. The improvement is also significant when examining average rates with the integrated network offering an average user rate 10 Mb/s higher than that achieved in a pure CF-M-MIMO network. It is interesting to observe in this figure to observe that the average data rate performance of the ground users in the integrated LEO-CF-M-MIMO system consistently and significantly outperforms that achieved by the stand-alone CF-M-MIMO scheme. This implies that, as expected, removing the worst users from the ground segment, improves the performance of the remaining ones owing to two different effects: on the one hand, the calculation of the precoding matrix is potentially better  conditioned as harmful users have been dropped, and on the other hand, the connectivity matrix can be reconfigured to better serve the existing ground users by exploiting the empty connections left by the diverted users. Figure 4 examines the performance when using the distributed precoder LMMSE (with L = 1000 meters and N AP = 4) where the first important effect to note is that the achieved rates are considerably lower than those offered by the CMMSE precoder. As in Fig. 3, the integrated LEO-CF-M-MIMO clearly outperforms the CF-M-MIMO stand-alone network both, in terms of average and minimum user rates. Finalizing this group of simulations, Fig. 5 considers the use of the CMMSE precoding but now with single-antenna APs (N AP = 1) and L = 1000 meters. Again, the integrated scheme offers substantial benefits both, in terms of average user rate and minimum user rate. As a reference value, note how the LEO-CF-M-MIMO is able to secure a minimum 30 Mb/s user rate with 70 APs whereas a 100-AP deployment is needed to achieve the same value when considering a terrestrial network only. Comparing these results to those obtained with VOLUME 10, 2022 multi-antenna APs (see Fig. 2), it is clear that the satellite segment becomes increasingly more influential whenever the terrestrial infrastructure becomes simpler. This result helps to point out the trade-off between deploying a more complex ground infrastructure or relying on the satellite segment to attain a prescribed user performance.
In order now to assess how flexible the satellite connection can be, Fig. 6 shows the average minimum rate performance for a stand-alone CF-M-MIMO network in comparison to the integrated approach when the serving satellite is situated at different elevation angles. In this case, and to examine a broader range of propagation conditions, we consider a setup with Rayleigh fading in the terrestrial network and Ricean fading for the satellite link when using LMMSE precoding with M = 100 APs and N AP = 4 scattered over a squared area with L = 1000 meters and serving a varying number of users. Logically, decreasing the satellite elevation angle results in a decreased benefit of the satellite component mostly due to a farther distance between satellite and Earth and, correspondingly, a larger propagation loss. As expected, average user rate falls with increasing network load due to the larger degree of interference in the CF-M-MIMO segment and the higher spectral sharing among users in the satellite component. Nevertheless, the important fact this figure puts forward is that even for elevation angles as low as 30 • , there are significant improvements in the minimum user rates regardless of the network load (10 ≤ K ≤ 70). As most planned LEO constellations (e.g. Starlink, Kuipers) envisage that any point on Earth will have a multitude of satellites potentially in sight, the large resilience to the elevation angle provides a high degree of flexibility to pair a given satellite/beam to a certain terrestrial network. Moreover, it opens the doors to jointly manage an integrated network comprising many satellites/beams and terrestrial segments (for example, many beams could be directed towards a small terrestrial region in case of events attracting a large number of subscribers).
Concluding the numerical study, the last set of results analyzes performance as a function of the network load for  Figure 7 depicts the average and minimum user rates when assuming that APs are equipped with N AP = 4 antennas. In terms of average performance, results show that the improvement brought along by the satellite segment oscillates between a 25% for lightly loaded networks (10 users) and 10% for highly loaded environments (70 users). However, and as previously observed, it is when assessing the minimum rate performance that the integrated LEO-CF-M-MIMO system dramatically outperforms the stand-alone CF-M-MIMO scheme. As an example, note that to guarantee a minimum rate of 15 Mb/s per user, the pure CF-M-MIMO network can only support 10 users whereas the integrated one quadruples this figure by supporting up to 40 users. In fact, regardless of the user load, the integrated network is able to at least double the minimum rate of the CF-M-MIMO network. As already mentioned in previous graphs, it is important to note how the average rate of the terrestrial users of the integrated network consistently outperforms that of the stand-alone CF-M-MIMO under any network load. This again is caused by the better conditioning of the precoding filter when most ill-conditioned users need not be served from the APs. Note in the case of the integrated LEO-CF-MIMO scheme, that when only a few users are present, the minimum rate performance is limited by the terrestrial users, however, once the network load increases, minimum rate becomes far more conditioned by that of the satellite users. Figure 8 depicts the per-user rate Jain's fairness index (JFI) for both schemes (CF-M-MIMO and LEO-CF-M-MIMO) and for N AP = 4 transmit antennas. The JFI, when applied on the throughput of K users denoted by R 1 , . . . , R K , is usually defined as J = ( ∀k R k ) 2 /(K k∈K R 2 k ), and its value is constrained to the range [1/K , 1], with unity indicating perfect fairness. While the use of a strict max-min power allocation policy would result in a fairness index virtually equal to 1 (see [21]), now this index, as shown in Fig. 8, is somewhat lower due to the use of the fractional power allocation strategy but note in any case that JFI values above  0.9 are considered indicative of a very fair operation. Results demonstrate that having more antennas at the APs leads to a larger degree of fairness by capitalizing on the many more degrees of freedom available at the precoder when N AP = 4. As depicted in the figure, increasing the network load leads to imbalances in the user performance as a result of an increased chance of having ill-conditioned user(s) suffering from lowerthan-average SINRs. This situation is significantly improved by the satellite segment in the integrated LEO-CF-M-MIMO setup as the diverted users are those whose performance is most severely degraded in the terrestrial network. Note also that in the integrated scheme, if required, the weighing factor ρ could also be used as a tuning parameter to improve fairness between the two segments. Finally, in order to gain further insight on the overall user rate performance, Fig. 9 plots the user rates' cumulative distribution function (CDF) of the per-user data rate for both stand-alone CF-M-MIMO and integrated LEO-CF-M-MIMO for the two considered coverage areas (L = 1000, L = 2000 meters). From the CDFs it is clear that the integrated system brings along some improvement regardless of the examined user rate however the gain is more prominent when examining the lowest rates, which correspond to the worst users in the network. Indeed, these plots reinforce the fact that those users with worst rates are the ones that most benefit from having the potential to connect to the satellite segment. In particular, if we focus on the 10% of worst user rates as pinpointed in the graph by means of the dashed horizontal line, it is clear that the satellite connection, for the case of a service area of L = 2000 meters, helps in increasing the throughput experienced by the worst 10%-users from 13 Mbps to 20 Mbps (53% improvement). In the case of the L = 1000 meters, the gain is more moderate but still manages to ramp up the 10%-user rate from 38 Mbps to 43 Mpbs (13% improvement). Gains are even more significant when considering APs with only one antenna.

V. CONCLUSION
This paper has presented an integrated space-terrestrial framework combining the benefits offered by an ultradense terrestrial deployment (CF-M-MIMO) with the large coverage of a LEO satellite segment. Unlike prior work, the terrestrial segment is designed following scalability principles in the precoding, power allocation and fronthaul requirement designs. A key ingredient of this integration is the proposal of an algorithm that governs the diversion of users from the terrestrial to the satellite segment. This algorithm can potentially be tuned to favour a variety of metrics but in this paper, and in agreement with the original cell-free philosophy, the improvement of the worst user rate is targeted. Numerical results have identified conditions under which the satellite segment can provide very substantial gains and situations where the gains are just marginal. In particular, it has been observed that when the area to be served is poorly densified in the terrestrial segment, and very specially when the APs are single-antenna, the possibility of diverting users to the satellite segment provides large performance improvements both in average and worst-case user rates. The proposed integrated scheme has been shown to be robust to the elevation angle between the terrestrial network and the satellite. This opens the door to investigate the role the satellite segment can play in coordinating a number of adjacent CF-M-MIMO segments jointly using several satellite/beams. Findings in this paper reveal the significant potential a LEO satellite can have in supporting and expanding the coverage of a CF-M-MIMO terrestrial segment. Future work will address the performance that a multi-satellite configuration may have when providing service to an area where a mixture of single-segment MSs (terrestrial-only and satellite-only) and hybrid MSs are present.
GUILLEM FEMENIAS (Senior Member, IEEE) received the degree in telecommunication engineering and the Ph.D. degree in electrical engineering from the Technical University of Catalonia (UPC), Barcelona, Spain, in 1987 and 1991, respectively. From 1987 to 1994, he worked as a Researcher with the UPC, where he became an Associate Professor, in 1992. In 1995, he joined the Department of Mathematics and Informatics, University of the Balearic Islands (UIB), Mallorca, Spain, where he became a Full Professor, in 2010. He is currently leading the Mobile Communications Group, UIB, where he has been the Project Manager of projects ARAMIS, DREAMS, DARWIN, MARIMBA, COSMOS, ELISA, and TERESA, all of them funded by the Spanish and Balearic Islands Governments. In the past, he was also involved with several European projects (ATDMA, CODIT, and COST). His current research interests and activities span the fields of digital communications theory and wireless communication systems, with particular emphasis on radio resource management strategies applied to 5G and 6G wireless networks. On these topics, he has published more than 100 journals and conference papers, as well as some book chapters. He has served for various IEEE conferences as a technical program committee member, as the Publications Chair for the IEEE 69th Vehicular Technology Conference . She is currently the Director of the Centre Tecnológic de Telecomunicacions de Catalunya, Spain. She is the Co-ordinator of the Networks of Excellence on Satellite Communications, financed by the European Space Agency: SatnexIV-V. She has more than 60 journal articles and 300 conference papers. She is the coauthor of seven books. She has leaded more than 20 projects and holds eight patents. Her research interest includes signal processing for communications, focused on satellite communications. She is a member of the BoG of the IEEE SPS and the Vice-President for conferences (2021-2023). She is a member of the Real Academy of Science and Arts of Barcelona (RACAB). She was a recipient of the 2018 EURASIP Society Award. In 2020, she has been awarded the ICREA Academia Distinction by the Catalan Government. She has been the General Chair of the IEEE ICASSP 2020 (the first big IEEE virtual conference held by IEEE with more than 15.000 attendees). She has been an Associate Editor of the IEEE TRANSACTIONS ON SIGNAL PROCESSING, EURASIP SP, and EURASIP ASP. She is a Senior Area Editor of the IEEE OPEN JOURNAL OF SIGNAL PROCESSING. VOLUME 10, 2022