Super-MAC: Data Duplication and Combining for Reliability Enhancements in Next-Generation Networks

A piece of user equipment (UE), typically, has access to multiple radio access technologies (RATS). Moreover, apart from the standard primary cellular network, the secondary cellular networks can assist the primary network in downlink UE communications. In this way, the data can reach the UE through multiple entities. This paper exploits the multiple entities’ idea by proposing a cross-layer scheme that combines data to improve the block error rate (BLER) and the throughput. For this, we define a new entity, called the super-MAC, just above the Medium Access Control (MAC) layer. More specifically, we propose data duplication (at the transmitter) and combining (at UE) at the super-MAC, where the super-MAC gets the Radio Link Layer protocol data unit (RLC-PDU) and sends multiple-copies across various interfaces to different MAC-entities. In doing so, the super-MAC attaches a unique sequence number to a group of RLC-PDUs together. At the UE, the data from different MAC entities are combined at super-MAC to clear any block error. The super-MAC operates in between the Cyclic Redundancy Check and Forward Error Correction stages of the HARQ process. The additional complexity introduced by the scheme is negligible in front of the existing operations. Moreover, the average latency improves due to the significant improvement in the Block Error rate (BLER) that the combining scheme offers over the BLER of the conventional standalone system. Also, since the errors significantly reduce, the throughput shows significant improvement. Finally, the proposed scheme is an advancement in HARQ to reduce retransmissions, and hence it is suitable for the next-generation networks like B5G or 6G to adopt the super-MAC.


I. INTRODUCTION
With an increase in the number of users, there has been an exponential increase in demand for data, which has caused tremendous pressure on the current cellular networks like 4G-LTE and non-standalone 5G-NR. In the future, the 5G-NR based networks will also face tremendous pressure to meet user requirements, primarily because of the scarcity of vacant sub-6 GHz bandwidth where the transceiver design is relatively more manageable than the high mmWave frequencies.
The B5G networks need to enhance reliability to reduce the redundant reuse due to retransmissions.
The associate editor coordinating the review of this manuscript and approving it for publication was Matti Hämäläinen .
The 3GPP has sought to utilize the unlicensed band in the sub-6 GHz range, where the widely used IEEE 802.11xx standards operate. Hence, the 3GPP has primarily shown interest in operating at these frequencies in harmony with IEEE 802.11xx or WiFi technology with three primary approaches. Firstly, the LTE-unlicensed (LTE-U) or licensed assisted access (LAA), where LTE operates in the unlicensed bands [1] either with sensing the WiFi signals before transmitting (listen-before-talk) [2]- [4], with orthogonal existence [5] or with fair existence [6]- [8]. Deep-learning based resource management for fair coexistence of LTE and WiFi [9] is proposed in [9]. However, there are many challenges which need to be addressed [10], [11] including ensuring interference management and power control.
Such approaches are also suggested for the 5G-NR (NR-U) [12]- [14]. The second approach is the LTE-WiFi Aggregation (LWA), where packet data protocol convergence (PDCP) layer splits the data at the eNodeB and sends part of the data from the eNodeB to the UE directly and offloads the other part to the WiFi-access point (AP) through the Xw interface and finally to the UE [4], [15]- [18]. Splitting of data can be performed [19], [20] or the data can be switched among the LTE and WiFi depending upon which path performs better [21]. Commercial deployment scenarios are also considered for LWA [22]. The Aggregation of NR with WiFi is also an active area of research [23], [24]. Finally, the third approach is the LTE-WiFi integration through secured IP tunnelling where the IP layer sends the data across either to LTE or WiFi from the IP layer [15], [25]- [27]. Note that these approaches' central idea is enhancing the achievable throughput by offloading some of the traffic to WiFi from LTE. In particular, the data sent across LTE and WiFi are distinct. LWA works because the UE can arrange the PDCP protocol data units (PDUs), that have traversed two different paths, in the correct order using the appropriate SN.
The 3GPP has also prescribed dual connectivity to provide non-standalone access for 5G-NR, where the 5G cells are connected to 4G core network [28], [29].
Only recently, in release 15 [30], has attention also been given to PDCP layer data duplication, where the same PDCP PDU is duplicated and sent through different RLC entities with the aim to achieve reliability. The different RLC PDUs can use the same MAC entity through carrier aggregation or pass through completely different MACs [31]. Using multi-RAT connectivity, the PDCP PDU can be duplicated and passed through different RATs through the Xw interface. The primary aim of duplication is achieving reliability [32]- [35] for URLLC applications. It is also proposed for video applications through a network coding framework at the PDCP layer [36].
The PDCP duplication has its own set of challenges [37]. So far, the works that have addressed PDCP duplication accept data only from one path and discard the other, which is feasible as long as one of the paths achieves reliable communication. However, when all the paths (two in the simplest case) fail, then compulsory retransmission of the entire block is required. Given that the round trip time is a significant number over other fixed processing delays, the latency can increase [33]. Also, since the PDCP sits at the top of the L2 layer, the cross-layer scheme has relatively more computation complexity. Hence, performing operations similar to LWA offloading, but at the RLC layer [38] are also reported.
Note that the HARQ process sits below the RLC layer. Thus any modifications to this would need to be addressed below this layer. There are two steps in HARQ first, where the physical layer performs a cyclic redundancy check to detect errors. Second, where forward error correction procedure which corrects the some fixed number of errors. It is wellknown that FEC is an expensive procedure, both computationally as well as in terms of bandwidth. However, without FEC the round trip times that result from retransmissions are quite severe and increase the latency beyond current specifications. Hence, a scheme that relies less on FEC for error correction and computationally cheaper is needed.
In this paper, we propose data duplication and combining schemes at a newly defined entity, super-MAC. The super-MAC sits between the RLC and the MAC layer of the primary network, and can hence be part of the HARQ process. The super-MAC supplies the same version of the data and the essential control information to the secondary MAC entities. At the receiver, the super-MAC receives data from all the MAC entities. If any MAC entity has reported a successful reception of data, then the super-MAC discards other users' data. However, if all the MAC entities declare a failed CRC check, the super-MAC combines the data and provides the primary MAC with the combined version of the correct data before proceeding for FEC at the primary MAC entity. This combined version is likely to be less corrupted than the original corrupted version. The redundancy used in the coding scheme to correct the combined packet can be significantly less than that in the standalone scenario. Thus the overhead associated with FEC can be significantly reduced. The usage of super-MAC is shown to improve the BLER and enhance the throughput. These schemes will offer cell edge users with better reliability and other users with power savings for a given target BLER. Also, in many practical regimes of interest, our scheme offers better latency performance.
Against this background, the following are our contributions in this paper: • We first present the super-MAC based combining scheme and derive the optimal rule when soft-bits or loglikelihood ratios (LLRs) are combined that minimizes the bit errors.
• When soft-bits are not available, we suggest a hardcombining scheme with the same performance as the selection combining scheme. We analytically derive the block error rate and highlight the improvement achieved by combining data for hard-combining.
• We take an example of LTE assisted by WiFi. Using simulations, we demonstrate the performance gains achieved in terms of BLER and throughput for both the hard-combining and soft-combining schemes. Also, we study the impact on latency due to combining data. The average round-trip-times decrease significantly because of substantial improvement in the BLER, while the increase in fixed one-way latency due to super-MAC is negligible, resulting in improved average latency.
• We also take examples of NR-LTE, NR-WiFi and LTE-WiFi and compare our combining scheme's performance, with the PDCP based duplication scheme. We show that combining at the super-MAC is superior to PDCP based approach in terms of BLER and throughput.
• We discuss that our scheme essentially plays an important role in reducing re-transmissions and sits between the CRC and the FEC stages of the HARQ process.
Thus this new addition reduces the complexity that is associated with FEC while also ensuring good reliability performance.
The rest of the paper is as follows. We first discuss the different LTE (NR) and WiFi Coexistence mechanisms in Section II. We then introduce a generic theoretical model of the system in Section III. We follow with the description of the super-MAC combining scheme in Section IV. We derive analytical expressions for the BLER in Section V and analyse the Latency in Section VI. Next, we describe the two combining scenarios in Section VII-A. We present the simulation results and their interpretation in Section IX. Finally we conclude in Section X.

II. LTE-U, LAA, MULTEFIRE, LWA, LWIP
This section summarises the various approaches adopted to meet the exponential growth in cellular traffic in 3GPP networks. Although these approaches speak of LTE and WiFi coexistence mechanisms, they are essentially coexistence of 3GPP networks with WiFi. Hence, instead of LTE, these may very well be 5G-NR or even B5G networks.

A. LTE-U AND LAA
In LTE-U, LTE operates in the unlicensed band. Originally, Qualcomm designed LTE-U to work in regions which do not mandate the ''listen before talk'' (LBT) protocol. LBT [39], [40] refers to a procedure that checks for ongoing transmissions in the unlicensed band, and essentially avoids interference and improves spectral efficiency. Hence, regulations in Europe and Japan mandate the use of LBT in unlicensed bands.
To incorporate this and provide a globally deployable solution that could, the 3GPP in its release 13 [1], [41] has proposed LTE in the unlicensed band equipped with LBT and termed it as Licensed Assisted Access (LAA). The unlicensed transmission for the licensed user will be governed by the 3GPP standard [42] which is being studied and continually updated. Note that similar to LTE-U, NR-U is also under active consideration for operations of NR in unlicensed band [43].
The new standard IEEE 802.11ax allows multi-user transmission by partitioning the frequency band into a substantially large number of resource units. Thus the 3GPP networks have to perform LBT at all frequencies in the band and not merely calculate the whole band's energy. Hence, careful mechanisms that need to be adaptive [44], robust [45], and fair [6] have to be designed for LBT considering the incumbent traffic in these unlicensed band.
LAA has shown promising results in significantly improving throughput. However, when LAA employs multiple secondary base stations (LTE-eNBs) as in dense deployments, LBT tends to hinder WiFi operations and hence it is necessary to check on the density of deployment, and the LTE frame structure [39]. It has also led to suggestions of a more dramatic approach of MulteFire that allows standalone LTE operations in the unlicensed band.

B. MulteFire
MulteFire and its enhancement, the eMTC-U, suggested by the MulteFire-forum, operates in a standalone fashion in an unlicensed or shared medium [46]. It can be useful for an easy-to-deploy scenario that uses the WiFi APs to route cellular traffic [47]. Multefire dramatically increases the throughput; however, because of the scarcity of available bandwidth, it saturates relatively fast [48]. Nevertheless, it can improve system capacity and coverage in some specific environments, and small cell deployments [49]. Moreover, it is a possible candidate for deploying the massively connected Narrow-Band IoT [50] as well as more massive-Machine Type communications (mMTC) [51].
Also, in the newer releases that focus of 5G-NR, the NR in the unlicensed band, NR-U in release 16 has provisions for standalone LTE operations in the unlicensed band as envisioned by eMTC-U.

C. LWA
LWA is perhaps the most widely studied approach towards the co-existence of LTE and WiFi to enhance the cellular user experience. In fact substantial 3GPP standardization activities have taken place in release 13 [41] and 14 [4]. A new Xw logical interface between eNB and WT [52] is suggested. A new data transport procedure over the Xw interference [53] is proposed. The PDCP layer function is defined for an LWA bearer [54], and finally, a new sublayer the LWA adaption protocol (LWAAP) layer that sits between the PDCP and the RLC [55] is specified. The 3GPP is continually updating the standards in the releaser later.
In LWA the PDCP PDU is offloaded to the WiFi LLC. The LTE-eNB and WiFi-AP can either be collocated or noncollocated. When collocated, both the LTE-eNB and the WiFi-AP are in the same device. By contrast, when noncollocated, the eNB and AP are connected through the Xw logical-interface [56].

D. LWIP
Unlike LWA, LWIP operates above the PDCP layer. LWIP works by encapsulating the IP data in the IPsec tunnel and is directly presented to the WiFi STA, using the legacy WiFi architecture. A separate specification by the 3GPP describes the data encapsulation in LWIP [57]. LWIP works on userplane both in the uplink and downlink direction.
We summarize the LTE-WiFi co-existence schemes in Table 1.

III. SYSTEM MODEL
We keep our system model generic. We assume N different heterogeneous paths that a UE is connected to the NG-eNB [58]. These heterogeneous networks can either be different enodeBs of the same radio access technology (RATS) or different RATS.

A. OVERVIEW
Consider a piece of user equipment (UE), that connects to N heterogeneous networks (like LTE, WiFi, 5GNR). There is one network, say i = 1, that is the primary network of the UE. 1 The other networks, i = 2, . . . , N , are called the secondary networks. Each of the N networks receives the signals (data) in parallel. The received signals can either be hard-decoded at the physical layer to obtain binary bits or soft decoded to obtain LLRs. The receiver is equipped with sufficient buffer storage so that the data through networks with lesser delay can wait for data from other networks. Typically, the primary connection is assumed to be slower than the secondary networks. We provide an overview of the scheme in Fig. 1. Next, we explain the transmitter and the receiver parts.

B. TRANSMITTER
At the NG-eNB, the arriving IP data undergoes PDCP and RLC operations. The MAC layer of the primary network defines the size of the Transport block. Depending upon this size, the super-MAC groups the RLC PDUs into blocks and assigns a unique sequence number to each block. For the secondary networks, the super-MAC supplies essential information like the addresses of the source and destination, among other things. Note that this sequence number is different from the sequence number assigned at the PDCP and the RLC layers. The sequence numbers assigned here are essentially for synchronized combining of data from multiple connections and not for in-order packet combining.
The different MAC entities now transmit the same data according to the procedures defined in their standards. We now proceed to define a generic system model.
Let the k length bit sequence of the RLC PDU that belong to a single MAC payload at the primary network be denoted as Multiple copies with the same SN are transmitted through N different MAC entities. The data at the physical layer of MAC entity i is encoded in an n i length bit sequence C i,B = {c i,1 , c i,2 , . . . , c i,n i }. This data is modulated using appropriate scheme and is transmitted as signals, whose baseband is represented by X i , i ∈ {1, 2, . . . , N }. Note that there is an average transmit power constraint at the transmitter E[|X i | 2 ] ≤ P i .

C. RECEIVER
The received signal Y i is as follows, where G i denotes the random multiplicative channel fading inflicting the transmission through network i and Z i is the associated additive white Gaussian noise. We do not specify the distribution of G i whereas Z i is zero mean and unit variance Gaussian Noise. Thus the signal to noise ratio at the receiver conditioned on The receiver then demodulates and decodes the LLRs for each bit j = {1, . . . , k} from each network i. At this point the receiver has two options, either to hard decode the LLRs into bits or keep the LLRs as it is. We adopt both the approaches and compare them. Let the bits decoded be denoted bỹ The receiver now decodes the bits and does one of the following two, as shown in Fig 1. Along with the decoding the bits, the receiver can now either store the LLRs of the RLC PDUs or discard the LLRs and keep only the channel gain values corresponding to a bit.
The receiver checks all the N paths for an error-free block. If any path declares an error-free block's reception, it reports the sequence number of this block to the primary MAC (PMAC) and the PMAC sends the error-free data to the RLC layer. By contrast, if all blocks run into an error, then super-MAC initiates combining. The super-MAC accumulates and synchronises the data using the SN, and combines them according to sequence numbers. This combining can be soft-combining, where the receiver stores and combines the LLRs. The receiver may employ hard combining, where the receiver uses the channel state information along with hard-decoded bits to combine data. We present details on the two types of combining in sections that follow. Now, the receiver checks the combined data for errors. If data combining corrects the errors, then the primary network of the UE sends an ACK to the NG-eNB, and the NG-eNB initiates new data transmission. On the other hand, if there is error despite combining, then the primary network of the UE sends a NACK and requests retransmission for the same erroneous block of data.
Remark 1: Although we discuss downlink operations, our cross-layer technique requires data to flow back from super-MAC to the physical layer at the UE, increasing the fixed receiver delay in decoding. However, we show that the technique's reliability ensures lower retransmissions, resulting in low average decoding delay. We discuss the receiver operations in the event of super-MAC combining.

1) PHYSICAL LAYER OPERATIONS AT THE RECEIVER
The receiver has to perform de-interleaving, de-scrambling and cyclic redundancy check (CRC) at the physical layer. In hard-combining, the receiver performs these operations in the standard fashion. However, in soft-combining (where we keep the LLRs), only de-interleaving is trivial as it requires pre-multiplying LLRs by a permutation matrix. De-scrambling involves modulo two addition of bits, and in our case de-scrambling is performed as explained next.

2) DE-SCRAMBLING OF LLRS AT THE PHYSICAL LAYER
Let us denote the LLR of the j−th bit by LLR(c i,j ) from network i, j = {1, . . . , n i }. The LLRs of the descrambled bits LLR(c i,j ) is as follows To see an explanation for the same, the reader can refer to Appendix X. Next we discuss the MAC sub-layer.

3) AT THE MAC AND RLC SUB-LAYER
The MAC sub-layer converts the LLRs of the MAC headers into hard-bits and leaves the MAC payload as LLRs. All the MAC entities report their erroneous payload receptions to the super-MAC. The super-MAC aligns all the erroneous receptions according to the sequence number. Note that since in LTE there are 8 HARQ processes, the super-MAC will also maintain a maximum of 8 MAC payloads. Next, we describe the combining schemes employed by the super-MAC. However, before that, we provide a little background on softcombining, which will allow us to build our combining scheme.

4) SOFT-COMBINING
Soft-combining (SC) involves combining soft-values (SVS), instead of bits. The receiver assigns higher weights to those bits received with higher SNR. Consequently, SVs can be in VOLUME 9, 2021 the form of the Log-Likelihood Ratios [59] or can be receiver computed confidence values [60]. Chase combining (CC) and incremental redundancy combining (IRC) use SC in hybrid ARQ procedures [61], [62], where a receiver stores a corrupted (not decodable) packet and soft-combines it with a retransmitted packet arrived. The retransmitted packet may either be an identical copy of the first transmission as in CC or different as in IRC.
The soft-combining approach of combining bits is akin to the maximal ratio combining [63] where instead of bits, different copies of the same (weak) received signals are combined to form a strong signal with optimum SNR. In softcombining, since we combine SVs of each bit, the idea is to obtain a combined SV that will have the correct sign, i.e. positive for bit 1 and negative for bit 0. Next, we discuss the combining schemes proposed.

IV. COMBINING DATA AT THE SUPER-MAC
At the super-MAC, LLRs corresponding to the RLC PDUs are combined. Since, at the transmitter these bits were Notice that X i,j is now X j , that is each path transmits the same value X j = 1 for b j = 1 and Note that at the receiver Y j is the received signal and H j is estimated. We now have the following result Lemma 1: The optimal soft-combining rule for b j is cal- Proof: In Appendix E. In the case of hard-combining 3 , the following test statistic can be employed:L Here,b i,j is the hard-decoded bit estimate on the i−th path. Also, since we have lost the LLRs in hard decoding and wish to incorporate the role in combined bit decision making, we estimate the LLR inLLR(b j ) for the j−th bit. The decision rule will now be similar to the above rule for soft combining and is arrived at as follows.
The estimate of the baseband BPSK signalX Thus the decoding rule here becomes, We have the following result Lemma 2: The Hard-combining scheme of (5) has the same probability of error as that of selection combining for N = 2.
Proof: In appendix F This implies that when hard-decoding is followed by datacombining from two paths then selection combining is optimal. Next we derive the improvement in the block error rate as a result of combining.

V. BLER CALCULATIONS
In this section, we compute the improved probabilities of block error. Let where M i is the total number of bits present in the PHY and MAC headers of i, and N M is the number of RLC-PDU bits that each MAC layer presents to the super-MAC for combining. Note that N M is constant across different MACs.
Each of the N physical layer entities, PHY i will do a CRC check and the initiate combining at the super-MAC. After super-MAC combining the PMAC will do FEC for error correction. If FEC is successful, the super-MAC passes the data to the higher layers else request retransmission. We analyse BLER improvement due to super-MAC combining next.
We consider a block of data at the MAC-i. Let K i denote the random variable (r.v.) of the number of bit errors in a block at MAC i , i = {1, . . . , N }. Also let t i be the error correcting capability of the MAC i . Let the probability of bit error be denoted by p i . Also, let E i = {K i ≥ t i } denote the r.v. modeling the event that PHY i has a block a error after HARQ (without combining). Then the probability of E i , denoted by p b,i is given by where p i is the average probability of bit error.
Now let q i be the probability that an incorrect bit in MAC i was corrected post combining and let r i be the probability that a correct bit in MAC i was erred post combining. For soft combining We can compute this probability for a given fading distribution, but this is a complicated computation. For hardcombining, we derive the expression for this probability in Lemmas 3 and 4 for the cases where the fading is identical Rayleigh distributed and when fading is Rayleigh but with distinct mean values respectively. and Proof: See Appendix C. and Proof: See Appendix D. We then have the next result that characterizes the probability of the eventĒ c i , the event that super-MAC corrected a block in error.
The probability ofĒ c i is then where l i = min(t i , N M − k i ) and m i = k i − t i + e i . Proof: In Appendix G. Notice that the minimum number of errors that need to be corrected by the super-MAC to declare an error-free block at MAC i is C i = K i + E i − t i +. However, this number can be at-most K i and thus, if E i ≥ t i , then the block cannot be corrected. On the other hand if C i − E i > K i − t i then the block is corrected. Thus finally the BLER is given bỹ That is if we employ a simple selection rule to approve the data that has successfully decoded the data, then the BLER would be n i=1 p b,i p b,i . Now due to super-MAC, the BLER is improved by a factor n i=1 (1−q b,i ). In the next section, we provide the latency analysis and show that the system's average latency also improves due to improved BLER.

VI. LATENCY ANALYSIS
The 3GPP [64] defines latency as the time taken for a PDCP SDU to reach from the transmitter to the receiver.
Note that the total user-plane latency comprises of two components [65], (i) one-way latency due to the transmitter and receiver processing delays and the transmission time interval, and (ii) the average round trip time, that depends upon the block error rate that dictates the random number of round trips.
Since the data is combined at the super-MAC and then sent down to the lower layers, the fixed one-way latency increases, however, the average delay in the system is expected to decrease because combining the data reduces the block error rate and with it the average number of round trips.
Let T t and T r denote the fixed transmitter and receiver processing delays in the standalone system. Also, let T TTI and T RTT denote the transmission time interval and round trip VOLUME 9, 2021 time, respectively. Then the average latency in the standalone scenario for FDD based communications is given by where we have assumed, without loss of generality, that i = 1, represents the primary network. Although the latencies of the secondary networks are lesser than the primary network, the PMAC will dictate the latency value because the acknowledgements are sent through the primary network.
There is an additional delay component at the super-MAC and the re-performing of the CRC checks for the combined system. We denote this collectively by T s . The latency for the combined system will then be We scale the delay component at super-MAC by N i=1 p s i because the combining scheme is employed only when there is a block error in all MAC entities. Also per block, the combining takes only kN real additions in case of soft combining and 2kN real additions and kN real multiplications in hard combining. If N is high, then N i=1 p b i → 0, while if N = 2, then the number of real additions and multiplications are substantially low. Thus, the computational complexity of these operations is negligible. Secondly, the CRC check process forms a small component of T r which involves decoding, demodulating, deinterleaving and descrambling along with other processes till the super-MAC and this, coupled with the fact that the delay occurs with a small probability, the overall contribution can be safely neglected. Hence the average latency with combining can be approximated as However, we also look at an upper bound to latency, where we assume that the receiver processing delay is at most twice that of the standalone system when a block is in error before combining. We do make a small assumption that no detection of signals in the second traversing offsets the delay in data combining. The combining of LLRs incurs N real-additions per bit. Assuming that the block of RLC PDUs is k bits, we have kN such additions. The processing delays of modern DSPs are minimal. Even if we consider the 5G NR fixed receiver processing delay of a fraction of a millisecond, the processing delay T p for N additions is significantly smaller than T r , that is T p T r . Moreover, in the combining scheme, the fixed receiver operations such as detecting, deinterleaving and descrambling are not performed. Hence, we can safely assume that the fixed receiver processing delay in combining is less than 2T r . Nevertheless, we assume the upper bound and show that combining still achieves lesser latency in many interest cases.
Hence, the latency in the combining scheme, is For L c < L s , a sufficient condition is, Since (15), then a sufficient condition for L c < L s is Typically, T RTT ≈ 5T r [66] (or 8T TTI due to 8 simultaneous HARQ processes), which implies that the average latency improves through data combining if N i=2 p b,i < 5 6 , which is certainly true in most cases of practical interest. In sections that follow, we will see how the combining scheme will help in reducing latency.

VII. COMPATIBLE DATA RATES AND POSSIBLE SCENARIOS
The average data rates achievable through the secondary MAC (SMAC) entities must be more than the those achievable through the PMAC. For example, if WiFi has to assist NR through the super-MAC, then WiFi must be capable of bearing the NR MAC-payload with an average latency lower than NR and such that the super-MAC achieves target BLER after combining. In such a case, we say that SMAC (WiFi in this case) is compatible with the PMAC. In most cases, we can ensure that the SMAC is compatible with PMAC and SMAC by tuning various SMAC parameters that affect the data rates. These parameters include, modulation and coding schemes, number of resource units and MIMO capability. When multiple SMACs are involved, we require that all the individual SMACs be compatible with the PMAC. When all SMACs are compatible with PMAC, then we say that the different paths have compatible data rates.
Note that applications that perform data offloading do not require the SMAC to be compatible with PMAC in the sense we have defined above.
In the next section, using an example of 4G-LTE as PMAC, and 5G-NR and IEEE 802.11ac WiFi as SMACS, we show a wide range of compatible rates exists that allow WiFi to work with LTE and 5G-NR also to work with LTE.

A. POSSIBLE SCENARIOS
Since we have assumed a generic model, N can be arbitrary. However, in practice, multi-connectivity has not yet been realized for N > 2. The 3GPP has defined new interfaces. These interfaces can establish a multi-RAT and multi-connectivity as shown in Fig. 2, which depicts the multi-connectivity of primary NG-eNB, a standard 4G-eNodeB, a 5G-gNB and a WiFi AP. Here NG-eNB is the master which hosts the super-MAC and passes the RLC PDU to the (i) eNodeB through the X2 interface, (ii) 5G-gNB through the Xn interface and (iii) to the WiFi AP through the Xw interface.
The super-MAC achieves synchronization by marking a group of RLC PDUs with SN. It is essential to ensure that all the SMACs have latency lesser than that of NG-eNB direct path for latency-sensitive applications. However, for bandwidth-hungry applications that are latency-tolerant, the multi-connected system can enhance throughput while ensuring reliability. We simulate an example at the intersection of dual-RAT and dual-connectivity for reliability enhancement, through a super-MAC based combining of 4G-LTE and WiFi (802.11ac) followed by the LTE-NR and NR-WiFi scenarios. This simulation-based study will demonstrate the performance improvement achieved by the combining scheme in terms of BLER and throughput. In many regimes of interest, combining also improves the latency along with BLER and throughput.

VIII. COMBINING SCENARIOS
We consider two scenarios for further study. In the first scenario, we consider LTE-WiFi and analyze the BLER, Throughput and Latency performance for sub-scenarios spanning different modulation and coding schemes. This simulation demonstrates the performance improvement as a result of the two combining schemes. In the second scenario, we consider three sub-scenarios: LTE-NR, NR-WiFi and LTE-WiFi for BLER and throughput. The second scenario compares the performances of both our schemes to that of the PDCP based duplication scheme as proposed in [33].
We first take an example of LTE assisted by WiFi. We depict the interaction of the super-MAC and MACs of both LTE, WiFi and NR in Fig. 3.
From the figure, we see that a group of RLC PDUs form a block. The block size can be significant to make a long transport block. However, we assume that the at-least one block fits the WiFi MAC frame body. Selecting compatible rates ensures this. We examine the peak rates achievable for LTE, WiFi and NR, which allow us to characterize the rates of WiFi and NR that can be compatible for a given LTE peak rate, and also the rates of WiFi that can be compatible for a given NR peak rate.

A. LTE PEAK DATA RATE
As per 3GPP release 12 [16], for LTE, each frame is of 10 ms duration and is divided into 10 sub-frames each of T s,l = 1 ms duration. Each subframe consists of T s = 2 time slots. A UE is assigned with at least one physical resource block (PRB). A PRB consists of N c = 12 subcarriers, each of which has a fixed bandwidth of F L = 15 kHz. Also each PRB consists of N s = 7 OFDM symbols when normal cyclicprefix is used and N s = 6, when extended cyclic-prefix is used. Each OFDM symbol consists of b L bits for M − QAM, where b = log 2 M . Let B L be the available bandwidth to a UE. Corresponding to this bandwidth, let N prb be the number of PRBs that can be allocated to the UE. Let the code rate be C l . For MIMO systems let the multiplexing gain be G l . We then have r LTE , that is the peak data rate available to a UE as follows As an example rate calculation, suppose a total bandwidth of B = 10 MHz, is available to a UE. This corresponds to N prb = 50. Assuming, C l = 3 4 transmission, 4× 4 MIMO with normal cyclic-prefix and 16− QAM, we have the rate The reader can refer As per [67], in the 802.11ac, for bandwidth B w there is a fixed number of subcarriers N w each having sub-carrier spacing of F w = 312.5 kHz. Thus the symbol duration is 3.2µs. A short guard interval lasts 400 ns while long lasts 800 ns. The symbol duration is T s,w = 3.6µs for short-guard interval and 4µs for long-guard interval. For M − QAM modulation, with coding rate C w and multiplexing gain G w , we have the peak data rate for bandwidth B w As an example, for B w = 20 MHz, N w = 56. If 16-QAM is used with a short guard interval with C w = 3/4 and G w = 4, we have Thus there are scenarios where the achievable rates in WiFi are more than that of the LTE. Similarly, there are many more achievable data rates for WiFi compatible with the LTE rate of 96.13 Mbps. However, as the LTE rate increases, the possible WiFi rates compatible become smaller. An important point to note is that we are interested in scenarios where LTE performance is poor and will typically, have low data rates. Using different modulation and coding schemes and choosing an appropriate MIMO size, one can achieve compatible rates.

C. 5G-NR PEAK DATA RATE
As per [68], the maximum data rate computed for a given number of aggregated carriers in a band or band combination is as follows.
wherein J is the number of aggregated component carriers (CC) in a band or band combination, R max = 948/1024 (for LDCP codes). For the j-th CC, v   is the numerology (as defined in [69]), T µ s is the average OFDM symbol duration in a subframe for numerology µ with normal cyclic prefix. N BW (j),µ PRB is the maximum number of resource blocks that can be allocated in bandwidth BW (j) with numerology µ, BW (j) is the UE supported maximum bandwidth in the given band or band combination. As an example, for a 20MHz bandwidth with µ = 1 corresponding to a sub-carrier spacing of 30 KHz, N PRB = 51. For J = 1, v L = 1, Q m = 4, 14% overhead and unit scaling factor, the maximum data rate is 109.15 Mbps which is clearly compatible with an LTE rate of 96 Mbps.
Next, we perform simulation for the LTE-WiFi based dualconnected system in detail. Also, we simulate the scenarios of LTE-NR and NR-WiFi.

IX. SIMULATIONS
We perform simulations for the two scenarios as follows: • Scenario 1 compares the performance of LTE standalone with that of WiFi assisted LTE, where LTE is the primary network. The performance indicators are BLER, Throughput and Latency. Simulation parameters for this scenario are listed in Table 2. We choose the SNRs such that three different MCSs with compatible rates each are selected. The MCSs for standalone LTE are the MCS-3,8 and 12, while their compatible counterparts in WiFi are MCS-0,1 and 2, respectively. One can certainly select higher order MCS for WiFi; however, this will offer better performance at the cost of more resources.
• Scenario 2 compares the performance of the PDCP based duplication (PDCP) scheme [33] with that of the super-MAC based schemes proposed in this paper. Simulation parameters for this scenario are listed in Table 3. The performance indicators are BLER and Throughout. We consider three sub-scenarios as follows.
Firstly LTE-NR with LTE as the primary network, then VOLUME 9, 2021 NR-WiFi with NR as the primary network, and finally a version of LTE-WiFi with LTE as the primary network is considered. Note that the MCSs chosen are such that all these three combining strategies have compatible rates.
Since the key performance indices (KPI)s of interest are the BLER, throughput, and Latency (only for Scenario 1) we first define each of the terms as per 3GPP. Definition 1 (BLER): The BLER is defined as the fraction of code-blocks that have failed the CRC after all error correction schemes. The physical layer segments the MAC-PDUs or the transport blocks into appropriate size code-blocks and appends each code-block with CRC bits.
Definition 2 (Throughput): The measured UE Application Layer Throughput is defined as the number of useful user data bits per unit of time delivered by the network from the source endpoint to the destination endpoint, excluding protocol overhead (headers) and retransmitted data packets. Although this refers to the application layer throughput, we can extend this definition to the lower layers, like the PDCP, where the volume of data measured is successfully transmitted to the same layer at the receiver, excluding the lower layer headers.

Definition 3 (Latency): The 3GPP defines Latency as the time taken for a PDCP SDU to reach from the transmitter to the receiver (UE).
Note that this definition applies to 3GPP networks. Similarly, one may define the latency of WiFi to be the delay encountered in MAC-PDUs to reach from the transmitter (AP) to the receiver (STA).

A. RESULTS AND ANALYSIS
We first discuss the results and present a detailed analysis of Scenario-1. A discussion and analysis of Scenario-2 follow this.

B. SCENARIO-1
The Table 4 summarizes the standalone LTE's BLER and throughput performances for the three different MCS scenarios. When the received SNR increases beyond a certain level, the UE notifies the enodeB, which then moves to an MCS, offering better throughput while staying below a certain BLER level. For example, from the Table, we can conclude that when the SNR reaches 10 dB and above, the MCS can now change to MCS-8, which offers higher throughput while maintaining the BLER around 0.3. Certainly, this BLER is not desirable; however, we will show that employing the combining scheme the BLER will reduce to desirable values of near 0.1. Next we summaries the result for BLER in the combined scheme both with soft-combining (SC) and hard-combining (HC) in Table 5 and plot it in Fig. 4. Two observations are immediate. Firstly, the SC-based scheme is superior to the HC-based scheme. Secondly, both schemes offer superior BLER to that obtained by the standalone system. This is better demonstrated in Fig. 5, which plots the BLER against   Table 2. a range of LTE SNRs. Notice that the BLER has a sawtoothlike plot, which results from an MCS change with SNR. Since the transmitter now uses a higher-order modulation, the throughput increases, however, the average bit-error-rate increases leading to a higher BLER. We have selected parameters to achieve a target BLER or around 10%. Notice from the figure that without using WiFi, the standalone LTE suffers significantly high BLER when compared to where WiFi is employed.
Another important observation is that SC and HC's performance gap reduces as the SNR increases in a particular MCS, which implies that HC can be used instead of SC in such regimes and when memory requirements to store softbits are stringent.
We now demonstrate the latency reduction due to a decrease in BLER. The average latency for WiFi (802.11 ac at the 5GHz bands) is lesser than that of the 4G LTE but higher than 5G NR URLLC applications. Hence, WiFi can assist 4G LTE and 5G NR emBB applications but not the URLLC   application in 5G. We present the latency numbers in Table 6. An important observation from the latency table is that we achieve latency gains by combining when the SNR is low for a given MCS. The low SNR leads to higher standalone BLER  and hence combining helps reduce BLER significantly, thus also causing latency to reduce.
Throughput for the combined system is summarized in Table 7 and plotted in Fig 7. The throughput of the combined scheme is substantially higher than that of the standalone system. The throughput increase due to combining over the standalone LTE is relatively more at lower SNRs. Hence, even if a user is at the cell boundary where the signal is weak, the achievable throughput is substantial.
scheme and bandwidth. The modulation and coding scheme defines the code rate which is essentially specifies the fraction of informatio bits in the total number of bits after coding. Hence, when the MCS and bandwidth are fixed, then throughput will increase when BLER decreases. However, when the MCS changes to a higher index the throughput will increase as the code rate or bandwidth or both increase. However, the BLER may increase in such a case. We can observe this behavour when Figs. 5 and 7 are viewed together. In our   ). S1={(5G,23),(4G,12)}, S2={(5G,23),(W,1)}, S3={(L,12),(W,1)}. The column SNRs are of the first network. So for S1, the column SNRs are of (5G, 23) and row SNRs are of (4G,12). Similar convention for S2 and S3.
simulations we have maintained the BLER values to around 10% after combining.

C. SCENARIO -2
The Table 8 shows the BLER and Throughput for the standalone NR and LTE scenarios.
Note that since we wish to show the benefits of combining, we choose the standalone BLER to be high. We will see that all the three sub-scenarios of Scenario-2 reduce the BLER. Table 8 summarizes the BLER improvement achieved through combining. Firstly, both the combining schemes outperform the PDCP based duplication scheme. In the PDCP  Table 3.
based duplication scheme, if both the paths fail to decode the block successfully, then a block-error is declared. By contrast, in our schemes combining is initiated when both paths fail. From this argument and the Table, we conclude that combining rectifies errors in such blocks, which otherwise could not be corrected using coding. Thus, we can view our scheme as an intermediate step in a three-step HARQ process. In the first step, we do CRC, then combining and finally FEC. This three-step procedure reduces the BLER and improves the throughput as shown in Fig. 9, which plots the throughput for three sub-scenarios of Scenario-2. The throughput is also summarized in Table 10.

X. CONCLUSIONS AND FUTURE WORK
In this work, we propose a new L2 layer-based diversity scheme, combining that the next-generation networks can adopt to enhance reliability. When the CRC and FEC procedures fail, the proposed scheme optimally combines the RLC PDUs arriving at the UE from various MAC entities. It is the first time an optimal MAC level diversity combining scheme is proposed to the best of our knowledge. Simulations demonstrate that optimally combined PDUs reduce the block error rates substantially. Moreover, the average latency is also improved in various interest regimes, especially when SNR for a given MCS is low. The scheme can certainly help design strategies beyond 5G networks by combining the essential 5G advantages with this cross-layer technique.
In the future, it will be of much interest to investigate multi-RAT and multi-connectivity based solutions to tackle the rising demand for bandwidth. The main challenge will be to ensure latency below the prescribed limits. Another challenge is to evaluate the performance comparison of the hardcombining scheme with the selection-combining scheme for multi-connectivity with more than two connected paths.

APPENDIX A DESCRAMBLING OF LLRs
To obtain the LLRs of the descrambled bits, LLR(c i,j ), the receiver employs a soft-descrambling procedure described as follows. The receiver determines the descrambler's initial state based on set rules in a standard and obtains the corresponding pseudo-random sequence. Since the LLRs are available only at the pre-descrambling stage, the LLRs of scrambled bits are used to obtain LLRs for descrambled bits. To do so, observe the following. The likelihood ratio of a scrambled bitc i,j is Also note thatc i,j = c i,j ⊕ s i,j , where s i,j is the binary element of the scrambling sequence that is used to scramble bit j in the i−th channel. Thus we have c i,j =c i,j ⊕ s i,j . Now, Thus if s i,j = 0, then whereas if s i,j = 1, then Finally, LLR(c i,j ) = ln (c i,j ).

APPENDIX B PROBABILITY OF BIT-ERROR FOR WELL KNOWN DISTRIBUTIONS
For BPSK signalling with slow Rayleigh fading we have [70], while for Rician fading we have, where κ is the ratio between the power received in the Lineof-Sight path to the power received in the other paths 4 . For the Nakagami-m fading, m ∈ {1, 2, . . .} we have

APPENDIX C PROOF OF LEMMA 3
To begin with, we have Notice that we have removed the conditionb i = 0 because it is now incorporated in the probability calculation. Sincê b j ∈ {0, 1}, j ∈ S −i we condition the above probability on every possible combination of b j 's over j ∈ S −i . Consider the power set 2 S −i of S −i that contains all the subsets of S −i . Each element in the power set indicates one possible combination of b j 's. That is, consider P i ∈ 2 S −i . Since a power set is closed under complementation, P c i ∈ 2 S −i . Letb j = 1 if j ∈ P i . Running over all possible P i ∈ 2 S −i , we cover every scenario. Note that different P i represent disjoint combina- Then the RHS of (33) becomes For a fixed b, theb j 's are independent, and hence Thus (34) reduces to Notice that the conditioning in the first factor is no more required as we have incorporated its effect in the probability calculation. The Pr depend upon the distribution of |H j | 2 . For Rayleigh fading |H j | 2 is exponentially distributed. Certainly |H j | 2 are independent over i, but are not identical, because we have a Multi-RATs system and such an assumption will, in general, not hold. However, it is clear that if they were indeed identical then the r.v.s j∈P i |H j | 2 and j∈P c i |H j | 2 both the follow the Erlang distribution [72] but with different parameters. So if E[|H j | 2 ] = λ and |S| is the number of elements in the set S, then j∈P i |H j | 2 ∼ Erlang(λ, |P i |) while j∈P c i |H j | 2 ∼ Erlang(λ, |P c i |), where Erlang(α, n) is has the following density f E (x, α, n) = α k x n−1 e −αx (n − 1)! for x, α ≥ 0 Thus, We can now state the result for q(i). Also on similar lines we can derive the probability for r(i). For this purposed let P i = P i {i}. The Lemma then follows.

APPENDIX D PROOF OF LEMMA 4
For the case when |H j | 2 are independent and exponentially distributed but with distinct mean, say E[|H j | 2 ] = λ j , let λ P i = {λ j : j ∈ P i }. Instead of Erlang distribution, we will now call the resulting distribution as the modified Erlang or M-Erlang distribution. Note that this distribution will take all the distinct mean values inputs. Then j∈P i |H j | 2 ∼ M-Erlang(λ P i , |P i |) and where β j > 0 ∀j and x > 0. The Lemma then follows APPENDIX E PROOF OF LEMMA 1 The LLR for b j , This is for BPSK modulation which can be extended (nontrivially) to higher order modulation schemes. .
The noise Z i,j is Gaussian, independent and identically distributed over time and paths. Thus given X j and the estimate of H i,j at the receiver, Y i,j s are independent over i. Thus, The LLR thus becomes,
Thus p e,h = p e,s for N = 2.

APPENDIX G PROOF OF LEMMA 5
The proof the of Lemma 5 follows from successively conditioning the probability. First K i − C i + E i is conditioned on K i = k i , k i ≥ t i + 1. Then for each K i = k i , we are left with the r.v. E i − C i and we calculate the probability that C i − E i ≥ k i − t i , for which we again condition on E i = e i . Then for each K i = k i and E i = e i we are left with the r.v. C i and we calculate the probability that C i ≥ k i + e i − t i . Since C i can take a maximum value of k i , it is a binomial r.v. with k i trials with probability of success q i . The probability mass function (pmf) of this is exactly G(c i ; k i , q i ). To correct errors c i ∈ {k i + e i − t i , . . . , k i }. For this set to be non-zero, we must have e i ≤ t i as argued below. Now, E i ranges from 0 to N m − k i , it is also a binomial r.v. with N m − k i trial with probability of success r i . The pmf of E i is thus G(e i ; N M − k i , r i ). The limits of sum on e i are from 0 to min(N M − k i , t i ) for the following reason. Notice that the minimum number of errors that the super-MAC has to correct to declare an error free block at MAC i is C i = K i + E i − t i . But this number can at-most be K i and thus, if E i ≥ t i , then combining fails to correct the block. Thus the maximum value of E i to correct the block-error is min(t i , N M − k i ).
Finally, K i is a binomial r.v. with N M trials with probability of success p i . The pmf for this is G(k i ; N M , p i ). Since the super-MAC initiates combining for K i ≥ t i + 1, the limits of the sum on k i are from t i + 1 to N M .