Current Status and Directions of IEEE 802.11be, the Future Wi-Fi 7

,


I. INTRODUCTION
In September 2020, we celebrate the 30th anniversary of the IEEE 802.11 project [1] that has changed our connectivity habits. Nowadays, Wi-Fi, defined by a family of IEEE 802.11 standards, is the most popular wireless technology used for data transmission. Wi-Fi transmits more than half of user traffic. While cellular technologies make re-branding every decade, e.g., switching from 4G to 5G, for Wi-Fi users, increasing data rates, as well as introducing new services and new features come almost invisibly. Only a few of the customers care about letters ''n'' [2], ''ac'' [3], or ''ax'' [4] that follow ''802.11'' on the consumer electronics boxes. But it does not mean that Wi-Fi does not evolve.
One of the witnesses of this evolution is a dramatic increase in nominal data rates: from legacy 2 Mbps of The associate editor coordinating the review of this manuscript and approving it for publication was Xijun Wang. IEEE 802.  to almost 10 Gbps in the latest 802.11ax [4], also known as Wi-Fi 6. Modern Wi-Fi achieves such a performance gain thanks to faster modulation and coding schemes (MCSs), wider channels, and the adoption of Multiple Input Multiple Output (MIMO) technologies.
In addition to the main track of high-rate wireless local area networks, Wi-Fi evolution includes several niche projects. For example, Wi-Fi HaLow (802.11ah) [5] brings Wi-Fi to the market of wireless Internet of Things. Millimeter-Wave Wi-Fi (802.11ad/ay) [6] supports nominal data rates up to 275 Gbps at the cost of a very low range. New applications and services related to 8K video, Virtual Reality, Augmented Reality, Gaming, Remote Office, and Cloud Computing, as well as the need to support a high number of users with heavy traffic in wireless networks, push the community forward to Extremely High-Throughput (EHT) wireless networks. In May 2019, Task Group BE (TGbe) [7] started its work on a new amendment to the Wi-Fi standard that will increase the nominal throughput to more than 40 Gbps 1 in ≤ 7 GHz channels and provide support for real-time applications (RTA) [8]. Although the standard development process is at the very initial stage, by now, about 500 submissions have been made proposing new features for the future Wi-Fi 7, also known as IEEE 802.11be. In addition to increasing the data rates and reducing latencies, these features rethink important concepts of Wi-Fi operation, such as forward-compatible physical layer (PHY), scalable sounding, Multiple Access Point (Multi-AP) cooperation, which will form the basement for further Wi-Fi evolution. As 11be is quite a novel project, there is no comprehensive analysis of its candidate features. In the literature, we have found only [9]- [16]. Submission [9] just mentions the 11be features, [10] provides a very brief view on 11be, while [11], [12] focus mostly on RTA features. The authors of [13], [14] study Distributed MIMO, and [15], [16] analyze how Full Duplex (FD) works in Wi-Fi.
In this paper, we overview the main challenges related to Wi-Fi 7, thoroughly investigate possible innovations discussed in the IEEE 802.11 Working Group, raise open issues and provide the academic community with ideas on fruitful areas of research in the context of Wi-Fi 7. Many ideas described in the paper are only discussed in TGbe but are not approved yet. The other ideas have been recently approved. In the absence of a draft standard, the interested reader can find the list of already approved features in the latest version of the Specification Framework Document [17]. 2 In the paper, we indicate which proposals are approved and which ones are only discussed. In both cases, the contribution from academia related to evaluation and further development of the proposed ideas is highly valuable.
The rest of the paper is organized as follows. In Section II, we take a look at Wi-Fi history, explain the 11be standard development process timeline, and briefly point out the main candidate features of Wi-Fi 7. In Section III, we describe them in detail, focusing on open issues, challenges, and problems that can be solved by the research community. In Section IV, we summarize the paper.

II. IEEE 802.11be IN THE Wi-Fi LANDSCAPE A. Wi-Fi EVOLUTION
By the end of the seven-year-long development process of the first Wi-Fi standard, it has become apparent that its maximal nominal data rate of 2 Mbps is too small to replace 100 Mbps Ethernet. That is why quite soon, the community has developed a palette of standard amendments, namely 802.11a/b/g, that has increased data rates up to 54 Mbps by using new MCSs in both 2.4 and 5 GHz bands. Having introduced orthogonal frequency-division multiplexing (OFDM) with the channel bandwidth of 20 MHz, 64 tones, the symbol length of 3.2 µs plus 0.8 µs for the guard interval, 802.11a forms the framework for the following Wi-Fi versions.
Wi-Fi 4 (802.11n) [2] gives further growth of data rates (up to 600 Mbps) by exploiting several techniques. First, it introduces higher coding rates of 5/6 as opposed to the previous 3/4 and optionally reduces the guard interval between OFDM symbols from 0.8 µs to 0.4 µs. Second, it doubles the channel width to 40 MHz. Third, it introduces the MIMO technology that is the most significant 802.11n breakthrough. With 802.11n, a pair of devices can use multiple antennas to transmit up to four spatial streams (SSs) simultaneously between them. High nominal data rates at the PHY would not provide end-user benefits if it were not for new MAC features. The most significant MAC features are two aggregation methods, namely Aggregated MAC Service Data Unit (A-MSDU) and the Aggregated MAC Protocol Data Unit (A-MPDU), which have significantly reduced the overhead induced by headers and inter-frame spaces. A-MSDU appends several aggregated packets with a single MAC header and checksum. A-MPDU assigns a MAC header and frame checksum to each aggregated packet. Thus, A-MPDU improves transmission reliability by allowing the decoding of at least some packets in case of short noise bursts, at the expense of slightly increased overhead.
The next 10x increase of data rates is implemented with the 802.11ac amendment [3], [19], [20] (Wi-Fi 5). The amendment expands the approaches used in the previous version of Wi-Fi. Thus, it increases the constellation order of quadrature amplitude modulation (QAM) from 64-QAM to 256-QAM, i.e., the maximal number of raw bits per symbol grows from six to eight. The channel bandwidth increases up to 160 MHz. Since such wide bands are not available in 2.4 GHz, 802.11ac operates only in 5 GHz. Because of spectrum scarcity, the amendment allows using non-contiguous 80 + 80 MHz channels that can be separated by some frequency gap. To cope with interference, before each packet transmission, every device adaptively selects the bandwidth used for this packet: 20, 40, 80, or 160 MHz, as described in Section III-C.1. As for MIMO, 802.11ac doubles the number of SSs up to 8. The developers of the standard have noticed that it is hardly possible to deploy more than two antennas to some devices. Moreover, the access point (AP) may have only a small portion of data intended for each client station (STA). To address these issues, 802.11ac introduces downlink (DL) multi-user (MU) MIMO that allows an AP to assign different DL SSs to various STAs. All these means increase throughput up to 7 Gbps. To reduce the header-induced overhead at such high data rates, the amendment increases the maximal length of an aggregated frame from 65 535 of 802.11n to 4 692 480 octets.
The development of Wi-Fi 6 (802.11ax) [4] is connected with a paradigm shift. Instead of increasing the nominal data rates, the 802.11 Working Group focuses on improving the efficiency of Wi-Fi networks, specifically in dense 2.4 GHz and 5 GHz deployments. Primarily, they introduce orthogonal frequency-division multiple access (OFDMA) to Wi-Fi, which allows allocating small but the most efficient portions of time-frequency resources for STAs. Apart from that, Wi-Fi 6 enables uplink (UL) MU MIMO and OFDMA transmissions and introduces more flexible rules for channel bonding and carrier sense. The AP fully controls the parameters of UL MU transmissions, such as MCS, duration, etc. In particular, it sends Trigger frames that include these parameters and initiate UL MU transmissions.
To improve performance in outdoor scenarios and add more flexibility to OFDMA, the 11be downclocks OFDM numerology four times, quadrupling the number of tones. So the OFDM symbol duration becomes 12.8 µs plus the guard interval of 0.8, 1.6, or 3.2 µs. With the shortest guard interval, the overhead reduces by 10% with respect to Wi-Fi 5. To increase the nominal throughput, Wi-Fi 6 enables 1024-QAM that carries 25% more raw data than 256-QAM of Wi-Fi 5. Summing up, the nominal data rates are increased by 37% that is negligible in contrast to ten times growth showed by its predecessors. Despite much better performance in dense deployments, such low gains in the nominal throughput may not attract new customers. Skeptics claim that focusing on the quality of operation and ignoring the quantity performance indicators may slow down the sales of Wi-Fi 6 devices. Such a concern is one of the reasons why the 802.11 Working Group switches back to increasing the nominal throughput in Wi-Fi 7, together with improving user experience (e.g., when watching 8K video with an uncompressed rate of 20 Gbps) and providing real-time communications with the required latency below 5 ms for gaming.
High data rates are not enough for supporting RTA since the packets may wait a long time for the channel to become idle or the previous packets to be served. Thus, in addition to providing high data rates, the 802.11be amendment deals with the Quality of Service (QoS) of RTA. In Wi-Fi networks, there is a palette of methods to provide QoS. However, only one of them, namely Enhanced Distributed Channel Access (EDCA), is used in practice. EDCA distinguishes voice, video, best effort, and background traffic types by assigning them different access categories (AC). As EDCA extends the basic parametric channel access, it cannot guarantee QoS. In contrast, such standardized mechanisms as Hybrid Coordination Function Controlled Channel Access that take into account specific QoS requirements and use deterministic 88666 VOLUME 8, 2020    [21] (see Fig. 1). Its primary goal was to define new features of 802.11 on bands between 1 and 7.125 GHz with the primary objective of increasing peak throughput by scaling the PHY of 11ac and 11ax. In July 2018, the EHT TIG was transformed into the EHT Study Group that defined the scope of the new project and identified the list of candidate features of 11be.
In parallel, 802.11 discussed how to support RTA in Wi-Fi networks. The work in this direction started in November 2017 [22] with a presentation on Wi-Fi Time-Sensitive Networking (TSN) as a part of the activities of the 802.11 Wireless Next Generation Standing Committee. The proposal has attracted much attention, and in July 2018 RTA TIG was launched [23]. Since supporting RTA requires both high nominal data rates and some MAC features to speed up the standard development process, the 802.11 Working Group agreed to provide support of RTA as a part of the future 11be amendment.
In 2018 the FD TIG studied how to implement FD in Wi-FI and how much gain this technology could provide. The results of these activities should be taken into account by the 11be developers, too.
In March 2019, EHT Study Group transformed into TGbe [7] that is developing the 11be amendment. It aims at finishing the initial draft in two years, i.e., by March 2021. The final version is expected by early 2024. While the draft standard is not ready, all the approved features can be found in the latest version of the Specification Framework Document [17].
To meet the challenging timeline, the group evaluates various features in parallel, in two ad-hoc groups that focus on PHY and MAC features, correspondingly. Despite such optimization, there are many submissions in the queue, and the waiting time exceeds several months. In January 2020, the authors of [24] raised a concern that TGbe would meet the timeline at the current work pace. To accelerate the standard development process, the group agreed to select a small set of high-priority features that could be released by 2021 (Release 1). Such features should provide high gain with low complexity. The set should include support of 320 MHz, 4K-QAM, obvious OFDMA improvements, multi-link. The main concern against this proposal is related to the complexity of the changes to PHY and MAC that would be needed to support the features postponed for Release 2.
One more important issue related to Wi-Fi 7 is its coexistence with 3GPP technologies of cellular networks operating in the same unlicensed frequency bands. To study the coexistence issues related to Wi-Fi and cellular networks, IEEE 802.11 launched a Coexistence Standing Committee (Coex SC). The task of Coex SC is to establish contact with 3GPP to set up synchronous work. Despite many activities and even a joint workshop with both 3GPP and IEEE 802.11 participants in July 2019 in Vienna, no technical solutions have been approved yet. A possible explanation for such fruitless activities is that both IEEE 802 and 3GPP do not will to change their own technologies to make them aligned with the concurrent one. So, at the moment, it is not clear which of the solutions discussed within Coex SC will become a part of Wi-Fi 7.

C. Wi-Fi 7 AT A GLANCE
The 11be project has incorporated very ambitious goals related to higher nominal data rates, higher spectrum efficiency, better interference mitigation, and providing RTA support. To achieve these targets, the 802.11 Working Group has discussed about 500 proposals from different areas, which can be mapped to one of the seven major innovations of Wi-Fi 7.

1) EHT PHY
Wi-Fi 7 is approved to scale the PHY of the previous Wi-Fi standards by doubling both the bandwidth and the number of SSs in MU-MIMO, which increases the nominal throughput 2 × 2 = 4 times. PHY also introduces higher-rates MCSs by utilizing 4K-QAM, adding 20% to the nominal VOLUME 8, 2020 throughput. Thereby, Wi-Fi 7 will provide up to 2×2 × 1.2 = 4.8 times higher nominal data rates compared with 9.6 Gbps of Wi-Fi 6. Thus, the maximum nominal throughput of Wi-Fi 7 is 9.6 Gbps × 4.8 ≈ 46 Gbps. Additionally, a revolutionary change in the PHY protocol is related to the generalization of the previous PHY headers and developing a forward-compatible frame format.

2) EDCA WITH 802 TSN FEATURES
To support RTA, TGbe examines the main findings of IEEE 802 TSN [25] and discusses how to improve EDCA. The ongoing discussions in the standard committee are related to backoff procedure, ACs, as well as packet service policies.

3) ENHANCED OFDMA
Introduced in 11ax [4], OFDMA provides new opportunities in optimal resource allocation. However, in 11ax, OFDMA is insufficiently flexible. First, it allows the AP to allocate only one resource unit (RU) of a predetermined size to a client STA. Second, it does not support direct link transmissions. Both drawbacks reduce spectrum efficiency. Besides, the lack of flexibility of the legacy OFDMA degrades performance in dense deployments and increases the latency, which is crucial for RTA. TGbe addresses these OFDMA challenges.

4) MULTI-LINK OPERATION
One of the approved revolutionary changes of Wi-Fi 7 is native support of the multi-link operation, which is favorable for both tremendous data rates and extremely low latency. Although modern chipsets currently can use several links simultaneously, the links are independent, which limits the efficiency of such operation. 11be strives to find such a level of synchronization between the links that allows efficient use of the channel resources and does not suffer from interference in dense deployments.

5) CHANNEL SOUNDING OPTIMIZATION
High orders of MU-MIMO and OFDMA in wide channels require the devices to exchange a large amount of channel state information. The colossal amount of overhead induced by the sounding procedure eliminates the gains provided in theory by the scaled PHY. So much attention is paid to the methods which can reduce channel sounding overhead.

6) ADVANCED PHY TECHNIQUES IMPROVING SPECTRUM EFFICIENCY
Before TGbe was launched, the 802.11 Working Group has discussed several advanced PHY techniques that should significantly improve spectrum efficiency in case of transmission retries and simultaneous transmissions in the same or opposite directions. Although Hybrid Automatic Repeat Request (HARQ), FD operation, and Non-orthogonal Multiple Access (NOMA) are widely studied in the literature, it is not clear yet whether the gain provided by these technologies is sufficiently high to compensate the necessary changes. While during the work on Release 1, TGbe focuses on straightforward high-priority features, for which the group has no doubts, the community has some time for further evaluation of HARQ, NOMA, and FD in the context of Wi-Fi.

7) MULTI-AP COOPERATION
Another important innovation introduced in 11be is multi-AP cooperation. By now, 802.11 Working Group focused mostly on fully-distributed coordination between nearby APs. Although many vendors have their own centralized controllers for enterprise Wi-Fi networks, the ability of such controllers was limited by configuring long-term parameters and channel selection. TGbe discusses much tighter cooperation between nearby APs, which includes coordinated scheduling, beamforming, and even distributed MIMO systems. Some of the considered approaches rely on successive interference constellation (SIC). 11be will support the coordinated scheduling, but there is a level of uncertainty related to more complex approaches. That is why they are postponed for Release 2.
By now, it seems that the majority of proposals related to the first five innovations will become a part of Wi-Fi 7, while the proposals related to the last two innovations require much additional research to prove their efficiency. Table 2 summarizes the main innovations of 11be, the features used to implement them, and how these innovations improve the performance of 11be networks.

III. IEEE 802.11be CANDIDATE FEATURES
A. EHT PHY 11be extends PHY from 11ax [4]. The nominal data rates are increased by exploiting the same ideas that are used in 11n [2] and 11ac [3]. It is quite natural to accelerate nominal data rates by increasing (i) the order of modulation up to 4K-QAM, (ii) the bandwidth up to 320 MHz and beyond, and (iii) the number of spatial streams in MU-MIMO up to 16. A revolutionary part of EHT PHY is the forward-compatible frame format, which simplifies introducing new PHYs to 802.11 and supporting various PHY formats in the same network.

1) 4K-QAM
Each additional increase in the order of constellation gives a smaller and smaller gain. While introducing 256-QAM in 802.11ac provides a 33% gain with respect to 64-QAM of 802.11n, 1024-QAM of 802.11ax increases nominal data rates by only 25%. 4096-QAM gives only 20%. At the same time, the cost of such a small gain is high. The signalto-noise ratio (SNR) needed at the receiver side to accept 4096-QAM is about 40 dB, which is too high for a typical Wi-Fi scenario [26]. Such a high SNR can be achieved with beamforming. So this modulation can be fruitful when the AP has many antennas and serves only one client STA with a few antennas. In such a case, MU transmissions cannot be used, and the number of SSs is low. Thus, the only way to increase 88668 VOLUME 8, 2020 throughput is by using a high order of constellation. That is why 4096-QAM will be optionally supported by EHT.
2) 320 MHz A much higher gain is possible with doubling the bandwidth. Recently opened for ISM usage, the 6 GHz band brings hundreds of MHz available to Wi-Fi. To exploit these frequencies, TGbe introduces as wide as 320 MHz channels, which can double the maximal nominal throughput with respect to 11ax. Moreover, this feature improves real data rates if the distance between the transmitter and the receiver is moderate, as the achievable rate linearly increases with the bandwidth, while the effect from twice smaller SNR is logarithmic.
By now, it is approved that in addition to 320 MHz channels, 11be supports 160 + 160 MHz channels [27], [28] that are formed by two non-adjacent 160 MHz channels similar to 80 + 80 MHz channels of 11ac. Non-contiguous bandwidth facilitates the coexistence of neighboring networks, provides a high bandwidth if no contiguous spectrum is available. Moreover, the technology will enable 240/160 + 80 MHz channels [29].
When a wide bandwidth is used, the legacy preamble is duplicated every 20 MHz. A simple duplication may cause a high peak-to-average power ratio in the preamble. To reduce it, previous amendments rotate duplicated parts by multiplying them by ±1 or ±j. EHT part will use the same idea [27], [30], but the exact method is not agreed yet.
Since operation in such wide channels may be inefficient in dense deployments, with frequency-selective fading, and by power-limited devices, together with wider channels, TGbe considers band aggregation, i.e., joint usage of several links established at different frequencies. Obviously, with this approach, the total channel width may exceed 320 MHz, which results in extremely high total throughput. This feature is described in detail in Section III-D.

3) MU-MIMO
Spatial multiplexing gain has been a key technology driver for 802.11 in the last few standards cycles. It substantially improves spectrum efficiency. Continuing this trend, 11be will support MU-MIMO with a total of 16 SSs across all the scheduled STAs [31]. This improvement will double throughput [32]. Although, in theory, various SSs can have different capacities, 11be will likely not take advantage of it -all SSs to one STA will use the same modulation and coding scheme (MCS) to reduce implementation complexity. One more problem acute for a high number of SSs is the sounding overhead, which is discussed in Section III-E.

4) PHY FRAME FORMAT
To support all new PHY features, 11be needs to modify frame formats. Many changes are related to the PHY preamble of the frames (see Fig. 2). For backward-compatibility, in the 5 GHz and 6 GHz bands, all Wi-Fi frames start with the legacy VOLUME 8, 2020  preamble of 11a. The legacy preamble contains a short training field and a long training field used for frame detection and receiver synchronization. The next OFDM symbol carries the legacy signal field (L-SIG) that tells which MCS is used for subsequent signal and what is the frame length. Wi-Fi does not have an explicit way to indicate the PHY protocol version. In all Wi-Fi versions beyond 11n, the MCS and frame length indicated in L-SIG are fake, but legacy devices can calculate frame duration. Thus, they consider the channel as busy when the frame is in the air. The real values of MCS, frame size, and other parameters are transmitted in the following symbols according to a particular version.
To differentiate frame formats, 11n, 11ac, and 11ax manipulate modulation of the OFDM symbol following L-SIG as well as the content of L-SIG (see Fig. 3). Specifically, in 11n, L-SIG is followed by an HT-SIG field, consisting of two OFDM symbols that are BPSK with 90 degrees rotation (QBPSK). As QBPSK is not used in previous Wi-Fi amendments, having received such symbols, a device understands that the format of this frame is described in the 11n amendment. 11ac has two VHT-SIG-A symbols after L-SIG: the first one is modulated with BPSK, while the second one uses QBPSK. Also, the length indicated in L-SIG is a multiple of three. As for 11ax, first, it repeats L-SIG and indicates the length equal to one or two modulo three. Second, its High Efficiency (HE) signal field contains two OFDM symbols. The first one is modulated with QBPSK, while the second one is modulated with either BPSK or QBPSK. The result of the modulo operation combined with BPSK/QBPSK selection identifies one of the four 11ax frame types.
The frame formats of 11be and beyond amendments will use the L-SIG length divisible by three. Additionally, the developers of 11be decided to stop the bad practice of implicit indication of the frame formats and introduced a two-OFDM-symbol long universal SIG (U-SIG), which would provide forward compatibility [33]. This novel paradigm provides a long-term mechanism for frame type detection and future modifications of the frame formats. U-SIG contains version-independent information, followed by version dependent information [33], [34]. Version independent information includes a three-bit PHY identifier [35], one-bit UL/DL flag, Basic Service Set (BSS) color, transmission (TX) opportunity (TXOP) duration [34], bandwidth [36], etc. Version dependent information will likely include similar information as in HE signal field (e.g., the number of EHT long training fields symbols, midamble periodicity, and space-time block coding flag) and some information for 11be features.
The next EHT-SIG field stores information not included in U-SIG but needed for new 11be features. To accommodate all the mentioned information, EHT-SIG can use its own MCS (different from data MCS) and can occupy a variable number of symbols [37], [38], which is indicated in U-SIG [39]. The EHT-SIG field consists of the common field and a user-specific field [38]. The common field contains information about MCS, the number of space-time streams, coding, the duration of the guard interval, and RU allocation, etc. [39]. User-specific fields are present for MU frames and carry dedicated information for individual STAs. In the case of UL MU transmissions, the Trigger frame defines the information carried in the EHT-SIG field. To avoid unnecessary duplication of the same information, EHT-SIG is omitted in this case.
EHT short training field (STF) and EHT long training field (LTF) follow EHT-SIG and, similar to HE analogs, serve for the fine time and frequency tuning when MIMO/OFDMA is used. 11be inherits longer variants of STF and LTF from 11ax, which are favorable for extended range and better channel estimation [40], [41]. If a frame is transmitted in a wide bandwidth, EHT-STF and EHT-LTF are repeated every 20 MHz. The phase of every 20 MHz copy is rotated to reduce the peak-to-average power ratio and enhance correlation performance [42]. New 320 MHz channels require a new phase rotation design that considers the 320 MHz tone plan and possible puncturing.

5) OPEN ISSUES OF THE EHT PHY
In addition to the issues mentioned above, there are plenty of questions to solve. Having increased the nominal data rates, the 11be amendment shall provide some means to use them efficiently. It is not clear whether high data rates will be used in real deployments where the STAs are located at different distances from the AP and have different capabilities. Moreover, new PHY brings many challenges. Twice wider spectrum and the doubled number of SSs complicate and enlarge the channel sounding procedure. So the channel sounding overhead exceeds the typical duration of data transmission. This problem is discussed in detail in Section III-E.2.
Apart from that, by now, it is unclear whether the 320 MHz channels will provide gains in dense deployments. To address the problem of frequency selective interference and fading, the standard developers should make OFDMA more flexible than it is in 11ax, see Section III-C. Also, in addition to wide channels, TGbe shall consider multi-link aggregation described in Section III-D.
As in previous amendments, the new PHY format raises issues. New EHT preamble is longer than that in the previous Wi-Fi versions, e.g., in 11a. Despite increased data rates, the usage of EHT preamble for short frames is hardly fruitful. So the vendors shall determine which frames shall be transmitted with which preamble.
The gain of multiple antennas can only be achieved with advanced scheduler algorithms. Because of the low costs of Wi-Fi equipment and the unlicensed spectrum, these algorithms shall have low computational complexity and high robustness in scenarios with variable interference. Although the optimal scheduling problem is beyond the scope of the standard, this topic is fruitful for the academic community.
These features increase the nominal gains manifold, but this fact alone does not prove high gains for users. We need to test the new PHY performance in real scenarios. These tests are essential because they can shed light on previously unnoticed effects. For example, we need to test how the EHT PHY performs in scenarios where the STAs are located at different distances from the AP. Another example is scenarios with the moving STAs, where 16 SSs can suffer. We should also consider a typical scenario where the number of antennas at the STAs is limited. Such tests are essential to fix potential problems early and to confirm practical gains for customers.

B. EDCA WITH 802 TSN FEATURES 1) LESSONS FROM IEEE 802 TSN
High nominal data rates are not enough to support RTA in Wi-Fi networks. Channel access may still last for a long time, which increases latency. That is why many MAC solutions are proposed for RTA. The majority of them are not brand new, being initially proposed for wired networks and widely known as TSN.
IEEE 802 has a significant background in TSN with its TSN Task Group [25] and the corresponding standards developed for Ethernet. However, many of the 802 TSN solutions are not directly applicable for Wi-Fi, because Wi-Fi works in unlicensed channels and uses random access, so it is hardly possible to guarantee the required performance in terms of latency and reliability.
To address this issue, TGbe distinguishes two kinds of operation scenarios. The unmanaged operation scenario represents a typical Wi-Fi hotspot, a home or an office network with interference and contention for the channel. RTA-aware solutions can improve throughput, latency, and jitter, but they cannot guarantee any exact values because the overall performance is limited by interference. In the managed operation scenario, all BSSs and STAs are managed, and interference can be controlled to support time-sensitive requirements. Such networks can be designed for factories and enterprises. Under the assumption that there is no unmanaged interference, the network can provide predictable low latency, jitter, and high reliability.
To enable RTA in Wi-Fi networks, the RTA TIG has studied various approaches originating from 802 TSN. One of them is the ability of the network to detect the type of operation scenario. Depending on the scenario, the set of used solutions may vary.
Another TSN feature is based on the ability to stop ongoing transmission of a long delay-tolerant packet when an urgent packet arrives. If the long packet is transmitted by the same device, which requires to transmit the urgent one, this feature can be easily implemented. If the devices are different, one of them shall ask another one to stop the ongoing transmission. However, the link is busy, and no control packets can be sent explicitly. To address this challenge in Ethernet, collision detection can be used. In other words, the device with an urgent packet generates a signal to induce the collision at the long packet sender. Having detected the collision, the long packet sender stops its transmission, and the device with the urgent frame can access the channel.
In Wi-Fi networks, a device cannot sense the channel while transmitting. So this method cannot be directly applied to Wi-Fi. A possible workaround is using a busy tone signal in the main or separate channel, which is sent when a device asks the other ones to stop ongoing transmissions. However, the efficiency of this approach depends on whether the interfering devices support this busy tone and process it correctly [22], [43].
802 TSN widely uses scheduled transmissions. It effectively improves the worst-case latency [44], [45]. In Wi-Fi networks, RTA-aware deterministic scheduling could be implemented with Hybrid Coordination Function Controlled Channel Access, which is too heavy and hardly used in practice. Another approach is to use 11ax trigger frames [4] to allocate periodic time resources for time-sensitive frames [46], [47]. This scheme can be improved by using persistent UL allocation [48]. With this method, resource allocation information is unchanged for some time. Hence, the length of trigger frames can be shortened to reduce overhead [49]. However, in Wi-Fi, the triggerbased channel access works upon Carrier Sense Multiple Access with Collision Avoidance, and nobody can guarantee that the AP will be able to transmit the trigger frame in case of congestion. Moreover, such transmissions are not protected from interference from neighboring networks.
Real-time traffic is very sensitive to interference and congestion, which can increase delays in the network. Such an issue can be solved with admission control [50]. Since neighboring networks share the same channel time, to make admission control efficient, all the devices shall support this feature and follow the same rules.

2) LATENCY ANALYSIS FOR EHT
The authors of the submission [51] make a simplified network simulation with a single BSS, constant MCS and RU size. This submission gives a very helpful breakdown of the latency components (see Fig. 4): packet scheduling time, channel contention time, transmission, and retransmission VOLUME 8, 2020 times. The study reveals that a packet spends more time in contention or retransmissions in EDCA compared with OFDMA. However, if the AP sends packets to many STAs in OFDMA, the scheduling time can take up to 80% of the total latency, leading to deadline expiration. In the case of few STAs, DL OFDMA needs much time for actual transmission, so larger RUs can help transmit data quicker. UL OFDMA latency behaves differently than DL: scheduling time and transmission time do not vary much as the number of STAs grows (until the traffic is saturated). Instead, the TXOP duration plays a major role. Notably, UL OFDMA has no collision and contention in the considered scenario, which makes scheduling the dominating latency component.
In a later work [52], the same authors improve their simulation. They also have adopted deterministic service and bounded latency traffic. Deterministic service [53] means that packets are never lost due to congestion. The authors have studied how well the 11ax OFDMA system can operate as a deterministic service in a residential scenario. For the scenario of real-time streaming (data rate 150 Mbps, frame rate 120 Hz), the authors observe that the network struggles to maintain the 10 ms latency boundary, even though the throughput is above 340 Mbps. For the case of cloud gaming (20 Mbps for DL, 100 kbps for UL), UL traffic is much more vulnerable to the overlapping BSS (OBSS) environment than DL regardless of the data rate. Supporting four STAs is only possible without OBSS interference. A shorter trigger frame duration can improve UL latency, but it is not effective in a dense environment.

3) EDCA IMPROVEMENTS
To provide RTA support, it is highly important to limit the worst-case latency instead of the average one. If the RTA packets are small and not frequent, the required worst-case latency can be achieved in Wi-Fi networks without packet losses [54]. An example of such traffic is real-time gaming. Wi-Fi is known to be bad for gaming: online console games suffer from lags and high ping time. A reason for these effects could be a streaming device used in the same network. A simulation study [55] has shown that a video stream generates considerably more packets and requires much more throughput compared with gaming traffic. The key problem of such scenarios is that the video traffic is prioritized over the gaming traffic, which is mapped to the Best Effort AC [56].
To solve this problem, EDCA can be upgraded to reuse an existing Alternative Voice (A-VO) AC queue for RTA traffic or introduce new ACs [57], similarly to as it has been done in 11aa [58]. Such RTA queues can appear and disappear dynamically, adapting to the changes in the channel and traffic. The authors of [54] suggest to speed up the backoff counting if the RTA frame lifetime is going to expire. If the AP operates in a mixed environment and RTA traffic has different requirements, it would be fruitful if EDCA could prioritize packets by their remaining lifetime or considering other parameters.
Another useful tool for worst-case latency reduction is persistent channel allocation. The authors of [59] improve this tool to further reduce channel access delay. If RTA traffic is small and periodic, a STA can predict the next packet arrival and prepare for channel access beforehand. The STA starts counting backoff before the RTA packet is queued. Once backoff finishes, the STA can send the RTA packet if it arrives by this time. If it does not arrive and the expected arrival is soon, the STA can reserve the channel, e.g., with a null packet.
Also, TGbe discusses modifications of the TXOP rules. In legacy networks, having obtained a TXOP, a STA can send only data of one AC. 11be can allow using one TXOP of any AC to send RTA traffic as soon as possible. Moreover, an AP may temporarily capture TXOP ownership from any associated STA to deliver RTA traffic. The AP can also grant channel access to another STA if the AP knows that this STA has RTA traffic. After RTA traffic being delivered, the AP returns the TXOP to the original TXOP owner.

4) OPEN ISSUES OF RTA-AWARE MAC
Wi-Fi networks operate in the unlicensed spectrum, which significantly complicates the support of RTA. Although there are many solutions designed for wired or cellular networks, they cannot be directly applied to Wi-Fi. The optimal choice of an appropriate set of solutions depends on the environment.
In an unmanaged network, if strict latency bounds cannot be set, the AP could at least measure and report the feasible latency and jitter parameters [60]. By prioritizing RTA traffic and introducing channel diversity with enhanced OFDMA (see Section III-C), MU-MIMO, and multi-link operation (see Section III-D), 11be aims at increasing the probability that RTA packets are delivered within a given delay budget. However, none of these methods can improve the worst-case latency.
At the same time, there are plenty of ideas that work in managed environments and improve the worst-case performance: busy tone to notify about low-latency traffic, trigger-based persistent scheduling, admission control, multi-AP cooperation (see Section III-G), etc. TGbe should identify the most profitable ideas that can be standardized with moderate effort but provide huge gains. To evaluate the proposed ideas, accurate tests should be conducted.
Besides, more evidence appears that in both managed and unmanaged deployments, scheduled access can be efficient for supporting RTA. Moreover, this approach is compliant 88672 VOLUME 8, 2020  with channel access regulations for unlicensed bands and robust against interference [45], [61]. However, the design of RTA-aware backward compatible, efficient scheduling techniques is still an open issue.
C. ENHANCED OFDMA OFDMA in 11be uses the same scheduling approach as in 11ax [4]. An AP can initiate a DL MU transmission using OFDMA and/or MIMO, or a trigger-based UL MU transmission (see Fig. 5). A new degree of scheduling freedom in 11be is related to assigning multiple RUs to one STA. Besides, multi-link operation may require scheduling synchronization, as discussed in Section III-D.

1) PREAMBLE PUNCTURING
In 11ac networks, the STAs can adaptively select the bandwidth for each transmitted frame [3]. For that, the AP defines a hierarchy of embedded channels shown in Fig. 6. Having accessed the medium in the primary 20 MHz subchannel, a STA can expand the bandwidth by step-by-step concatenation of the secondary channels if they are idle. In other words, if the secondary 20 MHz channel is idle, the STA can transmit in 40 MHz bandwidth. If both the secondary 20 MHz and the secondary 40 MHz channels are idle, 80 MHz bandwidth can be used, etc. In contrast, even if the secondary 40 MHz channel is idle, but the secondary 20 MHz channel is busy, the STA can only transmit in the primary 20 MHz channel. This situation may happen if the secondary 20 MHz channel of a network is the primary 20 MHz channel of a neighboring network [4].
To avoid such underutilization of channel resources caused by the rigid channel bonding rules, 11ax introduces preamble puncturing. For an MU transmission in a ≥80 MHz channel, some busy ≥20 MHz subchannels can be punctured. It means that frame preamble is not transmitted, and RUs are not allocated in these subchannels. In dense deployment, puncturing allows using channel resources in a much more flexible way [4]. TGbe extends the preamble puncturing to 320 MHz bands and also improves it. 11be enables puncturing for single user frames [62], which is not supported in 11ax. This augment improves channel utilization. The exact design for puncturing is actively discussed. One topic of discussion is that 11be will operate in 6 GHz, where other incumbent technologies are present. To avoid them, TGbe has considered additional puncturing options [63]. Moreover, a TXOP protection mechanism was discussed for the cases of wide channel frames with puncturing [64].
2) MULTI-RU 11ax has introduced OFDMA to Wi-Fi. In an OFDMA frame, tones are grouped into RUs (see Fig. 7 An 11ax AP can assign each STA only a single RU. The authors of [66] show that such a restriction leads to channel waste and the network throughput degradation in some scenarios with a small number of STAs. As a first example, in a scenario with two STAs in 80 MHz channel, if one 242-tone RU is assigned to one STA, another STA can at most take 484-tone RU, wasting 25% of bandwidth. Second, if the AP has data for a single STA, and the AP punctures the secondary 20 MHz channel while the secondary 40 MHz channel is idle, the AP can assign only the primary 20 MHz channel to the STA. It means that the AP uses three times less bandwidth than it could. Third, one RU per STA hurts the diversity gain, which is fruitful for RTA.
For the mentioned reasons, 11be supports the assignment of multiple RU per STA. The main issues here are related to how to reduce overhead and describe the set of RUs most simply. In 11ax, each RU is described by a long list of parameters, such as MCS and the number of SSs. In the case of multiple RU assignment, the data seems to be transmitted with the same parameters. To simplify the receiver, the description of various RUs assigned to the same STA contains the same information. To compress such information, the authors of [67] propose to provide full information only for the first RU of a set of RUs assigned to a STA. For the rest of RUs, the description contains only a reference to the first RU of the set. Recently, TGbe has discussed [18] that if the aggregated size of RUs assigned to a STA is ≤ 80 MHz, coding and interleaving shall be done jointly. However, if the size exceeds 80 MHz, they shall be done separately for each 80 MHz segment.
The major drawback of the multi-RU assignment is implementation and scheduling complexity. To address this issue, the authors of [65] propose to limit the set of possible RU combinations. Also, they believe that frequency diversity will provide rather small gains and prefer to use only those combinations of RUs that enhance the spectrum use. They propose to divide RU into two groups: small RUs (<20 MHz) and large ones (≥20 MHz). Only RUs of the same groups can be combined. For example, combining a small-size RU with a large-size RU does not increase the spectrum utilization use much.
Following the described principles, submissions [65], [68], [69] offer possible RU combinations, filtering out the low-gain ones. For example, small RUs combinations shall not cross the boundaries of 20 MHz subchannels (maybe except for RU106 plus center RU26). For the OFDMA transmission in 320/160 + 160 MHz, RUs can be grouped only within one 160 MHz subchannel.

3) OFDMA WITH DIRECT LINK
In infrastructure Wi-Fi networks, if a STA has data for another STA, associated with the same AP, the data exchange is typically done via the AP. Such two-hop transmission leads to channel waste if the STAs are in the transmission range of each other. 11e has enabled direct links between the STAs, so the neighboring STAs can transmit the data directly to each other. TGbe has agreed to design a method of how the AP can dedicate channel resources for direct link operation, though the exact method is under development.
Introduced in 11ax, OFDMA supports only UL and DL transmissions, and TGbe can extend OFDMA to support direct links. This extension will help to avoid collisions between two peer communicating STAs and a nearby BSS [70]. For that, the AP allocates dedicated RUs for direct links. Having received a data frame in a particular RU via a direct link, the STA acknowledges it in the same RU. It is not decided yet, but the acknowledgment (ACK) is likely sent with the same transmission parameters as the data frame. The submission [71] does not imply using OFDMA, but it proposes that an AP can send a Trigger frame to initiate transmission via a direct link between two STAs.

4) OFDMA FOR RTA
The authors of [58] emphasize that OFDMA is a powerful tool for supporting delay-sensitive traffic because the AP can centrally manage DL and UL transmission. However, the current OFDMA may need more enhancement if we want to support extremely low-latency traffic. As evaluated in [52], DL and UL transmissions are vulnerable to OBSS interference. It occurs at a random time, causes collisions, and defers channel access, so the latency grows even in the case of high average SNR. Hence, the network can support the demands of only a few STAs. So we need to focus on the following issues.
First, to allocate some RU for a STA, the AP should know that the STA has some urgent data to be delivered to the AP. Apart from that, for the optimal scheduling of channel resources, it is important to know the traffic parameters, packets' remaining lifetime, etc.
Second, OFDMA of 11ax allows allocating RU for random access either for all STAs or for the STAs that try to associate with the AP. To improve random access for RTA, it is worth allocating RU for RTA packets only.
The paper [72] introduces OFDMA random access for a dynamic set of STAs. Skipping the usage of MU-MIMO, in 11ax, an AP can assign an RU either to one STA or to all STAs that share this RU with random access. The authors of [72] show that this approach is fruitful for massive RTA. They modify OFDMA random access so that only the STAs that have RTA UL traffic will transmit in the dedicated RU. Such a modification speeds up the collision resolution process and can increase the number of RTA STAs in the network by 50%.

5) OPEN ISSUES OF OFDMA
There are many open issues related to enhanced OFDMA. First, although flexible preamble puncturing allows usage of wider bands, it may lead to spectrum fragmentation. The pros and cons of such an approach shall be studied in various scenarios.
Second, the efficient usage of OFDMA requires to rethink the scheduling policies implemented in Wi-Fi devices. In Wi-Fi networks, RUs form a complex structure, so the resource allocation algorithms are not such straightforward as in LTE. However, by now, only a few papers consider the problem of resource allocation in Wi-Fi 6 networks [73]- [76]. Remarkable, these papers reveal important effects relevant to scheduling in Wi-Fi networks. With much more flexible RU allocation in Wi-Fi 7, the scheduling problem becomes more challenging. Since the resource allocation policies are left out of the scope of the standard, these open issues require much attention from the vendors and the academic community.
Finally, resource allocation shall depend on the types of traffic to be delivered. Heavy delay-tolerant flows shall be delivered withing the RUs with the highest spectrum efficiency, while RTA packets require bounded delays.
To improve the quality of experience for users, 802.11 shall consider the exact required values of the QoS parameters.
Till now, the 802.11 standard considers PHY and MAC layer only separately from the rest of the protocol stack. Thus, it does not have the full potential to provide cross-layer cooperation, as presented in [77], [78]. Considering RTA may be a trigger for such changes in the 802.11 ideology.

D. MULTI-LINK OPERATION 1) LEGACY APPROACHES FOR WIDE SPECTRUM USAGE
Wider and wider channels available in the Wi-Fi technology provide higher throughput and reduce delays. However, the usage of wide channels is not efficient for the following reasons. First, even with OFDMA, all the transmissions at different subbands are fully synchronized. Second, the channel access is mostly controlled by the primary 20 MHz subchannel. It means that the whole wide channel is blocked if the primary channel is busy, and the rest of the subchannels are idle. Third, operation in a wide channel consumes more power that is crucial for mobile devices. Fourth, as the channel width grows, so does the number of OFDM tones and, consequently, the peak-to-average power ratio. The latter effect has been significantly aggravated by a 4x increase in the number of tones in 11ax [4]. Fifth, various parts of a wide channel may have different properties and interference levels, so they may require different parameters of the channel access and other mechanisms. Finally, if a device supports only single-channel operation, the vendors cannot indicate on the box a higher throughput than it is defined in the standard. However, if a device could use several channels, the total throughput can be a multiple of the standard limit.
That is why, in addition to wide channels, many modern off-the-shelf APs support the dual-or tri-band operation. In these APs, the Wi-Fi MAC and PHY of various bands work almost independently and provide multiple independent links to STAs [79], [80]. For example, in 11ad, multi-band is used to protect transmission reliability when the line-of-sight mmWave link becomes non-line-of-sight because of a sudden high attenuation, e.g., when a person appears between the transmitting and receiving devices. In such cases, Fast Session Transfer provides a seamless transition between different channels.
Given the data transmission methods in wide channels described in Section III-C.1 at one end and independent multiple band operation at the other one, the developers of 11be try to find such a level of synchronization between various links that provides high spectrum efficiency, low delays, and low power consumption. The designed solution significantly extends the multi-band functionality of 11ad/ay. To achieve better performance, 11be will allow sending packets concurrently on multiple channels. Channels can occupy either different bands or even the same band. The latter case can enjoy the multi-link benefits in one band unavailable for a simple non-contiguous wide spectrum, like asynchronous channel access, power save mode, etc. The multi-link operation can aggregate a various number of links of different widths, e.g., 160 MHz + 20 MHz. 2) MULTI-LINK ARCHITECTURE 11be introduces a concept of a Multi-Link Device (MLD) (see Fig. 8), which consists of several so-called affiliated Wi-Fi devices (each has a PHY interface to the wireless media), but with a single interface to the LLC layer. In other words, the upper-layer protocols consider the MLD as a single device [82]. Despite having multiple physical radio interfaces, MLD has a single MAC address [82], and the sequence numbers are generated uniquely from the same sequence number space [83]. This solution simplifies fragments and packets reassembly, duplication detection, and dynamic link switching. They allow packet retransmission on any link regardless of the link of the initial transmission of the packet.
TGbe discusses that establishing a connection (in terms of Wi-Fi standards, association, and authentication) with an MLD on various affiliated devices may occur independently or jointly [84]. In the latter case, all links' capabilities should be explicitly indicated, as they may vary between the links. For example, as we see further, MLDs should exchange their capabilities to transmit a receive on the links simultaneously [85].
11be has discussed two modes of multi-link operation, referred to as restricted and dynamic link switch [86]. In the restricted mode, data frames and ACKs are bound to one link. Management exchanges transmitted over one link, such as related to power save mode, security key negotiation, Block ACK (BA) negotiation, etc., apply only to this link. It is a simple scheme of multiple independent links with enabled aggregation. In the dynamic link switch mode, multiple links can be used for transmission of the same flow. Management information and negotiations sent over one link can apply to other links. This mode enables load balancing and congestion avoidance. It also improves peak throughput and reduces latency [87], [88], overhead, and power consumption. However, this mode requires reconsidering the protocol limitations of the mentioned mechanisms. For example, it is suggested to increase the size of BA bitmap indicating the subset of received frames [18].

3) MULTI-LINK CHANNEL ACCESS
An important advantage of the multi-link operation in comparison with the single extra-wide channel is the ability of an MLD to perform channel access and transmit data via multiple links asynchronously, as in Fig. 9. Such MLDs can do simultaneous transmission and reception in different bands, VOLUME 8, 2020 i.e., 2.4/5/6 GHz [89]. The power leakage between the bands is minimal because the spectral distance between them is high. However, since affiliated devices within an MLD share the same antennas or antennas are located in the very neighborhood of each other, they may interfere if links are in the same band. The spectrum mask of a Wi-Fi signal is not ideal, the signal strength on the transmitter is much higher than that on the receiver, and the interference between the affiliated devices can be significant even if they use different channels. The closer are the channels of affiliated devices of an MLD, the stronger is the power leakage from a transmitting affiliated device to the others. Such interference complicates the simultaneous transmission and reception capability.
To address this issue, in addition to asynchronous multi-link operation, synchronous transmissions are proposed (see Fig. 9). Synchronous multi-link operation avoids this problem at the cost of reduced throughput caused by more rare channel access [90].
Another potential solution for cross-device interference is forbidden transmission during the transmission of the intended receiver. For example, if an MLD transmits over one link, it cannot receive any frames (e.g., BAs) at another link. Thus, to receive the BA successfully, the MLD should stop transmission in neighboring bands [91].

4) MULTI-LINK POWER SAVE
The concept of an MLD usually implies that it has at least two embedded devices. In the common case, together, they consume twice more energy. The issue is especially important to the battery-supplied devices that do not need to constantly listen to more than one link if their traffic is light. Hence, a power management mechanism for MLD is required.
The basic power management mechanism work as follows. A Wi-Fi device can operate in two modes: active and power save (PS). In the active mode, the device is always awake and can transmit and receive frames. In PS, it can switch off its radio from time to time. When the radio is off, i.e., the device is in the doze state, it can neither transmit nor receive.
In infrastructure networks, a STA shall notify the AP before changing the operation mode. If the STA is in the PS mode, the AP buffers all frames destined for this STA. To notify PS STAs about the buffered packets, the AP includes in each beacon a Traffic Indication Map (TIM) that indicates the presence of packets destined for each STA. Every PS STA periodically wakes up to receive beacons. If the beacon says that no buffered packets are destined for the STA, the STA returns to the doze state right after the beacon. Otherwise, the STA sends a PS-Poll frame. As a response to the PS-Poll, the AP sends buffered frames.
The most flexible power saving mechanism designed by today in Wi-Fi is Target Wakeup Time (TWT). While a detailed description of TWT in 802.11ah and 802.11ax can be found in [4], [5], in this paper, we briefly mention its main peculiarities. TWT allows a STA to negotiate with the AP moments when the STA wakes up for some time (called TWT service periods) and exchanges frames with the AP. With TWT, the STA can always stay doze except for the negotiated service periods and does not need to wake up for beacons anymore, which reduces energy consumption significantly.
With the multi-link framework, the described power management mechanisms can work independently at various links [92]. However, much better performance will be achieved if the MLDs can exchange power management information about one link via another one. For example, a STA can notify the AP via a link that another link is switched on. Multiple TWT transmissions over various links can be scheduled via one of them [92].
Depending on the traffic, channel loads, and interference at different links, the AP can command through the associated STAs to switch off the links that will not be used for data transmission. The authors of [93] propose to dedicate a so-called anchor link for management and groupcast link. All other links can be disabled in case of the absence of intensive traffic [94]. In this case, the multi-link AP may use the anchor links to wake up other ones when it is needed. Various multi-link STAs associated with a multi-link AP may use different channels for anchor links. A simple example of [94] shows a ≈ 35% reduction in power consumption for this scheme with three available links.

5) MULTI-LINK OPERATION FOR RTA
The multi-link operation is considered to be a prospective approach to enhanced reliability and reduced latency in TGbe [95]- [97] thanks to channel diversity. The report of RTA TIG [98] evaluates two modes for multi-link operation: Duplicate Mode and Joint Mode (see Fig. 10). In the Duplicate Mode, the transmitter sends copies of each frame over multiple links. Once the receiver obtains a frame, it drops all its copies that are delivered later. Such a scheme notably increases the robustness of the transmission. In the Joint mode, the transmitter produces no copies but distributes frames over available links. This mode reduces transmission latency.
The authors of [99] propose a Conditional Packet Duplication mode. With this mode, an MLD initially tries to deliver a frame only via one link. If within some time interval, it does not succeed, it replicates the packet and tries to deliver it via other links with the highest priority. The time interval should be chosen concerning the packet delay budget, say, 60% of it, as proposed in [99]. Once the packet is delivered via a link, its copies are dropped.

6) OPEN ISSUES OF MULTI-LINK
In Wi-Fi networks, the multi-link operation may significantly improve throughput, latency, and reliability. The efficient usage of multiple links requires tight coordination between them.
The multi-link operation raises many issues related to both asynchronous and synchronous channel access. The asynchronous operation needs optimization to reduce worst-case latency and enhance the reliability of the RTA data transmission. Some ideas exist [99], but their usability should be verified with a set of tests in practical scenarios.
How to organize the synchronous channel access is an open issue. There could be a primary link that performs channel access, or all links could participate concurrently. The first option provides rare channel access, whereas the second option has a potential fairness issue. It is unclear how to find such an intermediate design that provides reasonable gain compared with the simple wide-band operation and has sufficiently fast channel access.
Both synchronous and asynchronous channel access methods require mathematical models to evaluate and optimize their performance in case of finite flows. Obviously, Bianchi's model [100] -which is widely used to evaluate Wi-Fi channel access -cannot be directly applied to the multi-link case. However, hopefully, it can be extended.
Finally, the algorithms of packet distribution among multiple links shall both minimize channel wasting and prevent the head-of-line blocking delay problem [91].

E. CHANNEL SOUNDING OPTIMIZATION 1) CHANNEL SOUNDING INDUCED OVERHEAD
Nowadays, MU-MIMO is a key technique used in many wireless network technologies to increase its capacity. This technique requires the perfect knowledge of the channel state. Unfortunately, the channel properties significantly vary with time. So devices need to measure the channel quite often.
The sounding procedure in 802.11ax [4] consists of sending a reference signal by the AP and getting the explicit channel state information (CSI) feedback from the STAs. The procedure starts with a Null Data Packet (NDP) Announcement (NDPA) sent by the AP to notify the STAs about the following reference signal. The AP sends this signal a short inter-frame space after NDPA in the from of Null Data Packets (NDPs). A short inter-frame space after NDPA, the AP broadcasts an NDP, which is used by the STAs to assess the channel. The NDP includes HE-LTF 3 of duration 7.2, 8, or 16 us for each SS. Having received the NDP, the STAs reply with the beamforming Reports (BFR) either sequentially or in parallel thanks to OFDMA. Briefly speaking, the BFR provides the following information.
• The average SNR for each SS. Each SNR value is an 8-bit integer.
• The Givens rotation [102] angles (φ,ψ) of the feedback matrix, for every 4th or 16th subcarrier. The size of a φ-ψ pair is 6 or 10 bits for a single user and 12 or 16 bits for MU.
• For MU-MIMO, an array of 4-bit differences between the SNR for subcarrier and the average SNR in the SS.
Given BFR, the CSI for the remaining subcarriers is interpolated. Thus, for a 160 MHz channel, one SS, and every 16th subcarrier being reported, the BFR contains information about 128 subcarriers. For a larger number of SSs, the size of both the NDP and BFR grows too. For example, 4 × 4 MIMO requires just six pairs of φ and ψ angles per subcarrier, whereas 8 × 8 MIMO needs 28 pairs, and 16 × 16 MIMO needs 120 pairs. The developers of Wi-Fi 7 aim at supporting up to 16 SSs. However, such a high order of MU-MIMO and twice wider channels make the overhead huge (see Fig. 11). As the duration of the procedure reaches 10 ms, channel measurements become useless, since the channel varies significantly between the NDP and the following data transmission. Thus, the measured channel state becomes outdated [103].
Thereby, the main challenge on the way towards increasing the order of MIMO is the reduction of the overhead induced by channel sounding and BFR. As this challenge is crucially important for Wi-Fi 7, many submissions are proposing various approached to reduce overhead. Here we describe the most promising ones.

2) SOUNDING ENHANCEMENTS
One of the purposes of the preamble is to sound the channel at all subcarriers in the transmission bandwidth and all spatial streams. For sounding, the transmitter sends a linear combination of reference signals, defined by a P matrix, in the EHT-LTF field. Subcarriers in the i-th LTF symbol for j-th stream are multiplied by P ij for differentiating between the streams when estimating the channel.
Channel estimation requires a certain type of matrix arithmetic with P matrices, so they shall be designed to minimize the computational complexity [104]. Examples of good design are the orthogonal discrete Fourier transform matrices, the orthogonal (±1, ±j)-matrices, and the (±1)-matrices, listed in the decreasing order of their implementation complexity. However, not all matrix sizes of a particular type exist. The discrete Fourier transform matrices exist for all sizes, but there are no (±1, ±j)-matrices of sizes 9,11,13, and 15; and no (±1)-matrices of sizes 9, 10, 11, 13, 14, and 15 [104]. The authors of [104] propose to consider discrete Fourier transform matrices as a baseline, and not to consider any more complex solutions.
Apart from the above matrices, submission [105] suggests that large P matrices have zero entries. Zero entries mean that in the corresponding SSs, certain LTF symbols do not contain reference tones. This approach allows receivers to make fewer computations. Combined with tone-alternating P matrices, the transmissions power becomes consistent throughout all LTFs and provides full resolution for frequency domain in every SS.
With HE-LTF of 16 µs and 16 SSs, the total duration of the reference signal is as long as 256 µs, which is comparable to the data transmission duration. To shorten the reference signal, the tone interleaving is proposed [106], [107]. Every stream uses a unique periodic subset of tones for sounding. The receiver interpolates the channel estimation for every stream. An alternative way of sounding is a technique called Orthogonal Sequence-based Reference Signal (OSRS) [108]. With OSRS, groups of tones are orthogonally coded for every stream. The receiver applies these orthogonal codes to determine channel estimation for every stream and every tone group. The authors of [109] compare the techniques by noise resistance. The current results show that the OSRS technique has worse performance than tone interleaving, because of a cross-stream leakage, as explained by [109]. More evidence is needed to decide which method to use. So this is work in progress.

3) EXPLICIT FEEDBACK OVERHEAD REDUCTION
Many efforts have been put to reduce the size of BFR. Submission [110] summarizes some of them. One of the methods is the so-called wideband precoding [111]. Similarly to as the SNR is averaged over the whole channel in the BFR, we can implement a wideband precoder by averaging the channel over all the tones. This approach shrinks the feedback information manyfold at the cost of some degradation in the accuracy and, consequently, performance. In the case of a higher-order MIMO and wide channels, such degradation can be significant, so the authors of [111] propose to implement narrowband precoders on top of the wideband one. Such an approach improves wideband beamforming while keeping the precoder matrix size smaller than a regular per-tone precoder. Notably, this technique uses the same basic hardware blocks as in legacy Wi-Fi devices, but the smaller size of matrices reduces the number of complex multiplications up to three times [111]. Simulation results show that such an approach reduces the overhead by 25-30% at the cost of less than 0.5 dB loss in throughput or up to 70% with 1.0 dB throughput loss depending on the parameters.
A later study [112] shows that feedback can be compressed even further. The study of the mixed beamforming in the TGn-D channel [113] reveals that more than 20% of the entries of narrowband matrices have under 1% of total power influence. It means that we can ignore these very small values and reduce the feedback size additionally by 20% with a very small reduction in total power.

4) IMPLICIT SOUNDING
Wireless channel reciprocity, i.e., the identical impulse responses for the UL and DL transmission, allows reducing the overhead further. Thus, the authors of [101], [114] reintroduce implicit BFR to 11be. Originally designed for 802.11n [2], it was never used in the off-the-shelf products because an accurate AP self-calibration seemed to be very complicated a decade ago [115]. Consequently, this approach was not used in 11ac and 11ax.
Implicit BFR means that instead of sending sounding from the AP to the STAs and then gathering feedback with the explicit sounding, the STAs can send NDP sounding in the UL, while the AP directly measures the channel. For a large number of STAs and SSs, one BFR takes more time than one NDP. As no BFRs are sent in implicit BFR, the overhead goes down (Fig. 11), and the channel estimation latency reduces, hence the channel ages less. As the channel matrix is calculated directly from the measurements, no channel information is lost because of coarse quantization and information compression. Consequently, such a sounding procedure improves scheduling and spectrum efficiency.
Each STA can send its NDP to the AP either sequentially or simultaneously (e.g., using P-matrices or tone interleaving). The first approach should be used when sounded clients have dissimilar receive powers. The AP can send individual trigger frame to STAs to collect all NDPs. A higher number of STAs extends the duration of the procedure, but it is still quicker compared with the explicit sounding for a large number of SSs. Sounding can be accelerated if NDPs are simultaneous, but this method requires that the STAs generate a signal with almost the same receive power at the AP.
However, the baseband-to-RF (radio frequency interface) and RF-to-baseband conversion chains are not necessarily reciprocal. As a result, the effective DL baseband channel is not equal to the effective UL baseband channel unless this mismatch is explicitly compensated thanks to self-calibration. Newer developed Local AP Calibration may be applied where STAs are not required to be involved in the calibration process [116], [117]. If each antenna's CSI estimation deviates from the real CSI by the same multiplicative factor, beamforming will still result in the same beam pattern. Thus, the AP can select a reference antenna, send a pilot signal from every antenna, and estimate baseband-to-RF gain for every its non-reference antenna relative to the reference one. Such CSI estimation deviates from the real CSI by a baseband-to-RF gain of the reference antenna.
The proposal [118] presents a trigger-based scheme for enabling Implicit Sounding in 802.11be (see Fig. 12). The AP starts with a trigger frame requesting for UL NDPs from the STAs. They reply with NDPs simultaneously, which can be orthogonalized by using P-matrices or LTF subcarrier interleaving [106], as described in Section III-E.2. Having received the NDPs, the AP implicitly assesses the channel to each of the STAs and then sends a beamformed Data frame to them. The authors of [101] have adopted the described sounding protocol and demonstrated that the new protocol saves more than 60% of the original airtime (Fig. 11). For a higher number of STAs and SSs, the gain is even higher.

5) OPEN ISSUES OF CHANNEL SOUNDING
As 11be increases the number of SSs and bandwidth, the channel sounding shortening becomes especially crucial. Unaffordable overhead induced by the legacy sounding methods makes TGbe change the paradigm and develop methods efficient for massive MIMO systems. Depending on the channel conditions, such methods as tone interleaving or OSRS require to find a tradeoff between reducing the duration of training signals and losses in spectrum efficiency. A similar trade-off shall be found in matrix compression techniques.
The efficiency of the implicit sounding is still debatable. Its main disadvantage is related to low performance caused by weak uplink. To overcome this problem, the STAs may need a longer reference signal. Also, the design of an implicit sounding procedure for multi-AP operation discussed in Section III-G is still an open issue.

F. ADVANCED PHY TECHNIQUES IMPROVING SPECTRUM EFFICIENCY
TGbe also considers a palette of advanced PHY techniques, such as HARQ, Non-orthogonal Multiple Access, or FD, that improve spectrum efficiency in case of retransmissions and parallel transmissions in the same or opposite directions.
Although numerous academic papers promise huge gains, the observed gains in real Wi-Fi deployments are not obvious, and additional performance evaluation is required.

1) HARQ
In Wi-Fi networks, if a control sum of a packet does not equal the value in the corresponding field of the packet, the receiver drops the obtained data, and the transmitter repeats the whole packet. To improve spectrum efficiency of 11be, many contributions [119]- [127] propose to introduce HARQ to 11be. In contrast to the legacy retransmission procedure, HARQ exploits the information from the previous tries. The receiver combines the signals from several transmission attempts, which increases SNR and, consequently, the probability that the receiver decodes the packet correctly.
HARQ proves to be more robust to the errors in the estimation of the SNR at the receiver [125]- [129]. It allows the transmitter to select a higher MCS opportunistically. Either the transmission is fast with the good channel, or the receiver extracts some information anyway with the poor channel [130] and decodes the packet with a transmission retry. Moreover, HARQ avoids reducing MCS for such retries.
TGbe has discussed three popular HARQ methods: Chase Combining (CC), Punctured CC, and Incremental Redundancy (IR). With CC, every retry contains the same information as the initial transmission. So it is quite easy to combine the signals and to achieve gains in the SNR. The cost for low complexity is the worth performance with respect to the other HARQ methods. With Punctured CC, the transmitter repeats only a portion of the coded bits that have low SNR. Thus, Punctured CC reduces HARQ-induced overhead even more. With IR, every retransmission uses a different set of coded bits, representing the same set of information bits. Thus, at every retransmission, the receiver gains extra information [131]. IR is the most difficult in implementation. However, this method is the most efficient [125], [131].
For these methods, 11be developers often imply low-density parity-check (LDPC) coding. Supporting HARQ with binary convolutional coding seems excessive, and it does not provide any gain anyway [131], [132]. Although cellular systems use HARQ, introducing HARQ in Wi-Fi raises many issues described below.

a: DATA UNIT
One of the most crucial issues related to HARQ is the HARQ data unit, i.e., the piece of information that the transmitter shall repeat in case of delivery failure. Typically, in Wi-Fi networks, every MPDU has a control sum. In case of failure, the whole MPDU is repeated [133].
HARQ can inherit this behavior from the legacy Wi-Fi, and in case of transmission failure, repeat the whole MPDU. However, this approach raises many issues. The first one is that the original transmission and the retry carry different information because of the retry bit, different ciphertext, different CRC bits, different scramblers. Thus, the VOLUME 8, 2020 FIGURE 13. MDPU-codeword misalignment [121]. signals and coded bits are different and cannot be combined directly. Additionally, if the lost MPDU is encapsulated in an A-MPDU, we have a problem that LDPC codewords are not aligned to MPDUs within A-MPDU, see Fig. 13. If MPDU-2 is not correctly decoded, what MAC-layer data shall be repeated to generate appropriate codewords? Note that in a new A-MPDU, the MPDUs will be mapped to codewords differently.
To address all the aforementioned issues, we could repeat the whole A-MPDU without any changes, except for the following one. To avoid packet duplication at the MAC layer, we need to add a retry flag to the PHY header, which is encoded separately. However, repeating the whole A-MPDU generates too much overhead because the transmitter needs to repeat both lost and delivered data.
A good solution could be repeating only the damaged codewords. This method requires the receiver to identify the erroneous codewords at the PHY layer to request their retransmission [121]. Codewords do not have any robust checksum except for the Parity Check of LDPC, which can be used together with the checksum of MPDUs. Based on this information, PHY can request the codewords associated with the failed MPDUs. Implementing codeword retransmissions require tight MAC-PHY interaction, which may cause implementation issues.
The authors of [123] propose to group several codewords in an HARQ block that carries one or several MPDUs with padding bits. The size of such an HARQ block can be negotiated or predefined. The BA mechanism can be reused for HARQ Blocks' acknowledgment. Moreover, since HARQ Blocks are larger than codewords, they require a lower feedback overhead. However, the scheme incurs overhead in the form of MAC padding. In addition, if several small MPDUs are concatenated within an HARQ Block, retransmission may contain already delivered MPDUs. The authors of [134] suggest that, instead of MAC padding, HARQ Blocks could contain multiple MPDUs and their fragments. But the retransmission overhead will be large if a fragmented MPDU fails.

b: PROTOCOL
TGbe discusses how to implement the HARQ protocol [135]. Adapting the BA mechanism is especially attractive, and it suits best for MPDU and HARQ Block units. Feedback overhead is small for these units, but retransmission overhead can be large. Reusing BA for codeword-level HARQ is problematic because of the necessity of additional MAC-PHY interactions. Codeword-level HARQ can bypass MAC interaction by performing communication only between PHY levels of the transmitter and receiver. If this scheme is approved, the amendment will need to define the corresponding communication protocol.
An HARQ retransmission can occur in a new TXOP, which requires minimal changes to the standard [134], or in the same TXOP. If the HARQ retransmission occurs in a new TXOP, the AP needs to support many HARQ processes because the AP can receive HARQ frames from multiple STAs. Making HARQ attempts in the same TXOP as the original transmission requires much less memory than the first approach and speeds up retries. However, the number of HARQ retransmissions is limited by the TXOP limit. Moreover, the sender needs to choose such a TXOP duration that is enough for both the original transmission and retries. Thus, this approach may result in resource waste or an unfinished HARQ process.
One of the approaches to improve HARQ performance is to take benefits of frequency diversity and to make additional transmission attempts at different frequencies [126] or even via different links.
Another method to enhance HARQ is the so-called multilayer HARQ [130], [136]. This method exploits the fact that in a high-order modulation (e.g., 16-QAM, 64-QAM, etc.), various bits have different reliability [137]. If these bits belong to different codewords, the codewords are transmitted with different reliability, too. By appropriate mapping of the codewords to the modulated bits, the multi-layer HARQ can improve transmission reliability in the low-SNR area [130]. Notably, this approach requires no instantaneous SNR information and may provide much better performance than traditional schemes for rate adaptation [138].
While considering Punctured CC and IR, TGbe faces the problem of how to puncture the codewords and what type of codes shall be used with HARQ [124], [125]. Wi-Fi networks use traditional binary convolutional coding of lower complexity and modern LDPC with higher performance. Introducing HARQ will require the development and evaluation of various puncturing methods, and maybe new code constructions [125].
Another issue is the amount of information to be sent in case of transmission failure. The authors of [139] show that in terms of overall throughput, the optimal percentage of retransmitted information is opposite to the SNR (i.e., if the SNR is high, the portion of the retransmitted information in case of transmission failure shall be low, and visa versa).
Apart from that, TGbe discusses introducing additional retry counter for HARQ retransmission attempts. In contrast to usual retransmissions, which can be done with different transmission parameters, additional transmission attempts done with the same codewords become almost useless after the fourth attempt. Many members declare that the limit of two transmission attempts is enough to achieve reasonable gains in goodput [126], [139], [140].

c: IMPLEMENTATION ISSUES
Implementing HARQ in cheap Wi-Fi devices raises many issues related to memory consumption and required computational speed.
HARQ requires that the receiver saves log-likelihood ratios (LLRs) for received bits. This requirement makes HARQ more a memory-hungry technique than complex. There are some first-order memory size estimates, but they different standpoints. Submission [135] claims up to 35.8 MB of the required memory, which takes 44.5 mm 2 of the chip area. This is a significant portion of a typical 10 × 10 mm -20×20 mm 802.11ax chipset. The authors of [141] soften this estimate to 9.3 MB, requiring 2.55 mm 2 with a 7 nm chipset fabrication technology. We need to derive a more accurate and more objective assessment of the required memory and its anticipated performance.
Besides, HARQ operations shall be done very quickly. This problem is aggravated in the case of Punctured CC and IR, where only a subset of bits are retransmitted. A small number of retransmitted bits means short transmissions and a high processing rate required to keep up with the incoming LDPC codewords [135]. Fortunately, the LDPC decoder processes codewords iteratively, and combined codewords would require less decoding iterations [141]. However, in the case of codeword-based HARQ, MAC processing shall be accelerated. If only a few codewords fail, they can be retransmitted in a single short frame. The receiver has to process an entire A-MPDU during this short time and feedback with ACK. This fact may limit possible HARQ gains, and a thorough evaluation is required.
The performance of HARQ in Wi-Fi deployments is still an open issue. For example, the authors of [122] notice that HARQ shows its highest gain only if SNR is low. Beamforming significantly increases SNR, which reduces the benefits of HARQ. Besides, the performance of HARQ is not well studied in dense deployments, where the packets are lost by random collisions rather than by a slight change in SNR. This issue is another fruitful area for investigation.

2) FULL-DUPLEX
Another technique that has the potential to be included in the future 802.11 standards is FD. In January 2018, the 802.11 Working Group formed an FD TIG to study this type of communication. In-Band FD allows simultaneous UL and DL on the same spectrum. This technique maximizes the use of available spectrum and provides benefits as reduced latency, high scalability, and increased throughput. FD has additional features, like collision reduction. A DL signal prevents potential hidden nodes from transmitting during UL. FD can also relax issues for relay-based networks: multiple relays supporting FD can transmit simultaneously. Moreover, apart from the increase in throughput, FD with interference cancellation enables a new version of channel access ''Listen While Talk.'' Wired standards like DOCSIS [142] have already adopted FD to leverage its advantages. Doing the same in Wi-Fi is harder due to rapid channel variations and MIMO. The challenge is the requirement of quick adaptation and schemes that scale well with antennas [143].
SIC makes simultaneous transmission and reception feasible, and yet it is the most complex problem to solve. SIC shall mitigate internal reflections (15-20 dB lower than TX signal), non-linear components (30-40 dB), and multipath (50-60 dB). This task is often divided into two parts: analog SIC that reduces the strongest components and digital SIC that reduces the interference below the noise floor. SIC can operate correctly only if the STA knows the figure of its internal reflections and non-linearity. To get this figure, the STA needs an efficient calibration procedure with minimum system-level overhead [144].
To use FD in Wi-Fi, we need to focus on identifying where FD reaches its theoretical benefits with respect to fundamental aspects above. The accurate decision should account for RF properties, achievable residual interference, and noise floor.
In November 2018, the majority of FD TIG decided that the EHT project should include FD. However, by now, none of the FD solutions received sufficient support within TGbe because of unclear gains in real deployments.
There have been a few proposals that consider SIC implementation [145] and MAC enhancements [146]. Nevertheless, incorporation of FD in the Wi-Fi technology raises many questions: how to modify the transmission protocol, which STAs can be involved in FD, how to combine FD and MIMO/OFDMA, how to keep backward compatibility, etc. Another issue related to FD is when to apply FD. The accurate decision should take into account RF properties, achievable residual interference, and noise floor.

3) NON-ORTHOGONAL MULTIPLE ACCESS
To increase peak throughput and improve efficiency, the NOMA method is designed. The basic idea of NOMA is that an AP can serve multiple STAs simultaneously in the same baseband by allocating portions of the total transmission power for each STA (see Fig. 14). The AP can perform NOMA transmissions with the superposition coding, which is a simple superposition of multiple signal components with different coefficients subject to a power constraint [148]. Hence, the bigger is the power, the more reliable is the component reception. In the two-STA case, the great-power component is destined to a far STA with worse channel conditions, whereas the other component is destined to a near STA. The far STA receives the composite signal as is, perceiving VOLUME 8, 2020 interference from low-power component as noise. The near STA separates the signal components using SIC.
The proposal [149] introduces a variation of NOMA: Semi-orthogonal Multiple Access (SOMA). SOMA does not just do superposition but makes an artificially designed gray-mapped superposed constellation. This feature makes the low-power signal component more resilient to noise. Moreover, SIC becomes unnecessary, which makes the receiver less complex. This concept of SOMA is similar to the Multi-User Superposition Transmission feature included in the 3GPP LTE Advanced [150]. The ≈ 20 − 30% gain of the feature is already proven in both Link and System Level.
Moreover, there is an experimental study of NOMA/SOMA Wi-Fi systems [147], [151]. The authors provide evidence of up to 40% gain of the geometric average throughput for two STAs. Additionally, they show that the implementation of NOMA/SOMA can be backward-compatible: the far STA can be legacy. As the far STA receives the signal as-is and does not apply SIC, it does not need to know that the composite signal uses SOMA.
The proposals on NOMA initiated an extensive discussion regarding its effectiveness with MU-MIMO. NOMA is not a competing technology against MU-MIMO, but rather a complementary one. MU-MIMO shows high performance if the STAs have similar attenuation but orthogonal MIMO channels. In contrast, NOMA works better with the STAs that have dissimilar attenuation and correlated channels. Intuitively, the AP can use MU-MIMO to form several spatial beams to STA groups, while each beam carries NOMA signal, separated within the STA group. There are plenty of theoretical works dedicated to MU-MIMO and NOMA cooperation, but still, no experimental results are available.
G. MULTI-AP COOPERATION 1) BASIC IDEA While 11ax [4] improves Wi-Fi performance in dense deployments by introducing the features that can be implemented in a single network, TGbe goes further and discusses a list of features that require coordination of nearby networks [8]. The discussion in TGbe demonstrates a paradigm shift from interference mitigation to cooperation between the neighboring APs.
The state-of-the-art enterprise Wi-Fi networks often use cloud-based solutions such as Cisco Meraki [153], Quantenna MAUI [154], or Huawei Agile [155]. They enable seamless roaming between Wi-Fi networks and simplify network configuration, e.g., selecting the channels for operation. However, except for such long-term parameters, the networks operate in almost uncoordinated modes. In this context, TGbe aims at improving network performance by much tight coordination of channel access, transmission schedule, and joint transmissions of the same data.
TGbe discusses to allow a set of APs to form a multi-AP system, which can have a distributed or centralized coordination. In the latter case, the central node is often called Master AP, while the remaining ones are Slave APs. In contrast to existing cloud-based solutions, in which the APs are connected to the controller by a wire, some proposals, e.g., [156], assume that all the Slave APs are in the transmission range of the Master AP. However, some Slave AP may be hidden from each other. The AP roles are not bound to certain devices, but can be dynamically changed.
TGbe considers two types of multi-AP systems: Coordinated and Joint [157]. Coordinated systems send/receive each portion of data by a single AP, whereas Joint systems send/receive data by multiple APs. The considered multi-AP systems are listed below, starting from the ones, which are easier to implement.

a: COORDINATED SPATIAL REUSE (CSR)
CSR [152], [158], [159], the simplest multi-AP system, is an evolution of a spatial reuse (SR) system introduced in 802.11ax. It can be used when inter-BSS interference is weak, but the channel state is perceived as busy. For adequate SNR at all the STAs, the APs mitigate interference by cooperatively controlling TX power (see Fig. 15). It differs from the uncoordinated 11ax SR, where one AP transmits with the maximum TX power while the other APs should decrease TX power. CSR requires small inter-AP feedback, and it better combats interference compared with uncoordinated SR from 11ax. The Co-OFDMA [158], [159], [161], [162] multi-AP system allows the APs to coordinate their schedules in time and frequencies. With Co-OFDMA, the nearby APs can assign the same RUs for some STAs if such transmission does not interfere, or they can assign different RUs to avoid interference (see Fig. 16). Preliminary simulation results show that Co-OFDMA is effective for medium or large AP density [159].
Among the list of the multi-AP coordination types, CSR and Co-OFDMA are the most likely to be supported in the new amendment [163] because of their simplicity, flexibility, and plenty of possible solutions. For example, the authors of [164] improve the scheme by allowing the owner of TXOP to directly schedule the channel resources among the STAs associated with the neighboring APs. This approach requires additional signaling between the APs. However, it may improve the spectrum efficiency and flexibility of multi-AP transmission.

c: NULL STEERING
The idea of Null Steering [159], [162] (or, as it is also called for DL transmissions, Coordinated beamforming, CBF) is that while forming the beams to their STAs, an AP also targets to null its interference to particular neighboring STAs (see Fig. 16). This approach avoids mutual interference between nearby networks. One of the most significant challenges of null steering is that the APs need to acquire CSI from the non-served STAs associated to other APs.
Per-AP Interference Cancellation is a null steering approach but used for UL transmission [165]. Before UL frame reception, each AP collects information about the channel to all the nearby STAs. Later the AP configures its receiver to acquire a frame from its associated STA and ignore interference induced by the other STAs. Such an approach requires no data exchange across APs. Also, it allows simultaneous transmission by different STAs to the corresponding APs.

d: JOINT TRANSMISSION AND RECEPTION
This method [162], [166] allows multiple APs to serve the same STA by creating a dynamic distributed MU-MIMO system. This system runs jointly on multiple APs. In DL, the profit of joint transmission and reception is experimentally proven to be the highest compared with the coordinated multi-AP systems [167]. However, this method is too complicated and has severe requirements, such as high-speed backhaul and accurate synchronization across the APs.
Joint transmission in UL can be organized in different ways, providing higher reliability in various scenarios. TGbe has discussed the following approaches sorted in the increasing order of their complexity [165]: • Distributed SIC improves UL data delivery in case of overlapped transmissions. With Distributed SIC, frame reception consists of two stages. At the first stage, each AP decodes data from its STAs and forwards the data to other APs for interference subtraction. Then, the other APs remove the interfering signal from the received one and obtain their own data. This technique enables simultaneous UL, which is especially useful for RTA.
• Joint Frame Reception implies that the APs jointly process the received data from all the STAs. Such a scheme can provide high gains in deployments with non-uniform distribution of STAs and with high interference, but it requires tight synchronization across APs, extremely high-speed and low-latency backhaul. So it is not clear yet, whether it will become a part of the 11be amendment. Joint transmission and reception methods require joint signal processing at the APs. The performance of these methods can be improved by appropriate precoders used by the STAs to transmit data in parallel [168].
The described types differ in the level of synchronization needed between the APs. CSR can operate with as coarse as frame-level synchronization, CBF and co-OFDMA require symbol-level synchronization [169]. At the same time, joint systems require tight time and phase synchronization [170]. For CSR, Co-OFDMA, and CBF, the schedule is needed to be disseminated between the APs. In Joint TX, the APs additionally need to have the same data for transmission. In Joint Frame Reception, the APs need to exchange signal time samples. The last two methods are most difficult for implementation and the most challenging to provide notable gains in real deployments. From this perspective, CSR [171] and Co-OFDMA [172] seem more appealing and have been supported recently at the TGbe teleconferences.

2) SOUNDING
A multi-AP transmission starts with a sounding procedure during which the APs and the STAs measure the channel between them. In an example of sounding procedure for JTX, shown in [173], the Master AP initiates the sounding with a multi-AP trigger frame that requests multi-AP NDPA and NDP frames from all the involved Slave APs (see Fig. 17). The multi-AP NDPs from different APs are sent simultaneously, based on P-matrices [106]. After receiving all NDPs, the STAs send the NDP feedback frames. Each STA can address its feedback to its associated AP, to Master AP only [174], or to the APs from which the STA has heard NDPs.
For a high order of MIMO, multi-AP NDPs may take much airtime, so the submission [107] proposes to use LTF subcarrier interleaving, described in Section III-E. The sounding feedback from the STAs can also take much time, so the submission [175] suggests a method to reduce feedback information. The main idea is that a STA does not need to provide a VOLUME 8, 2020 detailed beamforming feedback for a far AP because involving this AP in joint transmission cannot provide much gain for the STA anyway. If a STA measures a poor channel to an AP, it can send either no feedback or a short channel quality indicator of the link, so the channel resources are saved.
Based on the obtained sounding feedback, the Master AP selects which APs serve which STAs. It is crucially important to ensure that all the selected Slave APs correctly contribute to the transmission. Otherwise, the multi-AP system can lose its anticipated gain [176]. Although the selection procedure is not discussed yet, and the Slave AP selection algorithm will be out of the scope of the standard, it is proposed that the STAs can measure the fronthaul links and recommend a set of Slave APs for a multi-AP transmission [156], [177].

3) COLLECTING ACKNOWLEDGMENTS
After the data is transmitted, the STAs send BA. In the MU case, ACKs from multiple STAs shall be coordinated. Otherwise, they collide. Thus, BAs should be sent sequentially or using UL OFDMA [178]. In the case of JTX, the APs need to synchronize information about the delivery of each frame. For that, each AP may listen to all the BAs, and then disseminate information about heard BAs to the rest of APs. Such an approach is rather complicated in case of MU transmissions and requires much channel time. Another approach allows one AP (e.g., the Master AP) to collect all the BAs, and then share this information with the other APs [178].

4) VIRTUAL BSS 4
In both DL and UL, the set of participating APs may vary dynamically based on the link quality, load balancing, etc. [179], [180]. Seamless exchange of frames between a STA and a group of cooperating APs is desired without additional negotiation overhead. Moreover, if a set of cooperating APs act as a single transmitter or receiver, the connection between the STA and the APs shall be secured. For that, the authentication and association procedures shall be done with all the APs of the set.
The submission [179] proposes to consider such a set of APs as a virtual BSS. All the APs of the set share the association/authentication and can have the same BSS ID. Hence, having finished the association procedure with a Virtual BSS, the STA does not need re-association, if it changes the physical AP serving the STA [181]. The coordinator of the Virtual BSS decides which AP serves an associated STA. The decision can take into account link qualities, channel capacity, AP's capabilities, the number of served STAs, etc.

5) IMPLEMENTATION ISSUES
A large palette of approaches discussed in TGbe in the context of multi-AP cooperation raises many open issues related to their implementation and provided gains. The main issue is coexistence with other networks. Multi-AP cooperation requires tight synchronization between APs that can be done 4 In the Wi-Fi standard, Basic Service Set (BSS) means a network. via wireless or backbones links. Previously, similar attempts have been made in 11aa with HCCA TXOP negotiation [182], which is quite an unclear feature not implemented in real devices. Besides, centralized and distributed clusterization is a part of mmWave Wi-Fi [6], which is a niche project. In traditional Wi-Fi networks, it is typically assumed that the nearby APs may be produced by different vendors and have different owners. Thus, delegating any decisions to a concurrent APs is a questionable strategy. Multi-AP cooperation shows a paradigm shift in the 802.11 Working Group. This innovation assumes that in an enterprise network, the majority of APs may belong to a single owner. Thus, any centralized decisions could be fruitful. However, it is not clear how efficient such decisions are in the presence of alien STAs that are not under control is an issue.
Multi-AP operation requires advanced scheduling techniques that are left beyond the scope of the standard. Even in a simple scheme, namely Co-OFDMA, the APs need to exchange information about channel resource demands. Moreover, the sounding mechanism -required for all the considered methods of multi-AP operation -has not been studied for UL, yet. The solutions proposed in TGbe for the DL scale poorly for UL. Besides, any centralized or distributed Multi-AP scheduling approaches raise the fairness issue.
Many open issues are related to joint transmission and reception. One of them is related to time, frequency, and phase synchronization of APs. Synchronization can be affected by independent carrier frequency offsets, time shifts between NDP and Data frame, carrier frequency drifts at APs between the NDP and Data frames, and propagation delay between the APs to the STA [183]. The authors of [184] explain the impact of these impairments and give an approximation of cumulative distribution function for timing offset. Submission [183] suggests introducing midambles into long data frames to reduce the negative effect of any residual carrier frequency offset across APs.
The variable nature of the wireless channel complicates joint transmission and reception. The relative gain of the channel across APs, which is used for precoding, should be very close to relative gain at the time of transmission [185]. Any difference beyond 0.8 dB needs to be corrected.
Submission [185] additionally highlights the backhaul requirements for joint transmission and reception. This method needs the data of all participating STAs are available at all APs, which can be done with the backhaul. In theory, the backhaul can be deployed on the same channel as the fronthaul or another wireless/wired channel. In practice, the usage of wireless backhaul is questionable because it shall have a huge capacity, exceeding the cumulative throughput of all the STAs in the network.

IV. CONCLUSION
The 802.11be amendment is the next significant milestone in the Wi-Fi long-term success story. Its core features are related to providing extremely high throughput and supporting real-time applications. Although the standard development process is at the very initial stage, we already can sketch the future technology and point out its advantages and limitations together with open issues, which require additional efforts from the community.
In the tutorial, we introduce seven significant innovations of Wi-Fi 7 and describe in detail the related proposals. In theory, higher nominal data rates and lower latencies can be achieved only by using the first innovation: the EHT PHY. However, in practice, the EHT PHY alone cannot provide notable gains in goodput and latencies for end users because of the unlicensed spectrum, interference, and massive overhead. That is why in addition to the EHT PHY, TGbe discusses the other six innovations for Wi-Fi 7. The modified EDCA and OFDMA will provide support for RTA. Furthermore, OFDMA will become more flexible to improve spectrum efficiency. Bringing multi-link operation inside the Wi-Fi standard adds flexibility to resource usage and offers a complementary approach to higher bandwidth utilization and even higher throughputs. Much effort towards minimization of the channel sounding overhead targets to open the door for efficient massive MIMO Wi-Fi systems. Finally, TGbe discusses advanced PHY approaches, such as HARQ, NOMA, and FD that can increase spectral efficiency, and various multi-AP cooperation approaches. Within the latter group of proposals, we see another paradigm shift from interference mitigation by separating transmission in time/frequency/space or power to joint transmission within a distributed massive antenna system. Although TGbe may postpone many of the advanced PHY and multi-AP cooperation features for the next Wi-Fi versions, they show us the direction of the further evolution beyond Wi-Fi 7.
In addition to introducing the reader to the anticipated features of 11be, we try whenever possible to give further hints on open issues interesting for industry and academia. These open issues are related both to the mechanisms that should be included in the standard (e.g., sounding procedure, HARQ framework, multi-link channel access) and to the algorithms beyond the scope of the standard. We describe many optimization problems and implementation issues. We hope that our survey will attract researches from the top telecommunications companies and leading universities to 802.11be challenges. Finally, these researchers will contribute to the new technology with thorough studies and efficient solutions.