Introduction
In the last 40 years, telecommunications have progressed significantly. To briefly recall, in 1990, we witnessed the global spread of “the Internet”. In 1995, the first cellular systems (or 1G) emerged in Europe and the United States with Total Access Communication System (TACS), Nordic Mobile Telephone (NMT), and Advanced Mobile Phone System (AMPS). Developments then continued with the 2G cellular systems as Global System for Mobile Communications (GSM), Japanese Digital Cellular (JDC) and Interim Standard 95 (IS-95), and 3G cellular systems as Universal Mobile Telecommunications System (UMTS) and CDMA2000. A further step-forward in telecommunications and services occurred when smartphones entered the market, and the throughput provided by cellular systems was sufficient to give users fast access to the Internet on mobility. This led to the era of “Mobile Internet” around 2005. In these years, High-Speed Packet Access (HSPA) and Long Term Evolution (LTE) spread in the telecommunications landscape [1], [2].
Ten years later, a new service begins to emerge: the “Internet of Things” (IoT). In the IoT, communications between objects are enabled, allowing services where sensors or actuators can be accessed remotely through software platforms [3]. The concept had already been present for several years, and experimental communications between objects and networks had already been conducted (e.g., remote asset monitoring and vehicle fleet management). The development of IoT has been slower, and it is still ongoing. The reasons are multiple: 1. The market is not very remunerative. For services dedicated to people (e.g., voice and mobile internet), telecommunication operators had a high payback, allowing them to be economically expansive, at least until 2010, in terms of technological investments and network developments; 2. Communication technologies are not as performing in terms of connecting with sensors and other low-capacity processing objects (i.e., the mobile network was not designed for the typical characteristics of sensor communications such as low transmission frequency, many sensors in an area, and low power consumption); 3. Terminals have always played a prominent role, and without suitable terminals, the telecommunications market has struggled to develop as happened for IoT market. Sensors, often with low processing capacity and energy constraints, have been developed by those directly interested in monitoring services rather than by international vendors. This resulted in lower business margins compared to smartphones and tablets.
The advent of 5G has brought about new services thanks to significant technological developments. The three communication classes potentially enable new services: enhanced Mobile Broadband (eMMB), which provides higher data rate, massive Machine Type Communications (mMTC), which enables a high number of connections per area and ultra Reliable Low Latency Communications (uRLLC), which guarantees low latency in communication. However, the full potential of 5G is not yet fully exploited, and the services provided struggle to be monetized by telecommunication operators and vendors [4].
The question now is, what’s next? What will be the new services that will spread in the coming years? Before answering this question, in my opinion, the role of future communications will go far beyond the simple connectivity of devices or sensors. Its role will be to provide access to information and knowledge. According to [5], future networks will enable some of the United Nations Sustainable Development Goals (UN-SDGs) such as sustainability, circular economy, zero-waste, and zero-emission. It will also enable new verticals such as digital health, smart transportation, and smart factories. New business models will be possible (local demand-supply-consumption), reducing barriers to entry for service provision thanks to the decoupling of hardware and software (this will mean that even small companies will contribute to innovation). Finally, users will contribute to data generation, promoting both human-centric knowledge and pluralism and diversity of views.
Going back to the question “what’s next?”, according to the author, as well as many experts, researchers and visionaries in the telecommunications sector, we are close to a new and significant step forward that will be taken in the coming years. Advancements in the telecommunications systems currently being developed will soon be capable of providing connectivity for new services. Alongside these, there will be the development of devices (although working products are being available in the market, some steps need to be taken to contain costs and leverage economies of scale) and monetization by stakeholders (i.e., telecommunication operators or over-the-top providers), who will need to finalize the business plan.
According to the author, the next services that will be available to a wide audience are depicted in Figure 1. One of the services expected to spread in the future is Immersive Communications. After phone calls and video calls, communications will become increasingly immersive and engaging for the human user. In addition to smartphones and tablets, other devices such as goggles, viewers, or Head-Mounted Displays (HMDs) will be utilized. This involves combining both the features of the eMBB and uRLLC classes defined in 5G. Typical use cases include remote collaboration for joint design and training, gaming and entertainment and virtual tourism.
A second service will involve not only people but also every type of object connected to the network, accessible remotely and from nearby devices, capable of providing microservices. The types of devices will have different characteristics, as described in Section III. The concept of mMTC introduced in 5G will be extended because the environments in which the Everything Connected concept will be provided will be very different, ranging from wide-area environments like smart cities to local environments like smart offices and smart buildings, up to personal environments, involving sensors and devices such as smartwatches and medical devices with the aim of improving our daily lifestyle. Typical use cases include monitoring health parameters of hospitalized or home patients, predictive maintenance for vehicles and industrial machinery, supply chain management, and remote control of security cameras.
Finally, a third very important service will be High-Positioning. This is not simply about locating a smartphone or an IoT device at the level of a cell or within a
Applications may require more than one of the three main services. For example, the combination of high-positioning and everything connected can enable a smart warehouse. An assistant through a headset can be aware of packages and their contents based on the activity to be carried out and, most importantly, their location, significantly reducing operational time. Applications based on everything connected and immersive communications can enable a person to pilot a drone through a headset, allowing interaction with the drone’s controls and potentially with other objects nearby. Finally, high-positioning and immersive communications can support remote driving and smart transportation.
To provide these services, the next-generation wireless networks (or 6G) need to achieve much more stringent performance requirements than those in 5G. To this end, it is necessary to define new performance indicators for proper design and evaluation of the network in 2030, as well as new traffic models. 3GPP has defined numerous advancements to increase data rate, reliability, and coverage, reduce latencies, and improve network management. In the paper, the upcoming developments envisaged in Release 18 and Release 19 will be described, and the most promising enabling technologies to deliver the three main innovative services will be investigated.
The three main services represent advancements beyond previous classical services from which they derive, and, most importantly, they form the foundation for providing new applications. Their definition implies proactively understanding the key and fundamental functionalities common to a certain group of applications, which enables the definition of a set of techniques, technologies, and telecommunication architectures. These elements are crucial in designing the bearer communication system and specific applications, thereby creating a business market. This is likely to propel us forward, similarly to how voice transmission paved the way for cellular systems, data transmission for the Mobile Internet, and machine-to-machine or human-to-machine communication leading to the IoT.
The paper contributions are outlined as follows:
Presenting, according to the author’s perspective, the new foundational services that will be provided by the next-generation telecommunications systems (6G). In line with this vision, the author introduces their characteristics and proposes new specific Key Performance Indicators (KPIs) for each bearer service. The aim is to facilitate the effective design of the next-generation network and assess its performance optimally.
Critically examining the developments defined or proposed by the 3GPP in Releases 17, 18, and 19, and exploring other potential systems and technologies that could be integrated in the network 2030. The advancements from 3GPP are analyzed within the contexts of the three new bearer services, distinguishing this paper from many in the literature that often limit themselves only addressing limited aspects (e.g., holography, virtual reality, or smart cities).
Reporting and proposing the requirements that the next-generation network must fulfill to support the bearer services alongside applications that are expected to proliferate in the next 3–8 years. The analysis refers to applications ready to be available in the short term (2-4 years), such as smart cities and smart agriculture; in the medium term (3-5 years), such as smart factories, smart transportation, and digital health; and in the long term (not before than 6–8 years), such as autonomous driving and remote surgery. In addition the paper provides the most promising enabling technologies.
In the literature, some other papers are present and dedicated to the forthcoming network expectation in 2030. However, they usually treat limited aspects of the entire context differently from what is presented in this paper. In 2018 ITU presented the market drivers for the network in 2030 and some possible vertical markets [6]. In [7] it is given a high-level vision of future wireless networks in terms of applications and technological trends but limited to providing a list without entering in deep details. In [8] author gave an overview of the last advances in Release 18. In [9] authors extensively presented enabling technologies for 6G and the corresponding projected timeline. In [10] and [11] authors focused in XR standardization activities within 3GPP. Also some vendors proposed their use cases supported by future networks including Nokia [12], Ericsson [13], Huawei [14] and Samsung [15].
Each section describing the new service is divided into three areas: the specific services supported by the bearer service, the KPIs necessary for a proper performance evaluation and design, and the set of enhancements defined or expected in the upcoming releases of 3GPP. In particular, C subsections detail the main advancements investigated in the 3GPP releases in order to support each bearer service, including new techniques, optimized algorithms and architectures or simple parameter settings. Refer to Figure 2 for organizational details. After presenting the three bearer services, the main and most innovative applications for our lives and interactions related to the three bearer services are described. Finally, according to the author, the most promising technologies enabling services in the next-generation network are presented.
Before concluding this section, some useful definitions are reported:
Service. In telecommunications, a service refers to the methods of performing a communication process, provided to users over a network and includes infrastructure, protocols and other mechanisms. Classic telecommunications services include voice calls, messaging, Internet access, video conferencing or streaming, etc., usually provided by a network operator but also by external service providers.
Application. In telecommunications, an application refers to a software program or component that runs on a device and utilizes telecommunication services to perform specific tasks or provide functionalities to users according to his settings. Classic examples include email, chat, web-browsing, social media and soccer match streaming. Applications, accessed by user devices, are typically developed by individual software developers or companies.
Use Case. A use case in telecommunications refers to a specific scenario or situation in which telecommunication technologies, services, or applications are utilized to achieve a particular goal. It outlines the interactions between users, devices, and systems within a given context, detailing the sequence of steps and actions required to accomplish a specific task or objective. Their aim is to visualize how telecommunication solutions can address real-world issues, thus helping designers and developers in understanding the behavior of the telecommunication system. They typically include the fruition of specific services in a given situation such as voice calls in a high-speed train, online shopping using goggles, remote monitoring and Internet access in remote areas.
Enabling technologies. In telecommunication, enabling technologies refer to the group of techniques, advancements and technologies that support and implement services and applications in the telecommunication system. These technologies are designed to support the network key capabilities such as full coverage, multi broadband connectivity, low-latency, confidentiality and high-precision positioning.
In details, the paper is organized as follows. In Section II, the characteristics of immersive communications are described. Novel KPIs for their evaluation and the enhancements defined in 3GPP Releases 17, 18, and 19 are also reported. In Section III, the characteristics of the types of smart things within the everything connected service are described. The various enhancements defined in 3GPP to ensure proper communications, the platforms, and the environments where the new smart objects will be deployed are reported. Section IV describes the implemented techniques and technologies to ensure accurate and timely positioning. Moreover, enhancements defined in 3GPP and new services supported by high-positioning are also reported. In Section V, service requirements and main applications enabled by the three services are provided. Before presenting the conclusions in Section VII, Section VI analyzes the enabling technologies that will be studied and implemented in the path from 5G-Advanced to 6G. Enjoy reading.
Immersive Communications
A. Specific Services
Immersive Communications (IC) is the evolution of the classical service in the telecommunications. It follows the voice and video streaming or video call services that telecommunication systems provided from the beginning up to now. IC is the ability to combine natural high-quality human experience by merging the physical world with a digital or simulated reality. It is based on multi-channel, multi-sensory and interactivity with the adoption of digital equipment such as HMDs or hologram devices. Users can interact with each other in virtual spaces, regardless of their physical location. In addition to videoconferencing, IC have the potential to transform various industries and fields such as remote collaboration, education, training, telemedicine, virtual events, and more. It enables more natural and engaging communication, facilitating better understanding, collaboration, and innovation among individuals or groups, regardless of their physical location [16].
This group encompasses services that focus on providing engaging and immersive user experiences. Some relevant examples are the Extended Reality (XR), the Holographic Type Communications and the Haptic Communications, including also scents [17]. Within the umbrella of XR falls down Augmented Reality (AR), Mixed Reality (MR) and Virtual Reality (VR). While they may have been initially considered similar, they actually have distinct characteristics. According to the author it is preferable to consider AR and MR as XR because they are technologies that blend the physical and digital realms extending the reality, while considering VR separated from them because it completely replaces the real world. Moreover, AR and MR have different use cases and applications with respect to VR [18].
1) Extended Reality (XR)
In the first step (lowest level of virtuality), some digital objects (such as vehicles, faces and simple animals) are overlaid onto the real world, enhancing the user’s perception of reality. This is called AR and it can be experienced through devices like smartphones or AR glasses, allowing users to see and interact with virtual objects in their physical environment. MR goes a step further by combining virtual content with the real world, creating an interactive and immersive experience. It allows users to see and interact with virtual objects that appear to coexist with their physical environment. MR often uses specialized hardware like headsets or glasses (e.g., hololens) to achieve this, with the aim of extending the computer functionalities. This way MR reports information with up to six Degrees of Freedom (6DoF), i.e. the 3D positions [x, y, z], and the 3 rotation axis [yaw, pitch, roll]). The implementation procedure is based on three main steps: i. the content transmission, where the video frames are captured by an AR/MR device; ii. rendering, after some processing on collected frames (e.g., object detection/recognition, possible object positioning and physical/virtual mapping), data are added to the augmented frames and rendered by the AR/MR device; iii. feedback collection by AR/MR device to select the content to deliver next. Depending on the collect contents (e.g., gloves, inertial sensors) some scene processing can be performed locally (i.e., on the device) or in a close server edge to offloading the device.
2) Virtual Reality (VR)
Differently from AR/MR, VR creates a fully immersive digital experience by completely replacing the real world with a simulated environment. Users typically wear a VR headset that covers their eyes and ears, transporting them to a virtual world. This is often accompanied by hand controllers or other input devices for interaction. The human vision is not uniform as a screen but it is different depending on the distance from the central view due to the binocular vision and based on the cone and rod density over the retina. The knowledge how the human vision works is fundamental to reduce the complexity of the optical hardware, as well as the software and content structure, without degrading the user’s display immersion and comfort experience [19].
In general, the width of the observable virtual world that can be seen when using VR/AR/MR headsets is referred as the Field of View (FoV). To have a realistic experience is necessary to have a wide FoV, usually measured in degrees. Typical headsets available on the market have a horizontal FoV ranging from
3) Holographic-Type Communication (HTC)
HTC refer to technologies to transmit and receive 3D holographic images and videos, creating advanced and engaging communication experiences. Differently from traditional two-dimensional images or videos, holograms provide a more immersive and realistic experience as they can display objects in the environment and scenes with depth and parallax.
HTC are composed of three processes [20]. First, the capturing process is the collecting of a 3D representation of an object, person or environment. Nowadays, the most common technologies are LiDARs, based on the mechanism of time-of-flight (ToF) able to measure the distance by the time it takes a light pulse to travel to its destination and back, or stereo/multi cameras for the information depth. Some post-processing steps may be necessary to provide one single stream and reduce the data size (e.g., remove redundancy and noise). Second, the transmission process is not just the bit rate stream and bandwidth constraints but it is related to encoding and compression of the data based on real-time and semantic meaning of the holographic images. This is fundamental for offloading heavy processing from the displaying device and network congestion. Third, the display or rendering a content is the generation of the 3D holographic images. At the moment, the main device types are smartphones/tablets (i.e., handheld devices), holographic displays and AR/VR glasses. With these devices, a hologram can be placed on a table or in a room and we can move around it. Generally, the 3D scene is processed in the cloud, but the 6DoF can track the head movement and split the rendering in the network edge. In addition, Artificial Intelligence (AI) can support the content generation in case of delay in reaching the cloud or bandwidth constraints.
4) Haptic Communications (HC)
HCs involve the touch in the interaction of two or more people or one person with an object. The definition has been extended as electronic devices are able to provide feedback via taps, vibrations, and pressing and releasing sensations, such as smartphones or smartwatches [21]. The next step is to have remote HC. Current remote communications are still mainly audio-visual, but HC is growing and has a potential impact in some life fields [22]. In some cases, they can be considered as part of the MR, to enhance the VR or just to make more real a cloud game. Nevertheless, they have a quite different function. First, the HC development and diffusion are related to the haptic interface needed to collect and to provide the touch or kinesthetic feedback to the user [23]. At the moment haptic interfaces available on the market or as a prototype are skin finger, gloves, bracelets, sleeves but also vests and jackets (as wearable) or consoles, smartphones (as external devices), able to provide vibro-tactile [24], force [25] or thermal [26] feedback. Second, the delay requirements are more stringent than those required by videos or audios. Third, the applications are quite different. The feedback touch may be important in case of some dementia pathologies such Alzheimer, where HC can provide comfort and calm, or for people with visual and hearing impairments, where HC may enable remote communication for them, or just to amplifying the emotions experienced by people and their quality of experience.
Similarly for holograms, HC is composed of generation phase of the touch data, the transmission phase and the local reproduction at the user through the haptic interface. With regard to tactile data acquisition, force sensors, thermistors, and laser scanners are mainly employed to measure or assess factors such as friction and hardness, warmth, and macroscopic roughness. In the haptic data transmission, it is possible to make a data reduction to counteract the bandwidth limits. This can be performed in the sender haptic interface or in an intermediate server. The haptic stream may be composed of some sub-streams generated by different haptic interfaces of the sender or by some senders located in different places. For an optimal experience, all sub-streams should be properly synchronized at the recipient haptic interface [27]. Anyway, it is expected that the XR and HC will be integrated in the near future. An example is reported in [28]. The latency requirements may be even lower than those of MR/VR (e.g., 0.5-3 ms), but the data rate is certainly lower (from 2–5 Mbit/s).
B. Traffic Model and Advancements From Release 15 to Release 17
At the beginning (Release 15), 5G introduced several enhancements suitable for increasing the data rate and to reduce the latency such as mini-slot transmissions, downlink pre-emption, grant-free transmissions, and Demodulation Reference Signals (DMRS) to increase the Multiple-Input Multiple-Output (MIMO) effectiveness for users. Nevertheless, all these new technical functionalities were not designed directly for IC (e.g., MR/VR or HTC applications). Small enhancements for IC have been introduced in Release 16 as reported in the following [11]. Concerning reducing latency, dynamic power boosting is allowed for User Equipment (UE) transmitting time-sensitive traffic with respect to eMMB and uplink pre-emption is enabled. In order to preserve UE battery life, the Discontinuous Reception (DRX) is enhanced also in Connected Mode (i.e., Connected Mode Discontinuous Reception, CDRX), where the UE may still be asleep even if it is in Connected Mode unless the network does not send a wake–up signal on Physical Downlink Control Channel (PDCCH) within the ‘OnDuration’ time period. Moreover, the UE may indicate preferred saving parameters and CDRX configurations within the UE Assistance Information (UAI), adjusted according to the running application. Concerning the scheduling strategy, it is defined the Semi-Persistent Scheduling (or Configured Grant, CG), where the network may configure one or more Physical Resource Blocks (PRBs) for a UE for a given time period. This enhances the dynamic scheduling where the gNode B (gNB) assigns the PBR to the UE after its Scheduling Request (SR). With dynamic scheduling (or Dynamic Grant, DG), the UE has to monitor Downlink Control Information (DCI) to know when the network allows it for the uplink data transmission, resulting variable transmission time instead of fixed and constant in the CG strategy. Concerning the data rate increasing, Carrier Aggregation (CA) and MIMO are also enhanced by increasing the number of carriers that can be aggregated and the supported layers. Moreover, possible flow adaptation may be provided on UAI based on application requirements.
The real XR/HTC standardization started with Release 17. In 3GPP TR 26.928 [18] XR has been clearly defined and the most common uses case were described in terms of categories (e.g., MR, VR, Cloud Gaming), devices (e.g., smartphone, HMD, AR glasses) and Quality of Service (QoS)/Quality of Experience (QoE) requirements (e.g., in terms of Mbit/s, maximum allowed delay, frame-per-second streams). The 23 envisaged use cases can be grouped into five main categories: social and immersive sharing (e.g., 3D shared experience, emotional sharing, visiting museum sharing), gaming (e.g., immersive gaming as player or spectator, gaming party), 3D communications (e.g., 360-degree conference, convention, avatar calls, XR meeting), online shopping (e.g., shopping from a catalogue, shopping with the shop assistant) and dedicate applications for industries and police forces (e.g., AR guided remote assistance).
Moreover, in 3GPP TR 38.838 [29] the XR traffic model and KPIs were described. Basically, 3GPP considered four traffic types: downlink (DL) Video, uplink (UL) Video, DL audio and data and finally UL Pose and Control Traffic. The DL Video is both for VR and MR, while in MR it can be also for uplink. The single frame, which contains data for left and right eye, is modelled as some IP packets. The stream can be single or multiple in case the scene is encoded separately and sent in different streams. The third traffic type is for aggregated video and audio/data both in DL for VR (e.g., for gaming) and in DL/UL for MR. The traffic has separate streams for audio and external data. The fourth traffic type is related to the pose of the UE device (e.g., movements and rotations) and control information (e.g., auxiliary, user commands and local sensor data). It is important to provide the UE with most updated data from the XR.
The generic video traffic for MR/VR is modelled as a sequence of video frames arriving at gNB according to the video frame rates \begin{equation*} T_{f}=\frac {1}{\lambda _{f}} + J_{f} \tag {1}\end{equation*}
Given the application characteristics, the traffic is composed of one or more set of Packet Data Units (PDUs), each one carrying the data of a frame video or any MR/VR payload (see [18]). Then, data generated by the application in a short period of time and that should be transferred by the network are usually burst. A PDU set is then composed of a quasi-periodic PDUs at the period of
Data burst for XR traffic model: (a) its relation with the PDU sets; (b) mismatch between XR traffic at 60 Hz and SPS scheduling in Rel-16 for 16 slots and 17 slots.
Based on the application, the PDU set can be formed by information units that can be still recovered by the application layer in case one or more PDUs are missing. For this reason, the used KPI for the system capacity assessment \begin{equation*} C_{S} \equiv N_{\mathrm {total}\; \mathrm {UE}} \quad \mathrm {if} \quad \Pr \left \{{{\frac {N_{\mathrm {satisfied}\; \mathrm {UE}}}{N_{\mathrm {total}\; \mathrm {UE}}}\geq Y_{\%} }}\right \} \tag {2}\end{equation*}
\begin{align*} & Pr\left \{{{\frac {N_{\mathrm {frame}\; \mathrm {in} \; \mathrm {time}}}{N_{\mathrm {total}\; \mathrm {frame}}}\geq X_{\%} }}\right \} \; \mathrm {or} \\ & \Pr \left \{{{\frac {N_{\mathrm {correct} \; \mathrm {packet}}}{N_{\mathrm {total}\; \mathrm {packet}}}}}\right \} \geq \mathrm {PER_{T}} \tag {3}\end{align*}
In (3) \begin{equation*} N_{i} \in \{\mathrm {set \; of \; the \; frames \; in \; time} | \mathrm {PDB}(N_{i})\lt \mathrm {PDB}_{T}\} \tag {4}\end{equation*}
A received packet is ‘correct’ in case it is readable by the application. Equations in (2) and in (3) should be evaluated statistically (i.e., with the probability Pr{}), as they depend on traffic and environmental characteristics such as the source data rate, the frame generation rate, the jitter, the environment and the distance between the UE and the gNB. From Release 17 XR evaluation methodology [29], baseline parameters in (2) and in (3) are
C. Enhancements in Release 18
The introduction of the KPIs for MR/VR allowed for a more precise definition of aspects to be improved and considered in Release 18. The 5G System (5GS) architecture integrating XR and Media (XRM) services is the classic one defined in Release 15, for supporting external services/applications as reported in Figure 4 [18], [30].
Assume a 5G-XR Application Provider being an XR Application provider that makes use of 5G System functionalities for its services. For this purpose, it provides a 5G-XR Aware Application on the UE to make use of a 5G-XR client and network functions using network interfaces and APIs, potentially defined in 5G-XR related specifications. Within the 5GS, there are a dedicated Application Function (AF) for XMR (namely, 5G-XR AF), with the aim of providing some control functionalities to the XR Session Handler in the 5G-XR Client and to the 5G-XR Application Provider, and the 5G-XR Application Server (AS), hosting the 5G-XR media functions. The 5G-XR AF may interact directly with the Policy Control Function (PCF) or with the Network Exposure Function (NEF) for exploiting other network functions in case it is located in a trusted network. On the contrary, the 5G-XR AF may interact only with the NEF if it is on external networks. Finally, in the 5G-XR Client the XR Engine completes the functionalities to access to XRM data from the 5G-XR AS and to process sensors and tracking data with the XR Session Handler for XR session control. In addition, a 5G-XR Aware Application may use the Uu interface (i.e., directly through a gNB) or the PC5/sidelink interface (i.e., relaying data through another UE) to access the XRM services according to [31] and similarly to Vehicle-to-everything (V2X) service as in [32].
In addition to the XR architecture within the 5GS, other enhancements have been investigated and defined in Release 18 as reported in the following [33].
XR Awareness. As reported in Figure 3, PDU sets and Data Bursts in XR carry a content that the application treats as a single unit, such as a segment of an image or a video/audio frame, and determining the duration of a data transmission. Then, in case a Data Burst is recognized, the Radio Access Network (RAN) may improve its transmission. Some information may be provided by the Core Network (CN) to the RAN [34] such as: i. the traffic jitter of the frames in the PDU Set, UL/DL traffic periodicity and PDU Set QoS parameters useful for QoS flow matching; ii. Data on PDU Sets such as its size, the last PDU of the PDU Set, the sequence number within a PDU Set, and the PDU Set Importance (PSI) compared to other PDU Sets within a QoS Flow; iii. new performance parameters such as PDU Set Error Rate (PSER) and PDU Set Delay Budget (PSDB), that are applicable to the whole PDU Set and described by the PDU Set Integrated Handling Indication (PSIHI). This allows the RAN to take countermeasures in case of congestion and during the scheduling phase in order not to degrade the XR application performance (e.g., discarding PDU less important or scheduling earlier PDU that are about to overcome its PDB, see also for example [35]). In [36], four strategies for mapping the PDU Sets (at the User Plane Function, UPF) onto QoS flows (at Service Data Adaptation Protocol (SDAP) in the gNB) have been proposed. In 111 mapping, each PDU Set is mapped into one single QoS flow, which is then mapped into one single Dedicated Radio Bearer (DRB); in NN1 mapping, each PDU Set is mapped into one single QoS flow, but all QoS flows in the SDAP are then mapped into one single DRB in the Packet Data Convergence Protocol (PDCP); in N11 mapping, all PDU Sets are multiplexed into one single QoS flow, which is then mapped into one single DRB; finally, in N1N mapping all PDU Sets are multiplexed into one single QoS flow, which is then split into multiple DRBs.
Power Saving. In order to save batteries in the UE, it is requested to periodically turn off, thus implementing the DRX. The problem with XR traffic is that the alignment between the XR frame rate (typically 30, 60, 90 and 120 fps or 33.33, 16.67, 11.11 and 8.33 ms respectively) is critical with the 5G frame periodicity and the integer DRX values (see Figure 3.b), causing power consumption since the UE should stay alive or extra latencies are introduced. Due to PDU Set information, the PDCCH monitoring in DRX can be adjusted correspondingly as well as can be adapted when a PDU Set is starting or ending, with respect to the original settings.
Capacity enhancements. Three main enhancements for XR capacity are defined in Release 18. The first one is at PHY layer and it foresees the possibility to configure grant periods with one single packet, then signaling multiple Physical Uplink Shared Channel (PUSCH) occasions, thus saving overhead. The second enhancement, also occurring at the PHY layer, relies on the UE indicating unused PUSCH occasions through the Uplink Control Information (UCI), based on the actual size of the application uplink data. This allows the gNB to potentially reallocate some or all of the unused uplink resources to other users. No enhancement is foreseen in downlink. The third one occurs at Radio Link Control (RLC) layer. As high bit rates and low latency are requested for XR services, the structure of Buffer Status Report (BSR) is improved in order to reduce quantization errors and latency by including also the delay knowledge of buffered data.
5G System enhancements. As reported above, the XR architecture within the 5GS follows the typical implementation. However, some architectural improvements have been defined in Release 18 as follows.
Policy control enhancements to support multi-modality. For XR services, there are multiple types of traffic, e.g. audio, video, sensor, pressure, and tactile, all generated to enhance the user experience. In the existing 5GS, the policy control is applied to a single data flow. To guarantee that the multi-modal data is transmitted in a coordinated fashion, the policy control of 5GS is now being enhanced with alignment policy control, e.g. same 5G QoS Identifier (5QI), same priority, or integrated network resource allocation.
Use of the information exposure in 5G. In order to assist the application, the 5GS exposes network information to allow the application to properly adjust their parameters such as the generated frame rate and data encoding. For XR, it is possible to use the evolved Explicit Congestion Notification (ECN) marking already adopted in wired networks, namely Low Latency, Low Loss and Scalable Throughput (L4S). Then, when the RAN detects a congestion, the UPF may mark the received packets at IP layer, before their transmission to UE in downlink or outside to the data network (or to another UPF) in uplink. Moreover, in addition to classical 5GS information such as congestion, QoS Notification Control and data rate, that can be sent to Session Management Function (SMF)/PCF or to UPF, the UPF may also evaluate the Round-Trip Time (RTT) and data rate of flows (not only the ones experienced by the RAN) and report them to PCF and to NEF/AF, which can signal any impairment to the XR application, thus timely enabling the server to take any countermeasure (see the flow arrows in Figure 4). Furthermore, this information can also be exploit to evaluate the stability of the network, through the Packet Delay Variation, thus allowing the PCF to take specific countermeasure in case it is too high for the XR application;
Uplink-downlink transmission coordination. To meet RTT requirements, the PCF may split the RTT target into an UL PDB and a DL PDB, by generating two Policy and Charging Control (PCC) rules for the uplink stream and the downlink stream and assigning the 5QIs according to PDBs, respectively.
Before concluding this section, it is worth noting that new 5QI should be defined in order to satisfy the typical requirements of IC services i.e., high data rates and low latency. Currently, 5QIs used for MR/VR flows are some already available and standardized in [18, Table 4.3.3-1] (identical to [30, Table 5.7.4.1-1]). Among others, potential 5QIs relevant for MR/VR are those ensuring delays within 20–150 ms (up to 300 ms) and typical data rates of 50–100 Mbit/s. For instance, 5QI with value 80 has a PDB of 10 ms among Non-Guaranteed Bit Rate (Non-GBR) types, while 5QI with value 3 has a PDB of 50 ms among Guaranteed Bit Rate (GBR) types. However, it would be advisable to define 5QIs by 3GPP with low latency and high data rate.
Everything Connected
The number of non-human devices, commonly referred to as ‘things’, connected to the Internet is rapidly growing worldwide. The IoT encompasses sensors used for infrastructure monitoring and environmental automation. Looking ahead, additional devices with increased processing capabilities and fewer energy constraints, such as vehicles and equipment in smart factories, will join the network. Furthermore, connectivity distances will vary, ranging from body and local area networks to wide and very wide area networks, for example, in cities or extensive agricultural fields.
In this vision, the connectivity extends beyond individual sensors or actuators, becoming crucial for aspects such as interoperability between communication networks (from local to wide) and interconnection between products, processes, and applications. Technical considerations, including mobility, speed, and data transmission latency, also play a fundamental role, paving the way for the digitization of society.
A. Types of Smart Things
In practice every single element in our life, able to provide a simple service or life functionality, will be connected in the Internet, thus enabling a plethora of applications. It is envisaged that at least four groups of devices will be part of the digital ecosystem as follows.
Smartphones. In addition to classical user terminals, such as smartphones and tablets, advanced mobile devices will be available in the near future, aiming to provide communication and Internet access. Nevertheless, they will offer a wide range of applications and services to the human user. For example, they will work as remote controllers for home appliances, gate access devices, and collectors of personal data, enabling access to various services like banking, office, fitness, and health data. For these reasons, they need to have high-performance capabilities, reduced energy constraints, high processing capabilities, and high-resolution screens for optimal data visualization and user experience.
Sensors and Actuators. Belonging to this category are classical detection devices, such as environmental sensors capable of collecting and transmitting data about the surrounding environment, including humidity, temperature, and video monitoring. Additionally, simple actuators fall into this category, capable of interacting with the environment triggered by events, such as opening a gate, activating a thermostat, or automatically alerting security after an infraction. Generally, these devices have low processing capabilities and limited energy constraints if cable energy is provided, as data is remotely collected in a software platform where the service intelligence resides.
In the near future, new services and applications will emerge, such as smart agriculture, smart transportation, and the smart city, going beyond the mMTC concept within the 5G framework. In these scenarios, simple sensors will interact with more complex devices requiring higher data rates (up to 5–10 Mbit/s), greater processing capabilities, and lower latencies (less than 1 s). Moreover, depending on mobility and the operational environment, the future communication system should ensure full coverage in various areas, including homes, factories, large fields, cities, or even globally (for logistics or support of military operations, for example). In addition to mMTC, the MTC umbrella also encompasses critical MTC (cMTC), which has more stringent latency requirements (e.g., 10 ms or lower, as seen in a smart factory), as well as synchronization accuracy and reliability in communication. In this case, the operational environment is typically small and well-confined, such as a factory, thereby reducing coverage and energy constraints.
Bio-devices. The proliferation of smart ‘things’ will provide individuals with several local devices capable of performing simple or quite complex functionalities in their lives. For example, in the near future, smart glasses will offer functions currently provided by smartphones (e.g., street navigation views and additional information about a visited site) or quick previews of messages and incoming calls through a smartwatch. Other devices will assist users in monitoring their health parameters (e.g., fitness trackers, blood pressure, heart rate, or blood insulin levels) or simply in tracking their belongings (e.g., house keys or the parking location of their car). A smart ring (or a smart finger) equipped with an RFID may allow users to access their homes or cars, including security checks through multi-biometric parameters (e.g., fingerprint, voice, face recognition). Figure 5 provides an example of a body area network that can be available in proximity to a user and composed of some local sensors. Furthermore, it is also expected that, in addition to wearable sensors, other types of terminals will become widespread, such as brain sensors, skin patches, and bio-implants (e.g., pacemakers, insulin meters), to enhance user perception (e.g., haptic, gesture) and for health monitoring [37]. Requirements for this group of sensors strictly depend on the related application supported by the device. Typically, data rates are moderate (lower than 1 Mbit/s), as well as reliability and latency (i.e., typical applications do not require stringent reliability values since the data can be transmitted several times before becoming obsolete, with latency in the range of a few hundred milliseconds or seconds). Similar considerations apply to energy constraints, which strongly depend on the device (e.g., a pacemaker should be powered by batteries capable of signaling their low energy status in advance) and localization (e.g., for security purposes, localization can be in the order of a few meters; for a key tracker, it is probably better to have localization lower than 1 m, with 10 cm being preferable).
Drones and vehicles. Machines or mobile devices, such as vehicles and drones, then typically of large size and equipped with their own battery or motor for movement, fall into this category. They are installed with a transceiver chipset, thereby connecting them to the Internet.
The proliferation of drones namely unmanned aerial vehicles (UAVs) for military and hobbyist purposes is rapidly increasing. However, the chance to equip drones with cameras and other payloads (e.g., packages, fertilizers, first aid medicines), coupled with Internet connectivity, opens the way for several applications, including surveillance, goods delivery, support for agriculture and exploration, and acting as coverage or capacity enhancers when equipped with a gNB or a Wi-Fi access point [38].
Connecting drones poses challenges due to their high speed, environmental variability, massive deployment, and flight safety requirements [39]. This can lead to handover issues for control and high-variability pathloss, especially in dense urban environments [40]. The next service anticipated from UAVs is Urban Air Mobility (UAM), where people become the payload, requiring rapid and safe transportation from one location to another (e.g., from the airport to the city center), aiming to alleviate traffic congestion [41]. Communications for drone control and the transmission of data collected by on-board sensors (e.g., infrared cameras, videos, or other sensors deployed on the ground) should be reliable, be in coverage, and meet specific latency and data rate requirements, depending on the application [42], [43]. Figure 6.a reports the typical scenario for UAV communication strategies.
Similarly, vehicles are becoming fully connected and capable of interacting with other vehicles, infrastructure elements (namely, Road Side Users or RSUs), and pedestrians and cyclists (referred to as Vulnerable Road Users or VRUs). Unlike UAVs, communication between vehicles needs to be established locally and quickly to facilitate rapid data exchange. Three primary areas are envisioned: 1. traffic optimization, encompassing route planning and traffic congestion reduction; 2. safety for onboard occupants, roadside individuals, and infrastructure; and 3. infotainment, related to movies, music, and social connectivity within the vehicle during the journey. Safety use cases include advanced driving maneuvers (e.g., coordinated maneuvers, adjusting vehicle courses for safety, Cooperative Collision Avoidance), extended sensors (e.g., Sensor and State Map Sharing, Collective Perception of Environment, Video data sharing for automated driving), and remote driving (e.g., controlling a remote vehicle, assisting a remote driver through a V2X Application Server, AS).
In addition to the classical Vehicle-to-Network or V2N (where the vehicle is connected to an external server through a gNB), direct connectivity between vehicles and other road users enables other communication modes: Vehicle-to-Vehicle (V2V), Vehicle-to-Infrastructure (V2I), when the vehicle exchanges data with RSUs like a traffic light or local road signals, and Vehicle-to-Pedestrian (V2P), in the case of communication between a vehicle and a pedestrian or cyclist equipped with a UE. Refer to Figure 6.b for an illustration of the communication types. Requirements for vehicle communications vary greatly depending on the specific application [44]. The most critical parameters include data rate, latency, packet size, transmission rate, link reliability, and coverage. For example, the data rate may range between 10 Mbit/s and 50 Mbit/s, but it may require up to 700 Mbit/s in the case of V2V video sharing. Latency is on the order of 20–50 ms, but it can be as low as 3–5 ms in the case of remote driving and trajectory alignment. The communication range typically falls between 50 and 400 m. Anyway, a comprehensive requirement analysis can be found in [45].
B. Key Performance Indicators for Everything Connected
The evolution of mMTC toward novel device typologies forces researchers, vendors, and service providers to consider new KPIs for a proper design and evaluation of the specific service under consideration. The following are some KPIs for the Everything Connected service:
Coverage. The service area for all types of UEs encompasses the entire global world, including remote and difficult-to-access areas. The related KPI is the percentage of covered area with respect to the entire Earth. A target value could be
, equivalent to approximately 5,000 km2 worldwide.10^{-5} Number of simultaneous connections per square meter or cubic meter. The increase in available devices in a close area necessitates simultaneous transmissions. This is evident, for instance, when a person has several personal devices in their local area or a group of people gathers during a soccer match in a stadium, in an air-gathering area, or in the metro during rush hour. A target value may be 50 connections/m3, considering 2–5 people per square meter having 10–25 personal devices each.
Data Rate per square km. In some new use cases, providing access to the network is not enough; it is critical to guarantee a minimum data rate for UEs (i.e., sensors) in a given area. A target value may be 5–50 Gbit/s/km2, equivalent to 0.5-5 Mbit/s over an area of
m2.10\times 10 Low Power Massive Connections. In addition to a minimum data rate, battery life is critical in some cases. Massive connections need to be guaranteed while respecting a maximum amount of battery usage. A target value may be 100 Mbit/s/mW.
First Packet Latency. In high-mobility or fast variable environments, IoT devices such as trains, vehicles, or drones may need to transmit a packet (or a set of packets), including the association with the receiver and eventually completing security procedures. A target value may be 0.01-1 s.
C. Advancements in Rel-18 and Rel-19 Within Everything Connected Service
1) Advancements for Low-Complexity Sensors
The connectivity of a large number of low-complexity and low-cost devices, coupled with wide area coverage, was introduced in 3GPP standards starting from Release 13 with LTE-M and Narrowband IoT (NB-IoT) UE terminals, both falling under the mMTC use case in 5G, and both using Orthogonal Frequency-Division Multiplexing (OFDM). LTE-M has reduced capabilities compared to UE Cat.0, allowing it to address a wide range of simple and low-complexity services, even including voice. LTE-M has a sub-carrier spacing (SCS) of 15 kHz and consists of two device categories: LTE-M1 for low data rates (approximately 1 Mbit/s) operating with a bandwidth of 1.4 MHz and LTE-M2 for low-to-medium data rates using a bandwidth of 5 MHz. NB-IoT further reduces the bandwidth to 3.75 kHz in the uplink (while maintaining 180 kHz in the downlink) to cover ultra-low mMTC end applications (about 20–50 kbit/s). While LTE-M utilizes typical control channels of LTE and 5G, NB-IoT has its own narrow dedicated channels, such as Narrowband Primary/Secondary Synchronization Signals (NPSS, NSSS) used for synchronization, Narrowband Physical Downlink Control Channel (NPDCCH), indicating for which UE there is data and in which PRBs, and the Narrowband Physical Downlink Shared Channel (NPDSCH), used for user data. In case of further extension of the PRBs beyond those assigned to the NB-IoT system, the transmission block is appropriately increased.
Release 18 defines new IoT devices with ultra-low complexity and ultra-low power consumption, known as Ambient IoTs [46]. Their purpose is to fulfill verticals where batteries are not allowed due to extreme environmental conditions (e.g., very high pressure or temperature), maintenance-free requirements, or simply very small size (e.g., millimeter-high). Use cases include identification (in warehouses, medical instruments, and logistics), positioning (e.g., finding a lost item), and sending commands to modify the status of a remote device (e.g., activate/deactivate a medical instrument, controller in smart agriculture). They aim to overcome the limitations of RFID in terms of range, the number of devices in the same area, and scanning operations [47]. Four communication topologies are defined: i. Direct (and bidirectional) communication between Ambient IoT and the base station; ii. Bidirectional communication between Ambient IoT and the base station only with the assistance of an intermediate node, which acts as a relay; iii. The Ambient IoT is assisted in uplink (downlink) transmission by the intermediate node, but it can directly communicate with the base station in downlink (uplink); iv. Bidirectional communication between Ambient IoT and a UE. Allowed strategies for spectrum duplexing are licensed Frequency Division Duplex (FDD), licensed Time Division Duplex (TDD) and unlicensed, with the aim of providing 0.1-5 kbit/s, latency of 1 s (shorter) and 10 s (longer) and a coverage of 50 m (indoor) and 500 m (outdoor). The positioning accuracy ranges between 1–3 m (indoor) up to tens meters (outdoor) depending on the use case with a reliability of 90%. Ambient IoT devices have a speed of 10 km/h and a target density of 150 devices per 100 m2 for indoor scenarios and 20 devices per 100 m2 for outdoor scenarios. Finally, three device classes are defined: Class A with no device storage, using backscattering transmission; Class B with energy storage but utilizing backscattering transmission; and Class C devices equipped with both energy storage and Radio Frequency (RF) components for transmission [48].
2) Advancements for RedCap
The need for services such as video surveillance, industrial wireless sensors, and wearable devices (e.g., smartwatches or goggles) has prompted updates and the development of new devices with requirements for reduced energy consumption and complexity. These devices are capable of ensuring a data rate of several Mbit/s and latencies around one second. Current devices in 5G do not meet very low energy consumption as those for massive IoT, nor do they provide the high data rates of eMMB devices or uRLLC for critical IoTs. Then, in Release 17 (with continued developments in Release 18), 3GPP introduced several capability restrictions for a group of devices known as Reduced Capability (or RedCap). These devices fall between mMTC, uRLLC, and eMMB devices, positioning themselves in the center of the classic 5G classes triangle. RedCap devices offer a balanced solution for data rate, latency, and battery life, somewhat resembling the Cat.4 UE available in LTE Release 8 [49].
Before presenting the main restrictions enabling RedCap devices, as reported below, it is noteworthy that a RedCap device is identified by the network during its Random Access (RA) procedure, where some parameters are sent by the terminal to the gNB [50]. Then, proper parameters may be set by the network to meet the UE capabilities. The RedCap bandwidth can be 20 MHz in Frequency Range 1 (FR1) and 100 MHz in Frequency Range 2 (FR2), with a transmitting power of 23 dBm (class 3, and optionally class 7). A RedCap is allowed to support one TX antenna, while one or two RX antennas are supported to shrink the complexity as well as the size of the device. The highest allowed modulation scheme is 64QAM (compared to 1024QAM for eMMB devices). Half Division Duplex (HDD) transmissions are allowed for RedCap even if FDD is optional. In case of HDD, priorities between transmission and reception are specified in TS 38.213 [51], for example, to favor the data upload (e.g., the PUSCH or Sound Reference Signals, SRS) or the signaling reception (e.g., PDCCH or system information blocks, SIBs). Following the device complexity simplification, some advanced 5G features are not allowed, such as wide CA and Dual Connectivity (DC), which forces the RedCaps to be connected only to Stand-Alone networks, and relaxed measurement at Radio Resource Management (RRM) for the handover procedure, since it is assumed that a RedCap is stationary. Other restrictions are related to higher layers and network signaling, such as the reduced number of DRBs at 12 (optionally at 16) due to a lower allowed data rate, shorter RLC/PDCP sequence number at 12 bits (optionally at 18 bits) to save UE memory, and restriction of the accessible band (namely, Bandwidth Part, BWP). Initially, every 5G device is allowed to transmit and receive in the whole BWP of the cell signaled during the RA procedure, which may contrast the RedCap capabilities. To avoid any possible conflict, the network should configure a RedCap BWP not to exceed the RedCap capabilities (i.e., a specific RedCap BWP or dedicated BWP for RedCap devices). See [50] for procedural details.
Extended RedCap (eRedCap) features are defined in Release 18. Enhancements are in the direction of further reduction of the bandwidth lower than 5 MHz, in order to fulfill the requirements of communication terminals of specific sectors, such as the Future Railway Mobile Communications System (FRMCS) and Mission Critical Communications (MCX). eRedCap aims to replace GSM-R (GSM for Railway), currently used for communication between trains and the control center in the first case, and Digital Mobile Radio (DMR) or Terrestrial Trunked Radio (TETRA)/Project 25 (P.25) adopted in public safety and protection operations in the second case. Existing devices in MCX are permitted to facilitate direct device-to-device communication. Therefore, eRedCap is required to implement sidelink communication, similar to V2X (explained later). Furthermore, eRedCap devices are enhanced with optimized methodologies for positioning, utilizing Position Reference Signal (PRS), time of arrival measurements, bandwidth aggregation to improve granularity, and providing ranging with sidelink signals [52]. For details on positioning enhancements, refer below.
3) Advancements for Non-Terrestrial Networks
The deployment of satellites in new constellations has sparked and fueled the interest of researchers and companies. Traditionally, satellites have been used for broadcasting, providing low-data-rate connectivity in remote areas, or during relief operations. Due to technological advancements, novel applications and services related to service continuity (e.g., providing connectivity to moving objects such as ships, airplanes, and trains), throughput enhancement (e.g., offering DC with a second satellite link in wide areas served by a single gNB), and ubiquitous connectivity (e.g., logistics tracking in areas where terrestrial coverage is neither present nor possible, such as in maritime and air), can now be provided worldwide [53].
Release 16 extends the New Radio (NR) to Non-Terrestrial Networks (NTNs) [54]. It encompasses a Non-Terrestrial Element (NTE) which may be a Geostationary (GSO) satellite, a Medium-Earth Orbit (MEO) satellite, and Low-Earth Orbit (LEO) satellite, as well as a High-Altitude Platform Station (HAPS) and a UAV. The adoption of NTN is fundamental both for offering services that require worldwide coverage and for improving local connectivity. The general NTN architecture is depicted in Figure 7. Additional general characteristics of NTN are outlined in [55] and [56]. The satellite or flying platform (i.e., a HAPS or a UAV) is connected to the ground station gateway through a Satellite Radio Interface (SRI) or feeder link and to the terrestrial UE. The integration within the 5G RAN has three options. The transparent mode envisions the NTE (e.g., LEO or even a UAV) merely implementing a frequency conversion. In this case, the gNB functionalities are implemented on the ground after the ground station. The second option is the regenerative architecture and has the gNB mounted on-board. The radio procedures (i.e., Radio Resource Control (RRC) and lower in the control plane and SDAP and lower in the user plane) are confined to the NTE. In the third option, the gNB functionalities are split into a Distributed Unit (DU) on board and a Central Unit (CU) on the ground, connected through the F2 interface on the feeder link. Layers up to RLC are implemented on the NTE, while PDCP and RRC for the control plane and PDCP and SDAP for the user plane are on the CU.
Principle NTN architecture with a generic NTE connecting the UE and the ground network with three options: i. gNB on the ground (transparent NTN); ii. gNB onboard (regenerative NTN) and iii. gNB functionalities split in the NTE and on the ground (DU/CU split NTN).
The considered frequencies are at 2 GHz and at 20/30 GHz. The goal of 3GPP was to minimize changes in NR as much as possible. However, due to the propagation distance in NTN, delay is several tens of milliseconds, depending on the altitude of the NTE, much higher than that of the terrestrial UE-gNB link (lower than 1 ms). For this reason, some timers (mostly in the physical layer procedures, such as timers for Timing Advance (TA), for HARQ retransmissions, preambles in RA procedure, and for setting the DRX) should be modified and adapted for NTN. In Release 16 and Release 17, several enhancements have been defined to adapt NR for NTN. Limited adaptations for MAC (e.g., for SR) and for RLC (e.g., for BSR) are needed. Another aspect to consider is the speed of the LEO and MEO satellites, causing moving cells in the case of fixed antenna beams. This results in Tracking Areas (TAs) sweeping on the ground, and the consequent handover of the terrestrial UEs, even if they are fixed, is frequently triggered. One possible implementation to support those UEs with limited power and processing capabilities is to have fixed TAs on the ground. Tracking Area Code (TAC) is updated as the satellite moves. Enhancements in Release 18 aim: i. to improve the uplink coverage for RedCap UE, by modifying the PUCCH repetition for HARQ and the DMRS bundling for PUSCH, and for IoT NTN by disabling the HARQ feedback; ii. to integrate the use of Global Navigation Satellite System (GNSS) for power saving in long connections of IoT-NTN. It is expected that many procedures for NTN will be supported by Machine Learning (ML) techniques [57].
4) Advancements for Vehicle-to-Everything (V2X)
Several developments and enhancements have been investigated and defined in 3GPP standards to support vehicle and drone communications, which were quite distant from classic mMTC in terms of data rate and latency requirements, as well as energy constraints. Release 14 was adopted for V2X communications (enabling the first Cellular-V2X, C-V2X), defining communications on the Uu interface with the network and the PC5 interface for direct UE communications. PC5 is based on Proximity Services (ProSe), which first appeared in Release 12 to support proximity communications, primarily for business purposes between pedestrians and local markets. Release 16 extended V2X from LTE to 5G, using the Uu interface for unicast links. Release 17 enabled broadcast and groupcast. PC5 uses sidelink channels to support two direct communication modes: in mode 1, gNB or eNB assigns radio resources for subsequent V2V communication, while in mode 2, UEs manage their own transmission, not requiring base station coverage.
As they are in 5G, transmissions in Uu and PC5 have most of the characteristics of NR, such as OFDM, numerology, frame duration, and frame organization. V2X is assigned two frequency ranges: 410 MHz – 7.125 GHz for FR1 with a bandwidth of 200 MHz and 24.25 GHz – 52.6 GHz for FR2 with a bandwidth of 400 MHz. Note that for Intelligent Transportation Systems (ITS), a 5.9 GHz frequency with a bandwidth of 40 MHz is dedicated by governments worldwide to enable generic V2X independently of the technology used (i.e., Dedicated Short-Range Communication (DSRC) or 802.11p is allowed, too) to reduce possible interference. Even though 5G bandwidths can be managed by gNBs, UEs are provided with an indication of a contiguous portion of bandwidth within the carrier bandwidth where a single numerology is employed (i.e., adopting the concept of BWP) to overcome their processing limitations or high power consumption. BWP reduces the V2X bandwidth and signals the resource pool allocated locally. A detailed description of the sidelink channels (e.g., Sidelink Primary/Secondary Synchronization Signal (S-PSS/S-SSS), Physical Sidelink Control Channel (PSCCH), and Physical Sidelink Shared Channel (PSSCH)) and resource scheduling for V2X is available in [58]. 3GPP defined also the service architecture for V2X as reported in [32] and [59] and the discovery procedure. The V2X architecture is integrated within the 5GS architecture, where V2X applications run over UE and external Application Servers. It is highlighted both Uu and PC5 interfaces [60].
Resource allocation for V2X includes classical modes DG and CG (i.e., SPS). With DG, the UE must send a SR and after receiving it the gNB indicates the allocated sidelink resources (i.e., PRBs) in the DCI over the PDCCH. Then, the UE can use it to transmit one or two Transmission Blocks (TBs). In case of CG, the UE sends a message to gNB informing it about periodicity of TBs, TB maximum size and required QoS (e.g., latency and priority). The gNB uses the information to configure the radio resource in terms of time-frequency allocation, periodicity and duration, resulting faster than DG transmission. V2X QoS is based on the 5G QoS flow management. The V2X QoS model maps application QoS requests into the PC5 QoS Flow, identified by a PC5 QoS Flow Identifier (PFI), which is used to derive the PC5 QoS profile and configure the sidelink Radio Bearer (SLRB). The PC5 QoS profile is a set of parameters: Resource Type (i.e., GBR, Non-GBR and Delay Critical GBR), default priority level (from 1 to 7), PDB (from 3 ms to 500 ms), PER (from
Further enhancement have been completed or moved in Release 17 and Release 18. One of these regards the adoption of beamforming in sidelink channels. For some V2X use cases (e.g., platooning, advanced driving) it is required high data rates (about 50–770 Mbit/s). For this aim transmissions at FR2 can be used but in order to compensate the pathloss and to improve the coverage, beamforming in sidelink channels is defined, thus allowing also to reuse the radio resource and reduce the interference. In particular, Release 17 defined two procedures. One is for the transmission of the sidelink-Synchronization Signal Blocks (S-SSB) to favor the synchronization of vehicles out-of-coverage thus expanding it. The second one is for properly adapting the transmitting power of the in-coverage UE using an antenna beam towards the gNB on Uu interface and another antenna beam towards the out-of-coverage UE on PC5 interface. Some V2X use cases can benefit from the positions of the involved UEs, such as Cooperative Collision Avoidance, Cooperative Lane Change, and Coordinated Maneuvers. Release 17 defines the use of sidelink channels to derive relative positioning for increasing the maneuver accuracy. Position accuracy may be enhanced by measurements performed by the gNB, which can provide absolute position, although this may be affected by errors, and in the considered use cases, it may not be essential (i.e., relative position alone may be sufficient). Moreover, vehicles should be in coverage. Relative position provided by sidelink channels (theoretically improved by GNSS data exchange between involved vehicles) can be based on measurements of ranging, angular, or time differences.
Release 17 introduces enhancements related to resource allocation, as follows: i. Sensing in mode 2 can be performed in a partial bandwidth, particularly useful for saving energy consumption for VRUs and RSUs; ii. A UE may assist another UE by recommending resources in V2V scenarios, simplifying its selection process since these resources are already detected by the assisting UE (i.e., avoiding the effect of the hidden terminal); iii. An in-coverage UE vehicle may be indicated as the scheduler of a vehicle group with the aim of scheduling resources within the group, thus avoiding collisions. It acts as an intermediate of the gNB. The concept of UE relaying was first introduced in Release 13 within ProSe for proximity marketing purposes or to support Search and Rescue teams in public safety operations then evolving in Release 14 and Release 15. Release 17 standardized the UE relaying between commercial devices such as smartphones and critical safety devices, extending the concept to vehicles in V2X scenarios [62]. Then, UE vehicles benefit from increased reliability, coverage, and higher data rates in case they are in-coverage, partial-coverage, and out-of-coverage. In particular, relaying enhancements are for supporting relay discovery and selection procedures as well as during the communication switching between Uu and PC5 interface.
Finally, Release 17 introduces advancements in multicast transmissions. Several use cases, such as Intersection Safety Information Provisioning, Sensor and State Map Sharing, Collective Perception of Environment, and Video data sharing, can be efficiently transmitted in multicast to UE vehicles subscribed to that service. This saves radio resources compared to some unicast transmissions. Transmissions, or equivalently, the group subscription can take into account the navigation route, the position of the vehicles, and their direction (e.g., a car incident may affect just one lane, not the other). Advancements include feedback provision to improve reliability and enable transmission in FR2 with a higher numerology as required by the service data rate.
Before concluding this paragraph, it is noteworthy that three performance KPIs have been proposed for V2X [63]. The Packet Reception Ratio (PRR) is defined as the ratio X/Y between the number X of vehicles successfully receiving a packet (within a distance D) with respect to all Y vehicles within the range D. The Packet Inter-Reception (PIR) is defined as the time with two consecutive successful receptions of two different packets sent by vehicle A to vehicle B within a distance D. The third KPI is the throughput [bit/s], defined as the amount of transmitted data [bit] in the time interval needed to download [s]. Five traffic models have been proposed for V2X evaluation: three periodic models and two aperiodic models. Each of them specifies the packet in terms of inter-packet generation rate, packet size, and latency requirement (ranging between 10 ms and 100 ms). The packet size can be a fixed pattern (varying between 300 bytes and 190 bytes), a variable pattern (1200 bytes or 800 bytes), or a uniform random generation (between 30, 0.2, 10 kbytes and 60, 2, 30 kbytes at steps of 10, 0.2, 4 kbytes respectively). See [63, sec.6.1.5] or [58, table 14].
5) Advancements for Drones
The initial attempts at UAV communications using cellular technologies were made in Release 15. In fact, some aspects of NR are already well-suited to the unique characteristics of UAVs. Features such as scalable numerology, flexible bandwidth assignments, the capability to establish dedicated network slicing for specific traffic, and the implemented enhancements of beamforming and MIMO provide a solid foundation for addressing the communication requirements of unmanned aerial vehicles. These attributes contribute to the adaptability and efficiency necessary to support the diverse and dynamic communication needs of UAVs in various scenarios [64]. However, it was only in Release 18 that the focus shifted towards improving radio measurements, mobility, and interference mitigation. The objective was to seamlessly integrate UAVs into the existing NR interface. Given their typical flight height above rooftops, Line-of-Sight conditions prevail, enabling UAVs to receive synchronization beacons from multiple cells.
NR-enabled measurements are dependent on the UE height, allowing adaptation of the H1 and H2 threshold events for handover purposes to determine the UAV altitude. Consequently, the network can appropriately configure the UAV UE, such as adjusting the measurement report frequency and establishing different configurations for UAVs above or below rooftops. This enables the network to receive reports on the flight path, facilitating advanced planning of gNB radio resources.
Significant efforts have also been directed towards UAV identification through the Broadcasting UAV ID (BRID) over the PC-5 radio interface. This allows authorities or authorized users to detect flying drones in specific areas. Additionally, UAV UEs can transmit Detect-And-Avoid (DAA) messages over PC-5 for authentication and to minimize radio interference, as illustrated in Figure 6.a [65].
6) Advancements for IoT Platforms
Within Release 13 (2012-2016), 3GPP established the technical specifications (TS) for the Mission Critical Push-to-Talk (MCPTT) service. Numerous functionalities were outlined to support aspects such as group management, identity management, and configuration management. However, the standardization process was time-consuming and resource-intensive. It became evident that standardizing the core functionalities was crucial to enable similar services across the 5GS. To address various vertical applications within the 5GS, 3GPP defined a functional model that focuses on providing functional support to external systems, primarily those connecting smart things, implemented at the application layer. The reference model is illustrated in Figure 8. The figure also highlights the adoption of the PC5 interface within the framework, which is significant for systems where direct communications between UEs may occur.
In the model, the application server in the Vertical Application Layer (VAL) consists of the Application-specific Server (responsible for providing services and harmonizing operations between other services within the same vertical) and the Vertical Application Enabler (VAE), facilitating message delivery between different vertical application servers. These functionalities are also present on the UE side as client versions, aiming to support the server side.
The application-specific server provides client-side functionalities corresponding to specific applications. For instance, it involves interactions with Unmanned Aircraft Systems (UAS) Traffic Management (UTM)/UAS Service Supplier (USS) for UAV applications, interactions with vehicles or both vehicles and road infrastructure for V2X applications, or for IoT deployments in body area networks, specifically Personal IoT Networks (PIN). In the last case, PIN enabling architecture is further specialized to support multiple connected local devices managed by the PIN Gateway Client (PINGC) and PIN Element with Management Capability (PEMC), aiding the server in managing and delivering messages within the local IoT network. The VAE client and server support application-specific client and server functionalities, respectively. VAE client functionalities may include general tasks such as communication modes (Mode 1 or Mode 2), application message communication handling, or application-specific tasks (e.g., for UAVs, receiving and storing Multi-USS and DAA application policies). Additionally, VAE server functionalities involve group-based QoS management through SEAL, determining location information of other specific UEs, providing proximity information to specific areas, and monitoring the status of specific UEs, among others.
Application servers/clients in VAL provide specific vertical application functionalities and require common auxiliary services from the Service Enabler Architecture Layer (SEAL). SEAL is essential for developing common functionalities based on 5GS to support multiple verticals [66]. SEAL manages location, group and configuration. Moreover, it is in charge of identity management, supporting authentication, key management, and network resource management [67]. The IoT platform may implement the Common API Framework (CAPIF), acting as a gateway for the VAL server (i.e., as API invoker). CAPIF provides connectivity to API providers (API exposures) and discovers service APIs published by the Service Capability Exposure Function (SCEF) in LTE and Network Exposure Function (NEF) in 5G. It authenticates the VAL server and provides it with service API links, enabling API providers to offer services to API invokers [68].
This functional model allows the straightforward addition and implementation of external systems (i.e., the verticals) on top of 5GS as 3GPP releases advance or are newly defined. Examples of external systems include V2X whose specific functionalities are in [69], UAS [65], EDGE [70], Factories of the Future (FF) [71], Personal IoT [72], messaging delivery service in 5G (MSGin5G), and Application Architecture for MSGin5G Service (5GMARCH) [73]. The latter is designed for massive IoT device communication, covering thing-to-thing and person-to-thing communication, including point-to-point, application-to-point, group, and broadcast messaging. The ongoing development of Mission Critical (MC) services, including MCPTT, MC video, and MC data, is in Release 19 [74].
High-Resolution Positioning
Positioning has always been crucial, and various techniques and systems have been adopted for user positioning in both indoor and outdoor environments. Satellite-based positioning systems, including GPS, Galileo, GLONASS, and Beidou (as part of GNSS), are widely used for outdoor positioning. Indoor positioning relies on different strategies, including fingerprinting (commonly used for Wi-Fi), proximity detection based on passive or limited-range tags like Bluetooth Low Energy (BLE) and RFID, dead reckoning, Angle-of-Arrival/Angle-of-Departure (AoA/AoD), Time Difference of Arrival (TDoA), and Received Signal Strength (RSS). Extensive surveys on these methods are available in [75] and [76]. The limitations of current methods, causing degradation in accuracy, are primarily attributed to multi-path propagation, time synchronization of transceivers, loss of Line-of-Sight, and complexity in antennas and receiver hardware [77].
The accuracy of positioning varies based on technology: 1–5 m for Wi-Fi (using RSS and TDoA), 2–5 m for Bluetooth and ZigBee (using RSS and TDoA/AoA), 0.01-1 m for UWB (using TDoA and AoA), 2.5-25 m for cellular systems (using RSS and TDoA), and 3–5 m for satellite-based systems (using TDoA) [78]. Ensuring positioning requirements with a single technology is challenging. Therefore, in addition to GNSS, other technologies may be combined, such as visible light, local sensors, and communication signals.
A. Specific Services Enabled by High-Positioning
To enable new services, it is essential to provide not only centimeters (or lower) accuracy but also position availability within a few hundred (or lower) milliseconds. Moreover, the ability to supply high-precision positioning for not only a user (i.e., his smartphone) but also an object, including its location, speed, and direction, can enable smart environments, enhance productivity, and improve safety services. Additionally, considerations must be given to the position availability everywhere, i.e., in open areas, dense urban, and indoors, while accommodating possible device constraints such as processing power and energy consumption [79].
High-precision positioning can be leveraged for various services, including:
Real-Time Kinematic (RTK) Movements. Utilizing MR for the automatic control of tools and equipment, where cameras enable gesture and motion recognition [80].
Smart Factory Operations. Enabling precise positioning for industrial control, factory automation, and process automation. This includes continuous tracking of goods in the factory to optimize material flow in warehousing and logistics processes. Fleet management and autonomous driving systems in a factory can also benefit from localization, improving their interaction with the environment. Real-time monitoring of production machinery, assembly cells, and the localization of workpieces in the assembly line can be quickly adapted to new circumstances and material changes [81].
V2X Use Cases. Enabling low-latency and relatively low-distance interaction between vehicles and between vehicles and RSUs. In addition to remote driving, V2X services include scenarios where the ego vehicle interacts with objects (e.g., other vehicles, emergency vehicles, road obstacles, pedestrians/VRUs), and remote events (e.g., accidents, approaching a specific location, roadworks). Examples include lane merge, curve speed warning, cooperative adaptive cruise control, and pedestrian detection. Relative position is often more critical than absolute position in these cases [45].
Assistive Services. Leveraging the positioning of people and objects within smart environments for context-aware service provisioning, activity detection, and elderly monitoring [82].
Retail Tracking. Retailers utilizing customer position and habits for consumer tracking within malls/airports, enabling proximity marketing and advertisements. Various technologies, including Wi-Fi, RFID, BLE, and video cameras, can be used for locating and tracking customer movements [83].
UAV Positioning. Precision positioning for drone track control, enabling services such as UAVs for package delivery, precision farming, surveillance, or other urban and dense urban area applications where GNSS signals may degrade [84], [85].
Livestock and Object Localization. Achieving precise localization of small objects or monitoring livestock movements and migration using passive or active tags embedded in objects or animals. Depending on the use case, UAVs and open gates can be used for monitoring and tracking.
High-Resolution Mapping Services. Enabling services based on high-resolution mapping and environment reconstruction by providing images with enhanced object localization. Technologies for capturing images of the surrounding environment can be improved by integrating object localization data, contributing to map construction and reconstruction.
These examples highlight the diverse applications and benefits of high-precision positioning across various sectors and use cases.
B. Key Performance Indicators for High-Positioning
Positioning KPIs go beyond mere accuracy in determining the position of an object or user. Release 17 has outlined several of these KPIs, as detailed below [86], [87].
Positioning Accuracy. It is the measure of how accurately a spatial object is positioned on the map concerning its true position.
Positioning Service Availability. It is the amount of time the positioning service is provided according to performance requirements, divided by the expected time the system should receive the positioning service in the targeted service area.
Positioning Integrity. It is a measure of the trust in the positioning accuracy.
Positioning Service Latency. This is the time elapsed between the event of a positioning request and the availability of position-related data at the system interface.
In addition to these KPIs, Release 17 has set requirements to address Industrial Internet of Things (IIoT) and automotive use cases (i.e., V2X), particularly in scenarios where GPS signals degrade, such as indoor environments. For commercial use cases, the positioning requirements aim for horizontal accuracy below 1 m and vertical accuracy below 3 m for 90% of UEs. In IIoT use cases, the requirements are even more stringent, with horizontal accuracy below 0.2 m and vertical accuracy below 1 m for 90% of UEs. The expected positioning service latency is between 0.1 and 1 s, and the positioning service availability ranges between 95% and 99.9%, depending on the specific use cases [86, Table 7.3.2.2-1].
Release 17 has also introduced requirements for Low-Power High-Accuracy Positioning (LPHAP) for IIoT, focusing on devices with low power and very low power constraints. One typical use case involves tracking workpieces both indoors and outdoors in assembly areas and warehouses. The positioning requirements in this scenario include accuracy below 1 m and a positioning interval of 15–30 s [81]. Furthermore, Release 17 has evaluated use cases for V2X and public safety supporting sidelink, establishing horizontal accuracy below 0.5-1.5 m and vertical accuracy below 2–3 m for V2X, and horizontal accuracy below 1 m and vertical accuracy below 0.2 m for public safety [88].
C. Positioning in NR and Enhancements in Rel-17 and Rel-18
Positioning techniques have been implemented in LTE and 5G cellular systems to provide services to users [89]. In NR, the positioning is based on the Location Management Function (LMF), responsible for estimating the UE position based on UE measurements and NG-RAN assistance. Although similar to LTE, the exchange protocol for carrying position information between NG-RAN and LMF is new, known as NR Positioning Protocol A (NRPPa). Positioning methods in 5G are based on PRSs and SRSs, configured by the Location Management Service (LMS) and RRC, respectively. PRS is specifically designed for extensive coverage and interference suppression. To achieve this, PRS is spread across the entire NR bandwidth, transmitted over multiple symbols, and combined with PRS sent by other gNBs, allowing power accumulation. 5G positioning methods are described in the following [90]:
NR Enhanced Cell ID (E-CID). This straightforward method relies on the network reporting the location of the serving cell and its coverage based on the Broadcast Channel (BCH) measurements performed by the UE.
MIMO and Beamforming. Utilizing MIMO in NR, along with beamforming sweeping, enables AoA determination through SRS measures by the gNB in uplink and AoD through PRS measures by the UE in downlink. The gNB and UEs report these measures to the LMS for position estimation, limited by the signal RSS resolution (i.e., 1 dB).
Time Difference of Arrival (TDoA). This method involves continually measuring PRS and SRS. In the downlink, the UE receives PRS from multiple gNBs, and its position is determined by intersecting the two hyperbolas created by the difference in timing. Similarly, in the uplink, neighboring gNBs receive the SRS sent by the UE, and the LMS computes the UE position. Limitations are due to potential Non-Line-of-Sight paths and gNB synchronization.
Multi-cell Round-Trip Time (Multi-RTT). Using PRS and SRS, gNB and UE measure RX-TX time differences over multiple cells. The LMS estimates the position using Multi-RTT, offering relaxation in synchronization requirements due to multiple measurements.
These methods collectively contribute to accurately positioning UE in 5G networks, catering to diverse use cases and scenarios.
Nevertheless, further enhancements in positioning accuracy have been achieved through updates implemented in 5G releases, focusing on improving various communication aspects, such as bandwidth expansion up to 400 MHz (providing better timing and multipath resolution), larger antenna sizes (enhancing angular estimates through narrower beams), transmission at higher frequencies in mmWave bands (24.25 GHz to 52.6 GHz or higher for THz communications), enabling narrower or pencil beams, scalable SCS (higher SCS increasing resolution in resolving Line-of-Sight from multipath), and higher antenna gain (permitting higher received power). The densification of access points is crucial to ensuring Line-of-Sight and reducing range, inherently improving positioning accuracy.
Release 18 introduces enhancements based on Carrier Phase-based positioning, leveraging the conceptual method of GPS. The UE correlates the pseudo-range sequence transmitted by gNBs and calculates its distance with the gNB based on the corresponding transmitter-to-receiver signal propagation time. The accuracy depends on the correlation property of the pseudo-range sequence, primarily influenced by UE sampling resolution and signal bandwidth. Carrier Phase measurements can also be exploited for AoA estimates when the UE is equipped with at least two antennas. In this case, the different phases of the PRSs detected by the two antennas allow for AoA estimation, resulting in a more accurate estimate compared to the beam-sweep method. However, gNB clock mismatch may degrade pseudo-range accuracy. As mentioned earlier, further positioning enhancements utilize sidelink signals in V2X use cases, where relative positioning supports remote and autonomous driving. Sidelink signals, such as SL-PRS/SRS, NR-SL carrier phase, and positioning in RCC_INACTIVE MODE are employed to enhance the positioning of V2X, public safety, RedCap, and LPHAP devices, employing similar techniques implemented through sidelink signals to save energy [52].
The next challenges in providing high-positioning accuracy include: i. Synchronization precision of sending reference signals (e.g., gNBs); ii. The use of THz sensing for 3D images enabling Simultaneous Localization and Mapping (SLAM); iii. AI/ML positioning techniques to enhance signal estimates, especially in scenarios with unpredictable radio propagation characteristics (e.g., Non-Line-of-Sight links) as in indoor environments. AI/ML may support fingerprinting or ray-tracing [91]; iv. The usage of external sensors such as video cameras and LiDAR mounted on vehicles, squares, roads, or indoor spaces. Computer Vision, with proper settings like fixed anchor points in the image, can capture an object or person within an image, estimate its position in the scene or distance, and track it. Similarly, point cloud data collected by LiDAR can detect a person or object in an environment.
Applications and Service Requirements
The three bearer services introduced earlier will enable a range of foundational applications to develop in the near future, significantly enhancing our lives. Before briefly outlining their characteristics, it is useful to analyze and assess the technical requirements that the immersive communications and everything connected services must meet, according to a set of parameters.
Table 2 presents the requirements of specific services related to immersive communications, as described in Section II. Requirements are expressed based on values from the literature (examples are in [11], [14], [20], [92], [93], and [94]) and reworked and refined by the author according to the description provided in Section II-A. Parameters considered for comparing services offered by immersive communications include data rate evaluated for both downlink and uplink, reliability defined as the percentage of packets successfully delivered to the application, packet transmission latency, and the refresh rate for packet generation. The power limitations of devices are also taken into account. In this case, the device’s mobility, which translates to reduced battery capacity, is particularly assessed. As expected, higher data rates are required for HTC and VR. HC requires higher refresh rates than in other cases. Low latency values are crucial for MR/AR, especially for typical use cases such as pedestrian crossing and manufacture assistant.
Table 3 presents the requirements for devices related to everything connected, as described in Section III. In order to evaluate the four types of devices introduced in Section III, the following parameters have been considered: data rate, reliability defined as the percentage of packets successfully delivered to the application, latency range for packet transmission in the communication system, degree of mobility or allowed speed of the device, power constraints, and the level of accuracy required in the device’s localization.
For smartphones and vehicles, limited power requirements are necessary due to their long battery life. However, for devices implanted in the human body or low-cost sensors with limited processing capacity, battery life becomes crucial for the service provision. Time-sensitive sensors, as well as remote controls for sending commands to drones and autonomous vehicles, demand high latency requirements and transmission reliability. As expected, the most stringent data rate requirements are for smartphones, goggles, viewers, and other devices essential for users to enjoy high-quality videos. In the following it is reported a brief description of the main basic applications with the related requirement characteristics.
A. Smart Factory
With the definition of Industry 4.0 in 2013 and the need to automate factories for competitiveness and cost-related reasons, the traditional manufacturing paradigm has shifted towards the smart factory. Its concept is based on a collaborative manufacturing system with the aim of responding to the customer needs (mass customization). It integrates a vast amount of data collected from sensors, appliance and devices, driven by an AI engine [95]. The smart factory is based on Big Data, Cyber-Physical Systems (CPS), IoT sensors and cloud/virtualization resources to have a Digital Twin of the manufacturing chain, in order to become fully automated and flexible by optimizing products and processes. Some definitions and models are in [95] and [96] and the references therein. In a smart factory, collaborative robots (or co-bots) are synchronized and perfectly integrated in the product supply chain, requiring low latency, and medium-to-high data rate in the factory area, for the interaction with customer, for possible resource re-allocation or flexibility in new assembly line rearrangement. High-positioning is required in the cooperation between humans and the mechanical arms, while a less stringent localization is needed for workpieces along the assembly line [97]. Specific applications are related to MR and VR for design of customized products and integrated sensors with final products for predictive maintenance.
B. Smart City and Smart Agriculture
Applications belonging to this group involve an extensive deployment of sensors and actuators across a wide area, such as in the case of smart cities and smart agriculture, with the aim of optimizing human life, promoting economic growth and environmental sustainability. These sensors and actuators collect data and perform functions through a close interaction between data and processes, involving communication systems, processing capabilities, and autonomous devices such as drones and self-guided vehicles. In the case of smart cities, citizens and other operators can benefit from optimized processes and information retrieval to access services that enhance their quality of life [98], [99]. In the context of smart agriculture, farmers, breeders, and farm managers can optimize functions such as irrigation, fruit harvesting, and animal feeding [100], [101], [102]. Significant importance lies in the proper positioning of autonomous vehicles, including tractors and rovers, to provide support.
C. Automotive and Smart Transportation
The increase in vehicles in urban and suburban areas highlights the urgency of traffic optimization and the enhancement of safety for both passengers and pedestrians on the roadside. Advancements in connectivity and processing technology have facilitated the proliferation of connected vehicles and smart transportation, commonly referred to as Intelligent Transport Systems (ITS) [103]. ITS aims to integrate vehicles, passengers, freight, and road infrastructure by leveraging IoT, Big Data, and AI to reduce traffic congestion, enhance safety, and enable autonomous/remote driving [104]. Some use cases have been proposed by 3GPP in [45] and by the 5G Automotive Association (5GAA) in [105]. In addition to these, scenarios related to logistics and package delivery should be considered. Immersivity is only crucial in a few use cases (e.g., remote driving), while data collection from on-board sensors and those deployed along the road or on packages is critical for data sharing and flow optimization. However, only a few sensors are employed simultaneously in a certain area of interest. Extreme localization is very important, both for safety and to avoid confusion among the elements involved (e.g., vehicles or packages).
D. Digital Health
The possibility to use smartphones/tablets and communication systems has opened up new opportunities in the health sector with the aim of providing care services to a larger number of people, optimizing patient information retrieval and sharing, and reducing costs in the healthcare field [106]. Several use cases or specific applications may be implemented as introduced in the following. The simplest use case is the remote patient monitoring enabling faster intervention and predictive e-health. It requires the interface of medical devices installed on the body with storing and communication capabilities. Another use case is related to the possibility to share the patient data between doctors, between hospitals and with ambulances in case an incident occurred to the patient. The third use case involves the possibility for a doctor to make remote medical exams and diagnoses. This is particularly useful for patients in remote areas, on cruise ships, or in locations where doctors and nurses are not readily available. In this scenario, the doctor connects remotely to a medical device located where the patient is and can conduct a clinical examination. A more challenging application would be the case of telesurgery, which is present in many research studies and potentially achievable in some cases with future telecommunications systems, as they will be capable of ensuring both minimal latency (below 2–5 ms), extreme reliability exceeding 99.999%, and excellent maneuverability of surgical instruments [107]. However, it remains only potentially feasible in the coming years (more than 5–8 years) due to equipment costs. The integration of medical devices and sensors with transceiver units or smartphones is important for monitoring and detecting patients’ biomedical parameters. The use of visors and HMDs can facilitate collaboration among multiple doctors in holographic mode for diagnoses.
E. Smart Environments
A smart environment is an ecosystem where people and objects interact to deliver a high-level service within a confined area, coordinating and controlling complex information and flows. Examples of smart environments include offices, homes, schools, and warehouses, each tailored to provide distinct services for humans. In this scenario, immersion (i.e., the capability for humans to be part of a scene) and presence (i.e., the capability of being physically and spatially located in an environment) are fundamental for the provided services, such as teaching, virtual meetings, or managing home appliances [18]. Connectivity with numerous sensors simultaneously and object positioning in a room are less critical for the overall functionality.
1) Public Safety
Communication networks employed in public safety are specifically dedicated to emergency forces, including the police, fire, and emergency medical services. These networks are strategically designed to prevent or respond to incidents that could potentially be dangerous for people and properties. Current technologies utilized by first responders and law enforcement include TETRA, TETRAPOL, and P.25, with LTE gaining adoption following the development of device-to-device interfaces and group communication functionality [108]. Further enhancements are deemed necessary to enable fast communication, connect with available sensors in the area of interest, and accurately localize both people and law enforcement officers, even in adverse environments.
Before concluding this section, the importance of new bearer services for the main application fields is summarized in Table 4.
It is noteworthy that the development and diffusion of the applications just described require, on one hand, the advancement of the next-generation network (and thus the three main services), but on the other hand, the development of application-specific devices is equally important. This implies a close collaboration between global partners involved in direct business with network stakeholders, i.e., vendors and telecommunication network operators. An example is the automotive sector, where vehicle manufacturers need to integrate chipsets to enable V2X in their vehicles. It becomes important to use standardized chipsets that allow interoperability in vehicle communications of different automakers. The same issue arises in communications with road infrastructure, which will need to deploy chipsets in roadside elements (such as traffic lights and city gates) that ensure high interoperability both in terms of communication and service providers, even when motorists travel in other countries. Another example is related to the spread of the NTN network, where it is necessary to use a single equipment to guarantee service continuity and ubiquitous connectivity. In this case, the terminal used by the user must be able to connect seamlessly to both the terrestrial and satellite networks and adapt to various constellations (i.e., LEO and possibly even higher orbit constellations such as MEO and GEO) in terms of power and frequency, or even foresee the implementation of dedicated radio interfaces.
In Table 5, there are examples of innovative products and project pilots that support advancements in applied research, carried out by private companies.
Enabling Technologies
The implementation of new services and their associated applications requires the adoption of several enabling technologies in the next-generation wireless networks. These technologies are organized starting from the physical layer, to technologies implemented in the access points and at the link level. Next, improvements achievable at the architectural level are considered, and thanks to the utilization of integrated sensing, positioning, and communication technologies, as well as the integration of AI or ML at all levels of the protocol stack and within each element of the communication system. Finally, potential future technologies are explored. The following outlines the most promising technologies envisioned for implementation in the next few years, with the goal of transitioning from 5G-Advanced to the initial phases of 6G networks. As expected, the use of each technology or technique can be more or less appropriate depending on the goal to be achieved (such as increasing the data rate or the number of users to be served simultaneously), its limitations (such as power, size, and latency requirements), and the given use case.
A. Spectrum Increase and Dynamic Band Allocation
Probably, the simplest improvement to apply is increasing the assigned bandwidth in the telecommunication system, consequently boosting data rates for novel services [109]. With advancements in antenna technology, the use of millimeter waves (mmWave) ranging from 3 GHz to 300 GHz (i.e., wavelengths from 1 to 10 mm) has become viable for cellular systems, extending frequencies beyond 6 GHz. Release 17 in 3GPP 5G FR2 has been allocated up to 71 GHz (frequencies between 24.25 GHz to 71 GHz) [110]. The next frontier is terahertz communications (THz comms), collectively ranging from 0.1 THz to 10 THz (or wavelengths between 3 mm and
Researchers are also exploring Visible Light Communications (VLC), using LED or laser transmissions within the visible light spectrum, which extends from 380 nm to 780 nm [113]. IEEE defined a standard for PHY and MAC about ten years ago as part of IEEE 802.15.7 [114]. However, higher frequencies provide larger bandwidths (on the order of GHz) but poses challenges such as atmospheric absorption (oxygen and
In conjunction with spectrum increase, further techniques are envisaged for the upcoming telecommunication systems to accommodate very high data rates and flexible access. Dynamic bandwidth allocation, including variable bandwidth, common pool of bands, licensed and unlicensed bandwidth exchange, and adoption of strategies between primary and secondary network operators, may be considered in areas with specific traffic density or variability. Furthermore, novel duplexing configurations can be adopted, such as Sub-Band Full Duplex (SBFD). Part of the bands in TDD can be assigned to UEs requiring high uplink throughput, while the rest can be left to downlink heavy traffic by the base station, enabling simultaneous transmission in both directions according to the traffic requirements in the cell [115].
B. Proliferation of Access Points and Link Enhancements
The throughput increase certainly involves updating the RAN and implementing advanced channel coding and modulation techniques at the link level [116]. This involves various aspects, ranging from chip-set processing in devices to updating coding techniques for novel services like XR and HC, as well as for short messages in IoT. Additionally, novel transmission waveform techniques are being explored [117]. Several coding technique updates are proposed at the link level, including Low-Density Parity-Check (LDPC) codes, Polar codes, and 4G turbo codes, making performance more robust in various environments [118], [119]. Another technique gaining attention is Semantic Communications (SemCom), aiming to convey the meaning behind a transmitted message by transmitting only semantically-relevant information, rather than supporting symbol-by-symbol reconstruction. This approach minimizes power usage, bandwidth consumption, and transmission delay [120] (see also [121] for details). SemCom is based on extracting semantic information from the source message to be transmitted, usually accomplished using a semantic encoder based on transformers [122]. Prototypes are already available [123].
Another topic aimed at increasing spectral efficiency in 6G is the investigation of new waveforms in addition to OFDM, which was widely adopted in 5G and 5G-Advanced. Proposals include Non-Orthogonal Multiple Access (NOMA), where two or more users share the same physical resource (i.e., frequency, time, or code) and are separated over the power domain by assigning different power levels or distinctive codebooks, similar to CDMA but with more complex receivers [124]. Sparse Code Multiple Access (SCMA) is another proposal, using two or three constellations over multiple subcarriers per user, providing redundancy but also adding complexity [125]. Orthogonal Time-Frequency Space (OTFS) combines a radar concept to estimate user movement and the communication signal [126]. However, it is widely accepted that ML will heavily influence both waveforms and coding definitions in 6G.
The proliferation of a large number of access points enables the possibility for a user to be connected with more than one antenna simultaneously, a concept known as DC in cases of slow/non-optimal backhaul connection between two base stations. Increasing throughput also involves enhancements in MIMO, which provides spatial multiplexing and/or diversity gains, thereby increasing spectral efficiency without requiring additional frequency, power, or time resources [127].
Various MIMO versions have been proposed in the literature, such as Single-User MIMO and Multiple-User MIMO (based on the number of users served simultaneously), massive MIMO (which employs a larger number of antennas relative to the low number of served users) and centralized MIMO and distributed MIMO (depending on whether a compact large-scale antenna array is adopted at the base station or a large number of antennas is geographically spread out over a cell). Emerging concepts related to MIMO include Coordinated Multi-Point (CoMP), distributed antenna systems (DAS), a network of MIMO in cases where a UE is served by more than one base station with multiple antennas but with an optimized backhaul/fronthaul, and the recent cell-free network. In the cell-free network, all UEs in an area are served by a large number of local access points (with one or more antennas) through joint transmission, exploiting local pre- and post-coding channel information. For more details, refer to [128] and [129], and the references therein. Despite MIMO improvements, practical limitations arise, including pre- and post-processing challenges, especially in the distributed case, network synchronization, low-latency requirements in the front-haul, channel estimation challenges, especially in FDD, channel information acquisition, RF channel calibration, and power consumption.
MIMO was introduced in 3GPP Release 6 in HSPA and has been continually enhanced in subsequent releases. 3GPP Release 8 included a full use of MIMO by introducing new Transmission Mode 8 (TM8), while Release 10 enabled
Lastly, Reconfigurable Intelligent Surface (RIS) is gaining growing interest due to its extensive potential in mmWave and future 6G applications [131], [132]. The concept involves modifying surface properties by dynamically adjusting the amplitude, delaying, and changing the polarization of each surface element to control incident electromagnetic waves in real-time. RIS improves communication links and extends the coverage of wireless networks. Current RIS elements can passively modify their impedance by external stimuli, but new developments can also refract electromagnetic waves effectively, implementing the Simultaneously Transmitting And Reflecting RIS (STAR-RIS) [133].
C. Innovative Architectures
To ensure very high data rates and global connectivity, the next-generation network must evolve toward novel architectural models. The primary concern is to ensure flexibility and adaptability, allowing for the coexistence of various possible architectural models as new scenarios emerge, giving rise to the concept of a network of networks. Several of these models have been analyzed in this paper, and a brief recap follows. The integration of terrestrial networks with NTN (including UAVs, HAPSs, and new very small satellites, namely cubesats) will be essential to ensure worldwide connectivity. Additionally, the multi Radio Access Technologies (RAT) and the proliferation of access nodes enable the ultra-dense network, supporting flexible accessibility to large base stations, small base stations, femto base stations, Wi-Fi, private networks, NTN, etc. This adaptability is crucial to cater to various use cases and scenarios, guaranteeing full mobility (e.g., for high-speed users, drones, and vehicles), meeting local requirements (e.g., in a smart factory or for smart agriculture), and supporting emergency situations for public safety operations. Possible device-to-device connections may be necessary to extend local coverage or reduce communication latency between two devices.
With the deployment of the 5G Service-based Architecture (SBA), the system architecture has undergone a transformation from a traditional mobile system to a shared platform capable of providing open services. All network functionalities and data within a common repository may be delivered through a set of interconnected Network Functions (NFs) by exposing APIs to other NFs or to third parties. This approach involves network softwarization, where hardware, such as antennas, switches, routers, and network servers, is separated from software [134]. The software configures the hardware based on required services in specific locations, giving rise to concepts like i. Network Function Virtualization (NFV), which implements network functions such as routing, switching, firewalling and other functionalities for mobility, session management and user authentication on virtual machines, rather than as physical devices; ii. Software-Defined Network (SDN), which allows the network administrator to configure the network devices (e.g., switches and routers) by software instead of manually in order to manage the traffic flows; iii. Software-Defined Radio (SDR), which defines functionalities related to radio access technology and RF front end programmability [135]. Softwarization offers numerous advantages, including automated network management processes, rapid development of new services, and reconfigurability to meet changing traffic flows and foster innovation. Different logical networks supporting various verticals, such as V2X and mIoT, can be easily implemented over the same infrastructure using network slicing [136]. Another enhancement involves the deployment of private networks in local areas dedicated to specific verticals, such as closed-loop manufacturing, logistics and warehousing, and venue security. Private networks, utilizing technologies like 5G and 6G, but also 4G, or the ultimate version of Wi-Fi (IEEE 802.11be) can support various industrial applications and operate in licensed and unlicensed spectrum. Managed by private owners, these networks offer control over scheduling strategies, security levels, resource allocation, and access permissions [137]. However, industrial applications demand more stringent network performance in terms of throughput, latency, device density, and reliability. To address this, research and standardization efforts are focused on Deterministic Networks (DetNets). DetNets feature bounded delay, jitter, and packet loss rates by coordinating scheduling and forwarding in each node and utilizing redundant links or end-to-end time-based resource reservation. The IETF and IEEE have defined specifications for deterministic latency at layer 3 (IP/MPLS) and layer 2 (Time Sensitive Network (TSN) for Ethernet-based networks), respectively [138]. Additionally, 3GPP in Release 17 and 18 has initiated studies on DetNet [139].
In summary, the network supporting 6G must be highly flexible for the rapid deployment of a diverse range of services and dynamic to respond swiftly to user requests across time and space. This flexibility is achieved through softwarization, while dynamicity is ensured by integrating various networks (e.g., NTN, private, DetNets), locally allocating spectrum, and deploying dense access points.
Before concluding this subsection, it is important to consider two additional aspects in addition to any cost limitations. New architectures must ensure interoperability with existing systems. In particular, the 5G SBA architecture is equipped with the Non-3GPP Interworking Function (N3IWF), which is responsible for interworking between untrusted non-3GPP networks and the 5G core [30]. It allows seamless connectivity (e.g., handover, IPsec) to non-3GPP networks such as Wi-Fi and existing satellite networks through N2 and N3 interfaces and by the interaction of 5G elements such as AMF, SMF, and UPF. In other cases, interoperability can be achieved at a local level, for example, by considering short-range technologies (e.g., Near Field Communications (NFC), Bluetooth, or ZigBee) capable of locally connecting sensors, smartwatches, or peripherals instead of using the PC5 interface. The second aspect is regulation. The deployment of any service is linked to the regulation of operating frequencies and the bands assigned globally. For example, in the case of transportation, a 40 MHz band is assigned at the frequency of 5.9 GHz regardless of the technology (i.e., DSRC or cellular connectivity). This does not prohibit extending the band to provide additional services, such as in the US where 75 MHz (licensed 5.850-5.925 GHz) has been assigned at 5.9 GHz or in Japan with an extra band at 700 MHz (from 755 to 765 MHZ) for safety ITS applications. Refer to [140] and [141] for ITS usage. In the satellite domain, there are proprietary solutions along with standardized solutions at the 3GPP level, making it more complicated to connect to more than one satellite network. Possible solutions include: 1. multi-technology terminals (capable of connecting to more than one network); 2. terminals equipped with SDR (capable of sensing signals transmitted in a certain area, detecting the technology, and consequently adapting their transmission); 3. defining standardized transmission frequencies and bands globally, partly allocated to unlicensed transmission where more than one technology can transmit (similarly to ISM bands) and partly assigned to a dedicated technology (e.g., NR for NTN) and service providers.
D. Convergence of Sensing, Positioning, Computing and Communications
It is evident that most of the evolutions in cellular systems and enhancements are geared towards increasing data rates to provide innovative and advanced services (e.g., XR, holography, or simply the ability to detect finger movements for transformation into actions). However, some new services require low latency and positioning. It has been observed that technologies combining sensing and positioning, sensing and communication, and communication and processing further improve individual requirements compared to the use of disjointed technologies.
The importance of positioning has been extensively discussed earlier, including a detailed overview of the main techniques in Section IV. On the other hand, sensing refers to the capability of detecting and, if possible, tracking an object or a person within an area. It often involves estimating speed, size, and relative position within the area, considering other objects present. Various emerging use cases include gesture/facial expression recognition, human position recognition, detecting unauthorized vehicles/drones, monitoring objects in smart factories for interaction with manufacturing robots, and tracking devices without communication capabilities (e.g., micro-robots within the human body and biodiversity control) [142]. Sensing and positioning are closely intertwined. When an object is detected, it can be accurately positioned within the area, depending on the sensing technique employed. One method relies on large antenna arrays and higher frequencies, which will be deployed in 5G-Advanced and 6G to achieve pencil beams and address multipath issues for angular and range resolution. Another method involves the use of cameras capable of detecting potential targets through Computer Vision. As a field within ML, this method and other methods based on a similar approach requires large training data to detect one or more objects within an image or video [143]. Additionally, sensors can be mounted on the UE, such as on the wheel or steering of a vehicle to determine the UE’s trajectory, or a barometer on a UAV to ascertain its altitude. GPS can be utilized to synchronize local sensors, enhancing TDoA [144].
Sensing and communication techniques have traditionally been developed independently. However, due to the increasingly crowded spectrum and the demand for higher smart device density in specific environments (e.g., robots and drones), recent researches have investigated waveforms to jointly perform Integrated Sensing And Communications (ISAC), sometimes referred as Joint Communication and Sensing (JC&S). This approach has the potential advantage of using the same hardware for both functions, addressing cost and performance issues [145]. ISAC is particularly useful when traditional sensing technologies like cameras and LiDAR cannot be mounted on objects, or when the environment has obstructions or weather limitations (e.g., fog and rain). In such cases, radio sensing can provide a solution. It is worth noting that sensing can be monostatic when the transceiver is located in the same place (i.e., performed by the same device) or bistatic when the transmitter and receiver are in different positions. Possible implementations include: i. frequency division between data transmission and radar measurements, ii. simultaneous usage with time duplexing for backscattering reception, and iii. joint use of data/radar transmission in a bistatic fashion, involving two additional nodes besides the target node to be positioned. One proposal utilizes IEEE 802.11bf, leveraging the Directional Multi-Gigabit (DMG) sensing procedure for frequencies both below 7 GHz (2.4 GHz, 5 GHz, and 6 GHz) and in the mmWave range (above 45 GHz). The standardization Task Group for IEEE 802.11bf has introduced enhancements dedicated to channel estimates, including training sequences and the beam refinement protocol (BRP) for the sensing procedure. The general principle of radar sensing and the trade-off between sensing accuracy and overhead are explained in [146]. Radar technique aims to achieve resolution in separating objects or people in terms of range, angle, or velocity. Another method involves ultra-wideband (UWB) within the IEEE 802.15.4z standard, utilizing radio pulse sequences [147]. This technology is endorsed by car connectivity consortia for collision avoidance. The MAC layer of UWB is highly flexible, allowing various configuration options in terms of repetition frequency, header lengths, and the number of devices involved in the measurements, enabling joint measurements and data transmission simultaneously. Further proposals [148] explore the integration of radar functionalities within radio cellular systems, some of which are described in Section IV. In [149], the adoption of OFDM signals and processing strategies for interference cancellation is investigated.
With the rise of new Internet services, such as 4K/8K video streaming, 360-degree MR/VR, and autonomous driving, upcoming communication systems are being enhanced by strategically locating processing (and storage) capabilities within the network. ETSI introduced the Multi-access Edge Computing (MEC) architecture [150] with the objective of improving the economic sustainability of telecommunication operators by reducing the Total Cost of Ownership (TCO) of their networks. The basic concept is to bring large content as close as possible to the user, thereby minimizing traffic within the network and offloading remote servers. MEC represents an evolution of the Cloud, typically deployed farther within the broader Internet. In addition to reducing transport network traffic, MEC also minimizes latency in data transmission, enhancing overall network performance and facilitating various services. Due to congestion control, the throughput \begin{equation*} T_{H}\le \min \left [{{c \cdot \frac {M_{S}}{\mathrm {RTT}} \cdot \frac {1}{\sqrt {P_{L}}}, \max [R_{b}]}}\right ] \tag {5}\end{equation*}
Throughput vs Round-Trip time with
In addition to bringing content closer, MEC can support the computational capabilities of IoT sensor/actuators with limited computing power and battery capacity or offload the processing of personal devices (e.g., goggles and viewers in XR) [152]. Other concepts are introduced to emphasize the possibility of performing processing very close to the edge of the network or up to the device. In the first case, it is referred to as Fog Computing, where processing occurs in the access point (i.e., the last section of the network, such as the gNB or Wi-Fi modem). In the second case, it is known as Mist Computing, contributing to pre-process data directly on limited devices, following basic rules for filtering, aggregating, or fusing collected data by sensor/actuators [154]. Nonetheless, MEC is a general concept, and its definition may sometimes overlap with other concepts. Refer to [153] for detailed information, differences, and advantages.
In upcoming communication systems, effective coordination of processing and communication resources is fundamental. Local edge servers, with their limited content storage and processing capacity, need precise coordination between communication resources and processing/storage resources within specific local areas. The management of how content is loaded onto and removed from a local server becomes important, involving techniques such as ML [155] or more frequent visualization strategies (Zipf’s law). An additional consideration is the coordination of radio resources within a given area with the latency requirements of certain applications. Depending on the availability of radio resources and the time-critical nature of the application (e.g., arm movement for XR or face recognition for restricted area entry), content can be processed in a local MEC or in a second-level MEC positioned further within the network.
In summary, emerging joint techniques that enable Integrated Sensing, Positioning, Computing, and Communications (ISPCC) simultaneously are instrumental in meeting requirements that are challenging to achieve with separate waveforms and technologies. The networks of 2030 will be built on this integration, serving as an enabler for applications that demand object tracking, robot automation, and human-object interaction.
E. Native AI for Communication System
AI, or its subset ML, is rapidly advancing in various aspects of human life. Its application in telecommunications can be categorized into two distinct groups: at the application level and within telecommunications systems. The first group involves predicting human movements, content retrieval, extracting an object from an image and other applications. The second group focuses on optimizing the physical transmission of signals and managing network functionalities, such as predicting UE handovers, resource provisioning, optimizing routing, or RRC setup. AI and ML can support communication systems in the following areas: i. Determining fast and flexible models for hard-to-model and challenging problems; ii. Solving complex tasks to deliver practical solutions for accurate and near-ideal predictions; iii. Modeling non-linear functions through generative processes; iv. Providing continuous enhancements to wireless communication systems.
Next-generation communication systems are being designed from the ground up to natively incorporate AI/ML, while existing systems are integrating AI/ML features across all elements. In a comprehensive study [156], ML adoption within cellular systems is highlighted, covering all layers of the protocol stack. For instance, at the PHY layer, ML can support decoding, MIMO, channel estimation, and interference reduction. At the MAC/RLC layer, ML aids in scheduling strategies, buffer status reporting, and retransmissions. Meanwhile, at the RRC layer, ML contributes to user mobility with handover prediction and threshold settings for handover policy. AI/ML play a crucial role in Self-Organizing Networks (SON) and Open RAN architecture through the RAN Intelligent Controller (RIC). The RIC is responsible for controlling and optimizing RAN functions [157], enabling wireless networks to adapt, reconfigure, and optimize functions and parameters based on specific network conditions and user demands [158], [159]. However, AI/ML often focuses on solving individual problems without optimizing the entire system. In response to this, Generative AI (GenAI) is emerging as a revolutionary candidate in many fields, including telecommunications. As a subset of AI, GenAI can generate new contents, such as text, images, and videos, based on learned patterns from extensive datasets. Its main expression is related to Large Language Models (LLM), that attracted the research community in the fields of text and image generation, question answering, sentiment analysis, conversational agents, human machine interactions, automation, and many more [160]. Moreover, some products such as Falcon LLM, generative pre-trained transformer (ChatGPT) and Bidirectional Encoder Representation from Transformer (BERT) are already available on the market and largely adopted by people. More in general, Large GenAI Models (LGM) are demonstrating improvements in wireless systems, predicting better behavior in elements such as beamforming, handover routes, power allocation, and spectrum management. GenAI’s strength lies in considering the multi-modal features of typical telecommunication systems, including RF signal variation, wireless environment, contextual situations, and timeliness awareness. However, there are drawbacks to current AI/ML. Firstly, ML algorithms require a substantial amount of data for training, resulting in time-consuming and sub-optimal outputs initially, which may persist for an extended period. Secondly, training data may be representative to certain situations, limiting the model’s generalizability across different areas without experiencing degradation (i.e., data dependency). Thirdly, ML models are usually complex and perceived as “black boxes,” posing challenges in interpreting their outputs, leading to a lack of transparency or interpretability. Lastly, continuous learning is necessary to adapt to dynamic situations, introducing potential biases or inaccuracies in models.
The next-generation wireless network or 6G will be based on the integration of native AI at every level and element in the network, leveraging the collective knowledge distributed within the network and mobile terminals [161]. Several key aspects are emerging in this context. Firstly, the need for extensive data exchange among agents performing actions in the network needs of the adoption of strategies to overcome bandwidth limitations and resource constraints. Semantic Communications can be employed to compress raw data using LGMs and facilitate learning on a spatial basis for information exchange between agents [162]. Moreover, existing algorithms and protocols, such as those for scheduling and routing, lack flexibility, making them inadequately adaptable to future services like MR/VR and fast-moving IoT devices. To address this, novel and goal-oriented learning protocols should be implemented to enable effective collaboration and interaction between agents across various network elements, including devices, base stations, and NF/AF elements. These protocols need to comprehend and learn when, where, and what information should be exchanged to take appropriate actions, optimizing both network resources and the control commands of vehicles, robots, and drones. The integration of distributed and autonomous agents is expected to lead to a fully autonomous and self-organized networks, capable of perceiving the real environment. LGMs can contribute to making better decisions based on sequences of actions required to accomplish a task or by distributing sub-tasks to different devices or network elements, based on a Distributed LGM Agent model [163].
F. Next Future Technologies
In addition to the enabling technologies discussed above, numerous research groups are investigating technologies that will be available in the coming years. Below are some of the most promising technologies. One of these concerns quantum communications, where the qubit (quantum bit) is utilized as the elementary element for data transmission. Based on the physics concept of a quantum, qubits can take on a state of superposition, allowing them to represent multiple combinations of 1 and 0 simultaneously. The final outcome of a calculation is revealed only upon measurement of the qubits, i.e., when a measurement basis is selected, which immediately results in their quantum state “collapsing” to either 1 or 0 [164]. Quantum Key Distribution (QKD) protocols in quantum communications are employed for secure data transmission. This involves sending encrypted data as classical bits over networks, while the keys needed to decrypt the information are encoded and transmitted in a quantum state using qubits. Various approaches, or protocols, have been developed for implementing QKD such as BB84, BBM92, Six State, E91, S09, KMB09 and some others [165]. Note that typical media of quantum channel may be optical fiber or free space superposition in qubits are induced using precision lasers or microwave beams, respectively.
The network of the telecommunication operator is rapidly evolving towards a distributed and disaggregated architecture, with an increasing number of elements and interfaces being added, and more external players participating. In this context, blockchain or any distributed ledger technology (DLT) can play a significant role in supporting the disaggregated ecosystem by managing untrusted environments and participants [166]. Two main use cases can be envisioned for blockchain in this scenario. The first relates to the Metaverse vision, where VR operates on a distributed infrastructure. For example, one provider may offer an asset (e.g., a piece of clothing), while another provides glasses, and both are connected through a platform provided by yet another provider. Blockchain enables traceable interactions in this complex ecosystem. The second use case involves the interaction of smart things or individuals in a local environment, such as a road intersection. In this setting, various actors (e.g., vehicles, pedestrians) enter and exit in the area, interacting with fixed elements like traffic lights and cameras and the other moving elements like vehicles and pedestrians. Blockchain can record the states and positions of each actor, facilitating incident management and emergencies.
Finally, 5G is starting to utilize mmWave frequencies for wireless communications networks and 6G is targeting even higher. Then the speed of the transistor is beginning to become the kind of bottleneck [111], [167]. This trend poses an increasing challenge for the semiconductor industry to provide components to deliver high output power at extremely high frequencies. Transistors and associated materials to transceivers, antennas, and packaging should be considered in the development of chipset for 6G for noise, power, linearity, signal conversion, and the generation of clean, high-quality RF and clock references, among many other things causing limitation of RF transceivers. Researches provide advancements for the following semiconductors. Gallium Arsenide (GaAs) may guarantee extremely high frequencies up to the high W-band, (i.e., <110 GHz). but its material costs are high. Gallium Nitride (GaN) is suitable for high-power, high-temperature applications, while delivering decent noise performance, it is able to achieve extremely high frequencies up to 500 GHz. Due to the highest electron mobility and saturation velocity, Indium Phosphide (InP) can achieve frequencies higher than 1THz. However, the material cost is high and wafer/chip handling is difficult [168].
G. Further Concepts
Before concluding this section, it is crucial to emphasize the need for trustworthiness and sustainability in the forthcoming network in 2030. Trustworthiness encompasses two key aspects. On one side, the network must be secure, ensuring confidentiality, integrity, and availability, along with a guarantee of user privacy. On the other side, trustworthiness involves the network’s ability to provide accurate and genuine information and data. This goes beyond addressing issues such as fake news or misleading contents; it is fundamental to deliver information to users in a fair, timely, and unbiased manner, eliminating discrimination based on political, economic, gender, religious, or ethical characteristics and avoiding biases in information provision and decision-making [169]. Sustainability has become a crucial concern in recent years. Technological sustainability focuses on using resources for products and services to minimize environmental impact, specifically reducing carbon emissions and greenhouse gases. The future telecommunication system has two objectives. Firstly, it aims to bring benefits and improvements across various sectors, including transportation, smart cities, digitalization, healthcare, logistics, and manufacturing, to cite a few. The second objective is related to its sustainability [170]. According to several reports [171], [172], [173], the ICT sector contributes to global emissions at around 2%, with a potential upward trend to 15% worldwide. In response, 3GPP has proposed a 90% reduction in the energy consumption of NR compared to LTE. Future networks need to be designed with energy efficiency and low power consumption in mind. Various implementations and enhancements have been discussed in [174] and related references. In summary, it is fundamental that the 6G network is both sustainable and efficient.
Conclusion
The constant evolution of telecommunication systems enables the provision of new services that will enhance our lives in the near future. The paper defined three new classes of bearer services: immersive communications, everything connected, and high-positioning. To accommodate these new services, next-generation wireless networks need to achieve much more stringent performance requirements than those in 5G. Therefore, new KPIs and traffic models have been proposed for designing and evaluating future networks.
Techniques and advancements introduced in Release 18 and Release 19 have also been analyzed and investigated with the aim of increasing data rate, reliability, and coverage, reducing latencies, and improving network management. In addition to devices like smartphones and legacy sensors, drones, connected vehicles, and devices with specific functionalities such as medical devices and support for train control with important latency requirements will be connected to the future network. Fundamental for providing innovative services (e.g., gesture recognition, autonomous factories and object tracking) is the ability to ensure positioning data accuracy and positioning data latency for various devices. Classical positioning techniques (e.g., received signal strength, fingerprinting, and time difference-of-arrival evaluation) will need to be integrated with communication techniques developed in 5G-Advanced and 6G, such as mmWave, pencil-beam, and horizontal and vertical massive MIMO.
Finally, the paper analyzed the most promising technologies based on improvements in the physical layer, radio interface, and bandwidth extension. These include the proliferation of access points and the introduction of innovative architectures to ensure global coverage and performance in adverse environments. New joint techniques are also described to integrate Sensing, Positioning, Computing, and Communications, along with the native and extensive application of AI and ML throughout the network to enhance both single functions and more complex functionalities that interact at various levels and elements within the network. To seize new opportunities, various challenges must be addressed to overcome the limitations posed by enabling and emerging technologies. An important challenge will involve developing increasingly high-performance antennas with a high number of active elements, with very high-directional beams, compact in size for building installations, and miniaturized for device integration, particularly in sensors. Another challenge will be designing bandpass filters with extremely wideband constant gain, extending up to frequencies on the order of Terahertz. The processing capacity required in various types of devices supporting new services, as well as in network elements like MEC and RIS, will also significantly increase, necessitating measures to balance energy consumption. Last but not least, addressing the deployment of a large number of access points, practically everywhere, will be crucial to providing ubiquitous and multi-broadband connectivity.