Het-SDVN: SDN-Based Radio Resource Management of Heterogeneous V2X for Cooperative Perception

The transformation and innovation brought by vehicle-to-everything (V2X) communications are ongoing. Enhanced road safety, traffic efficiency, and vehicle automation are all promised on condition that the envisioned V2X applications can be realized. “Share what I can see” is one of the visions, known as cooperative perception (CP). CP enables sensor data sharing, while the types of data are diverse according to the levels of vehicle automation. To support sharing raw, pre-processed, and fully processed data, V2X is advancing. Emerging radio access technologies (RATs), like millimeter waves (mmWave), are introduced. V2X networks become more heterogeneous than ever before. To fill in the gaps that existing V2X networks lack interoperability and a network function to manage heterogeneous V2X, this paper proposes Het-SDVN, a software defined networking (SDN)-based vehicular network (SDVN) architecture for heterogeneous V2X. Compared to the counterparts, Het-SDVN features hierarchical control planes (C-planes), which can reduce overhead and latency, provide scalability, and guarantee the availability of CP. This paper also presents the demonstration of Het-SDVN in a real-life proof of concept. Through the response to diverse CP requests, the required V2X performance, the capability of local SDVN, and the importance of interference management are all revealed. Besides, a benchmarking test in flow installation for local and global SDVN controllers is carried out. Finally, this paper shows the visualization results of CP.


I. INTRODUCTION
Vehicle-to-everything (V2X) communication is a revolutionary technology to transportation systems. One of the greatest contributions, is the integration of automobiles, the most pervasive mobility platform in modern society, into data networks through vehicle-to-vehicle (V2V), vehicleto-infrastructure (V2I), vehicle-to-pedestrian (V2P), and vehicle-to-network (V2N) communications. This integration enables intelligent transportation systems (ITS) [1], which foster a variety of disruptive applications that make the roads The associate editor coordinating the review of this manuscript and approving it for publication was Hang Shen . safer, the traffic more efficient, and the vehicles smarter. To enhance the vehicle's automation and resolve the concerns on their safety, cooperative perception (CP) has drawn extensive research attentions in recent years.
CP is a safety-critical application that allows vehicles to exchange sensor data with other vehicles or roadside units (RSU) via V2X. Without CP, vehicles, which merely rely on drivers or local sensors, are unreliable for safe driving due to the physical constraints such as perception range, field of view (FOV), and occluding objects [2]. The data for CP can be categorized by its processing levels. For instance, highly processed sensor data is encoded into V2X messages such as cooperative awareness messages (CAM) and collective perception messages (CPM). The CAM and CPM-based CP have been systematically demonstrated in [3] and [4].
Meanwhile, CP with raw sensor data is crucial for highlevel automated driving. The detection performance can be improved with high data density. In addition, from a security perspective, the risks from untrusted CP contributors doing detection can be eliminated. However, enabling raw sensor data-based CP is quite challenging. Traditional V2X technologies cannot support the stringent requirements posed by raw data exchange (a data rate of over 1 Gbps, an end-to-end latency of less than 10 ms) [5]. The peak data rates of the firstgeneration V2X, the Institute of Electrical and Electronics Engineers (IEEE) 802.11p and the long-term evolution V2X (LTE-V2X), are limited to 27 Mbps, and 100 Mbps, respectively. The only solution is introducing new radio access technologies (RAT) that work on higher frequency bands, such as millimeter waves (mmWave), to enhance V2X.
As V2X networks become more and more heterogeneous, bottlenecks should be resolved from two aspects: 1) Complexity of network management: Low-frequency V2X (mainly at 5.9 GHz) and high-frequency V2X (over 30 GHz) will coexist in the network. They feature different characteristics in propagation (path loss, penetration rate, etc.). The network management must be adaptive to CP requirements and selected V2X. For example, broadcasting that used to be applied in IEEE 802.11p is not effective for mmWave. Multi-hop transmission is required by mmWave V2X, so the routing topology needs to be determined efficiently. 2) Interoperability between diverse networks: Currently, vehicular ad-hoc networks (VANETs) led by the European Telecommunications Standards Institute (ETSI) and cellular V2X (C-V2X) led by the 3rd Generation Partnership Project (3GPP) are independent networks. They possess their own V2X technology stacks. IEEE 802.11p and its successor IEEE 802.11bd are used for VANETs, while LTE and new radio (NR) are adopted by C-V2X. The interaction and handover among these technologies have not been clearly specified.
Software defined networking (SDN) is an emerging network technology for 5G and beyond [6]. It shows tremendous advantages in network resource management and hence can be introduced to vehicular networks, namely software defined vehicular networks (SDVN). SDN centralizes the network by decoupling the control plane (C-plane) and data plane (D-plane) from distributed network devices. Through a unified C-plane, the SDN controller can track the real-time status of D-plane networks, create highly abstracted resource maps for communications, and make globally optimized decisions. Moreover, third-party applications are programmable orienting to the SDN controller, which liberates developers from complicated network protocols and trivial hardware configurations. Regarding SDVN, most prior arts are theoretical and simulation-based [7], [8], [9]. The authors in [10] made a solid step forward by prototyping SDVN in an emulator called Mininet-WiFi. They demonstrated the interoperability empowered by SDVN through a network handover scenario (Wi-Fi→LTE). To fill the gap of lacking practical implementations, the authors in [11] and [12] established real SDVN testbeds with commercial off-the-shelf (COTS) equipment (Rasberry Pi, Zodiac FX, etc.), and demonstrated routing and mobility management, respectively. However, only a single RAT (Wi-Fi) was considered in their studies.
In our previous work [13], we carried out a large-scale proof of concept for SDN-based mmWave V2X networks, in which both the 3GPP network (LTE) and IEEE network (IEEE 802.11ad) were involved. Their interoperability was achieved by serving different planes in SDVN (LTE as C-plane, IEEE 802.11ad as D-plane). To the best of the authors' knowledge, it was the first prototype of mmWave SDVN and the first demonstration of SDVN supporting CP. Nevertheless, it left some defects. One was at the D-plane. Only mmWave was exploited, so the management focused on multi-hops, which was inefficient for CP with processed sensor data that could be broadcast. The other was at C-plane. The whole network only relied on a single C-plane. The scalability and stability were questionable. It may fail to meet the required latency when the network gets huge and crowded.
Therefore, in this paper, we make novel and solid contributions, which lie in three aspects: • A hierarchical SDVN architecture is designed. By separating the roles and scopes of local and global C-planes, the global SDVN controller will only focus on crossnetwork operations and extreme scenarios. The local SDVN controllers will handle the networking within sub-SDVN networks. In this way, control overhead can be diluted, and latency can be promised.
• Heterogeneous V2X is considered in this new hierarchical SDVN (Het-SDVN). After presenting a survey on the existing and emerging V2X technologies, the applicability of IEEE-based V2X, 3GPP-based V2X, and even satellite networks, to the Het-SDVN C/D-planes is comprehensively discussed.
• A full-fledged Het-SDVN testbed is set up. The testbed incorporates major ITS components, such as vehicles with onboard units (OBU), RSUs, and multi-access edge computing (MEC) servers. Based on the testbed, a proof of concept is carried out to demonstrate Het-SDVN's capability to provide CP within and across sub-SDVN networks. The system performances of C-planes and D-planes are comprehensively evaluated. The remainder of this paper is organized as follows. Sect. II presents the state of the art of CP in order to help our readers better understand this safety-critical application and the motivation to innovate the current V2X networks. Sect. III provides a detailed introduction to our proposed Het-SDVN architecture, where the definitions of ''local'' and ''global'', the utilization of heterogeneous V2X, and the supports to CP are covered. Sect. IV reviews our efforts in setting up a real-life proof of concept for Het-SDVN. The experimental topology and orchestration mechanism are explained. The evaluation results from the field trials of CP are discussed. Finally, Sect. V gives concluding remarks of this paper.

II. STATE-OF-THE-ART OF COOPERATIVE PERCEPTION
Cooperative perception (CP), also referred to as ''collective perception'' by the European Telecommunications Standards Institute (ETSI) Technical Report (TR) 103 562 [14], and ''extended sensors'' by the 3rd Generation Partnership Project (3GPP) TR 22.886 [15], is one of the recent efforts to bring cooperation to transportation systems. Cooperation will enhance safety, efficiency, and sustainability. In the context of automated driving systems, the Society of Automotive Engineers (SAE) has extended the classic definitions of driving automation to cooperative driving automation, through the J3216 standard [16]. The capability to share ''where I am and what I can see,'' consistent with the CP concept, is emphasized across all levels of automation, ranging from Level 1 (driver assistance) to Level 5 (full driving automation).
To realize CP, current research interests center around three pivotal aspects: 1) How should sensor data be shared, considering the data type? 2) How can the received sensor data be effectively utilized through fusion techniques? 3) How does the V2X network provide support for CP? The following subsections will review the state of the art.

A. TYPES OF DATA SHARING
The authors in [17] categorized sensor data sharing into three types: (1) raw/low-level data sharing; (2) feature/middlelevel data sharing; and (3) object/track-level data sharing. Considering the diversity of sensors and variations among manufacturers, the last type is easier to be standardized due to its high-level abstraction. Cooperative awareness message (CAM) is the first-generation V2X message, standardized by ETSI [18], to share high-level sensor data, i.e., the vehicle position, velocity, and heading derived from the global navigation satellite system (GNSS) and inertial measurement unit (IMU). Now, ESTI is drafting a new specification for collective perception message (CPM) [19], which will include a list of objects perceived by the cameras or light detection and ranging (LiDAR) on vehicles or RSUs in its format.
Camera videos and LiDAR point cloud data are raw/ low-level sensor data. However, extensively sharing ultrahigh-definition (UHD) videos or three-dimensional (3D) point cloud data will quickly approach network bottlenecks due to the huge data volumes and limited bandwidth. As a compromise, feature/middle-level data sharing gains popularity. The definition of features varies in the way and depth of data processing. In [20], the authors proposed feature maps for CP which are generated from point cloud data by the convolutional layers in a neural network. Another work introduced spatial confidence maps as features which highlight object detection scores in each frame of point cloud data [21].
Feature/middle-level data sharing involves initial abstractions, resulting in a reduced data size compared to raw/low-level data sharing, while still preserving more details than object/track-level data sharing. However, this approach increases the complexity of CP as it necessitates consistency between the sender and receiver sides (e.g., the design of the encoder/decoder). Moreover, the definitions of features are too diverse to generalize for all CP use cases. Therefore, no single type of sensor data sharing dominates. They are selected based on specific circumstances, including processing capabilities, channel status, and application preferences.

B. FUSION TECHNIQUES
Fusion is to merge data from onboard sensors with data received from shared extended sensors. Fusion techniques determine the effectiveness of CP. With respect to the three types of data sharing in II.A, the fusion can also be categorized into (1) raw/low-level data fusion; (2) feature/middlelevel data fusion; and (3) object/track-level data fusion.
Fusion in the first and second types needs to take accuracy requirements, bandwidth limitations and computational capabilities into account. To that end, the authors in [22] proposed two fusion paradigms known as early fusion and late fusion, respectively. For instance, in the case of shared raw LiDAR data, i.e., point cloud data frames, early fusion involves merging these frames as a whole and then feeding them into the detection model. On the other hand, late fusion entails performing detection on each frame individually and subsequently merging the final results (similar to the third type). Both paradigms offer pros and cons. Early fusion has the potential to produce optimal results as it preserves all information prior to detection. However, it necessitates strong computational capabilities, requiring the introduction of an edge server, as highlighted by the authors in [23]. In contrast, late fusion mitigates the bandwidth requirements but compromises accuracy. Given the diversity in shared data quality and sensor positions, early fusion may lead to detection failures or duplicate detections, thereby impacting vehicle operation and driving safety.
For all types of fusion, perspective transformation is the fundamental step to utilizing CP data [24]. Positional information from CAM, CPM, or other localization sensors will be useful to derive perspective (coordinate) relationships. For example, when an object is detected at the position where T rv (t) is the transformation matrix from the RSU perspective to the vehicle perspective. T rv (t) is expressed by which dynamically changes with the vehicle movement.

C. V2X NETWORKS
The two widely recognized vehicular networks are the vehicular ad-hoc network (VANET) and cellular V2X (C-V2X). Both of them can support CP. The VANET led by ETSI is built on a fully distributed architecture. In VANETs, egoistic vehicles and RSUs share their sensor data in CAM or CPM formats mainly by broadcasting. The radio access technology (RAT) serving VANETs is IEEE 802.11p. IEEE 802.11p provides a peak data rate of 27 Mbps, which is sufficient to transmit object/track-level sensor data in CAM or CPM because of the small payload (61.25 bytes per object [25]). However, the performance of VANETs could severely deteriorate if sharing raw/low-level sensor data. The drastically increased packet numbers and payload size (1248 bytes [26]) can easily cause channel congestion, leading to unpredictable delay, since the MAC layer of IEEE 802.11p adopts the carrier-sense multiple access with collision avoidance (CSMA/CA) [27]. In this regard, the long-term evolution V2X (LTE-V2X) can handle the dilemma of VANETs as it introduces central nodes (i.e., base stations) to allocate channel resources [28]. Nevertheless, LTE-V2X, with its limited data rate performance of 100 Mbps, is also struggling to meet the requirements of sharing raw/low-level sensor data, as summarized in Table 1. More strict requirements can be estimated from challenging scenarios, such as overtaking scenarios [29] and intersection scenarios [30]. The required data for sharing raw data on a V2V link can reach multi-gigabit per second. For this reason, mmWave is introduced to enable raw/low-level sensor data sharing. Its capability to deliver ultra-high data rate and ultra-low latency transmission has been comprehensively demonstrated [31]. Eventually, in the new radio V2X (NR-V2X), a mmWave band is involved, namely FR2 [32].
As a conclusion, V2X networks are expected to effectively support various types of sensor data sharing. However, the interoperability between IEEE-based RATs and 3GPP-based RATs has not been sufficiently discussed in research or standards. Besides, it lacks a network function in the current C-V2X architecture to efficiently utilize heterogeneous V2X for CP. This gap sparked our interest in introducing SDN to enhance the vehicular network management.

III. HET-SDVN: ARCHITECTURE AND COMPONENTS
This section provides an overview of Het-SDVN architecture and elaborates on the separation of C/D-planes, roles of heterogeneous V2X, and profits to cooperative perception (CP).

A. DEMYSTIFYING ARCHITECTURE
The primary goal of the proposed Het-SDVN architecture is to effectively fulfill the stringent and diverse QoS requirements of V2X applications, specifically those pertaining to CP. To that end, it is crucial to have global optimization on network decisions and minimize control overhead. Extensive studies have exposed that a fully centralized control architecture exhibits remarkable efficiency in adapting to highly dynamic vehicular networks [33]. However, this architecture encounters significant performance degradation when the service range of the SDVN controller expands substantially. In order to retain the benefits of centralization while simultaneously mitigating excessive overhead, a two-layer hierarchical SDVN C-plane is introduced and illustrated in Fig. 1.
1) Local SDVN C-plane: This C-plane manages one sub-SDVN network, where the central node is an RSU. The RSU hosts C-plane functions and orchestrates communication resources for D-plane applications. The local SDVN controller processes the requests sent by the onboard units (OBU) of connected vehicles and dynamically controls V2X networks (typically, V2V and V2I) to transfer the contents.
2) Global SDVN C-plane: Global C-plane manages geographically decentralized sub-SDVN networks. This C-plane takes advantage of MEC or Cloud resources and wide-area communication systems such as cellular networks and satellite networks. The global SDVN controller holds an extended network view by synthesizing the context information (e.g., topology, resource distribution, and traffic information) from the underlying local SDVN controllers. If vehicles send requests that exceed the capabilities of local SDVN controllers, or vehicles move to infrastructure-less areas, it hands over to the global SDVN controller to control the network.
3) RSU and OBU D-plane: The Het-SDVN D-plane consists of stationary RSUs and mobile OBUs. There have been a variety of wireless interfaces that support data transfer in this D-plane. IEEE-based interface technologies (IEEE 802.11p, 802.11n/ac, 802.11ad/ay, etc.) and their 3GPP-based counterparts (LTE and NR sidelinks) are two significant representatives. Application data, for example, raw or processed sensor data for CP, is routed over a relay network of RSUs and OBUs, which is configured by SDVN controllers using southbound interfaces like OpenFlow. The vehicle density, channel conditions, QoS requirements, etc., may affect the selection of heterogeneous V2X interfaces at each hop.

B. UTILIZATION OF HETEROGENEOUS V2X
The proposed Het-SDVN architecture aims to accommodate heterogeneous V2X, for which existing RATs are designated to serve different network planes (C-plane or D-plane). The SDVN controllers should be able to select the optimal V2X technology and communication method based on the environment and specific V2X application requirements. Table 2 summarizes the legacy and emerging V2X technologies. 76258 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

1) LOW-FREQUENCY V2X TECHNOLOGIES
Low-frequency V2X mostly operates in the sub-6 GHz band. IEEE 802.11p is a representative of the first-generation V2X technologies, which becomes the basis of DRSC in America, ITS-G5 in Europe, and ITS-Connect in Japan. IEEE 802.11p performs at 5.9 GHz and provides a data rate of up to 28.8 Mbps to support basic V2X applications as defined in ESTI technical report (TR) 102 638 [34]. Then, the 3GPP launches formal discussions on enhancing cellular networks for V2X since . In 2016, LTE-V2X was finalized. Further enhancements to LTE-V2X sidelink (PC5) for direct communications, and the legacy air interface (Uu), were studied under Rel-15. In addition, NR-V2X was developed since Rel-16, which was frozen in 2020, while the discussions on its enhancements now continue in Rel-17 and Rel-18. NR-V2X targets two frequency bands (FR1: 410 MHz -7.125 GHz, FR2: 24.25 GHz -52.6 GHz), but the primary focus for the first-generation NR-V2X design is in FR1 to enable low-latency reliable sidelink communications at 5.9 GHz. Cellular technology-based V2X (C-V2X) outperforms IEEE 802.11p in terms of data rates, latency, and reliability. However, IEEE 802.11p won't be obsolete owing to its successful mass-market applications over the last decades and economy. Their coexistence and interoperability should be one of the major focuses of the newly designed V2X architecture.
Other than the IEEE 802.11p and C-V2X, Wi-Fi based on the IEEE 802.11 family of standards, which works in the unlicensed 2.4/5/6 GHz bands, can be potentially used for V2X due to its low cost and ease of deployment. Wi-Fi has become ubiquitous in urban areas and its performance is improving drastically as the generation evolves. In Wi-Fi 6 (802.11ax), a data rate of over 1 Gbps is supported under the single-carrier and single-stream settings. However, one of the well-known drawbacks of Wi-Fi is the catastrophic interference which heavily degrades its practical performance.
Since the aforementioned technologies all operate in low frequency bands, they suffer from trivial path loss and penetration loss, thus earning a good coverage. In this Het-SDVN, they can serve as C-planes in different layers. For example, Wi-Fi, DSRC, PC5-based LTE, and NR (FR1) provide local C-planes, while Uu-based LTE provides global C-planes as it can reach several kilometers. Apparently, low-frequency V2X can contribute to C-planes and D-planes simultaneously (in-band mode) or separately (out-band mode), because they have fully met the QoS requirements of basic V2X applications (including detected object-based and feature-based CP).

2) HIGH-FREQUENCY V2X TECHNOLOGIES
V2X operating in higher-frequency bands now attracts great interest due to the vast unexploited spectrum resources. The mmWave, terahertz (THz), and visible light communication (VLC) are promising technologies to be used for V2X. NR-V2X has to fulfill the peak data rate of 20 Gbps as required by IMT-2020, thus introducing the mmWave band (FR2). In a field trial of remote driving [35], the NR-V2X system deployed at FR2 demonstrates its ability to provide broadband, reliable and low-latency communications. At the same time, the applicability of IEEE 802.11ad to V2X is being evaluated by many projects [31]. IEEE 802.11ad, also known as WiGig, is an earlier, commercialized mmWave technology working in the unlicensed 60 GHz band. It enables the peak data rate of 6.75 Gbps with 2.16 GHz bandwidth.
Although THz and VLC were not included in the design of NR-V2X, they have been identified as key 6G-V2X technologies [36]. Of course, their utilization still faces serious challenges, such as the propagation measurement and channel modeling for THz communications, and the interference management for VLC. An indoor THz system measurement validates a 104 Gbps wireless link performance [37], which shows THz's potential to enable immersive V2X applications with AR/VR/XR. In [38], an automotive VLC system using light-emitting diode (LED) transmitters and camera receivers achieves a system performance of 45 Mbps without bit errors. These preliminary evaluations pave the way for their future adaptation to more challenging V2X scenarios.
Owing to abundant bandwidth allocation, high-frequency V2X enables ultra-high data rate and ultra-low latency communications, so that raw data-based CP is supported. However, the coverage is sacrificed as the frequency increases. In addition to severe path loss, antenna directivity and weakened penetration capability make broadcast-based communication infeasible. Hence, in this Het-SDVN, high-frequency V2X is only suitable for D-plane transmission. The SDVN controllers should manage their multi-hop communications and select appropriate D-plane nodes as relays.

3) OTHER V2X TECHNOLOGIES
Other V2X technologies target both low-and high-frequency communications. In March 2018, a new task group, TGbd, was established by IEEE for next generation V2X (NGV). The task is to develop IEEE 802.11bd as the successor to IEEE 802.11p. IEEE 802.11bd will work in the 5.9 GHz band and optionally in the mmWave band from 57 GHz to 71 GHz. In addition to having interoperability, coexistence, backward compatibility, and fairness with IEEE 802.11p, IEEE 802.11bd will inherit advanced PHY and MAC techniques implemented in the latest Wi-Fi standards (IEEE 802.11/n/ac/ax) to enhance the 5.9 GHz performance for at least two times higher throughput and communication range. As for the mmWave band, it is proposed to reuse portions of the PHY and MAC layers from IEEE 802.11ad/ay [39]. However, detailed specifications are ongoing. Considering the coverage, for this Het-SDVN, 5.9 GHz IEEE 802.11bd can play roles in the local C-plane and D-plane, while its mmWave version should serve the D-plane.
In February 2021, SpaceX, the US company, launched its commercial low Earth orbit (LEO) satellite communication service, known as Starlink. This system comprises over 4000 small satellites in LEO and operates within the Ku (12 -18 GHz) and Ka (27 -40 GHz) frequency bands, as authorized by the Federal Communications Commission (FCC). According to the disclosed specification [40], it offers impressive mobile user performance, with up to 250 Mbps downlink and 30 Mbps uplink speeds, comparable to LTE systems. Notably, Starlink ensures ubiquitous coverage, enabling internet access even in rural and infrastructure-less areas. This unique capability makes it an exceptional choice for the global C-plane and D-plane of Het-SDVN, except the LTE-V2X.

C. SUPPORTS TO COOPERATIVE PERCEPTION
The proposed Het-SDVN resolves the connectivity and scalability issues of existing SDVN architectures. With reliable 76260 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

C-plane associations, local/global SDVN controllers master real-time context information in D-plane and respond quickly to application requests from vehicles.
Regarding CP, this Het-SDVN provides support from two aspects. Firstly, the local SDVN provides CP for road/driving safety. This type of CP requires the exchange of sensing data focused on the immediate regions. Typical scenarios include safe overtaking and safe ramp merging, where an RSU with local SDVN controller manages the transmission of raw or processed sensing data over V2I and V2V links, allowing for the notification or assistance in detecting the presence of vehicles that pose a risk of collision. Secondly, the global SDVN provides CP for traffic efficiency. Along the planned route that an automated vehicle will take, congestion and accident information perceived by sensors in other sub-SDVNs are useful for route re-planning. The global SDVN controller manages the distribution of the above information over backhauls according to requests. There is a special case of CP where the global SDVN guarantees safety. It is on roads with harsh driving conditions (lighting, surface quality, etc.) and without infrastructure support (usually in rural or mountainous areas). Vehicles can get the latest sensing information from passing vehicles for safety by sending requests to the global SDVN controller, which establishes V2N2V connections using wide-area communication systems.

IV. PROOF OF CONCEPT: AN OUTDOOR EXPERIMENT
This section presents a practical proof-of-concept demonstration for the proposed Het-SDVN architecture. Firstly, we explain the experimental topology that reflects the essence of Het-SDVN, and then review the proof-of-concept setup. Finally, the experimental results are discussed.

A. EXPERIMENTAL TOPOLOGY
To simplify the experimental topology without losing hierarchical mechanisms, the scenario that a vehicle benefits from cooperative perception (CP) in the current local sub-SDVN network and requests CP from another sub-SDVN network is considered, as shown in Fig. 2. In this scenario, Vehicle1 runs CP for safety. The nearest CP server at RSU1 handles the CP request containing an accurate vehicle position. Sensing data from RSU1 is then delivered to Vehicle1 in its demanded format (detected objects, raw data) through a V2X network organized by the local SDVN controller. For the purpose of efficiency, Vehicle1 also needs perceptual information of RSU2, which exceeds the charge area of RSU1. Hence, this type of CP request goes to the CP server at MEC or Cloud. The global SDVN controller is responsible for sensing data routing across these two sub-SDVN networks.
In this topology, D-plane consists of two parts: the local SDVN D-plane within RSU1 coverage and the global D-plane connecting the two sub-SDVN networks. The former D-plane includes three nodes, two vehicles and one RSU, as well as multiple V2V/V2I links to forward sensing data. The latter D-plane is mainly wired networks, where sensing data needs to pass through large amounts of backhaul switches and Ethernet links in between.
As for C-plane, the described CP relies on the local SDVN C-plane of RSU1 and the global SDVN C-plane to schedule network resources. These two C-planes require different coverage. The former C-plane only needs to cover a target region, like an intersection with high accident probability where RSU1 can locate for safety, while the latter C-plane must be ubiquitous to mitigate the risk of local SDVN controller failures and provide CP even in RSU-less areas.

B. PROOF-OF-CONCEPT SETUP
The smart mobility field for the proof of concept is placed at Tokyo Institute of Technology,Ōokayama campus [41]. This field includes smart vehicles and RSUs for automated driving and V2X-related tests. The proof-of-concept environment is shown in Fig. 3. It is seen that hardware equipment for sensing, communication, and computation is installed on the exploited vehicles and RSUs. Table 3 reveals the details about the equipment. Here, LiDARs are used to monitor dynamic 3D objects. Therefore, the dense point clouds become major data sources of CP. The 32-laser LiDARs and 80-laser LiDARs are deployed on vehicles and RSUs, respectively.
For D-plane implementation, Wi-Fi (IEEE 802.11a/n/ac) at 5 GHz and WiGig (IEEE 802.11ad) at 60 GHz are two options for the local SDVN D-plane. To support the WiGig, directional antennas are installed on RSUs and the rooftops of vehicles so that mmWave V2I and V2V are enabled. The global SDVN D-plane directly uses campus Ethernet, but a backup could be 4G LTE. Pocket LTE devices are put on vehicles to provide access to the global SDVN C-plane. As for the local SDVN C-plane, Wi-Fi (IEEE 802.11b/g/n) at 2.4 GHz is adopted. These Wi-Fi APs are set up on RSUs.
In addition, computation devices with customized capabilities are supplied in order to perform CP and the functions of SDVN controllers on vehicles, RSUs, and the MEC server.

C. NETWORK ORCHESTRATION
The design of network orchestration in this proof of concept showcases the hierarchy and heterogeneity principles of the VOLUME 11, 2023  proposed Het-SDVN architecture. Figure 4 explicitly draws the application, networking, and data sequences when Vehi-cle1 runs CP and claims for various data formats.
In Fig. 4(a) and (b), the CP client on Vehicle1 broadcasts CP requests, indicating a region of interest (ROI) and format preference, which are received by the nearest local CP server on RSU1. As shown in Fig. 4(a), when Vehicle1 specifies the format as detected objects and the ROI as the current sub-SDVN, the local CP server on RSU1 sends a request to the local SDVN controller and performs LiDAR detection. The local SDVN controller on RSU1 creates OpenFlow tables and installs them on the Open vSwitches of Vehicle1 and RSU1, respectively. The OpenFlow table for RSU1 is used to broadcast the detected objects of RSU1 via UDP over 5 GHz Wi-Fi. When the ROI changes to the RSU2 sub-SDVN, the local CP server on RSU1 notifies the global CP server on MEC and requests the local SDVN controller to enable broadcast-based networking. The global CP server forwards the Vehicle1 request to RSU2 and simultaneously sends a request to the global SDVN controller for routing the detected-object data over the backhaul networks.
The sequence is almost the same when raw LiDAR data becomes the preferred format, as shown in Fig. 4(b). The major difference lies in the local SDVN. The large data size of LiDAR point cloud makes it intractable for broadcasting. Therefore, the local SDVN controller on RSU1 installs new flow tables on Vehicle1, Vehicle2, and RSU1. The OpenFlow table for Vehicle2 is used to relay the raw data to Vehicle1 through a V2I and a V2V link of 60 GHz WiGig.
In this proof of concept, both the CP client and servers are implemented as ROS2 nodes [42], leveraging the detection algorithm (CenterPoint) in autoware.universe [43], a ROS2-based open-source framework for automated driving.
Besides, a Python-written SDN controller framework called Ryu is used to realize the local and global SDVN functions [44]. Ryu provides OpenFlow components and welldefined API. For the development of various interfaces (i.e., the CP client-server interface, CP server-to-SDVN controller interface, and SDVN local-to-global interface), Python sockets and Python Flask are utilized. Flask enables the creation of HTTP-based RESTful APIs, which are widely used by web applications and as northbound interfaces of SDN controllers.

D. RESULTS AND DISCUSSIONS
Proof-of-concept results are discussed from three aspects: the required data rate for two different types of CP, the performance of the deployed local SDVN network and the VOLUME 11, 2023 76263 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
overall performance, including the backhaul performance (global D-plane), controller reaction delay and the visualization of CP. Table 4 shows the practical output data rate of RSU1/RSU2 LiDAR and the data rates of detected objects on RSU1 and RSU2, respectively. The former value depends on the LiDAR specification. RoboSense 80-laser LiDAR Ruby-Lite is used by RSU1 and RSU2. The measured average data rate is 46.5 Mbps, about 3 times that of the RoboSense 32-laser LiDAR used by Vehicle1. It makes sense because RSU LiDARs are supposed to monitor more precisely to ensure road safety. The latter values are affected by the surrounding object number, the detected object number, the used detection algorithm, the message format, etc. In this measurement, object detection is done by the ROS2 node lidar_centerpoint of autoware.universe. It publishes detected objects in ROS2 message format DetectedObjects, which are transferred via UDP. In 10 minutes, the average detection data rates of RSU1 and RSU2 are 2.84 Mbps and 1.39 Mbps, respectively.

1) REQUIRED DATA RATE
Comparing the measured data rates, it shows that sharing raw LiDAR data for CP poses greater challenges to network performances, especially to the throughput (about 20 times higher than the detected objects with only one LiDAR). It is crucial to consider different radio access technologies and transmission modes of V2X in the local SDVN network.

2) LOCAL SDVN PERFORMANCE
Vehicle1 receives CP data within the local SDVN of RSU1. Therefore, key performance metrics like the C-plane coverage and D-plane throughputs should be evaluated. Figure 5 showcases the received signal strength indicator (RSSI) maps of RSU1 Wi-Fi as the local C/D-planes. As shown in Fig. 5(a), RSU1 is located at a T-type intersection (red circle). Since the RSSI values are measured under the condition that the Wi-Fi association is available, it demonstrates that using 2.4 GHz Wi-Fi as the local C-plane on RSU1 can cover all three main roads to the intersection. So does the 5 GHz Wi-Fi D-plane, as shown in Fig. 5(b). Four red triangles are marked along the experimental course (from bottom to top: A→B→C→D), which indicate the positions where Wi-Fi D-plane throughputs are recorded. Figure 6 plots the average Wi-Fi D-plane throughputs. The error bars represent standard deviations and the dash lines indicate the required data rates under four processes depicted in Fig. 4. It is seen that the practical throughput of 5 GHz Wi-Fi outdoors degrades as the distance from RSU1 increases. When it comes to point B, this D-plane has failed to support the broadcasting of RSU1 LiDAR data. Even at the closest point D, the throughput improves, which, however, is still far from the requirement of transmitting both RSU1 and RSU2 LiDAR data. It demonstrates that the 5 GHz Wi-Fi D-plane is only capable of detected object-based CP.  Heterogeneous V2X, especially high-frequency V2X like mmWave, is necessary in order to enable raw data-based CP.
In contrast, the 60 GHz WiGig, another choice of local SDVN D-plane, has extraordinary throughput performance. The peak throughput observed on one WiGig V2I/V2V link 76264 VOLUME 11, 2023 reaches 2.20 Gbps. Due to the directivity of WiGig antennas, this performance is only sustained in line-of-sight communications. In addition, when multi-hop mmWave V2X occurs on a straight lane/road, like the topology of this proof of concept (RSU1→Vehicle2→Vehicle1), interference is nonnegligible. This impact has been studied in [45], where the authors proposed to configure mmWave antennas as ZigZag to mitigate inter-vehicle interference. In this work, channel management is implemented as a function of the local SDVN controller to relax interference. In Fig. 3, Ch2 (60.48 GHz) and Ch3 (62.64 GHz) are allocated to the mmWave V2I and V2V links, respectively. Table 5 compares the throughputs and packet delivery ratios (PDR) of mmWave D-plane with and without such channel control. It is shown that by applying channel control, the average D-plane throughput increases from 602.1 Mbps to 1.84 Gbps and the PDR also improves from 66.33% to 99.99%.

3) OVERALL PERFORMANCE
This proof of concept aims to effectively integrate the local and global SDVN networks. To examine network operations, the data rates of mmWave and backhaul interfaces on Vehi-cle1, Vehicle2 and RSU1 are recored using Wireshark after synchronization. Figure 7 shows the data rate variation when SDVN controllers are requested to orchestrate the network for CP following the sequence of Fig. 4(b). It is seen that the mmWave V2I and V2V links start to carry raw LiDAR data (from RSU1) after the local SDVN controller responds. When Vehicle1 demands raw LiDAR data from RSU2, the global SDVN controller successfully arranges the backhaul network, which can be observed from the significant growth of backhaul data rates. Meanwhile, the data rates on mmWave V2I and V2V links double, which implies that RSU2 LiDAR data get properly transmitted over the local SDVN network.
The local and global C-planes use heterogeneous networks (2.4 GHz Wi-Fi and Ethernet). It is necessary to evaluate their HTTP latency and flow installation latency separately. Table 6 presents the test results. The HTTP latency over Wi-Fi C-plane is slightly larger and more unstable than that over Ethernet, probably due to the outdoor wireless interference. Nevertheless, if the global CP server is deployed on Cloud, instead of the MEC server in this proof of concept. The HTTP latency can range from tens to hundreds of milliseconds. Therefore, safety-critical requests need to be firstly processed by the local CP server. As for flow installation latency, it is measured by CBench [46], one of the benchmarking tools for SDN controllers. The CBench emulates one OpenFlow switch (connecting to the Ryu controllers) and calculates the number of flow modifications per second. The number of test  iterations is set to 20. Other parameters keep default and the same. It is seen that the local and global SDVN controllers present similar flow installation efficiency regardless of network types. Since the network flow can be configured within 3 ms, it is likely to support advanced V2X applications which require an end-to-end latency of less than 10 ms.
On Vehicle1 PC, the received CP data are visualized using the ROS2 tool Rviz, as shown in Fig. 8. Figure 8(a) reflects the detected objects from RSU1 and RSU2 as colored bounding boxes with motion vectors, representing their categories, sizes, and mobility. These attributes are extracted from raw LiDAR data using the detection method CenterPoint, which significantly reduces the data size so that legacy V2X is still capable of broadcasting detected objects. Although sharing raw data is resource-consuming, demanding high-frequency V2X and additional bandwidth allocation, the LiDAR point cloud data from RSU1 and RSU2, as shown in Fig. 8(b), are imperative for high-level automated driving in terms of reliability and liability. Regarding reliability, these raw data complement the point cloud density of Vehicle1 LiDAR and light up the vacancy spots along the driving course, which can increase maneuvering agility and the tolerable time for safe maneuvering [29], [30]. As for liability, automated vehicles share higher portions of liability for incidents as the levels of autonomy upgrade, because they gradually take over the driving responsibility from human drivers [47]. RSUs, however, share very limited liability when they don't control vehicles directly. Therefore, Vehicle1 has the obligation to identify the trustworthiness of detection results from RSU1 and RSU2 if their authentication information (e.g., detection capability) is incomplete. With raw data at hand, Vehicle1 can perform detection by itself, as shown in Fig. 8(c), so the safety-critical decisions (brake, deceleration) can be double-checked. The visualization results clearly demonstrate the capability of the proposed Het-SDVN in managing heterogeneous V2X resources to enable diverse CP effectively.

V. CONCLUSION
The emergence of advanced V2X applications facilitates the advancement of V2X communications and the innovation on traditional vehicular networks. In order to effectively support various types of cooperative perception (CP) and ensure this safety-critical V2X application is deliverable to vehicles anywhere and anytime, this paper proposes Het-SDVN, an SDNbased V2X network architecture, consisting of hierarchical C-planes, geographically distributed but logically centralized sub-SDVNs, and heterogeneous V2X. The network roles are clearly defined. For CP in local SDVNs, communications are orchestrated by RSUs with local SDVN controllers; for CP in other SDVNs or infrastructure-rare regions, communications for crossing sub-SDVNs are orchestrated by global SDVN controllers at MEC/Cloud servers. This Het-SDVN is set up in Tokyo Institute of Technology, and its support for CP has been demonstrated via a proof of concept. Evaluation results validate the feasibility of Wi-Fi as local C-planes and stress the importance of mmWave and the necessity of interference management when performing raw sensor data sharing. Moreover, no significant difference in flow installation performance is observed between the local and global SDVN controllers over different networks. They are all under 3 ms, showing the potential to meet the latency requirement (less than 10 ms) of advanced V2X applications [15].
In our future work, the network mechanisms of Het-SDVN supporting other advanced V2X services will be considered. Regarding 3GPP networks, although in this proof of concept, only LTE is prepared for the global C-plane if RSUs are not visible, there is a plan to integrate real 5G NR systems with both mmWave and Sub6 into our testbed to enhance the D-plane. Then, a lot more new tests can be carried out. In addition, a large-scale simulation is necessary to examine the scalability of Het-SDVN considering the deployment cost.