Zero-Knowledge Proof of Traffic: A Deterministic and Privacy-Preserving Cross Verification Mechanism for Cooperative Perception Data

Cooperative perception is crucial for connected automated vehicles in intelligent transportation systems (ITSs); however, ensuring the authenticity of perception data remains a challenge as the vehicles cannot verify events that they do not witness independently. Various studies have been conducted on establishing the authenticity of data, such as trust-based statistical methods and plausibility-based methods. However, these methods are limited as they require prior knowledge such as previous sender behaviors or predefined rules to evaluate the authenticity. To overcome this limitation, this study proposes a novel approach called zero-knowledge Proof of Traffic (zk-PoT), which involves generating cryptographic proofs to the traffic observations. Multiple independent proofs regarding the same vehicle can be deterministically cross-verified by any receivers without relying on ground truth, probabilistic, or plausibility evaluations. Additionally, no private information is compromised during the entire procedure. A full on-board unit software stack that reflects the behavior of zk-PoT is implemented within a specifically designed simulator called Flowsim. A comprehensive experimental analysis is then conducted using synthesized city-scale simulations, which demonstrates that zk-PoT’s cross-verification ratio ranges between 80 % to 96 %, and 80 % of the verification is achieved in 2 s, with a protocol overhead of approximately 25 %. Furthermore, the analyses of various attacks indicate that most of the attacks could be prevented, and some, such as collusion attacks, can be mitigated. The proposed approach can be incorporated into existing works, including the European Telecommunications Standards Institute (ETSI) and the International Organization for Standardization (ISO) ITS standards, without disrupting the backward compatibility.


I. INTRODUCTION
R OAD transportation has been one of the most essential services for human mobility since ancient times.It has undergone minimal changes despite the passage of time, with the driver of the vehicle still being responsible for determining its operation based on the surrounding environment.Modern vehicles are fitted with various sensors, such as Li-DAR [1], millimeter wave [2], and stereo cameras [3], which provide data to support driving assistance features and enable autonomous driving capabilities.However, the effectiveness of these sensors is limited by their inherent locality, i.e.
sensing ranges [4].As sensors are attached to the vehicle, they present a view that is comparable to that available to the driver.Consequently, they cannot detect objects that are beyond the sensing ranges or obstructed by obstacles.
Cooperative perception is a groundbreaking technology to break such limitations.It can expand our field of vision and reduce blind spots by leveraging the sensors of other vehicles.The standardization of cooperative perception in the International Organization for Standardization (ISO) [5] and the European Telecommunications Standards Institute (ETSI) [6] demonstrates its significance in the field.The Intelligent First, the local perception and V2X data elements extraction modules run their own processing to output a list of objects.
Then, the augmented perception is composed of two important functionalities:  A Collective Perception Application, which is in charge of generating CPM based on the list of objects obtained from the local perception. A Cooperative Fusion system, which is in charge of managing the list of objects provided by the local perception and V2X communication.This module maintains a list of internal traces for every object and matches the input data with the current traces.It is in charge of verifying the consistency of the data and of storing the fused information in the LDM As shown in Figure 5, the augmented perception system has external interface with the V2X messages and the local dynamic map.It also integrates the local perception that can rely on multiple sensors and on a local fusion module.To handle the internal information within the augmented perception system, a common format is introduced to represent the list of objects.This list of objects is exchanged using internal messages between local perception, the V2X data elements extraction, collective perception application and cooperative fusion.

Common data format for augmented perception
As mentioned in the previous section, a common data format is used for information exchange between the modules of the augmented perception system, in particular the description a list of objects provided by the local perception and description of V2X elements function to the cooperative fusion.In addition, such data format shall be as generic as possible and the data elements shall include the fields required for the CP service to facilitate the generation of CPM.The description of the complete data format is given in Figure 6.As presented in Figure 6, it has to describe at the same time the information source, i.e. the module from which the data are issued (here, the local perception or the V2X data elements extraction), and the objects that have been detected.First, a common header is used to give general information such as the version, a timestamp associated with the data.
Then, the source description is divided into two parts:  A source identification which characterizes the source type (Lidar, Camera, V2X Communication,…) and provides physical parameters on the source (physical location, aperture angles, detection,…) that may be used for further interpretation of the list of objects  A region of interest is the source's coverage area (sensor coverage area or V2X transmitter's coverage area).The region of interest can be considered as a circular region for a 360° LIDAR or for the transmitter The CPMs received from other vehicles can influence the decisions made by the vehicles.Thus, ensuring the authenticity of these messages is crucial for road safety, as the CPMs are generated by vehicles rather than a centralized and trusted authority [8], [9].Unlike a centralized authority, individual vehicles may be incentivized to deceive other vehicles to maximize their own profits [10].As an example, a malicious vehicle could simulate congestion on the road ahead by sending synthesized CPMs, prompting other vehicles to take an alternative route, to clear the way for themselves.
Numerous solutions have been suggested to detect and eliminate such fraudulent activities.One such solution is the Public Key Infrastructure (PKI), which has been standardized in the VANET [11].This standard mandates that every message transmitted over the VANET must be signed using a secret key provided by law enforcement agencies (LEAs).Consequently, every node in the network is forced to obtain the keys from LEAs before sending any message.
Nevertheless, vehicles can still send fraudulent messages on purpose and sign them with their valid LEA-issued keys.Most existing security solutions for the Internet are impractical owing to VANET's highly decentralized and mobile nature.Various trust management methods have been proposed to overcome this problem [12], [13], [14], [15].
In this work, we proposed a zero-knowledge proof (ZKP) based deterministic traffic cross-verification method called zero-knowledge Proof of Traffic (zk-PoT).Zk-PoT enables vehicles to prove the existence of vehicles they observed independently, then enables the remote parties to deterministically cross-verify the observations without any knowledge about the ground truth.Subject to the zero-knowledge property, the public cannot gain any extra information in the whole process, thus the privacy of the observed vehicles is preserved.
The proposed mechanism can enhance the security, efficiency, and verification latency while preserving the privacy of the target vehicles.Integrating the zk-PoT into existing cooperative perception standards such as ISO and ETSI requires minimal changes to the architecture, while still maintaining backward compatibility.Additionally, it can be used as a bootstrap method alongside existing trust management methods.
The basic concept of this work is proposed in our published paper [16].In this paper, we refine the original concept and propose a more concrete design and a comprehensive quantitative analysis.Furthermore, a detailed proof-of-concept implementation is performed for more realistic evaluations.As the ETSI ITS CPS standard, which is considered the basis of this work, is updated according to the recently published version of ETSI TS 103 324 [6], the zk-PoT is further adapted to the latest version.Additional results regarding efficiency, security, and communication overhead are analyzed based on the new implementation.
The rest of this paper is organized as follows.Section II reviews the related works and highlights the challenge of balancing location privacy and trust management.We also discuss the cryptographic tools required for the proposed solution, including the Elliptic Curve Digital Signature Algorithm (ECDSA) and zero-knowledge proofs.Section III defines the problem, outlines reasonable assumptions, and presents the approaches to address the problem.Section IV describes how the problem can be transformed into a cryptographic problem and then be solved using zero-knowledge proofs, and the means of applying the proposed solution by extending the existing CPS standard.Section V, presents quantitative analyses performed using the previously proposed simulator, Flowsim [17].Section VI, analyzes the robustness of the proposed method against common threats including various attacks and privacy leakage.Lastly, Section VII summarizes the contributions of the proposed method and presents future research directions.

II. RELATED WORK A. MISBEHAVIOR DETECTION
Misbehavior detection (MBD) is the process of identifying and mitigating malicious or inappropriate behavior within VANETs.This involves identifying vehicles or nodes that engage in unauthorized or harmful activities, such as sending false data or participating in attacks.MBD methods can be categorized into four types by two orthogonal criteria: nodecentric vs. data-centric, autonomous vs. collaborative [18].Node-centric methods perform evaluations based on the behavior of individual vehicles, whereas data-centric methods analyze the content and characteristics of the transmitted data.Local MBD methods involve independent node detection without depending on other nodes, whereas collaborative MBD methods employ neighbor nodes to verify the data and identify misinformation.Zacharias et al. [19] proposed an autonomous node-based MBD system based on the local traffic density.This approach utilized multiple independent sensors to measure the traffic density and combined the evidence using the Dempster rule of combination to detect misbehavior, particularly focusing on illusion attacks.Al-Ali et al. [20] proposed a blockchain-based collaborative approach to validate traffic events and authenticate vehicles in VANET.It utilizes the reputation scores, proof of authority (PoA) and proof of event (PoE) consensus algorithms, as well as mutual authentication between vehicles and roadside units (RSUs) to improve the event validation accuracy and detect internal attackers.
Numerous data-centric methods have also been proposed over the years.Ghosh et al. [21] proposed an autonomous MBD scheme for the post crash notification (PCN) application.It involves identifying the root cause of misbehaviors by constructing a cause tree and using logical reduction.This scheme achieves adequate detection rates and exhibits robustness to small errors in the parameter estimation.Lo et al. [22] introduced a security threat in VANETs called the illusion attack, where an adversary broadcasts false traffic warning messages based on the current road conditions, thereby creating an illusion for nearby vehicles.This illusion can manipulate the drivers' behaviors, leading to car accidents, traffic jams, and decreased VANET performance.The authors proposed an autonomous data-centric security model called the plausibility validation network (PVN) to address this issue by cross-verifying the plausibility of incoming message fields.The problem they were aiming to solve is very similar to our problem, as we also utilize cross-verification, due to which the requirements are very similar.Ercan et al. [23] proposed a machine learning-based Intrusion Detection System (IDS) to detect position falsification attacks in VANETs.The IDS utilizes three new features corresponding to the sender's position, along with the k-nearest neighbor (kNN) and random forest (RF) classification algorithms.The results demonstrate that the proposed mechanism outperforms the existing approaches in terms of classification performance and computation time.Kristianto et al. [24] proposed a semisupervised federated learning MBD system for V2X communications.This model addresses the challenges of limited labeled data and bandwidth consumption by leveraging semi-supervised learning and federated learning approaches.The experimental results show that the model achieves high performance, outperforming centralized supervised learning methods regarding the F1-score, recall, and reduced bandwidth utilization.Although all the above works are proposed, the MBD problem still cannot be solved perfectly in all situations.Data-centric methods require a significant amount of data, presenting higher channel occupation, packet loss, and increased latency.Furthermore, feasibility evaluations are the only options available in cases where the ground truth data are missing or difficult to obtain.In entity-centric methods, data correctness is still a major problem owing to the presence of attackers and sensor limitations.

B. THE DILEMMA OF LOCATION PRIVACY AND TRUST MANAGEMENT
Connected autonomous vehicles (CAVs) are designed to share their position and velocity information with other CAVs on the road, unlike conventional vehicles that share either limited or no information.However, preserving location privacy while ensuring accurate trust evaluation is challenging for most trust management methods as they are required to track the vehicles' historical behaviors.
Efforts were made to balance the dilemma, some by introducing a pseudonymous authentication scheme that can be used to protect the vehicles' location privacy [25]; however, this increases the difficulty of tracking specific vehicles for trust management.Recently, some methods that employ modern cryptographic tools such as self-blindable signatures and zero-knowledge proofs were also proposed for specific applications [26].

C. DIGITAL SIGNATURE AND ECDSA
A digital signature is a mathematical scheme that uses asymmetric cryptography to verify the authenticity of digital messages or documents.The Digital Signature Algorithm (DSA) is a standard of digital signature based on the discrete logarithm problem.ECDSA [27] is a variant of the DSA that employs elliptic curve cryptography, which is widely used in VANET and V2X applications, providing authentication, integrity protection, and privacy enhancements [28], [29], [30].ECDSA also serves essential roles in the existing ETSI ITS standards [11], [31].Many experiments and analyses regarding ECDSA's performance and impact on V2X communication are carried out [32], [33], [34], [35], [36], providing a comprehensive understanding about ECDSA's performance in the context of ITS and VANETs.

D. ZERO-KNOWLEDGE PROOFS
In 1989, Goldwasser et al. proposed the zero-knowledge proof (ZKP) system [37].This system allows one party, called the prover, to convince another party, called the verifier, that a specified statement is true, without revealing any additional information to the verifier, except for the truth of the statement.While ZKP systems are widely used in cryptocurrency protocols such as Zerocoin [38] and Zerocash [39], they are still relatively new in other fields of study.
Nowadays, research based on zero-knowledge proofs is emerging in the ITS field.
McEntyre et al. [40] present a privacy-preserving electronic toll collection (ETC) protocol for V2X communications, utilizing a ZKP challenge set to ensure security while addressing the limitations of embedded technology.The protocol achieves toll verification without disclosing sensitive subscriber information, such as GPS location, by generating unique randomized challenges localized to toll areas, thereby reducing the likelihood of false tolling.
Chaudhry et al. [41] introduce a secure communication framework for vehicle-to-healthcare everything (V2HX) communications in fog computing environments, utilizing a combination of ZKP and statistical fingerprinting (SF) protocols.The proposed framework enables vehicle authentication through ZKP and ensures secure communication between VANETs and healthcare enterprises through SF.
Rasheed et al. [42] present an Adaptive Group-based Zero Knowledge Proof Authentication Protocol (AGZKP-AP) for VANETs, addressing privacy concerns in authentication.The protocol enables anonymous authentication, distributed privilege control, and customizable privacy settings for users while minimizing the disclosure of authentication parameters, enhancing user privacy in VANETs.
Li et al. [43] introduce an aggregated zero-knowledge proof and blockchain-based authentication system for privacypreserving identity verification in autonomous truck platooning, addressing security and privacy concerns.The proposed approach enhances security, provides fast performance, and ensures data integrity while allowing truck companies to define access control policies.Experimental results on the Hyperledger platform demonstrate the system's feasibility for real-world truck platooning applications.

III. PROBLEM STATEMENT, ASSUMPTIONS AND APPROACHES A. PROBLEM STATEMENT
The cooperative perception ability provided by CPMs is significant; however, the potential risks associated with such messages must be considered.Blindly trusting CPMs could lead to suboptimal or even risky vehicle decisions, which can considerably compromise road safety.
Some studies [12] are based on trust estimation and management methods that depend on past statistics, making it nearly impossible to detect one-shot misinformation.Other studies rely on assessing the ''plausibility'' of data, which also necessitates the use of past statistics.Furthermore, these models cannot deterministically verify data as they are statistical models.
Therefore, we designed a cryptography-based mechanism that allows vehicles to prove their traffic observations, enabling other vehicles to cross-verify and deterministically trust the observations without knowing the ground truth.
Our study is primarily focused on proving the existence of observed vehicles as ''existence'' is the most fundamental and accurate event that occurs on the road, which remains unchanged despite the high speed of vehicles.This is contrary to a vehicle's location, which can change quickly, leading to inaccuracies.Although fundamental, proving existence can already block many data-fabricating attacks, such as the most widely discussed phantom vehicle attack.

B. ASSUMPTIONS
In this study, we make several assumptions.Firstly, We assume that all vehicles have a front camera system and can recognize the other vehicles' number plates by computer vision.We also assume that every vehicle on the road joins VANET, enabling all vehicles to perceive and share information with other vehicles.Consequently, we do not differentiate between conventional vehicles vs. vehicles with cameras, connected vs. non-connected vehicles, in our subsequent discussions.
Additionally, we assume that a public key infrastructure (PKI) run by LEAs is deployed to all the infrastructures and vehicles.Furthermore, vehicle certificates with key pairs are distributed to all the vehicles by LEAs; each vehicle signs every message it sends and uses the public key to verify the integrity and authority of the messages they receive.We also assume that all the vehicle certificates are provided by a pseudonym system with perfect unlinkability.This implies that after a vehicle changes its pseudonym, i.e. its pseudonymous certificate, the new pseudonym cannot be linked with the old one.
Based on these assumptions, we define ''A heard B'' as A receiving a message that contains B's pseudonym from V2X communication.Conversely, we define ''A saw B'' as A identifying B's number plate, extracted from the video feed of its front camera.

C. APPROACHES
We can design a naïve ''proof system'', which lets vehicles broadcast the value of vehicle number plates that they observe.However, such a ''proof system'' is not secured as the plate numbers can be easily synthesized, which makes it subject to data forging, replay attacks, and other threats.Moreover, the location privacy of the observed vehicles is compromised, since the number plates are in plaintext and can be used to track vehicles by malicious parties.
To address these issues, we propose a zero-knowledge proof (ZKP) based system that enables vehicles to prove that they have observed another vehicle.Based on the definition of ''seeing'' and ''hearing,'' we consider that the vehicle that can link the number plate and pseudonym must have a close observation of the target vehicle, presenting strong evidence of the existence of the target.Additionally, our ZKP-based system ensures that both the pseudonyms and number plates remain undisclosed in the proofs, thus ensuring location privacy.
In our system, it is difficult if not impossible to falsify the existence of a random vehicle.To overcome this limitation, we propose a novel approach that employs a crossverification scheme.Our approach involves multiple vehicles generating individual proofs for the same target vehicle.Thus, any third party can verify the existence of the target vehicle by comparing multiple proofs without requiring any information regarding the specific vehicle.

IV. ZERO-KNOWLEDGE PROOF OF TRAFFIC
In this section, we present our proposed solution, Zero-Knowledge Proof of Traffic (zk-PoT), to the problem of verifying whether two vehicles have observed the same target vehicle while preserving their location privacy.We convert the problem into a cryptographic protocol called zero-knowledge proof of shared secret (zk-PoSS).We then provide a solution to this model and apply it to the existing ETSI standard by extending the packet structure and station behaviors.We use I "heard" its !" and "saw" its #$%&', generate and broadcast a proof III.I also "heard" and "saw" its !" and #$%&', generate and broadcast another proof IV.I received two proofs.I don't know anything about the target vehicle, but they are indeed talking about a same vehicle!FIGURE 2: Procedure of a Zero-Knowledge Proof of Traffic ETSI standards as an example; our approach can also be applied to other cooperative perception systems, such as ISO standards.
A. PROBLEM CONVERSION Fig. 2 depicts a scenario where vehicles A and B observe a vehicle named Target and acquire its pseudonym, ID, and number plate, Plate.These observations essentially yield a shared secret, which is derived from the target vehicle's ID and Plate.To prove that they indeed know the shared secret, without disclosing the plaintext ID and Plate, we proposed a cryptographic protocol called zk-PoSS.This model involves creating two distinct proofs using a one-way proof function, F proof , using a shared secret, SS, and two cryptography salts, m and m ′ .Subsequently, another function, F verify , is used to determine if the pairs, (P, m), and (P ′ , m ′ ), are correspond to the same shared secret, SS.
By doing so, the problem of proving a common observation to a vehicle could be transferred to the cryptographic model of proving the shared secret:

B. ZERO-KNOWLEDGE PROOF OF SHARED SECRET
After converting the practical problem of zk-PoT into the theoretical problem zk-PoSS, it can be solved using the conventional ECDSA.
The conceptual overview of zk-PoSS is shown in Fig. 3. Concatenating the ID and plate of a specific observed vehicle, the provers essentially possess the same shared secret with sufficient entropy.Since the private key of ECDSA is just a plain integer, they can interpret the hash of such a shared secret as a private key.Therefore, two valid but different signatures become proof that the provers possess the same private key.In the proposed scheme, holding the private key implies knowledge of the shared secret.
Utilizing this feature of digital signature, two parties can create different proofs regarding the shared secret, by signing different random messages as salts.Subsequently, a third party that received both proofs can recover the public keys from the signatures, and verify if the two public keys are identical.If these public keys are identical, it can deduce that the private keys are identical; and that the private key is, in essence, a hash of the shared secret.Therefore, from the proof by contradiction, the only feasible option is that both indeed share the same secret.
The exact protocol is defined in Protocol zk-PoSS.

Protocol:zk-PoSS Preconditions: Common Agreement
Prior to initiating the protocol, there is a set of common parameters that all parties must agree upon.This ensures that all parties are working with the same fundamental assumptions, enabling coherent and meaningful communication.
1) All parties agree with the same set of ECDSA parameters:

{CURVE, G, n}
These parameters define the elliptic curve, the base point, and the order of the base point, respectively.2) All parties adopt the same cryptographic hash function, denoted by H .This function is capable of producing a fixed-length output that matches the binary length of n.This uniformity ensures consistency across all hashed outputs.

Prover: Proof Construction
The prover's role is to generate a proof of knowledge to a secret, without revealing the secret itself.The following steps outline this process.1) The prover computes the hash of the secret SS using the agreed cryptographic hash function: Here, the hashed secret, sk, could serve as a private key in the ECDSA context.2) There may be rare cases where sk >= n since in ECDSA n is a prime which is slightly less than a power of two.To address these outliers, we perform an iterative process of recalculating the hash by repeating SS, until sk < n.
3) The prover then computes PK = G × sk on curve CURVE.The generated pair (sk, PK ) can be interpreted as the private key and public key of the ECDSA algorithm, respectively.4) Using the private key sk, the prover signs a random message m with the function: This signature attests to the ownership of sk. 5) The prover publishes the message m and its corresponding signature Sig, available for any verifier to check.

Verifier: Pairing and Verifying Proofs
The verifier's role is to match and validate the proofs provided by the provers, thus affirming the provers' common knowledge of the shared secret.Here are the steps a verifier must follow.
1) The verifier recovers the public key PK from the message m and signature Sig using the function: The recovered public key is then stored in a database locally.

2) If the database already contains another message m ′
with corrsponding valid signature Sig ′ which is signed using the same public key PK , it implies that both provers of m and m ′ possessing such PK share the same secret.This is a significant conclusion as it allows us to infer shared knowledge without revealing the secret itself.
A zero-knowledge proof system must satisfy three properties [37]: 1) Completeness: If the statement is true, the verifier can be convinced by the prover.2) Soundness: If the statement is false, the verifier cannot be convinced by the prover, even if the prover tries to cheat.3) Zero-knowledge: The verifier cannot gain any knowledge apart from the truth of the statement.Zk-PoSS is considered a zero-knowledge proof system because it satisfies all three properties.1) Completeness: If PK s are identical, the verifier could deduce that the SS used to generate the PK s are identical, although it cannot know the actual secret.2) Soundness: If SS are not identical, then the same sk can only be produced through a hash collision, which is a computationally difficult task.Thus, a cheating prover has a negligible chance of convincing the verifier.3) Zero-Knowledge: In the whole process, the verifier cannot produce any information regarding sk or SS from PK s.This property is guaranteed by any public key cryptography, including ECDSA.The ETSI CPS system can be updated by incorporating the zk-PoSS protocol to provide proof of traffic capability.
The standard curve secp256k1 is used with a 256-bit key length, which will produce a 65-byte long signature.The message, m, used in zk-PoSS is the prover's pseudonym, which is selected for its high entropy and frequent transmission in every packet, enabling bandwidth conservation.Furthermore, it helps in preventing the naïve replay attack, which will be discussed in Section VI.

1) Changes to the CPM packet structure
The WrappedCPMContainer structure in the CPM has been extended with an additional section called the Proof Container, which is located at the end, contains a list of 0 to 8 Proof Entries, as shown in Fig. 4.Each Proof Entry contains an object ID that is present in the perceived object container, a 32-bit prefix of the vehicle's ID to assist quick filtering, along with the actual proof of this ID.The ASN.1 definition of ProofEntry is as follows.
ProofEntry ::= SEQUENCE{ objectID Identifier2B, pidPrefix Integer32, v BOOLEAN, r OCTET STRING (SIZE(32)), s OCTET STRING (SIZE(32)) } The objectID field represents the ID of the target vehicle assigned in PerceivedObjects, the pidPrefix field represents the 32-bit prefix of the target vehicle's pseudonym, and the three fields V, R, and S represent the corresponding fields in the ECDSA signatures, as defined in [44].
It is recommended that the proofs be sent intermittently rather than in every CPM to minimize the amount of data transmitted and preserve bandwidth, As the CPM can be broadcasted at a high frequency of up to 10 Hz [45], the length of the proofs (71 bytes per proof entry under ASN.1 UPER encoding) can add up quickly.Therefore, it is encouraged to implement an inclusion management process to limit the number of proof entry in the prover; and the receiver should store each entry for a specified period.The exact time interval for transmitting the proofs is not yet determined; however, sending them once every 3 s is considered sufficient to handle vehicle topology changes.

2) Changes to the ITS Station
Various implementations may produce different module designs; however, the fundamental structure of a vehicle equipped with the CPS should be similar to the one depicted in Fig. 5a.The receiver module retrieves objects from the received CPMs, and in conjunction with locally perceived objects, feeds them into the planning module for further processing.Additionally, the objects obtained from the local module are sent into the sender module to be packed as CPMs and broadcasted to the vehicles nearby.
To accommodate the zk-PoT, we introduced two new modules: prover and verifier.The prover module obtains objects from local, creates proof for each potential object, and then transfers the proofs along with the original objects to the sender module for CPM preparation and transmission, as shown in Fig. 5b.On the receiver side, the verifier module, positioned after the receiver, functions as a gatekeeper, ensuring that only verified received objects can be used for the planning module.It receives raw objects and proofs, stores them in an internal database,  ConnectedVehicle Perception and V2X communication capable vehicles, could understand CPMs PoTVehicle Proof of Traffic capable vehicles, could send and verify proofs SpamAttacker Attacker vehicles sending random fake objects to make the road look congested or blocking signals ReplayAttacker Attackers replaying CPM of both received and local perceived objects to confuse or overload others SilenceAttacker Selfish vehicles that only listen to V2V communications but do not contribute anything and continually verifies these proofs.When a match is found, the corresponding objects are considered authentic, and all the recent data of this object, including that stashed in the internal database, are passed to the planning module.Additionally, any subsequent data from the same objects are immediately forwarded to the planning module.
Such design ensures minimal modifications to the preexisting modules while incorporating the proof of traffic functionality.Furthermore, this approach maintains backward compatibility with the current standards.

V. EVALUATION
Although certain aspects, namely the vehicle traces, number plate recognition, and packet delivery, may not considered realistic enough within a simulated setting, zk-PoT, functioning as a behavioral model, remains contingent solely upon the outcomes of these modules and is thereby not directly influenced by their simulated performance.Thus, we consider that a simulation environment incorporating deterministic inputs suffices for the evaluation of a deterministic model.
Conversely, the pivotal consideration lies in evaluating the model's scalability across extensive geographic areas, ensuring its effectiveness in diverse traffic scenarios, including distinctions between highway and urban settings, as well as variations between rush hours and standard conditions.Considering the cost and complexity of conducting a realworld large-scale experiment, a simulation-based approach is considered more viable.
To evaluate the efficiency and performance of the zk-PoT compared to non-verified CPS in a large-scale and realistic environment, we proposed the Flowsim [17], a modular simulation platform for microscopic behavior analysis of cityscale connected autonomous vehicles.The Flowsim simulator has a simplified perception model, enabling us to evaluate the performance of zk-PoT in different visibility caused by diverse vehicle densities and occlusions.Various microscopic simulations were conducted on the Flowsim to evaluate the robustness, attack resistance, and network efficiency of zk-PoT.
The vehicles currently implemented are listed in Table .1.

A. PARAMETER AND METRIC SETTINGS
A set of four experiments is designed to compare the zk-PoT with the conventional methods under different conditions: • Local Perception Only: all the vehicles operate standalone and do not communicate with each other.
• Conventional CPS: all the vehicles participate in the V2V communication and share their perception based on the ETSI CPS standards.We collect the following metrics from all the experiments: Time to Verify (TTV) of a specific observed vehicle denotes the time delay between the first proof of this vehicle being received and successful cross-verification, which involves receiving the proof from another observer in our case.TTV conceptually represents the time required for an observed vehicle to be covered by cross-verification.This metric can facilitate the design of the planning module.It can also evaluate the overall performance of the zk-PoT.
Local / Received / All Objects represent the number of accumulated objects collected in the different modules.Local Objects (N l ) is obtained from LocalPerception; Received Objects (N r ) is obtained from CPSReceiver; All Objects (N a ) is obtained from Planner.These three metrics can evaluate the performance of the CPS, as well as the impact of zk-PoT in the long term.
Verification Ratio (R veri ) is defined as the ratio of crossverified objects (N a ) to the total received objects (N r ).They both excluded the local perceivable objects (N l ) because this portion does contribute additionally to the ego vehicles.It Bandwidth Consumption measures the bandwidth overhead produced by the proofs, and can also evaluate the effectiveness of the bandwidth conservation of PoT3.

B. MANHATTAN SCENARIO: BOOTSTRAP AND CONVERGENCE
Firstly, a set of small-scale experiments was conducted on a synthetic grid map, also known as the Manhattan scenario.This scenario primarily facilitates debugging and testing along with the development of Flowsim, but it can also be used to evaluate the bootstrapping and converging progress.The map contains 10 by 10 grids with the sizes of each grid set to 100 × 100 meters, as shown in Fig. 6 (vehicles scaled up for better visibility).Initially, 100 vehicles are spawned in the scenario, with random starting points and destinations.No vehicles are spawned or removed afterward during the whole experiment.
Each experiment was run for two hours (7200 ticks) and the number of vehicles available for Planner module is  The final and most important group of experiments is the LuST scenario.We use the same scenario in this work as that of the previous work, which is the 24-hour realistic traffic data generated from Luxembourg (a.k.a. the LuST scenario) [46].The peak number of vehicles simultaneously simulated and inspected is more than 8,000 in the evening rush hours.
As shown in Fig. 8, the road network contains diverged types of roads and intersections, which makes it suitable to analyze the effectiveness of zk-PoT in different situations.On the map, we can observe a U-shaped road stretching around the city.This is the European Route E44, which is also known as the A1 Motorway of Luxembourg, the road with the most dense traffic.However, in the LuST scenario, there is a problematic roundabout on this road, which has been highlighted in Fig. 8, and will be discussed later.
We have selected the same vehicle dimensions, communication and perception parameters, and then added additional parameters for extended vehicle behaviors, such as pseudonym changing.Flowsim is further optimized to ensure that the full-sized LuST scenario (i.e.86400 ticks) could be finished in a wall time of one day.

1) Traffic density and the problem
Fig. 9 depicts the number of active on-road vehicles during the day.In the experiment, there are three rush hours in total: 10:00, 13:00, and 19:00.Thus, we selected three samples to analyze the different behaviors of the system under different scenarios: 7:00, i.e., before the morning rush hours; 12:00, i.e., at noon; 19:00, i.e., during the peak of evening rush hours.Fig. 10 depicts the traffic density distribution in the three samples.In the heatmap of 7:00, the data points form a notable U-shaped path, representing the normal traffic on the E44.However, in the sub-figure of 12:00 and 19:00, we can see the data points on this road are very sparse, particularly in the southeast region, i.e., the bottom right area.This scarcity is caused by the complete congestion observed at the roundabout shown in Fig. 12; refer to the marked part in Fig. 8.In Fig. 12, the color of the lane represents their occupancy.Notably, the occupancy at E44 appears to be very low, while the roundabout and the entrances/exits of E44 are completely blocked by vehicles.This severe congestion has even produced a large number of teleportation events [47], where vehicles approaching the roundabout from other roads were instantaneously teleported to different locations.The congestion prevents vehicles from entering E44, leading to missing data in the southeast corner of the road during those time intervals in Fig. 10 and 11.
We believe that this is caused by the well-known border traffic issue and the teleportation issue of SUMO scenarios; however, we have not modified the LuST scenario itself to ensure that it corresponds with that of other researchers.Nevertheless, we can still analyze the other part of E44 to observe behavior in similar highway traffic situations.

2) Verification ratio
The verification ratio is the main performance metric of the system.We measure the long-term verification ratio throughout the entire day, which concurs more with the realworld conditions, instead of calculating the verification ratio (named as success ratio in the last work) separately for each tick like in the last work.
Fig. 11 depicts the verification ratio in the same sampling time.We can observe that the verification ratio has a positive correlation to vehicle density.It starts at approximately 80 % at 7:00 and peaks at over 96 % at 19:00.Furthermore, the vehicles in the downtown area generally perform better than the ones on the highway.This is because the vehicles on the highway are typically platooning and did not encounter different vehicles for a relatively long period.Conversely, the diversity of roads in the downtown area and the randomness of the vehicle behaviors in that area make the platoons of vehicles very volatile, thus increasing the cooperative perception and the verification ratio.This phenomenon is particularly evident in the figure indicating the scene of 19:00.

3) Time-to-Verification
The timeliness of the cooperative perception data, i.e. the TTV, is the second thing to be considered.Fig. 13 depicts the TTV distribution of every hour, scaled to 100 %.We can observe that the system is bootstrapping, and that the traffic is sparse during the period from 0:00 to 7:00, resulting in longer verification delays.From the data after 9:00, we can observe that the trend becomes relatively flat, indicating that the system has reached dynamic equilibrium.Based on the trend, we can observe that approximately 35 % of the data can be verified in the same tick that it first appears, which produces a sub-second verification delay.Furthermore, over 65 % and 80 % of the observations could be verified within 1 s and 2 s, respectively.The result indicates that the proposed zk-PoT presents a very short latency from the point of receiving data to considering them trustworthy.Conversely, the conventional statistics and plausibility approaches require a significantly longer time to collect ''evidence,'' to build the trust of either the sender or the received data.

4) Protocol overhead
Lastly, we must consider the protocol overhead as well.Fig. 14 shows that the system is not fully bootstrapped in the simulation time before 9:00, similar to the previous pattern.Beyond this point, we can consider it as the normal working condition of our system.Therefore, we primarily focus on analyzing the differences in this phase.
In the conventional CPS with no PoT enabled, the average bandwidth consumption of a single vehicle stabilizes at approximately 6 kbps.In the PoT-enabled system with proofs repeated every second, the system presents a 40 % increase in bandwidth when compared with the baseline system, reaching 9 kbps.However, if we change the PoT to repeat proofs every 3 s, the bandwidth increment reduces to about 25 %, resulting in a bandwidth of 7.5 kbps, which is considered stable    Drawing upon the quantitative results presented earlier, a comparative analysis with conventional misbehavior detection approaches can be conducted.While MBDs assume data are trustworthy and try to detect lies, we assume the opposite and try to prove truths.Despite the different intentions and methodologies employed by these two methods, they converge in providing similar functionality, specifically ensuring   3 presents a comparative analysis between zk-PoT and various MBD approaches, specifically those focusing on cooperative perception.The first classification is methodological taxonomy, distinguishing between node-centric and data-centric approaches.Within the data-centric domain, two representative strategies, namely plausibility-based and consistency-based methods, were selected for examination.The second categorization involves the identification of applicable attack types, specifically ghost vehicle attacks (fabricating non-existent vehicles) and omission attacks (intentionally excluding vehicles).
The analysis reveals that zk-PoT demonstrates notable attributes, including a zero false positive ratio, an elevated verification ratio, and a comparatively expedient decision delay.However, it is imperative to acknowledge zk-PoT's inherent limitation, notably its susceptibility to omission attacks.This limitation is inherent to zk-PoT's fundamental nature: nonexistent means no data and thus no entropy at all, so it's impossible to prove that ''nothing is there''.

E. ADAPTABILITY TO NON-V2X VEHICLES
In the zk-PoT mechanism, an imperative requirement for the identification and subsequent verification of a target vehicle have a unique identifier with sufficient entropy.For the sake of simplicity, this study assumes the utilization of the V2X station ID of each vehicle.It is noteworthy, however, that such an arrangement is not obligatory.
Different technologies already in the market also feature their distinct identifiers, thereby offering flexibility in implementation and deployment.Notably, regions such as certain states in the United States exhibit a notable prevalence of Radio-Frequency Identification (RFID) based tags [51], [52], while Japan witnessed an escalating adoption of Electronic Toll Collection Systems 2.0 (ETC 2.0) [53].The vehicles equipped with those existing technologies can thus still be observed and proved, even without V2X capabilities itself.zk-PoT can be adapted to utilize some or all of those technologies with little effort.

VI. THREAT ANALYSIS
The zk-PoT provides vehicles with the capability to swiftly and deterministically validate data received from the CPS.This eliminates the possibility of naïve attacks and data tampering, such as fabricating non-existent vehicles and duplicating existing vehicles.Furthermore, various complex attacks can be prevented or mitigated.Although this list is not exhaustive, it covers some of the most common attacks that can be encountered.

A. BRUTE-FORCE ATTACK AND DICTIONARY ATTACK
Malicious entities may attempt to perform brute-force attacks or dictionary attacks on the number plate used in the proof.
In the case of a brute-force attack, the attacker can try to match the public keys used in other vehicles' proofs by only ''hearing'' the candidate observed vehicles.As the range of ''hearing'' is considerably larger than that of ''seeing,'' attackers can still pretend that they ''saw'' the target vehicles.Furthermore, this attack can be done in an offline fashion and does not require any exchange of information with other vehicles.We can eliminate these types of attacks by employing key derivation functions (KDFs), such as PBKDF2 [54] or scrypt [55], to increase the length of time required for a successful brute-force attack, making it longer than the lifespan of a pseudonym.
In the case of a dictionary attack, the attacker can gather previously known number plates and use them as a reference.This type of attack presents a higher success rate than that of a brute-force attack due to the locality of the vehicles.Mechanics like ephemeral salt can be employed to prevent this attack.

B. LOCATION PRIVACY AGAINST TRACKING
The use of pseudonyms is crucial in ensuring location privacy in V2X communications.With the combination of pseudonyms and KDFs involved, brute-force attacks cannot be performed to create a timely proof.However, an attacker can still collect data on the road and recover a vehicle's trajectory.
Fortunately, after the target changes its pseudonymous ID, remote attackers can no longer track it, even if they have information about its old ID and number plate.As such, the proposed mechanism provides at least the same level of privacy as that of traditional pseudonyms.

C. SPAM ATTACK
Verifiers are vulnerable to spam attacks as partial proofs cannot be falsified, Attackers can easily create seemingly valid proofs randomly, which can deplete the victim verifiers' computing power and memory, thus potentially causing a denial-of-service situation.
This type of attack can be mitigated by setting a limit on the number of unmatched proofs from each prover.If a vehicle exceeds this limit, the verifier must disregard any subsequent proofs sent by the vehicle until some of them are matched by other vehicles.

D. SIGNAL JAMMING ATTACK
In the proposed scheme, a malicious vehicle cannot simply replay a message.However, the attacker can still jam the signal from the original prover and then replay its proof as their own.This kind of attack can be mitigated by including the prover's own pseudonym in the proof.This is because the pseudonym of a vehicle cannot be easily forged, making it a secure method of identifying the original prover.

E. COLLUSION ATTACK AND SYBIL ATTACK
Collusion and Sybil attacks are both types of attacks that are specific to the applications that require voting or crossverification, including the proposed method.A collusion attack is a type of attack in which multiple malicious entities work together to deceive a system.In a Sybil attack [56], the attacker creates multiple pseudonymous identities to impersonate multiple users.These two types of attacks can be combined to create a set of fake vehicles that can deceive victim verifiers.
While these attacks are typically challenging, we can implement certain strategies to mitigate them.We can mitigate the Sybil attack by implementing mechanisms that prevent the simultaneous use of different pseudonyms belonging to the same vehicle.The design of pseudonyms is not directly related to the proof system and will be addressed in the future.For the non-Sybil collusion attack, the number of vehicles controlled by the attacker is limited as they require a fleet to perform the attack.In this case, we can limit the revenue from the data.The revenue could comprise a trust value or credits, and it varies depending on the system built based on the proof system.For example, we can set the trust value for a successful match between two specific provers to be the reciprocal of the number of accepted matches over the past 15 minutes.This countermeasure can cause a sharp decrease in trust, making further collusion unprofitable.

VII. CONCLUSION
This study proposed a novel deterministic cross-verification scheme called zk-PoT.By enabling vehicles to generate zero-knowledge blind proofs to their independent observation, the remote parties can be convinced by cross-verifying these proofs, without leveraging the ground truth or trust evaluation.Subjecting to the zero-knowledge property, the zk-PoT will not reveal any information about any particular vehicle, thereby preserving the location privacy of the vehicles.
The quantitative simulation analyses in the Luxemburg scenario verify that zk-PoT is in good performance.In the experiments, about 80 % to 96 % of observed vehicles can be cross-verified by zk-PoT.Over 80 % of cross-verification happened within a sub-2 s delay and over 90% happened within 5 s.The bandwidth consumption overhead is approximately 25 % compared to the original 1 Hz CPS standard.
The threat analysis against various kinds of attackers shows that zk-PoT could survive most of the common threats, such as brute-force attacks, dictionary attacks, spam attacks, and signal jamming attacks.It can maintain location privacy to the same level as the original pseudonym system.The collusion attack is challenging, but it can still be mitigated by extra rules.
Zk-PoT can be either used as a standalone method or integrated with existing cooperative perception standards, as shown in the aforementioned example, where zk-PoT is implemented for the CPS in the ETSI/ITS standards.Furthermore, zk-PoT is particularly helpful for bootstrapping the trust establishment process in several existing trust management models as they can provide strong evidence that can be cross-verified by other vehicles [57].
Owing to the limited scope of this study, not all aspects have been examined comprehensively.The following issues should be considered in the future.Firstly, the vehicles are not incentivized to share data and tend to be selfish.This will reduce the number of proofs thus harming the crossverification ratio.Based on the ''privacy-trust dilemma,'' an economic model could be proposed.Secondly, Sybil attacks are not solved even when collusion mitigation techniques are introduced.This is a potential research direction to enhance zk-PoT.Lastly, the non-connected vehicles are not considered in this study because they lack identity with enough entropy.Some other mechanics should be considered to make zk-PoT better adapted to mixed traffic environments.

Figure 5 :
Figure 5: General architecture of augmented perception system

Figure 6 :
Figure 6: Data format used for information exchange within the modules of the augmented perception system

FIGURE 1 :
FIGURE 1: General structure of CPS include my !" in V2V broadcast messages II.

:FIGURE 3 :
FIGURE 3: Overview of Zero-knowledge Proof System

FIGURE 4 :
FIGURE 4: Structure of the extended part of collective perception message (CPM) for zk-PoT

FIGURE 8 :
FIGURE 8: LuST scenario with the position of the E44 and the roundabout

FIGURE 9 :
FIGURE 9: Number of active vehicles in the LuST scenario

FIGURE 10 :
FIGURE 10: Number of vehicles in Luxemburg city at different times

FIGURE 11 :
FIGURE 11: Verification ratio in Luxemburg city at different times

TABLE 1 :
Implemented vehicle types in FlowsimName Description UnconnectedVehicle Traditional vehicles with perception, but not connected to V2X

•
Proof of Traffic (1s repeat): all the vehicles are capable of Proof of Traffic, which includes (up to 8) proofs for recently seen vehicles in the CPM they sent every second.•Proof of Traffic (3s repeat): identical to PoT, except the vehicles only repeat proofs of the same vehicle once every three seconds.This setting can potentially decrease the overall bandwidth overhead while maintaining comparable performance.These experiments are conducted in scenarios with different scales.Table.2lists the common parameters throughout all the experiments.

TABLE 2 :
Common Parameter Settings

TABLE 3 :
Comparison of zk-PoT with misbehavior detection methods Time kbps FIGURE 14: Bandwidth consumption per vehicle data authenticity.Table