A Lightweight Stochastic Blockchain for IoT Data Integrity in Wireless Channels

Trustworthy validators selection is crucial as validators determine whether a block should be added to its chain. In this article, we proposed a novel confidence score-based lightweight stochastic blockchain for wireless Internet of Things (IoT) systems. The received signal strength is used to define the trust level of a wireless IoT node to facilitate the selection of more trustworthy validator nodes, thereby reducing the possibility of selecting malicious validators. We used a lightweight authentication protocol called Burrows-Abadi-Needham (BAN) logic to prevent unauthorised information leakage. A formal security analysis of BAN logic is provided for proving secure and fresh data storage. Our analysis and simulations reveal that the probability of successful defense against data integrity attacks (i.e., the probability that the majority of the validator nodes are not compromised) can be improved up to two times higher than that associated with the closest competitive scheme of random selection in stochastic blockchains. Our results further reveal that the probability of successful defense depends on the total number of nodes in the network and the number of validator nodes. The proposed blockchain concept can be easily implemented in various wireless IoT environments to enhance the successful defense of the system for maintaining IoT data integrity.


I. INTRODUCTION
Wireless Internet of Things (IoT) technology connects numerous devices and sensors through a distributed wireless network environment.The collected IoT data can facilitate intelligent decision-making.Wireless sensor networks (WSNs) is the integral part of the IoT to facilitate the wireless interconnections of things [1].Here, our research is focused on the security of the resource-constrained wireless-connected IoT devices.However, this technology's wireless characteristics complicate the secure exchange of IoT data among heterogeneous IoT devices [2].The untrusted communication environment may cause leakage of data [3], an authentication mechanism is required for identifying the communicating party and avoiding spoofing hijackers.Therefore lightweight authentication scheme needs to be addressed for low-powered IoT devices.A major security issue in IoT networks is data integrity.Conventional data verification methods that rely on a trusted central entity are not suitable for distributed IoT systems.In addition, IoT devices are usually resource-limited [4], precluding the implementation of complex data verification algorithms.The aforementioned security challenges can be resolved by implementing suitable lightweight authentication techniques and integrating wireless IoT and blockchain.In a blockchain network, transaction records are stored as blocks, each of which contains the hash value of the previous block to which it was linked [5].Any change in a block results in a change in the corresponding hash, resulting in an immutable chain.Notably, blockchains can achieve data consistency among nodes through a consensus mechanism.The preservation of data integrity in wireless IoT relies on leveraging the chain structure and consensus mechanism.In this regard, trustworthy validator nodes verify the data consistency via the consensus mechanism prior to its inclusion in the chain.Moreover, the immutable characteristics inherent in blockchain technology serve as a protective barrier, impeding any unauthorized modifications to the stored data.

II. RELATED WORK
The authors of [6] proposed a lightweight blockchain-based secure distributed key management scheme for flying ad hoc networks (FANET).The authors of [7] surveyed blockchain for securing vehicular networks.The coexistence of heterogeneous networks in an unlicensed spectrum by introducing blockchain implementation with proof of strategy consensus mechanism is presented by [8].The majority of the papers on blockchain-enabled wireless networks analyzed the latency, scalability, and throughput of the system.There are very few studies [9] regarding the successful defense of blockchain against malicious attacks.Some studies have integrated blockchains into wireless IoT by using various consensus mechanisms, such as Proof of Work (PoW) [10] and Proof of Stake (PoS) [11].The main limitation therein is the high computational cost of block mining that became a bottleneck when there is more data transfer in IoT compared to traditional cryptocurrency scenarios.Specifically, each block appended to PoW and PoS mechanisms must be verified by all the nodes within the blockchain network.This is not practical for resource-limited IoT devices [12].An alternative to PoW and PoS protocols is the Practical Byzantine Fault Tolerance (PBFT) algorithm, in which only a few preselected validators are required to reach a consensus mechanism [13].Its throughput and storage efficiency are superior to those of PoW and PoS protocols, and the computational costs are relatively low.This comes at the cost of security, however; the attack tolerance is less than 33% [14].
The author of [15] proposed the concept of fast and secure consortium blockchains with lightweight block verifiers (LBVs).LBVs are edge devices that help typical miners in verifying the blocks.To provide high data integrity, the selection of trustworthy miner nodes is also important.The author of [16] proposed the validator selection technique for integrating blockchains into drones in 5G.Validators are selected based on their interaction frequency, and direct and indirect opinions from other drones.The author of [17] proposed a consensus algorithm for blockchain-based IoT that selects one master node among all nodes based on voting, and then the master node selects a few validator nodes for verifying the data instead of wasting the resource in the competition of becoming validators.As a result, this scheme may not be very secure, as there is the possibility of an attack on the master node.
The author of [18] proposed a trust-based privacypreserving scheme for IoT networks for improving cooperative sensing.Trust is adaptive in nature based on nodes' historical and current performance.Data is stored in blockchain for maintaining immutability.Trust is an important factor to be considered while dealing with data integrity.For the calculation of the trustworthiness of a node or confidence of a node, there are a few parameters [19] that are commonly used: direct trust and indirect trust, historical behaviour and current behaviour, and adaptive trust based on the current behaviour and historical behaviour of the node.There are very less researchers who exactly talk about the exact parameters defining trust.In [20], the concept of age of information is introduced to verify the data freshness.Further, the author of [21] analyzed the impact of timely updates of information in the blockchain.
Therefore, maintaining authenticated, fresh, and trustworthy data are important factors for maintaining data integrity.While considering blockchain-IoT integration in wireless scenarios, wireless channel characteristics play an important role.To the best of the authors' knowledge, there is no research for validator selection that considers wireless characteristics and stochastic nature.An efficient trust-based lightweight consensus mechanism is needed [22] for maintaining data integrity in blockchain-enabled IoT.

A. MOTIVATION
An authentication scheme is required for nodes to prevent unauthorized access to data from external attackers.We employed BAN logic authentication owing to its lightweight nature and suitability for resource-constrained IoT devices [3], [23].Furthermore, low block overhead and trustworthiness of validator nodes are essential to ensure high data integrity in blockchain-IoT environments.Since IoT devices are typically resource-limited, nodes cannot compete for block validation.An attacker can manipulate the blockchain by compromising the majority of the validator nodes because they perform block mining.Our previous work [9] proposed a stochastic blockchain network with multiple randomly selected nodes as validator nodes.We demonstrated the data integrity with low block verification overhead by introducing randomness in validator selection.However, this work does not consider reliability between IoT nodes, device authentication, and data freshness; and each node has an equal chance of being elected as a validator node.In this article, we discuss preventing unauthorized access to data and maintaining the freshness of data using BAN logic.Further, we propose a novel trust-aware validator selection scheme to reduce the possibility that compromised nodes are selected as validator nodes.By implementing weighted validator selection, the consensus mechanism becomes lightweight and efficient.

B. RESEARCH CONTRIBUTIONS
The main contributions of this research are summarized as follows: r Our novel method, which can be used in the stochas- tic selection of trusted block validators, is based on estimates of the confidence scores of IoT nodes.The confidence score of a node can be calculated by comparing the strength of the signal it receives with its reported location.In our design, each node has a probability of being selected as a validator on the basis of its confidence score.
r The probability of successful defense, defined as the probability that the number of compromised validators is less than half of the total number of validators, is analyzed.The impacts of the number of nodes and the number of validators on the security performance under varying levels of attacker ability are also analysed.
r Extensive simulations in various network attack scenar- ios have been performed.The validator selection scheme outperformed the random selection scheme.
r Based on design goals of high data integrity, Burrows- Abadi-Needham (BAN) logic-based data authentication is done for proving data security and freshness.Mathematical analysis of BAN logic in the considered scenario is also proposed.The remainder of the article is organized as follows.In Section III, we present a background on blockchains.In Section IV, our solution is described.Section V presents the mathematical analysis of the proposed scheme.Section VI presents the performance evaluation, and the conclusions are provided in Section VII.

III. BACKGROUND ON BLOCKCHAINS
The data in a blockchain is stored in the form of blocks after verification by validator nodes and data storage is distributed in nature [24].Each block contains the previous block's hash value, which makes the blockchain immutable as any data changes affect both current and subsequent blocks.Based on controlling authority, blockchain may be broadly classified as a public blockchain, private blockchain, and consortium blockchain.In a public blockchain, there is no central authority, which is open to everyone [25] like bitcoin [26].However, the private blockchain is managed by a single organization with full control of validator selection.Hyperledger Fabric [27] managed by the Linux foundation is the most common open-source platform for supporting private blockchains.The consortium blockchain is a hybrid blockchain that is controlled by a group of validator nodes.It is suitable for heterogeneous IoT systems with various administrative domains [28].
Depending on the required security level and the network environment, different consensus mechanisms are used by validator nodes to verify the blocks.PoW, PoS and PBFT are the most commonly used consensus mechanism.In PoW, nodes compete to solve a computational puzzle; the node that solves the puzzle first is rewarded.PoS is intended to solve the problem of high energy consumption in PoW.The validator nodes in the PoS mechanism are selected on the basis of the value of coins held (i.e., the stake).The probability of being selected as a validator is determined by the nodes' respective stakes.In case of malicious behavior, the nodes are punished and their stakes are reduced [29].In PBFT, the consensus of the new block is reached if and only if no less than two-thirds of validators confirm the block within a given time period [30].This is intended to reduce transaction time and increase network scalability.
This article considers the consortium blockchain because it is more suitable for resource-constrained IoT devices.As a consensus mechanism, PoW requires high computational power that is not compatible with resource-constrained IoT devices.In PoS protocols, the probability of a node being selected as a validator is positively correlated with the value of the stake it holds.The public nature of stake information enables the prediction of which nodes participate in the block validation process.To resolve this security vulnerability, in our previous work [9], we introduced the stochastic consensus mechanism.We demonstrated that the randomness introduced during the validator selection process significantly can reduce the attack success probability.Validator nodes are responsible for verifying the block data as well as maintaining the data integrity.Therefore, selecting a trustworthy node is an important issue for blockchain, especially in the open wireless communication scenario.Because the validator selection mechanism proposed in [9] involved uniform probability, it is therefore inherently unable to reflect the node heterogeneity.Hence, in this article, we propose a stochastic weighted selection of validators for IoT data integrity using the calculated confidence score based on wireless characteristics.

IV. SYSTEM MODEL
The system model has three types of nodes: sensor nodes, cluster head nodes, and validator nodes.Let S = {S 1 , S 2 , S 3 , . ... .., S n } be the set of randomly distributed N sensor nodes.Sensor nodes have low computational power and sense the target continuously.The target T, which may either be the transmitter to be localized or any primary user, is under continuous detection by the sensor nodes.Sensor nodes with high confidence scores are selected as validator nodes.The sensor nodes transmit data, together with the corresponding sensor's location (Loc) information as well as the received signal strength indicator (RSSI), to their designated cluster head (CH).Cluster head nodes are IoT edge nodes having higher computational energy than sensor nodes.
CH receives data (RSSI, Loc) from its nearby sensors and calculates the confidence score of each sensor node.Further, CH transfers data to the base station (BS), also known as the destination node.The destination node (D) is the highly secure node and selects validator nodes stochastically based on their weight.Further, validator nodes send their data to the smart contract (SC) and majority-based data is selected for blockchain (BC) storage.The system architecture of our proposed system is shown in Fig. 1.
Sensor model: After detecting a target, the sensor node sends (RSSI, Loc) to the nearest cluster head.The IoT edge nodes act as the cluster heads are assigned with sensors.A sensor associated with a particular cluster head is based on the shortest Euclidean distance among the nearby cluster heads.Based on the data obtained from sensors, IoT edge nodes make an estimation of the target's position with a certain error d err (Fig. 2).As shown in Fig. 2, an annulus corresponds to  each sensor node.The lower and upper limits of the annulus represent the minimal and maximal approximated distances of the sensor, respectively.The thickness of the annulus represents the certainty regarding the position of the sensor zone.
Blockchain model: The blockchain considered here is a consortium blockchain wherein the data read and write operations are controlled by designated validator nodes.The probability of a node being selected as a validator node depends on its confidence scores.Therefore, the blockchain consensus is very similar to the PoS one.Instead of making a stake-based validator selection, we are making a trust-based validator selection.It utilizes the resource inequality flaw of PoS advantageously as inequality in validator node selection probabilities.The final data selection for block mining is performed according to the majority-based selection of the validator nodes.Each node's confidence score and time stamp are stored in the blockchain.
Threat model: Any node that attempts to temper the data or inject malicious data is considered an attacker.Any number of nodes can be randomly attacked [9].This depends on the attack capability C A , which can be defined as the number of nodes compromised in a single attempt.If a sensor node is tempered in an attack, it presents falsified data (RSSI, Loc).As confidence score of each sensor node is calculated based on its data using log-distance path loss model.The IoT edge node knows the true position of the target and can estimate the position of the sensor node.The higher value of the thickness of the annulus reduces the confidence score and consequently reduces the probability of being selected as validators.Herein, we increase our rate of successful defense through a strategy change.
Protocol: Our system comprises four key phases of activity.
r Sensing phase: Sensors detect the target and report to the nearest IoT edge node.
r Weight assignment phase: The confidence score of a node corresponds to its probability of selection.The weight of each node is directly proportional to this probability.
r Validator selection phase: The higher is the weight as- signed to the node, the greater is its likelihood of being selected as a validator.
r Blockchain phase: The block is mined according to a majority-based data selection process and is broadcasted to all the nodes for state updates.

A. PROPOSED CONFIDENCE SCORE BASED WEIGHT ASSIGNMENT
The goal of weight assignment is to identify the truthfulness of the sensor node in a distributed manner by using fundamental sensor-reported information.IoT edge node I j estimates the position of the target by using the log-distance path loss model.Localization methods [31] include trilateration and multilateration.Trilateration determines the node position by using the intersection of three circles of three anchor nodes.Hence, more than three nodes are required for localization.If the distance measurements are noisy, the accuracy of position estimation is compromised.Multilateration requires distance measurement from more than three nodes.The author of [32] explains various localization techniques based on distance, the angle of arrival, and the time of arrival.However, to maintain system simplicity and energy efficiency, we used a log-distance path loss model.The estimated position of the target may vary because of one or both of the following points: 1) the falsification of sensor data by malicious sensors 2) model noise and other inaccuracies In consideration of an allowance for error, we incorporated an error factor d err into our formula; d err is the error in the estimated position of the target by IoT edge nodes caused by noise or other factors such as signal distortion.The true location of the target lies within the circular region of radius d err centered around the approximated target position.
The proposed weight assignment steps are presented in Algorithm 1.The distance from the target to the IoT edge node (I j ) is defined as d I j and estimated as Loc T .The location of the IoT edge node is Loc I j (line 4).Owing to uncertainty in the target location, the actual distance from the target to the IoT edge node lies in the range of (d I j − d err ), (d I j + d err ).Lines 6-13 explain the steps for weight assignment for each sensor within the set S. Regarding weight estimation, if the confidence score is greater than the threshold, the corresponding probability is calculated and the weight is assigned accordingly.Otherwise, the node is assigned 1% probability.These aforementioned nodes contain anomalies or a smaller amount of true data.Assigning a lower confidence score to malicious nodes prevents them from being the validator nodes.In the present study, we assigned a higher weight to truer nodes to maximize their likelihood of being selected as validators.

1) ESTIMATION OF THE TARGET POSITION ZONE
Using the log distance path loss model, the power of the signal transmitted from the IoT edge node and the power of the signal transmitted from the target to the sensor node is calculated.The estimated minimal and maximal distance are calculated by using the (1): where x g is a zero-mean Gaussian random variable that represents a shadowing effect and γ is the path loss exponent with a value that varies according to the characteristic of the area considered (e.g., rural or urban).As for the other components of the equations, d err is the error in the estimated distance, d I j is the distance from the target to the IoT edge node, P r,I j is the power received from the signal transmitted from the target at the IoT edge node, and P r,s i is the power received from the signal transmitted from the target to the sensor node.Estimation of the target position zone is given by (d max s id min s i ), which defines the annulus of sensor s i .

2) CONFIDENCE SCORE CALCULATION AND WEIGHT ASSIGNMENT
After d max s i and d min s i are calculated, the position of the target can be estimated.Moreover, the confidence score can be calculated as follows: where d 0 is the reference distance used for normalization.
According to the confidence score, the truthfulness of a node can be estimated; accordingly, the probability that a node is selected as a validator is determined.The greater is the confidence score C s i , the more truthful is the node, as C s i is defined on the basis of d max s i − d min s i , i.e. the thickness of annulus.The higher value of confidence score corresponds to the higher likelihood of being selected as a validator node.The probability of being selected as a validator P i , a linear function similar to the linear function of confidence scores used in [33], is given by: where N is the total number of sensor nodes in the network, V is the number of validator nodes selected, k is a constant, and C s i is the confidence score of the i th node.The value of P i may be biased by considering higher value of k.However, to avoid any bias we used k = 0 in our calculations.On the basis of the probability value, the weight of each node W i is determined as follows: where P i is the sum of the probability of all nodes.The weighted selection of validators is complete after all nodes have been assigned weights.

B. VALIDATOR SELECTION AND BLOCK MINING
After weight assignment, the validator nodes are selected stochastically using weighted random selection (WRS) [34] as shown in Algorithm 2 below: In Algorithm 2, S id presents the list of sensor id, S weight presents the list of weights of all sensors, and V present the list of validator nodes which is initially empty.The total number of validator to be selected are presented by v, P i (k) presents the probability of k th node to be selected as a validator, W i is the weight of i th sensor node.In line 1, S id , S weight and V are given as input to the algorithm.Steps 3 and 4 are repeated v times for selecting v validator nodes.The validator nodes are selected based on the formula given at step 3. Once a node is selected as validator node, it is removed from S − V and inserted to V .For Algorithm 2, the probability that the node with weight W n is selected as validator is W n is the first validator node to be selected.The probability of the second validator node is etc. as per research [34].
Further, we will discuss the validator selection strategy, system defense strategy, and block mining in blockchains, which is relatively less time and energy-consuming than other methods and provides high data security.
Validator selection: The concept behind validator selection is shown in Fig. 3.According to sensor-received data, the IoT edge node calculates the confidence score of each node.Nodes with high and low confidence scores have high and low probabilities of being selected as validators, respectively.As nodes are compromised by an attacker, the sensor node begins giving falsified data to the IoT edge node.This immediately results in confidence score reductions in all compromised nodes.As the confidence scores decrease, the estimated position of the sensor node will be outside the annulus.The weights of these nodes decrease, meaning that they are less likely to be selected as validator nodes.Hence, the IoT edge node selects new nodes with high confidence scores as validator nodes.Truer nodes have very high probabilities of selection.By selecting validator nodes with awareness of wireless channel characteristics, the data security of the system is increased.
Fig. 4 presents cases of successful and unsuccessful defense in blockchains.In the example shown in Fig. 4, three validator nodes are involved.In the successful defense case, only one validator node is compromised in the attack; the remaining validator nodes contain true data.The validator nodes give their data to the destination node and the destination node selects the data to be stored based on majority-based selection.Successful defenses in blockchains involve block mining that hinges on majority-based data selection with true data.In the unsuccessful defense case, two of the three validator nodes are compromised.In other words, the majority of the validator nodes have falsified data and, according to the majority-based selection, this falsified data is involved in block mining.After majority-based selection, the data selected by the destination node is updated in the blockchain as the blockchain is distributed and transparent in nature.
Block mining: The destination node based on the data received from the validator nodes performs it.The final selection of data for block storage is performed through majority-based selection.Owing to the selection of truer nodes as validators, the likelihood that these nodes are falsifying data is extremely low.The complete workflow of our proposed system is shown in Fig. 5.

C. RELATIONSHIP BETWEEN TRUSTWORTHINESS AND RECEIVED SIGNAL STRENGTH OF A NODE
Algorithm 1, implies that the value of the confidence score depends on the RSSI value of the sensor node.The higher value of RSSI supports the higher trustworthiness of the node.The author of [35] proposed a trust calculation method using the value of RSSI for the trust calculation of the node.Fig. 6 presents the relationship based on Neyman-Pearson Hypothesis [36].A ROC curve illustrates the performance of a detector (binary classifier) by plotting the probability of detection (P d ) with respect to the probability of false positive (P f ) for different values of signal-to-noise ratio (SNR) [37].The value of P d increases and the value of P f decreases with the increase in SNR value respectively, resulting in identifying the node behaviour.It implies that nodes with a higher SNR value results in higher trustworthiness nodes.

D. APPLICATION SCENARIOS FOR OUR PROPOSED MECHANISM
Our proposed system model shown in Fig. 1 can be applied for various IoT applications such as secure data collection, trustworthy data fusion and aggregation for cooperative sensor fusion, efficient target handover, environmental monitoring for large agricultural area even in the presence of malicious node.The use case of secure data collection using cooperative sensor data fusion is explained below: r In cooperative sensor data fusion, all the nodes will give their data to the corresponding CH.The cluster head assigns the weight of each sensor node based on our proposed scheme in Algorithm 1 and sends the data to the base station.The complete workflow of the system is explained in fig. 5.The final data storage is based on majority based scheme, so presence of some malicious node will not affect the data integrity.Typically, in such scenario CH are the points of interest for the attackers.However, our scheme does not use the CH for the validator process and uses weight-based validators selection.This makes our system more robust even in the presence of malicious nodes.

V. MATHEMATICAL ANALYSIS A. FORMAL ANALYSIS USING BAN LOGIC
This section presents the formal analysis of our scheme using BAN logic [38].The analysis aims to prove the correctness and freshness of the data stored in the blockchain.Firstly,  we illustrate the notation and logical postulates of BAN logic.

1) BASIC NOTATIONS OF BAN LOGIC
BAN logic has its syntax and semantics for security proof.The logic considers several objects: principals, encryption keys, and formulas.The principals may be people, computers and services.The encryption keys are shared keys, public-private key pairs, session keys and secret keys based on the considered scenario.Formulas are also known as statements.We assume M and N as principals, K as the shared key, SK as the session key between principals and X as the statement.The logical description is as follows: r M |≡ X : Principal M believes the statement X and act as X is true.r M SK ←→ N: SK is the shared session key between M and N for communication.
r M X : Principal M sees statement X and can read it.r M |∼ X : Principal M once said statement X .r M ⇒ X : Principal M has jurisdiction over X which means M believes X and M has authority over X .r #(X ): Statement X is fresh, which implies X shared for the first time in the current run of the protocol.r {X } K : It states that message X encrypted by key K.

2) LOGICAL POSTULATES
This section discusses logical postulates used in proofs using BAN logic.
r Message meaning rule: It concerns the interpretation of communicated messages i.e. how the principal derives belief about the origin of the message.For shared key K, the message meaning rule is postulated as follows: It states that if principal M believes that the key is shared with N and sees message X encrypted under the key K, then M believes that N once said X .
r The nonce-verification: This rule demonstrates the mes- sage's freshness and the sender still believes in the message X .It is postulated as follows: r The jurisdiction rule: It states that if M believes that N has jurisdiction over X , then M trusts N about the truth of statement X .The jurisdiction rule is postulated as follows: r Fresh conjuncatenation rule: If principal M believes about the freshness of X , then U also believes (X, Y ) are fresh.This postulate can be represented as follows: It states that if one part of the formula is fresh, then the whole formula must be fresh.It is postulated as follows: If principal M trusts the freshness of formula X , then it also trusts the freshness of the formula (X, Y ).

3) METHOD
In the BAN logic scenario, we have principals that like to communicate with each other.However, they do not trust each other.There is a server having jurisdiction over keys, and both principals believe it.The server helps principals establish trusted communication based on three major considerations: r Verification of message origin r Verification of message freshness r Verification of the origin's trustworthiness

4) GOALS OF AUTHENTICATION
This section discusses the goals we want to prove using BAN logic in our scenario.BAN logic focuses on the proof of good and fresh data.In our case, we want to store true and fresh data in the blockchain.The base station (BS) performs a weighted selection of validator nodes (V ) after receiving data from all the cluster heads (CH).The smart contract (SC) selects the majority-based data (X ) from validator nodes for storage in the blockchain (BC).Based on the considered scenario, in this article four goals are defined as follows: Goal 1 defines the trust between the cluster head and the sensor node.Goal 2 defines the trust between the blockchain and the validator node.Goal 3 defines the trust of BC on stored X and Goal 4 defines the freshness of finally stored data X in BC.

5) ASSUMPTIONS
These define the initial keys shared between protocols, principals generating a new nonce, and the trustworthiness of principals in certain ways [38].Assumptions are always made to guarantee the success of the protocol.These assumptions act as a premise for the logic analysis.We defined eight assumptions A 1 to A 8 .A 1 to A 4 are assumptions corresponding to the first set of BAN logic and A 5 to A 8 are assumptions corresponding to the second set of BAN logic.In the assumption, A 1 defines the shared key SK between CH and BS.A 2 defines the shared key (SK) between S and BS.A 3 and A 4 define the freshness of timestamps shared between (CH, BS) and (S, CH ) respectively.A 5 and A 6 define the shared keys K V BS , K SCBS between (V, BS) and (BC, SC) respectively and have belief in keys.

6) COMMUNICATED MESSAGES
These are defined by the Kerberos protocol [39] based on the shared-key Needham-Schroeder protocol [40].Timestamps are considered nonce for verification of message freshness.The messages are defined as two layers of hierarchical order as shown in Fig. 7. Firstly, a timestamp T S is defined between the sensor node (S) and target (T ).The clock is considered fully synchronized between different units of the system for maintaining and confirming the freshness of data.The first layer of the message considers the communication between the base station (BS), the cluster head (CH) and the sensor node (S).CH and S are acting as principals and BS is acting as a server.The server defines the keys between principals CH and S. Whenever intending to establish a communication link with S, the cluster head CH initiates a communication request to the BS using Message For maintaining synchronization between two considered BAN scenarios, firstly, T S is shared between BS and V .With the change in target position, the parameters (RSSI, C s ) of sensor nodes changes.It results in the expiry of T S and the base station's selection of a new set of validators.Similar to BAN 1 explained above, in the second layer of message consideration validator (V ) and blockchain (BC) are the principals and the Smart contract (SC) is the server because SC selects the majority-based data (X ) for storing in BC.In the second set of considered BAN logic, SC have double-fold responsibilities.Along with key establishment between V and BC, it also performs a majority-based selection of data X from the data it receives from the validator and encrypts it with key (K SCBC ) such that only SC and BC can read this as shown in Message 6 and 7.The messages of BAN 2 are defined below:

7) IDEALIZED FORM OF THE PROPOSED SCHEME
A message in idealized form is called a formula.In an idealized form, the message is presented in encrypted form rather than cleartext form.The idealized form of the messages defined in the previous section is presented below: This is the conclusion of the analysis of Message 2. CH passes Message 3 to S along with timestamp.S can decrypt it with the knowledge of key K CS .Using the same logical steps used for Message 2, the analysis conclusion of Message 3 is shown below: Applying message meaning and nonce-verification to equations ( 13) and ( 14), we get Message 4 assures that S believes CH and got the latest message from CH.The final results from the above analysis are as follows: From the final results of the analysis, it is concluded that now there is a direct belief between the cluster head and sensor node.Both believe that only CH and S can see the data and that it is always fresh, as a timestamp is used to verify its freshness.This proves our defined goal 1: Similar steps from equations ( 5) to ( 13) can be repeated and the conclusion of the analysis of Message 6 is shown below: V passes Message 7 to BC with timestamp.BC can decrypt it with the key K V BC and analysis conclusion of Message 7 is shown below: Using message meaning and nonce-verification to equations ( 16) and ( 17), we get Message 8 assures that BC believes V and got a fresh message from V .The final results from the analysis of Messages 5-8 are as below: From the final results of the analysis, it is concluded that now there is a direct belief between the validator node (V) and the blockchain (BC).Both believe that only V and BC can see the data and that it is always fresh, as a timestamp is used to verify its freshness.This proves our defined goal 2.
From the conclusion of BAN 2, we have In Message 7, X K SCBC is forwarded to BC by V .As BC have access to X with key K SCBC .Therefore, we can write it as Applying the message meaning rule to ( 19) and ( 20), we get According to Message 8, before starting communication, BC crosschecks the freshness of data (X ) and key (K) it got from V .Therefore, BC believes that X is fresh.
As BC knows that SC have jurisdiction over X , i.e.
which means that Applying the jurisdiction rule to ( 23) and ( 24), we get We proved our goals 3 and 4 in ( 25) and ( 22) respectively.It completes the mathematical analysis of BAN logic.

B. DEFENSE FAILURE ANALYSIS OF THE SYSTEM
This section presents an analysis of defense failures under different conditions.In existing blockchain systems, if more than 50% nodes are compromised, the entire blockchain is compromised.In our proposed system, a few nodes are selected as validators, the selection of which is weighted.Therefore, truer nodes are more likely to be selected, resulting in a more secure system.Our system has a very high rate of successful defense even when the attack capability exceeds 50%.The core idea of our formulation is from [9].However, the difference here is, we proposed the confidence score-based weighted validator selection rather than the random selection proposed in [9].
r Obscured capacity o c : This represents the number of nonvalidator nodes in the network; that is, the nodes that do not affect system security even if they are tempered by the attacker: The higher the value of o c , the lower the probability of defense failure.
r Residual capacity r c : Represents the number of validator nodes that can still be compromised after the attacker has compromised V* nodes In these equations, N is the number of sensors in the network, V is the number of validator nodes selected, V * is the lower limit of system failure and is equivalent to V/2, and C A is the attack capability (i.e., the number of nodes tempered by an attacker).
Therefore, according to [9], the probability of defense failure based on the residual capacity, obscured capacity, attack capability, and the number of validator nodes as formulated below in (28): Different cases of ( 28) are explained as below: r C A < V * ; f d is always equal to 0 because the attack capability is less than the lower limit of system failure.Therefore, the system is fully secure.
r If V * < C A < V and r c < o c , imply that C A < V and C A > V * , the probability of defense failure is the sum of all distinct possible combinations of the probability that more than half of the validator nodes are compromised out of total V validator nodes.
r For C A > V , r c < o c , the attack capability exceeds the total number of validator nodes, and the probability of defense failure also increases.

VI. RESULTS AND DISCUSSION
Simulation environment: To evaluate the security of the proposed stochastic blockchain involving weighted validator selection, the Google Colaboratory and Matlab platforms were used.We considered the random distribution of sensor nodes with an omnidirectional antenna and a single target node T. The network diameter d 0 is 300 m.The simulation parameters are presented in Table 1 and notations used in paper are described in Table II:

A. VALIDATOR PERFORMANCE 1) EFFECT OF THE PERCENTAGE OF VALIDATOR NODES
Fig. 8 presents a comparison of the probability of successful defense (S d ) of the weighted stochastic blockchain system under varying proportions of validator nodes V and varying percentages of attack capability (C A ) for a total of 100 nodes.When the C A ≤ 50%, a higher proportion of validators is contributing to a higher probability of successful defense because compromising more than 50% of the validator nodes entails compromising a greater number of nodes.However, as C A exceeds 50%, the trend reverses.This is because when the proportion of validator nodes is excessively high (e.g., 70% or 90%), the system begins behaving like a normal system, and the validators are easily attacked.Therefore, maintaining a lower percentage of validator nodes is recommended.When the V is 30%, even when the C A is 65%, the likelihood of successful defense is still 20%.
2) EFFECT OF THE TOTAL NUMBER OF NODES Fig. 9 presents a comparison of the probability of successful defense (S d ) of the weighted stochastic blockchain system for a different number of total nodes in a network with 20 validator nodes (V).Herein, C A varies from 0 to 100.Increasing the value of N from 100 to 125 yields S d that is approximately 40% higher for the same attack capability (C A = 60).Increasing N further to 150 results in a S d that is approximately 75% higher than it is when N = 100 and when C A = 70.
If N increases from 100 to 200, for the same C A , the S d is approximately 85%.From these results, we can conclude that for the same number of validators, as the total number of nodes increases, S d also increases.Therefore, a system with a greater number of nodes is more secure than is one with fewer nodes.higher for the weighted validator selection than for the random validator selection.When the proportion of validators is lower, weighted validators can result in a successful defense for a higher C A .A weighted validator scheme is more secure, providing a comparatively high S d even when the proportion of validator nodes is high.When the C A = 60% and the V = 20%, the percentage of successful defense in weighted and random validator selection S d W and S d R is 40% and 10%, respectively.When the C A = 50% and the V = 80%, S d W = 90% and S d R = 40%.

B. PERFORMANCE COMPARISON OF THE RANDOM AND WEIGHTED SELECTION OF VALIDATOR NODES 1) EFFECT OF THE PERCENTAGE OF VALIDATOR NODES
2) EFFECT OF THE TOTAL NUMBER OF NODES  the number of nodes increases to 150, the weighted validator selection is up to 45% higher than random validator selection for C A = 80.Also, for N = 200, S d W is 25% higher than the S d R when C A = 80%.In short, the S d W is always higher than S d R, and S d increases as the number of nodes increases for the same number of validator nodes and the same attack capability.

3) PERFORMANCE COMPARISON OF DIFFERENT CONSENSUS ALGORITHMS
Fig. presents the comparative analysis of S d for a total of 100 nodes and selecting 20 validator nodes for weighted and random selection of validator nodes.For PoS and PBFT, the tolerated power of the attacker is less than 51% and 33% respectively [40].We can observe from the graph that with the increase in attacker capacity, the S d for PoS and PBFT falls abruptly to zero after their threshold values.However, random and weighted validator selections still have a considerable value of S d .Further, the weighted selection of validators has a more successful defense than the random selection of validators.

C. COMPARISON OF THE WEIGHTED AND RANDOM SELECTION OF VALIDATOR NODES FOR DIFFERENT PARAMETERS WITH REGARD TO PERCENTAGE SUCCESSFUL DEFENSE (S d )
This section discusses the minimal, average, and maximal probability of (S d ) for different parameters.Fig. 11 presents a comparison of the S d for the weighted and random selection of validator nodes.As the attack capability increases, (S d ) decreases.This reduction in (S d ) is greater in the random validator selection than in the weighted validator selection.When the maximal C A is 70%, the weighted and random selection of validator nodes respectively contributes to 8% and less than 1% of (S d ).
Fig. 13 presents a comparison of the percentage successful defense for random node selection S d R and weighted node selection S d W under varying numbers of total nodes.As the number of total nodes increases, (S d ) also increases, whereas  the V/N and C A /N ratios decrease.Therefore, with the increases in the total number of nodes, S d R and S d W also increase.
Fig. 14 presents a comparison of the S d R and S d W under varying numbers of nodes, with the V/N and C A /N ratios kept constant for all values of N. Similarly, S d increases with the increase in the number of total nodes.This suggests that even for the same proportion of validator nodes and the same attack capability, networks with a larger number of nodes have a greater likelihood of launching a successful defense than do networks with fewer nodes.The results demonstrate that the V/N and C A /N ratios are not the only factors influencing increases in S d ; the increase in the total number of nodes also increases this probability.

D. COMPARISON OF DIFFERENT TYPES OF BLOCKCHAINS
Table 3 presents a comparison of different types of blockchains and their relevant parameters.In regular blockchains, all nodes act as validator nodes; by contrast, in stochastic blockchains, few selected nodes act as validator nodes.In simple blockchains and stochastic blockchains with random validator selection, the probability that a certain node is selected as a validator is equal.In blockchains with weighted validator selection, this probability is based on the confidence score of the node.Regarding the attack capability, when more than half of total nodes in the network are compromised, simple blockchains have a 0% probability of successful defense.By contrast, stochastic blockchains with random validator selection have a 40% probability.In stochastic blockchains with weighted validator selection, the probability is up to 80%.

VII. CONCLUSION
This article has proposed a data security mechanism for resolving data integrity and trust issues by integrating IoT with blockchain technology.The selection of a few nodes as validators mitigate problems concerning computational cost, energy, and latency.The security of data within this system is strengthened through the prudent weighted selection of validator nodes.Both excessively high and excessively low proportions of validator nodes reduce the probability of successful defense against an attack.When the number of validator nodes is low, an attack on a single validator node contributes to a high level of compromise.When validator nodes are overly numerous, they are easily located and attacked.Moreover, when the proportion of validator nodes is high, the system begins behaving like a normal system.Thus, the proportion of validator nodes must be selected very carefully to maximize data security and the probability of successful defense.A system with a larger number of nodes is always more secure than a system with a smaller number of nodes.In this study, weighted validator selection almost doubled the probability of successful defense.It has been demonstrated that a weighted stochastic blockchain is more secure than a random stochastic blockchain under the same set of parameters.With the same proportion of validator nodes and the same attack capability, stochastic blockchains with random validator selection correspond to a 40% probability of successful defense, whereas stochastic blockchains with weighted validator selection correspond to an 80% probability.
For the future work, the movable wireless IoT requires fast and efficient security and consensus mechanisms.For example, UAVs and self-driving cars are such typical application scenarios.Therefore, our proposed lightweight stochastic blockchain mechanism holds great potential for expansion into these dynamic IoT environments.

Algorithm 1 :
Proposed Confidence Score Based Weight Assignment Algorithm.1: Function Weight assignment Input (map, [RSSI, Loc]) 2: d 0 = Diameter(map) 3: [Loc T , d err ] = Distributed target localization 4: d I j = Euclidean distance (Loc T , Loc I j ) 5: d s i = Euclidean distance (Loc T , Loc s i ) 6: for k=1:N do 7: Estimate annulus by calculating [d min s i , d max s i ] 8: Calculate confidence score as

FIGURE 4 .
FIGURE 4. Illustration of the attacker scenario of successful defense and failure defense of blockchain with an example of V = 3.

FIGURE 5 .
FIGURE 5. Illustration of the complete workflow of the system model for authentication, key selection and validator selection.

→
M: Key K is a public key over M. r M K ↔ N: Principal M and N use key k for communication.

FIGURE 7 .
FIGURE 7. BAN logic message flow based on Kerberos protocol.

1 .
CH shares the node IDs of CH and S with BS as parameters.In Message 2, the BS shares the timestamp T s and the length of session L (lifetime of the session depends on the change in RSSI value in our case).Then, in Message 3, CH shares the message with sensor S along with T s , K CHS and encrypted message shared by BS for S. Finally, in Message 4, S verifies the freshness of CH with the timestamp and received key.The messages of BAN 1 are defined below: Message 1. CH → BS : CH, S. Message 2. BS → CH : {T S , L, K CHS , S,

r
C A < V and r c > o c , in this case, the attacker can al- ways compromise more than half of the validator nodes.Therefore, f d is always equal to 1. r For C A > V and r c > o c , the attacker can always com- promise more than half of the validator nodes as r c > o c .Therefore, f d is always equal to 1.

FIGURE 10 .
FIGURE 10.Comparison of S d R and S d W for different percentage of Validator nodes with 100 nodes in the network.

Fig. 10
Fig. 10 presents a comparison of the probability of successful defense with regard to the proposed weighted selection of validator nodes (S d W ) and the random selection of validator nodes (S d R) for varying proportions of validator nodes (V ) and varying percentages of attack capability (C A ) for a total of 100 nodes.The percentage of successful defense (S d ) is

FIGURE 11 .FIGURE 12 .
FIGURE 11.Comparative analysis of (S d ) for different number of nodes in network for V = 20 for Random and Weighted selection of validators.

Fig. 11
Fig. 11 presents a comparison of the probability of successful defense S d in weighted and random validator selection for varying numbers of total nodes in the network, with V = 20 and C A varying from 0 to 100%.The probability of successful defense for weighted validator selection S d W is 18% higher as compared to the S d R for random validator selection for C A = 40% and N = 100.As

FIGURE 13 .
FIGURE 13.Comparison table for random and weighted selection of validator nodes with different attacker capacity and (N = 100, V = 20%).

FIGURE 14 .
FIGURE 14.Comparison of the random and weighted selection of validator nodes for varying numbers of nodes (V = 20%, C A = 40%).