Fault-Tolerant Data Aggregation Scheme Supporting Fine-Grained Linear Operation in Smart Grid

Smart grid is a combination of traditional power system engineering and information and communication technology. Smart grid provides users with convenient services through real-time data updates. Multi-dimensional data aggregation can be more flexible for statistical analysis of electricity information. However, most of the existing multi-dimensional data aggregation schemes require the participation of a trusted third party and do not support fault tolerance. In this paper, we propose a fault-tolerant data aggregation scheme supporting fine-grained linear operations in smart grid. Firstly, we used the Chinese remainder theorem to encode the user’s multi-dimensional data and the corresponding weights. Secondly, we construct a privacy-preserving data aggregation scheme without a trusted third party, by combining paillier homomorphic encryption scheme and a secure key agreement protocol. Finally, we use the extended Shamir secret sharing scheme to construct a fault-tolerant data aggregation scheme that supports the reuse of shared key shares. Security analysis results show that our scheme satisfies semantic security and user data privacy protection. Experimental results show that compared with the existing multidimensional data aggregation schemes that require a trusted third party, our scheme does not increase additional computation and communication overhead.


I. INTRODUCTION
Smart grid is a new generation of power grid that combines traditional power grid technology with information and communication technology to achieve efficient power generation, transmission, distribution, and control services [1], [2], [3]. Smart grid will play an increasingly important role in meeting user needs, improving data reliability, and ensuring energy control and management [4]. In recent years, smart grid has been developed rapidly in more and more countries.
The associate editor coordinating the review of this manuscript and approving it for publication was Salvatore Favuzza .
Smart grid is mainly through the real-time collection of user data to monitor and analyze the entire smart grid operation status. However, collecting user data can directly analyze user privacy behavior. For example, if a user uses zero electricity during a certain time of day, it can be inferred that no one is home during that time. If these data are obtained by some criminals, they can carry out illegal activities such as crimes at these time points. On the contrary, if power companies have access to the data, they can use big data technology to analyze it and improve the quality of service for customers. Obviously, exposing users' electricity consumption data will pose a threat to users' privacy [5]. Therefore, how to protect the private information of smart grid users from being leaked is particularly important [6].
Security and privacy are two important aspects of the smart grid. How to monitor regional power consumption without disclosing individual users' power consumption has become a research direction for scholars. Homomorphic encryption schemes ensure that algebraic operations on ciphertext are equivalent to direct operations on plaintext. Therefore, homomorphic encryption schemes have a wide range of applications in the privacy protection of smart grid data. Data aggregation uses the homomorphic properties of ciphertext in homomorphic encryption to aggregate multiple ciphertexts into one ciphertext to save bandwidth and reduce delay. Therefore, to solve the privacy protection problem in the smart grid, the idea of combining data aggregation and a homomorphic encryption algorithm is proposed.
With the development of science and technology and the Internet of Things, the computing efficiency of smart meters is getting higher and higher. The demand for user data from terminals is also becoming more and more detailed. A series of works are presented to protect the privacy of users in the smart grid. There are some very important properties in these privacy-preserving smart grid scenarios: no trusted third party, fine-grained linear operation, and fault tolerance. The early original scheme requires the participation of a trusted third party. However, in real life, it is difficult to find a trusted third party, and even if there is a trusted third party, the cost is expensive, so the data aggregation scheme without a trusted third party is particularly important. In literature [10], [12], [13], [14], [20], [21], and [31], the authors propose an aggregation scheme without a trusted third party by using a secure key agreement protocol, which reduces the running cost of the system; In literature [8], [14], [21], [22], and [31], the authors propose an aggregation scheme that allows users to flexibly join and leave the smart grid. The scheme reduces the operating costs for new households joining the smart grid; In literature [7], [8], [9], [14], [15], [20], [23], and [31], the authors propose a multidimensional data aggregation scheme. Multidimensional data (such as power consumption, time of use, variance, etc.) can provide a more detailed analysis basis for the terminal, so that the terminal can better develop electricity consumption strategy, etc.; In real life, there may be a failure of the user's smart meter. When the smart meter fails, the terminal hopes that the faulty smart meter will not affect the operation of the whole system. Therefore, in the literature [10], [21], [23], [24], and [28], the authors propose a fault-tolerant aggregation scheme. It solves the problem of correct decryption of the terminal when the smart meter is faulty. In the literature [15], the authors propose an aggregation scheme that supports linear homomorphic operations. The scheme enables the terminal to give different weights to different smart meters according to their needs for flexible statistical analysis.
The proposal of these schemes solves the different needs of the smart grid. However, none of these schemes support fine-grained linear homomorphic operations. With the rapid development of smart grids, the terminal's demand for statistical data is becoming more and more detailed. At the same time, the terminal has more and more functional requirements for the scheme. For example, a power company may want to calculate the total price of electricity that should be paid in a certain area, which requires a linear calculation of the electricity consumption information of the residents. Therefore, we propose a fault-tolerant data aggregation scheme that supports fine-grained linear operation. Our scheme solves the problem of the terminal achieving correct decrypting when the user's smart meter fails without a trusted third party. In addition, most cities have implemented step-by-step billing rules. Our scheme can calculate the electricity charge of interval electricity consumption, which is convenient for the power company to check the accounts, and prevent the internal personnel from tampering with the user ladder electricity consumption data and causing economic losses to the power company. As a result, our scheme is more practical.
Our contributions. 1). We apply the Chinese remainder theorem to encode the user's multidimensional data and corresponding multidimensional weights into one-dimensional data and onedimensional weights. The linear operation of encoded one-dimensional data and one-dimensional weight is equivalent to the result of the linear operation of multidimensional data and corresponding weight. And the data of each dimension is independent of each other during terminal decryption.
2). Based on the paillier homomorphic encryption scheme and bilinear mapping, we construct a homomorphic data aggregation scheme that protects user privacy and supports batch verification. In addition, we use a secure key agreement protocol to construct a data aggregation scheme that does not require a trusted third party.
3). We use the extended Shamir secret sharing scheme to construct a data aggregation scheme with fault tolerance. When a user's smart meter fails, our scheme can achieve correct decryption. At the same time, our scheme does not reveal the secret shares of faulty users, which can prevent malicious attacks that intentionally compromise smart meters.
The remaining sections of this paper are arranged as follows: In section II, we present a number of existing works. In section III, we introduced the theoretical knowledge used in the article. In section IV, we introduced the system model, threat model, and design goal of our scheme. In section V, we present our scheme in detail. In section VI, we analyzed our scheme. In section VII, we conducted an experimental evaluation of the scheme. In section VIII, we draw our conclusions.

II. RELATED WORK
In this section, we introduce the smart grid in the related work of privacy protection schemes. In 2010, Li et al. [32] proposed a data aggregation method for privacy protection by combining an aggregation tree and a homomorphic encryption algorithm. The method with the minimum communication overhead ensures that all equipment is involved in polymerization, any equipment cannot get in the middle of the aggregate the results. In 2018, Lyu et al. [26] proposed a fog computing aggregation scheme, which uses fog nodes to collect and transmit data for efficient processing and calculation. This method greatly improves the efficiency of terminal computing in privacy-preserving data aggregation schemes. Therefore, in recent years, the method based on fog calculation has been widely used.
Multidimensional data contains multiple information about users, which can facilitate power companies to analyze the whole smart grid system in more detail and provide better services for users. At present, most schemes to achieve multi-dimensional data aggregation are mainly through the combination of super-increasing sequences and homomorphic encryption algorithm [7], [8], [30]. However, constructing multidimensional data using super-increasing sequences leads to an exponential increase in computational overhead and is less efficient. In the literature [25], [27], and [33], the author constructed a multidimensional data aggregation scheme by encrypting data of each dimension and generating different ciphertexts respectively. However, this method will cause a waste of corresponding resources in computation overhead and communication overhead. In literature [23], the author constructs a multidimensional data aggregation scheme through a special coding method. But this method gives each dimension a certain length. This coding method will cause a waste of space and is not convenient for flexible adjustment of the number of dimensions. In literature [14], the author used the Chinese remainder theorem to construct a multidimensional data aggregation scheme. The user's multidimensional data was encoded into one-dimensional data, which reduced the computational overhead during encryption and ensured that each dimension was independent of the other during decryption.
Compared with the above scheme, we propose a faulttolerant data aggregation scheme that supports fine-grained linear operation. We use the Chinese Remainder theorem to encode the user's multidimensional data. Due to the linear homomorphism of the Chinese Remainder theorem. The terminal can successfully restore the result of each one-dimensional linear operation when decrypting. In addition, we use an extended Shamir secret sharing scheme to achieve fault tolerance. As shown in Table 1, we compare the existing scheme from six aspects: Data Confidentiality(DaC), No Trusted Authority(NTA), Dynamic Users(DyU), Multidimensional Data(MD), Linear Operation(LO), and Fault Tolerance(FT). As can be seen from Table 1, our scheme is more functional.

III. PRELIMINARIES
Our scheme is based on the Chinese remainder theorem, paillier homomorphic encryption scheme, bilinear mapping, secret sharing scheme, and combinational mathematics. Next, we describe the preparatory knowledge used in our scheme.

A. CHINESE REMAINDER THEOREM
Let b 1 , b 2 · · · b n be the number of pairwise co-primes, x 1 , x 2 · · · x n and y 1 , y 2 · · · y n respectively n integers, then the following congruence equations has the following properties [17]: The form of solution can be expressed as and the linear operation of the corresponding dimension can be achieved by using the properties of the Chinese remainder theorem.

B. PAILLIER HOMOMORPHIC ENCRYPTION
Paillier public key encryption algorithm [18] is a popular homomorphic encryption that supports homomorphic addition.
-Encryption.For any plaintextm ∈ N , select a random integer r, where 0 < r < N , r ∈ Z * N 2 . That is, r has a multiplication inverse in the remainder of N 2 . Calculate ciphertext c = g m r N mod N 2 .
-Homomorphic addition.For any plaintext m 1 , m 2 ∈ Z N , integers r 1 , r 2 are randomly selected to satisfy 0 < r 1 , r 2 < N , r 1 , r 2 ∈ Z * N 2 , which can be obtained after encryption That is, the result after ciphertext multiplication and decryption is equal to the sum of the plaintext. Therefore, paillier homomorphic encryption algorithm is mainly used in the scenario of plaintext accumulation with privacy protection.

C. BILINEAR PAIRING
Let G 1 , G 2 be a cyclic group that satisfies the order of a large prime number P, in which a pairing relation e : G 1 × G 1 → G 2 is defined to meet the following conditions [19]: -Computability. For any g, h ∈ G 1 , a, b ∈ Z p , there is an efficient polynomial time algorithm to calculate the value of e(g a , h b ), e(g, h) ab .

D. SECRET SHARING SCHEME
In this section, we present the Shamir secret sharing scheme and its extension.

1) SHAMIR SECRET SHARING SCHEME
Shamir secret sharing scheme [29] is a threshold secret sharing scheme based on the Lagrange interpolation theorem, which mainly includes secret distribution and secret reconstruction.

a: SECRET DISTRIBUTION STAGE
Let s ∈ Z p (where p is a large prime integer) be the secret information to be shared, and the distributor randomly selects , where x i is the ID information of the users to be distributed and n is the total number of users to be shared. Finally, the distributor sends y i to the corresponding user as a shared secret share.

b: SECRET RECONSTRUCTION STAGE
The reconstructor collects y i of k users, and the secret information s can be restored using Lagrange interpolation Shamir secret sharing scheme can not restore polynomial f (x) for any secret share of less than k sharing users, so it can not obtain secret information s = f (0). Therefore, Shamir secret sharing scheme is resistant to conspiratorial attacks with fewer than k sharing users.

2) EXTENDED SHAMIR SECRET SHARING SCHEME
When using Shamir secret sharing scheme for secret refactoring, once the shared user provides the secret share, the shared user's secret share will be disclosed. Therefore, Wu et al. [24] proposed an extended Shamir secret sharing scheme that supported secret share reuse based on the Shamir secret sharing scheme.

a: SECRET DISTRIBUTION STAGE
First, the distributor randomly selects a large prime number p and two large prime factors u, v of p − 1, and calculates N = uv. In the finite field GF(p), the generator g of order N is selected. Then, the distributor shares the secret information s in the module N and sends y i to the relevant participant (same as Shamir secret sharing scheme). Then, if the distributor wants to share a new secret message s ′ at the time T s , the distributor calculates T s = s ′ − g r·s mod p, where r = H (T s ) is the time-dependent blind factor and H is the hash function. Finally, the distributor publishes T s .

b: SECRET RECONSTRUCTION STAGE
When the reconstructor needs to reconstruct secret information s ′ , it collects the secret share of k users. In this case, the sharing users do not provide the shared secret share y i , but provide the equivalent blind share Y i = g r·y i mod p. Reconstructor computation where k ∈ Z,g N = 1 mod p. Thus, secret information s ′ can be reconstructed without disclosing the share.

IV. SYSTEM DESIGN
In this section, we introduce the structure of smart grid system from three aspects: system model, threat model, and design goals.
A. SYSTEM MODEL In our scheme, there are three main participants: smart meters (SM), fog nodes (FN), and control center (CC). As shown in Figure 1, the information transmission process of the three participants is given. The roles and functions of the three parties in the system are as follows.
-Smart meter(SM). Smart meters are smart devices that power companies install in customers' homes. It is responsible for collecting and encrypting the user's electricity data, and then transmitting the encrypted data to the nearest FN.
-Fog node(FN). FN has powerful storage and computing power. It collects, processes, and aggregates data from smart meters. Then, FN transmits the aggregated data to CC.
-Control center(CC). CC is responsible for receiving the data from FN, verifying and decrypting it. Then, CC processes this data. By analyzing the decrypted results, CC can know the operation status of the whole smart grid in real-time, check whether the system is faulty, adjust the operation strategy, etc.

B. THREAT MODEL
In our scheme, all parties are semi-honest. Typically, participants will process the data exactly as the protocol requires. At the same time, participants will try to obtain sensitive information about users. We assume that the attacker has the following capabilities.
-An attacker can intercept the communication data among SM, FN, and CC and try to obtain the user's sensitive information from these data.
-An attacker can invade FN's and CC's databases, to steal the user's data and related parameters. The attacker attempts to recover the user's sensitive information from the data.
-In order to obtain the sensitive information of a specific user, the attacker can cooperate with some users to conduct a collusive attack. In addition, the attacker can also forge the relevant user identity, inject false information into the system and destroy the integrity of the system.

C. DESIGN GOALS
Our goal is to design a privacy-preserving data aggregation scheme that supports linear operations on multidimensional data. Specifically, our scheme aims to achieve the following functions.
-Security. User data should be safe in the whole system transmission process. Specifically, it includes user data confidentiality, integrity, and terminal accessibility. External attackers cannot tamper with or forge user data and the system can check the validity of user identities.
-Privacy. The power of the user data is sensitive. The electricity consumption data of individual users cannot be obtained by any participant. Prevent the leakage of users' privacy information through electricity data.
-Practicality. The designed scheme can realize the linear operation of user multidimensional data. No trusted third party involvement is required in the entire process. In addition, considering the limited computing power of smart meters, the designed encryption algorithm should be efficient.
-Fault tolerance. In actual situations, some smart meters may fail, leading to incorrect decryption, so the scheme should have a certain degree of fault tolerance.

V. OUR SCHEME
In this section, we introduce our scheme in detail. As shown in Figure 2, shows the flow chart of our scheme. Table 2 lists our scheme using acronyms and symbols and their meanings.

A. SCHEME CONSTRUCTION
Our goal is to design a privacy-preserving data aggregation scheme that supports linear operations on multidimensional data. Specifically, our scheme aims to achieve the following functions.

1) INITIALIZATION STAGE
Step 1: Given the security parameter k, the system randomly selects two large prime numbers p, q. The system user set U = {u 1 , u 2 , · · · , u n } generates related parameters (G 1 , G 2 , e, h, ω i ). Where the group G 1 , G 2 satisfies the bilinear relation e : G 1 × G 1 → G 2 and h is a generator of the group G 1 .
Step 2: CC calculates N = p·q, λ = lcm(p−1, q−1), select g = N +1, large prime Q = εN +1, where ε is a small integer. CC chooses l coprime large prime B = {b 1 , b 2 , b 3 · · · b l } such that b j ⩾ nd max ω max , 0 ⩽ j ⩽ l, where d max is the upper bound of the user's single dimension data, ω max is the upper bound of the user's single dimension weight, n is the number of users, and computes CC randomly selects n blind factors and assigns them to these users, hash function H 1 : Meanwhile, CC assigns weight ω i to different users according to needs, ω i satisfy   Be denoted as ω i CRT ←−− (ω i,1 , ω i,2 , ω i,3 , · · · , ω i,l ), where l is the number of dimensions and ω i,l is the weight of data in dimension l. Then the CC discloses the relevant parameters l, N , Q, g, G 1 , G 2 , h, e, H 1 , H 2 , ε, B, B j , B j ′ and the CC retains the private key λ.
Step 3: Each user u i ∈ U registers randomly choosing x i ∈ R Z N * as its private key and computing Y i = h x i as its public key, and then FN registers, randomly choosing x f ∈ R Z N * as its private key and computing Y f = h x f as its public key.
Step4: User u i ∈ U randomly selects R ij ∈ R Z N 2 * , where 1 ≤ j ≤ n − 1. {R i1 , · · · , R i,n−1 } form the user's shared key set. Through the secure channel, users send R ij to user j. User u i ∈ U receives the shared key of other users, which forms the key set of users {R 1i , · · · , R ni }. The user's key set R i1 , · · · , R i,n−1 , R 1i , · · · , R ni is recorded k i , and this user u i ∈ U uses his key set to calculate the encrypted key To illustrate this, three user examples are given, as shown in Figure 3.
Step 5: The user u i ∈ U splits his private key sk i into two parts sk i,1 and sk i,2 , satisfying sk i = sk i,1 + sk i,2 . The user u i randomly selects n(n ⩽ n) users in this area and shares sk i,1 with n users by using the Shamir secret sharing scheme extended under the module N . The share of each user is denoted as y * ,i ( * represents the general user symbol, and y * ,i represents the secret share of the first part sk i,1 of the user u i 's private key reserved by the corresponding user). At the same time, u i sends sk i,2 to CC using the secure channel. At time interval T s , the user u i calculates In addition, two linear functions are defined.

2) REPORTING PHASE
When the user u i ∈ U needs to send smart meter data to FN at the time interval T s , the user collects multidimensional data Then, the user encrypts the message m i by c i = g ω i ·m i H 2 (T s ) N (sk i +β i ) mod N 2 and generates the signature value σ i = H 1 (I D i ∥ T ∥ c i ) x i with his private key x i , which T is the current time, which can be used to defend against replay attacks. After that, the user sends the data (c i , σ i , T , I D i ) to the nearest FN.

3) READING PHASE
Firstly, FN verifies the legitimacy and integrity of the message received from the users. FN verifies the following formula e(σ i , h) To improve the efficiency of verification, FN can verify the legitimacy and integrity of the current user set in batch. Verify that the following formula If the verification fails, FN asks the user to resend the data. Secondly, after the batch verification is passed, FN aggregates the data. Due to where f (ω, m) =

4) DECRYPTION PHASE
After receiving the report from FN, CC first verifies the formula e(σ f , h) Then, CC calculates that β 0 satisfies n i=1 β i + β 0 = 0 mod λ to decrypt the cipher c using its private key λ.
Then, CC computes where 1 ⩽ j ⩽ l. The sum of linear operations of all users corresponding to each dimension data can be obtained, which provides support for CC to conduct dynamic analysis and adjust power supply strategy.

B. FAULT TOLERANCE 1) THEORETICAL ANALYSIS
To study the probability of smart grid failure and successful recovery. We assume that the smart meters are all independent of each other and the failure rate of the smart meters is θ.
As long as there are at least k meters of all the n meters works, VOLUME 11, 2023 68531 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
we can recover the equivalent ciphertext by Shamir secret sharing scheme. Therefore, the probability of successfully recovering the equivalent ciphertext is Furthermore, we present two examples to illustrate the probability of successful recovery. When θ = 3%, k = 3, n = 5, we have P = 99.97%. When θ = 5%, k = 13, n = 20, we have P = 99.99%.

2) DETAILS OF FAULT TOLERANCE
At the time T s , FN could not collect users' information due to the damage to electric meters for some users, leading to incorrect decryption. Our scheme was fault-tolerant, and we used the extended Shamir secret sharing scheme to achieve correct decryption. In order to illustrate the fault tolerance of our scheme, assume that at the time T s , the user u r ∈ U fails and cannot send his information to FN. To complete decryption, FN collects the secret share that the user u r shares with other users. After the user u v (v = 1, 2, 3, · · · , n) receives the request, calculate c r,v = H (T s ) y v,r mod Q ( y v,r means useru v shares user u r 's secret share), and collect at least k effective secret shares about the user u r from the user u v , denoted as U r . FN calculates At the same time, the parameter r,T s of the user u r is used to calculate c r ′ = c r + r,T s , while CC computes the second part share H 2 (T s ) N ·sk r,2 mod N 2 and sends it to FN for aggregation calculation

3) CORRECTNESS
The correctness of fault tolerance: CC updates β 0 to satisfy decrypts to obtain where (1 ⩽ j ⩽ l). End proof.

C. DYNAMIC USER MANAGEMENT
Our scheme supports dynamic user join and revoke. User failure can be considered as user revocation.

1) USER REVOKE
When the user u n ′ ∈ U revokes, FN broadcasts to revoke the user I D n ′ . Each user removes the shared key R n ′ i of the revocation user and the shared key R jn ′ sent to the user in its own key set. Then, others re-update their set of shared keys, calculating the private key at this point in encryption.

2) USER JOIN
When the user u i ′ joins, the FN broadcast adds user identity information I D i ′ . Then, other users update their key sets as initialized. CC randomly generates a blind factor β i ′ ∈ R Z N * , sends it to the user, and updates CC's decryption blind factor At the same time, CC assigns weight ω i ′ to the user u i ′ .

D. BASED ON THE SCHEME OF THE APPLICATION
The step electricity price is the step increasing electricity price, which means that the average household electricity consumption is set into several steps or grades of pricing calculation costs. After the oil crisis in the 1970s, Japan, South Korea, and some parts of the United States adopted the step pricing system for residential electricity. The less electricity used, the lower the price, while the more electricity used, the higher the price. In real life, almost all regions have implemented the classification of the ladder pricing method.
Step electricity consumption is generally divided into three types: industrial and commercial electricity consumption, agricultural electricity consumption, and residential electricity consumption. Each type carries out different step charging standards. Implementing step-increasing prices for residential electricity consumption can improve energy efficiency. Segmented electricity can realize differentiated pricing of market segments and improve electricity efficiency. The establishment of a tiered pricing mechanism of ''multiple users pay more'' will help to form a social consensus on energy conservation and emission reduction and promote the construction of a resource-conserving and environment-friendly society. Therefore, the real-time statistics of electricity and electricity consumption in the region have a certain significance for the company to conduct dynamic analysis and adjust the electricity price. At the same time, collecting users' electricity bills in real time can prevent internal staff from tampering with the power consumption of different types of users at different levels of the company.
In order to better illustrate the practicability of our scheme, as shown in Figure 4, we briefly illustrate the step accounting process with specific examples. It is assumed that there are three types of electricity consumption in a region: industrial and commercial electricity consumption, agricultural electricity consumption, and residential electricity consumption, and each type of electricity consumption executes different charging standards. Using our linear homomorphic operation scheme, the electricity consumption of all users of different types in the local region can be counted. Different types of user billing ladder is k 1 , k 2 , k 3 three stages. Firstly, the initialization phase is carried out to complete the key negotiation and the allocation of the billing unit price  price, flexibly adjusting the charging strategy, saving energy, etc.

VI. SCHEME ANALYSIS A. SEMANTIC SECURITY OF ENCRYPTED DATA
In our scheme, each user encrypts the user's electricity consumption c i = g ω i ·m i H 2 (T s ) N (sk i +β i ) mod N 2 at the time T s and submits it to FN for data aggregation, which we demonstrate to be semantically secure with the following theorem.
Lemma 1: For a given message m 0 or m 1 , encrypted ciphers are indistinguishable.
Proof: Setup. The challenger obtains the parameters of the system. The attacker A obtains the public key of the system.
Ciphertext query. The attacker A inputs a plaintext message m x , and the challenger returns the plaintext corresponding ciphertext c x to the attacker A . The attacker A can query the ciphertext corresponding to different plaintexts multiple times.
Challenge. When the query is finished. The attacker A sends two messages m 0 and m 1 to the challenger, where |m 0 | = |m 1 |. The challenger randomly selected b ∈ {0, 1} and returns plain ciphertext pairs (c b , m b ).
Guess. The attacker A outputs its guess b ′ ∈ {0, 1}. If b ′ = b. The attacker A wins the game. The advantage of the attacker to win the game is defined as Defined. If for any polynomial-time attacker, there exists a negligible function ε(κ) that makes Adv CPA A (κ) ⩽ ε(κ). The scheme is said to be semantically secure.
In our scheme, we randomly select a message in messages m 0 or m 1 by to encrypt it. For the attacker A , because the calculated H 2 (T s ) and H 2 (T s ) $ ← − Z N 2 are indistinguishable. Refer to the indistinguishability of the paillier encryption scheme. H 2 (T s ) is equivalent to a randomly chosen r(0 < r < N , r ∈ Z * N 2 ) in the paillier encryption scheme. Therefore, the indistinguishable ciphers c b (b ∈ {0, 1}) and c b $ ← − Z N 2 are also indistinguishable. The attacker A guess b ′ is a blind guess. Therefore, the attacker's advantage is Adv CPA In other words, the attacker A cannot distinguish messages m 0 and m 1 , our scheme is semantically secure. End proof.

B. PRIVACY-PRESERVING
We analyzed our scheme of privacy protection through the following situation.
Case 1: Even if the attacker compromises the FN and obtains all messages sent by SM to the FN, the attacker still cannot obtain the sensitive information of a single user.
Analysis: The attacker destroys FN and obtains all the messages sent by SM to FN. Since the key of our scheme is generated through a secure key negotiation protocol, the attacker cannot infer the private key of any single user through this information. Because our scheme is semantically secure, no one can get a plaintext message through ciphertext without knowing the key. Therefore, even if the attacker destroys the FN, he cannot obtain any individual user's sensitive information.
Case 2: Even if the attacker has destroyed the CC, then an attacker can't get any single user's sensitive information.
Analysis: The attacker destroys the aggregated ciphertext obtained by the CC. The attacker can get the aggregated plaintext by decrypting it. Since our scheme is semantically secure, the attacker still cannot analyze any individual user's sensitive information through these plaintexts. Therefore, our scheme can protect the sensitive information of a single user.
Case 3: The attacker intentionally destroys some SM, FN cannot collect the data of these SM, the scheme uses the extended Shamir secret sharing scheme to achieve correct decryption, and the attacker cannot obtain any useful information in this process.
Analysis: We use the extended Shamir secret sharing scheme to achieve fault tolerance. Meanwhile, the user u i  divides the private key sk i into two parts, one part is shared secretly and the other is saved by CC. In the process of secret reconstruction, after receiving the request, the user u v (v = 1, 2, 3, · · · , n) calculates the equivalent ciphertext c r,v = H (T s ) y v,r mod Q of the secret share ( y v,r represents the secret share of the user u v sharing the faulty user u r ), and the original share is not exposed, so the secret share is reusable. At the same time, all the users in our scheme are semi-honest, so there is no collusion for k users of the refactoring. Therefore, no useful information is revealed during the refactoring process.

VII. PERFORMANCE EVALUATION
In this section, we evaluate our scenario in terms of computational overhead and communication overhead. Our experiment was based on the JPBC library, the computer used was configured with AMD R7-5800H CPU@3.2GHz and 16GB RAM, and the operating system was Windows 11. We use IntelliJ IDEA 2022.2.3(Community Edition), Open-JDK 64-Bit Server VM to run Java programs. The relevant parameters of bilinear mapping were shown in Table 3, and the sizes of selected parameters were shown in Table 4. We use T e to represent the computing cost of exponential operation in the module N 2 . T m is used to represent the calculation cost of multiplication operation in the module N 2 . T a is used to represent the computing cost of exponential operation in the module N 2 . T σ is used to represent the computational cost of bilinear mapping signature. T v represents the computational cost of bilinear mapping verification. T h 1 represents the computational cost of mapping {0, 1} * to G 1 ; T h 2 represents the computational overhead of mapping {0, 1} * to Z N * . The running times of these operations are shown in Table 5.

A. COMPUTATION OVERHEAD
In our scheme, the computing cost required by SM is and CC is T v + T m + T e + T h 2 , so the total computing cost is 3T e + (n + 1) In the scheme [7], the calculation cost in this scheme is related to the dimension. In order to facilitate calculation, the dimension is selected as 10 dimensions, and the calculation cost required by SM is 11T e + 10T m + T σ + T h 1 , FN isnT v + (n − 1)T m + T σ + T h 1 , and CC is T v + T m + T e , so the total calculation cost is 12T e + (n + 10)T m + 2T σ + (n + 1)T v + 2T h 1 = 14.58n + 427.19ms.
In the scheme [15], the calculation cost required by SM is 2T m +T a +T e +T σ +T h 1 , FN is nT v +(n−1)T e +(n−1)T m + T σ + T h 1 , and CC is T v + T m + T e , so the total calculation cost is (n + 1)T e + (n + 2)T m + 2T σ + (n + 1)T v + 2T h 1 + T a = 42.99n + 113.68ms.
In the scheme [16], considering that the energy efficiency level of common electrical appliances is usually level 4, the calculation cost required by SM is 4T m +5T e +T σ +T h 1 , FN is nT v +(n−1)T m +T σ +T h 1 and CC is T v +T m +T e , so the total calculation cost is 6T e +(n+4)T m +2T σ +(n+1)T v +2T h 1 = 14.58n + 255.95ms.
As shown in Figure 5, we compare the computational overhead with other schemes, and our scheme has certain advantages in computational overhead. In addition, as shown in Figure 6, we test the computational overhead of share generation for SMs with (13,20). As shown in Figure 7, when a part of the smart meter fails, we test the computational overhead of FN to recover the equivalent secret key using the extended Shamir secret sharing scheme.

B. COMMUNICATION OVERHEAD
The communication overhead is calculated based on the size of messages sent by the smart device to the fog node (SM-to-FN) and the fog node to the cloud center (FN-to-CC).
In our scheme, each user SM reports the message (c i , σ i , T , ID i ) to FN, so the communication overhead of SMto-FN is (|c i | + |σ i | + |T | + |ID i |) = 3168n bits, FN sends data (c, σ f , T , ID f ) to CC, so the communication overhead of FN-to-CC is (|c| + σ f + |T | + ID f ) = 3168 bits.   In the scheme [7], each user SM reports the message (c i , R A , u i , T S , σ i ) to FN, so the communication overhead of SM-to-FN is (|c i | + |R A | + |u i | + |T S | + |σ i |) = 3200n bits, FN sends data (C, R A , G W , T S , σ g ) to CC, so the communication overhead of FN-to-CC is (|C|+|R A |+|G W |+ |T S | + σ g ) = 3200 bits.
In the scheme [15], each user SM reports the message (c ij , k ij , s ij ) to FN, so the communication overhead of VOLUME 11, 2023 SM-to-FN is ( c ij + k ij + s ij ) = 4096n bits, FN sends data (c i , k i , s i ) to CC, so the communication overhead of FN-to-CC is (|c i | + |k i | + |s i |) = 4096 bits.
In the scheme [16], each user SM reports the message (c j,i , s i , t i , tid sm i,q ) to FN, so the communication overhead of SM-to-FN is ( c j,i + |s i | + |t i | + tid sm i,q ) = 3168n bits, FN sends data (c j , s j , t j , ID FN j ) to CC, so the communication overhead of FN-to-CC is ( c j + s j + t j + ID FN j ) = 3168 bits.
As shown in Figure 8, we compare the communication overhead with other schemes, and our scheme does not add an additional communication burden.

VIII. CONCLUSION
In this paper, we propose a fault-tolerant data aggregation scheme that supports the linear computation of multidimensional data without the participation of a trusted third party. The multi-dimensional data and weight of users are encoded by the Chinese residual theorem, which can improve the computational efficiency of encryption. In addition, we implement fault tolerance and user data integrity checking with an extended Shamir secret sharing scheme and bilinear mapping. The safety analysis results show that our scheme has a protective effect on the user's electricity consumption data. The experimental results show that our scheme is feasible and reasonable.

ACKNOWLEDGMENT
(Tanping Zhou and Zichao Song contributed equally to this work.) ZICHAO