Efficient Identity-Based Public Integrity Auditing of Shared Data in Cloud Storage With User Privacy Preserving

Provable Data Possession (PDP) model provides an efficient means for people to audit the integrity of data stored in cloud storage. When sensitive data is shared among multiple users based on cloud storage, it is critical to preserve the anonymity of the data uploader against the auditor. That is, the auditor should not get data uploader’s identity through the data audition. To address this problem, many PDP schemes with user identity privacy-persevering are proposed. However, most proposed schemes are designed based on PKI technique which suffers from big burden of certificate management. Moreover, data auditors in most proposed schemes bear heavy computation cost which results to the lower efficiency of the scheme. To overcome the shortcomings, we present a novel identity-based PDP protocol to audit efficiently the integrity of group shared data with uploader’s privacy-preserving. Due to the inherent structural advantage of identity-based crypto mechanism, our PDP scheme is able to avoid the problem of certificate management. Different from previous works, our scheme ensures the relationship of the data and the data uploader in the phase of proof generation not the phase of integrity audition. Therefore, the data auditor does not know the relationship at all as well as the extract data uploader of the challenged data. At the same time, establishing the relationship by cloud server in proof generation step can reduce the computational cost of data auditor greatly. Furthermore, the relationship of data uploader and challenged data in the proof is randomized so as to strength the security of the scheme. All these efforts are made in our scheme to efficiently realize the anonymity protection of the data uploader. We give the detailed security proof of our scheme under the computational Diffie-Hellman assumption. Many experiments are performed to evaluate the efficiency of our scheme, the results show that our new scheme is efficient and feasible.


I. INTRODUCTION
Nowadays, explosive growth of data makes people bear big burden to store and manage data in local. To relieve the cost of data storage and management, more and more people rent cloud storage service and outsource the data to cloud servers. Furthermore, based on cloud storage, people are able to conveniently share data with each other and work as a team [1]- [3]. However, cloud service provider (CSP) is not fully trustworthy. The data stored in CSP may be The associate editor coordinating the review of this manuscript and approving it for publication was SK Hafizul Islam . corrupted or deleted due to accidental hardware errors, network exceptions, software bugs, or human mistakes [4]- [6]. To escape economic compensation and keep good reputation, CSP would not tell the truth to data user. Therefore, users need to periodically check whether the data in cloud storage server is kept well.
PDP model supplies user an efficient method to remotely verify the integrity of the data in cloud storage. PDP divides the outsourced data into many small data blocks and blinds one tag to each block. Since the tag contains the value of the data block, user can get the integrity status of the data block through checking the validity of the corresponding tag.
Until now, several PDP schemes [7]- [37] with public verification have been proposed. Most PDP protocols focus just on checking the integrity of single data [7]- [21] that belongs to only one user. However, in real applications, sharing data among multiple users is a common situation, in which the shared data is able to be used by any one of the workgroup. Therefore, auditing the integrity of shared data becomes an attractive issue. When sensitive data is shared in a group, the data uploader's anonymity against third party auditor (TPA) must be preserved. Specifically, TPA should not get who the data uploader is after auditing the data integrity. That is, data integrity audition process should not reveal the confidential information of uploader's identity to TPA. Aim to this goal, Wang et al. [22] proposes a concrete PDP protocol with the notion of user privacy preserving for shared data. It resorts to group signature technique to keep user privacy private from the TPA. Following, several schemes [23]- [37] with user privacy preserving are proposed. However, most of these PDP schemes [23]- [32] are constructed on the PKI technique which suffers from certificate management problems such as certificate generation, distribution, revocation, re-new, update and verification.
To address this problem, some researchers utilize identitybased cryptography [38] and certificateless cryptography [39] to design PDP schemes [33]- [37] with user privacy preserving. Nevertheless, these schemes are not computationally efficient to apply in practical application. Therefore, it is necessary to present more efficient PDP scheme with user privacy preserving for cloud data audition.

A. MOTIVATION AND CONTRIBUTIONS
Most of previous PDP schemes only concentrate on verifying the integrity of personal data. However, sharing data with others based on cloud platform is a development trend. Because any user can save sensitive data on the cloud, the privacy of data uploader's identity should be guaranteed. That's to say, TPA can audit data integrity but can not distinguish the exact data uploader. For example, every person can report to the government about criminal behaviors through the open complaint platform. To prevent criminal from revenging the reporter, it's necessary to preserve the reporter's identity.
To address the problems above, the paper proposes a novel identity-based PDP scheme towards group shared data with user privacy protection. In our scheme, CSP presets the relationship between user identity and data block in integrity proof phase, so TPA can audit the correctness of the proof without knowing the relationship. As a result, user privacy is preserved against TPA. Moreover, detailed proof is given to prove the security of our proposal under defined security model. The evaluation results of experiments show that our PDP scheme is efficient and practical.

B. RELATED WORK
Ateniese et al. [7] firstly considered to check data integrity by PDP model and proposed two concrete schemes based on RSA algorithm. Similar to PDP, PoR model proposed by Juel and Kaliski et al. [9] has the function of remotely check data integrity too. To improve scheme efficiency, Shacham and Waters [8] developed a compact PoR scheme with shorter authentication tag. To support dynamic operations, Ateniese et al. [10] based on symmetric key encryption designed a more flexible PDP scheme, where data blocks can be appended, updated and deleted. Erway et al. [11] proposed a PDP protocol with full data block dynamic operations including data insertion. To improve dynamic operation efficiency, Yan et al. [12] realized a PDP scheme with the new data structure. Similarly, Shen et al. [13] designed another new data structure to realize data operations of their PDP scheme. To increase data durability, Liu et al. [14] proposed a multi-replicas data integrity checking protocol, which supported fully dynamic data updates. Wang [15] developed an integrity checking protocol for data on multi cloud servers. Li et al. [16] further considered a more complex environment that multi-copies stored in multi CSPs and constructed a concrete scheme to check the integrity of all copies for one time.
To support delegation of data checking, Wang [17] proposed a proxy PDP scheme in which a commitment was used to authenticate the validity of auditor. Further, Yan et al. [18] strengthened the restriction of the verifier and proposed a verifier-designated PDP scheme. To preserve the data privacy, Wang et al. [19] proposed a notion of data privacy protection and designed a public auditable PDP scheme. To get rid of certificate management problem, Yu et al. [20] based on identity-based crypto [34] presented a PDP scheme with data privacy protection. Shen et al. [21] proposed a PDP protocol to guarantee the privacy of authenticators.
Wang et al. [22] proposed the first PDP model for data shared in group which utilized ring signature technique to generate tags so as to support public auditing and user privacy preserving. Wang et al. [23] proposed a new PDP scheme for shared data with user privacy preserving. Furthermore, the scheme in [23] also supported dynamic group which allowed user to join or leave the group at any time. Liu et al. [25] designed a PDP scheme based on broadcast encryption [24] supporting dynamic group. Wang et al. [26] considered the user revocation issue and proposed a PDP scheme which outsourced user revocation to CSP by proxy resignature technique. Yang et al. [27] designed a PDP protocol for group data with user identity privacy and traceability. Wu et al. [28] developed a PDP scheme for data shared within multiple uploaders. Nayak and Tripathy [29] proposed a SEPDP scheme with data privacy preserving. They embedded the challenged data block to the proof as an exponent parameter so that TPA cannot recover the block from the proof. Moreover, they extended their scheme to support batch auditing and data dynamic. However, Yu and Hao [30] proved the scheme [29] was not secure to resist the forge attack of malicious CSP. Mara et al. [31] presented a CRUPA scheme to audit the data shared in a group. CRUPA made use of the concept of regression technique to resist the collusion attack of CSP and revoked users. Lu et al. [32] designed a data integrity verification mechanism for mobile terminals in VOLUME 9, 2021 cloud computing. The scheme supported data privacy preserving and authorized access of the data. These schemes mentioned above mainly relied on the PKI technique which bears heavy burden for certificate management. To address the problem, Yu et al. [33] utilized the identity-base crypto to propose a PDP protocol with user privacy preserving in dynamic group. However, this scheme was only suitable for devices with limited computational ability. To avoid certificate management and key escrow, Li et al. [34] proposed a PDP scheme of group shared data based on certificateless cryptography. However, the scheme lost the user privacy preservation feature. Similarly, Yang et al. [35] presented a scheme of shared data based on certificateless cryptography too. Although the scheme claimed that it was able to guarantee user identity, unfortunately, TPA can get the relationship of data and the public keys in the verification phase. Thus, it did not really realize user privacy preserving. Wu et al. [36] presented a new PDP scheme with user privacy protection, but the communication and computation overheads of the scheme were too heavy especially in the challenge phase.

II. PRELIMINARIES
We first review some preliminary cryptography knowledge throughout this paper.

A. BILINEAR MAPS
Assume two multiplicative cyclic groups: G 1 and G 2 have large prime order q. Let g ∈ G 1 to be one generator of G 1 . Define e : G 1 × G 1 → G 2 is a bilinear map with the properties as follow.
(i) Computability: for any u, v ∈ G 1 , there exist efficient algorithms to calculate the value of e(u, v).
(ii) Bilinearity: for any x, y ∈ Z * q and u, v ∈ G 1 , it has e(u x , v y ) = e(u, v) xy .

Definition 1. Computational Diffie-Hellman assumption:
Let g be a generator of multiplicative cyclic group G 1 . Given (g, g a , g b ), to get g ab is computationally intractable with unknown a, b ∈ Z * q . For any adversary A (probabilistic polynomial time, PPT), the probability for A to solve this problem (CDH) is negligible, which can be denoted as:

A. SYSTEM MODEL
There are four participants in our scheme: key generation center, CSP, users and TPA.
(1) key generation center (KGC) generates the private keys for all users. We assume the keys are transmitted by secure channel.
(2) CSP maintains user's data and generates the proofs for data integrity challenge from TPA. (3) users generate tags for their data and outsource the data with tags to CSP. Here, all users share their data to each other in a group.
(4) TPA audits the integrity of data shared within a group. TPA first sends an integrity challenge to CSP and gets a proof from CSP. Then TPA validates the rightness of the proof and reports the checking result to users. Assume TPA is able to honestly execute the audition process.
The system model is illustrated in Figure. 1. It assumed that CSP is semi-trusted. Namely, it can execute audition protocol honestly, but lies to TPA when data is broken. TPA is assumed to be honest-but-curious, that is, TPA performs the audition for data integrity honestly and responds the real audition result to users, but it is curious to reveal the identity of data uploader.

B. DEFINITION OF OUR SCHEME
A public identity-based auditing scheme for shared data supporting user privacy preserving consist of six algorithms Setup, Extract, TagGen, Challenge, Proof and Audit which are described as below: Setup(1 k ) → (pp, msk) With the security parameter k, this algorithm outputs the public parameter pp and the master key msk.
Extract(ID j , msk) → sk ID j : This algorithm outputs user secret key sk ID j with user's identity ID j ∈ {0, 1} * and the master key msk.
The algorithm generates one authentication tag for each data block. It inputs user's secret key sk ID j and the data m i , outputs tag T i,j .
Challenge(Fid) → chal This algorithm is performed by TPA to generate a data integrity challenge chal for data named Fid.
Proof (F, T , chal) → P The algorithm generates the data integrity proof P for chal. It takes the inputs of challenged data F, tags collection T and challenge chal.
Audit(chal, P, Fid) → {0, 1} This algorithm is used to audit the rightness of integrity proof. It takes the inputs of challenge chal, proof P and the name Fid. It returns '1' if P passed the audition, else returns '0'.
We give the detailed process flow of our scheme: KGC runs the Setup to initialize the system and runs the Extract to generate the private keys for all users. Users in the group prepare their data and compute all the tags of the data by TagGen. They outsource the data and the tags to CSP. TPA runs Challenge to send an integrity challenge request to CSP. CSP generates an integrity proof for the challenge request by Proof and submits the proof to TPA. TPA runs Audit to check the correctness of the proof and return the checking result to users.

C. SECURITY MODEL
A public identity-based auditing scheme for group shared data supporting user privacy preserving should achieve three security features: completeness, soundness and user privacy protection against TPA. Completeness means the integrity of shared data should be audited rightly when CSP and TPA execute the protocol honestly. Soundness means when data is broken the scheme can resist CSP cheating TPA by forging the proof. Namely, if CSP doesn't maintain the challenged data blocks, it can't output correct data integrity proof. User privacy preserving means the data uploader's identity should be guaranteed against the auditor. That is to say, TPA should not obtain data uploader's identity during the procedure of data integrity auditing.
Completeness of the scheme is defined as: Definition 2: A public identity-based auditing scheme for data shared with multi-users is effective, if the equation Soundness of the scheme can be captured by a game. The game involves an adversary A and a challenger C. We describe the game as below: Setup Phase: C runs Setup algorithm to set the public parameter pp and the master key msk. C stores msk and gives pp to A.
Queries Phase:A makes three types of query to C for polynomial times. C responds the query results to A.
(a) Hash Query. adversary A queries the hash values of any hash function in the scheme. C replies the hash values to A.
(b) Private-Key Query. A can query any user's private key with the identity ID j . C calculates the private key sk ID j by the algorithm Extract and returns the key to A.
(c) Tag Query. adversary A can send randomly selected blocks to C and query their tags generated by any user in the group. C runs algorithm TagGen to generate the tag of the queried block and sends the tag back to A. If C does not have user's private key, it can compute the key by Extract algorithm.
Challenge Phase: C runs Challenge to get a challenge chal and submits it to A. Noted that at least one block in chal has not been queried by A. C asks A to respond a proof for chal.
Forge Phase: Finally, A submits a proof P to C for the challenge chal. If P passes the audition, A wins the game.
Definition 3:A public identity-based auditing scheme for group shared data supporting user privacy preserving is secure, if any adversary A wins the game above only with negligible probability.
User privacy preserving is an important security feature of the scheme. The setting of our scheme is that multiple users share data with each other in a group and each one can upload data to the group. Since the data is sensitive and crucial, data uploader prefers to keep anonymous against TPA. However, an honest-but-curious TPA tries to distinguish the identity of data uploader during data verification process. It may result to the user information leakage which brings security threaten to data uploader. Thus, the scheme should guarantee data uploader's anonymity against TPA.
Definition 4: A public identity-based auditing scheme for shared data is user privacy-preserving, if TPA can not reveal the identity of data uploader within the procedure of data audition.

IV. CONCRETE CONSTRUCTION OF OUR SCHEME
We show the detailed construction of our identity-based auditing scheme for group shared data, which realizes public audition and user privacy protection.
Suppose U users work together as a team. Each user in the team is denoted by u j (1 ≤ j ≤ U ) whose identity is ID j . The team deals with the data F which is split into n blocks. Therefore, the data F can be represented as where i is the block index. The symbol T i,j is a block tag generated by the user u j for the block m i . The algorithms in our scheme are defined as follow.
Setup(1 k ) → (pp, msk) : KGC selects a big random prime number q with |q| = k where k is the security parameter. Select two cyclic multiplicative groups G 1 and G 2 with order q. e : G 1 ×G 1 → G 2 is a bilinear map on G 1 and G 2 . Choose a generator g of G 1 and two different hash functions H 1 and H 2 which are defined as: Choose a pseudo-random function φ and a pseudo-random permutation π : , · · · , n} → {1, · · · , n} Then, KGC randomly selects two values: s ∈ Z * q , u ∈ G 1 and sets the master secret key msk = s, the master public key P 0 = g s . Thus, the system parameter is pp = (q, g, G 1 , G 2 , u, e, P 0 , H 1 , H 2 , φ, π ).
Extract(ID j , msk) → sk j : on receiving the identity ID j of the user u j , KGC computes sk j = H 1 (ID j ) s as u j 's private key and sends it to the user u j by secure channel.
TagGen(sk j , m i ) → T i,j : each user can run this algorithm to compute the tag for any block. Take u j as the example, u j selects a random value λ j ∈ Z * q and generates the tag for the block m i by the equation (1), Here, Fid is the unique identification of the data F. After getting the tag T i,j , u j chooses a secure signature scheme VOLUME 9, 2021 SIG (such as BLS [36]) to compute the signature µ j = SIG(R j ||ID j ), in which R j = g λ j . Finally, u j uploads (m i , T i,j , ID j , R j , µ j ) to CSP. Note that the values (ID j , R j , µ j ) only need to be uploaded once, because they are bound with the user u j and keep unchanged in the system. When received the (m i , T i,j , ID j , R j , µ j ) from user, CSP first verifies the correctness of µ j by the signature scheme SIG. If µ j is invalid, CSP drops the data and notifies the user u j . Otherwise, CSP validates the rightness of the tag by the equation (2): It can be confirmed as follow: Challenge(Fid) → chal : TPA runs this algorithm to challenge the integrity of data named Fid. TPA first sets the number of challenged blocks c and then randomly chooses two values k 1 , k 2 ∈ Z * q . TPA submits the challenge request chal = (c, k 1 , k 2 ) to CSP.

V. SECURITY PROOF A. COMPLETENESS PROOF
The completeness of the scheme can be proved as following: The soundness of our scheme can be proved in two steps. First, we prove the tag of any block can't be forged by CSP no matter who is the tag generator. Second, we prove the integrity proof can not be forged by CSP no matter what the challenge request is. Theorem 1: The CDH problem can be broken with the probability ε ≥ ε ((q k + q T ) · 2e) in the time t ≤ t + O(q H 1 + q k + q H 2 + q T ), if there exists a PPT adversary wins the security game with advantage ε in time t, for at most q H 1 , q H 2 , q k , q T times of H 1 -Query, H 2 -Query, PrivateKey-Query and Tag-Query respectively.
Proof: Assume the PPT adversary A wins the security game, we can get a simulator B to solve the CDH problem resorting to A. Let (g, G 1 , g a , g b ) to be one CDH instance, B computes g ab by following steps.
Setup: B sets the master public key P 0 = g a , which means the master private key msk = a. Note that a is unknown to B. B randomly selects public parameters λ ∈ Z * q , u ∈ G 1 and sets R = g λ . Then B gives A all the public parameters and the value R. H 1 -Query: A adaptively queries the hash value of any identity ID * . B keeps a table L 1 = {(ID, h 1 , Q 1 , τ )} for the H 1 -Query. If L 1 contains the row (ID * , * , * , * ), B gets the row (ID * , h * 1 , Q * 1 , τ * ) from L 1 and responds Q * 1 to A. Otherwise, B randomly chooses a number h * 1 ∈ Z * q . Then B tosses a coin τ ∈ {0, 1}. Suppose the probability of τ = 1 is γ and the probability of Forge: At last, A gives a forged tag T i * ,j * for block m i * with the identity ID j * . The block m i * has not be executed the Tag-Query under such conditions before.
Analysis: It is easy to see that if A wins the game, the values (m i * , T i * ,j * , ID j * ) have to satisfy the equation (2). Then, we can get the equation (4): To compute the value of g ab , B first searches the row (ID j * , h 1 , Q 1 , τ ) from L 1 . If τ = 0, B outputs ''Fail'' and exits the game. Otherwise, B continues to find the row (Fid, i * , Q 2 ) from L 2 . Based on these values, the equation (4) can be changed to: i.e., e((T i * ,j * ) λ , g) = e(g abh 1 · Q 2 · u m i * , g). Therefore, we can compute the result of given CDH instance: According to the analysis, if τ = 1, B outputs ''Fail'' and exists the game. Otherwise, the game is perfect. Therefore, the probability that B perfectly playing the game with A without abortion is higher than (1 − γ ) q k +q T . As a result, B can successfully output the result of g ab with the probability ε ≥ ε · γ · (1 − γ ) (q k +q T ) ≥ ε ((q k + q T ) · 2e). The time cost of this process is t ≤ t + O(q H 1 + q k + q H 2 + q T ).
Theorem 2: If all hash functions in the scheme are collision-resistance, CSP generates the forged proof to cheat the TPA only with negligible probability.
Proof: The beforehand procedures are the same as that in the proof of 'Theorem 1'.
Suppose chal = (c, k 1 , k 2 ) is the challenge request to A. A outputs a forged proof P = (σ 1 , σ 2 , M ) which passes the audition.
Analysis: For the challenge chal = (c, k 1 , k 2 ), we can compute all the indexes of the challenged block v l = π(k 1 , l) (1 ≤ l ≤ c) and all the random parameters a l = φ(k 2 , l)(1 ≤ l ≤ c). Assume the forged proof is P = (σ 1 , σ 2 , M ) where Because the forged proof P can pass the audition, P has to satisfy the equation (3). Then we can get the equation (5): We assume the true proof for challenge chal = (c, k 1 , k 2 ) is P = (σ 1 , σ 2 , M ), where a l m v l σ 1 and σ 1 are computed with user identity regardless the block and tag, so it is easy to get σ 1 = σ 1 . Moreover, because P passes the audition, we can also get the equation (6): Compared with the equations (5) and (6), we can see that if M = M , then σ 2 = σ 2 . It means P = P which is contrast to the assumption. Therefore, M must be not equal to M . Under this condition, we consider two cases: σ 2 = σ 2 and σ 2 = σ 2 . If σ 2 = σ 2 , we consider the extreme situation that there is only one challenged block in the forged proof. This means the adversary can forge the tag for single block, which is contrast to the 'Theorem 1'. Then, we consider σ 2 = σ 2 . If σ 2 = σ 2 , according to the equations (5) and (6) However, it is contrast to our assumption of M = M . Hence, the theorem 2 is proved.

C. PRIVACY PRESERVING
Theorem 3: TPA cannot get the identity of data uploader within the process of data auditing.
Proof: Look into the complete procedure of data integrity auditing carefully, it is not difficult to prove that TPA can not know the data uploader of challenged data. First, the user's identity is stored by CSP privately, no one knows the relation between data and user identity except CSP and user himself. When auditing the data, TPA sends a challenge to CSP, which contains no information about user. In audition phase, TPA checks the correctness of the proof by equation (3), which does not refer to user identity either. Moreover, CSP hides the user identity in the proof σ 1 = h · (v l ,a l )∈C H 1 (ID j,v l ) a l by random value h. Even there only one user identity in σ 1 , TPA cannot obtain the user identity either. Therefore, our scheme can guarantee the user privacy against TPA.

D. PROBABILITY OF MISBEHAVIOR DETECTION
Our scheme adopts the random sampling method to detect the misbehavior of CSP which reduces the workload of TPA. Assume user data is divided into n blcoks which are outsourced in CSP. With the challenge chal = (c, k 1 , k 2 ), CSP randomly selects c different blocks decided by the pseudo-random permutation π . Assume that CSP modifies c 1 blocks out of n blocks, so the percentage of tampered block is P t = c 1 /n. To detect the misbehavior, it is required that at least one tampered block is selected by CSP out of n blocks. Therefore, the probability of misbehavior detection is: Obviously, we can get bigger detection probability when we increase the number of challenged blocks. The Figure 2 demonstrates the result of the detection probability with different number of challenged blocks. In this experiment we divide user data to 100000 blocks and set the P t to 0.5%, 1%, 2% and 3% respectively. The number of challenged blocks increases form 100 to 1000 for each P t . From the Figure 2, we can see if P t = 1%, we only need to challenge about 400 blocks to achieve P > 98%. For P t = 2%, 180 blocks are enough to achieve P > 98%. Thus, our scheme can efficiently detect the misbehavior of CSP by randomly sampling a few blocks.

VI. PERFORMANCE ANALYSIS A. PERFORMANCE EVALUATION
We summary the performance of our protocol from aspects of computational and communicational cost, which are shown as follows.
Computational Cost: Let T p , T exp −G 1 , T exp −G 2 represent the computational cost of pairing, exponentiation on G 1 and exponentiation on G 2 respectively. Others like hash function, addition and multiplication on Z q is omitted. Suppose the data has n blocks in total, each challenge refers to c blocks. Extract algorithm needs only one T exp −G 1 operation. The algorithm TagGen needs 2T exp −G 1 for generating one tag. Thus, the computational cost for generating all n tags is 2nT exp −G 1 . The Challenge algorithm only selects two values, it causes negligible cost. Proof algorithm is performed to generate proofs which needs cost of 2cT exp −G 1 + (c + 1)T p . To audit data integrity, the TPA needs to run the algorithm Audit, which costs 2T p + (c + 1)T exp −G 1 . Moreover, we compare our scheme with three similar schemes: ACAMU [28], CL-PGSDP [35] and CLCA [36] in terms of computational cost in Table 1, in which U is the number of group users.
The Table 1 shows that in tag generation phase our scheme has the same cost as others. In challenge phase, our scheme only needs negligible cost while other three schemes have great cost. In proof generation, the computational cost of our scheme is a little bigger than CL-PGSDP, but better than other two schemes. In proof verification phase, our scheme has the best performance. In summary, our scheme is computationally efficient.
Communicational Cost: In our scheme, a tag is one element of G 1 , the challenge size is bounded of 3|Z q |, the proof size is one element of G 1 , one element of G 2 and one element of |Z q |. The total communicational cost of our scheme is very low. We also compare our scheme with another three similar schemes in Table 2.
From Table 2, we can find that the size of proof in our scheme is a little longer than others, but the gap is very small and keeps constant. However, our scheme has a great advantage in terms of challenge size, which still increases with the incrementing of U and c. Thus, our scheme has better communication performance.

B. EXPERIMENT RESULTS
We implemented a prototype of our scheme with PBC library [41] which is based on the library of GMP [42]. Our experiments set the workgroup with 100 users and the size of the data shared in the group is 2M. The experiments are executed in ubuntukylin-15.10 operating system with vmware workstation. We give 1 CPU and 1G Ram to the virtual machine and use the Lenovo laptop X270 as the host which installs Win10 operation system with Core i5 CPU and 8G Ram. We choose the typical 'Type A' elliptic curve supplied  by PBC in our experiments. In order to accurately show the advantage of our scheme, we implement CL-PGSDP and CLCA schemes simultaneously.
First, we execute experiments to evaluate the performance of tag generation in our scheme. In these experiments, we generate 100 to 1,000 tags for different data blocks. The experimental results are shown in Figure.3. Looked from the overall, the cost of tag generation is linear with the number of data blocks. To generate 1,000 tags needs only about 15.2 seconds which is efficient for practical application. Furthermore, if the computation of TagGen is done offline, the cost will decrease greatly. Besides, each tag is generated for only one time, so that it brings little impact on the entire performance of the scheme.
The second experiment is to evaluate the performance of 'Challenge' phase. In this experiment, the count of group users is 100, and the count of challenged blocks changes from 100 to 500. The experiment data is shown in Figure 4. From the Figure 4, we can find that the cost of 'CL-PGSDP' increases linearly with the increment of challenged blocks and much greater than that of 'CLCA' and our scheme. The challenge cost of 'CLCA' scheme is almost invariable, because its cost is related to the count of users not  the challenged blocks. Our scheme has negligible cost and is much more efficient than the others. Figure 5 demonstrates the computational cost of the 'proof generation' phase. We can see that our scheme needs more computation cost than CL-PGSDP and CLCA in this phase. According to the Figure 2, if P t = 1%, we can use about 400 blocks to achieve 98% probability of misbehavior detection. Under this condition, our scheme only needs 4 seconds more than CL-PGSDP scheme. Moreover, the work of proof generation is taken by CSP. Since CSP has great computational ability, the gap of computation cost in this phase has negligible impact on the entire efficiency of the scheme.
The cost of proof audit is presented in Figure 6. We can see that all three schemes consume linear cost with number of challenged blocks in verification phase. CL-PGSDP scheme costs greater overhead than CLCA scheme and our scheme. CLCA scheme has similar cost to ours, but still higher than our scheme. In summary, our scheme is the most efficient one in this phase.   At last, we make experiments to summary the computation cost of CSP and TPA in the three schemes. The number of challenged block is set to 500. The results are shown in Figure 7.
Observed from the Figure 7, the TPA in CL-PGSDP assumes more computiation cost than that of in our scheme and CLCA scheme. Specifically, our scheme assigns the lightest workload to TPA. Furthermore, in CL-PGSDP the computation cost of CSP is much lower than the computation cost of TPA. However, in our schme, the suituation is opposite. It is well known that CSP has greate computation ability but TPA is usually a normal workstation or personal computer. Transferring more job from TPA to CSP is a reasonable way to improve the efficiency of PDP scheme. Thus, our scheme realizes a better mechnism than the others.
Overall, compared with recent researches, our scheme is efficient especially for TPA.

VII. CONCLUSION
In this paper, we present a public identity-based PDP protocol for secure data storage, which supports identity privacy protection of multiple users. With our scheme, TPA can check the integrity of group shared data rightly but cannot know who uploaded the challenged data. We give the security model for our scheme, and prove its security with features of completeness, soundness and identity privacy preserving. Experimental result demonstrates that our proposal is efficient.