Medical Data Sharing Scheme Based on Attribute Cryptosystem and Blockchain Technology

Electronic medical data have significant advantages over paper-based patient records when it comes to storage and retrieval. However, most existing medical data sharing schemes have security risks, such as being prone to data tampering and forgery, and do not support the ability to verify the authenticity of the data source. To solve these problems, we propose a medical data sharing scheme based on attribute cryptosystem and blockchain technology in this paper. First, the encrypted medical data are stored in the cloud, and the storage address and medical-related information are written into the blockchain, which can ensure efficient storage and eliminate the possibility of irreversible modification of the data. Second, the proposed scheme combines attribute-based encryption (ABE) and attribute-based signature (ABS), which achieves the sharing of medical data in many-to-many communications. The ABE achieves data privacy and fine-grained access control, and the ABS verifies the authenticity of the source of the medical data while protecting the signer’s identity. Moreover, the data user outsources most of the operations of medical data ciphertext decryption to the cloud service provider (CSP), which can greatly reduce the computational burden. Finally, results of the analysis show that our scheme satisfies the requirements for confidentiality and unforgeability in the random oracle model, and that the proposed scheme offers higher computational performance than other similar schemes.


I. INTRODUCTION
Traditional paper-based medical records [1] contain patient information, such as personal medical history, prescriptions, immunizations, results of examination, medical images, and family history of genetic diseases. However, it was difficult and time-consuming for two or more medical institutions to share paper-based medical data. A solution to this problem is to share the patient's medical data electronically. Medical data in digital format enable the long-term preservation and on-demand recall of medical data, and they can help doctors make more accurate diagnoses of patients' conditions. Nevertheless, the skyrocketing amount of medical data requires costly storage overhead. When electronic medical data are distributed between different medical institutions across an The associate editor coordinating the review of this manuscript and approving it for publication was Shuiguang Deng . open network, it is easy for attacks such as tampering, monitoring, and forgery to take place.
Cloud storage technology can store massive amounts of electronic medical data with powerful computation capabilities and low cost. More and more users and medical institutions are uploading medical data to the cloud for storage and sharing. This can reduce the cost of local storage of medical data and enable users to access shared medical data anytime and anywhere. Several cloud-based strategies for medical data sharing have been proposed [2]- [4]. To guarantee the security of data, Cheng et al. [2] designed a method to cut big data into a number of ordered data columns and then place these data on a variety of cloud servers. Shen et al. [3] proposed a data storage scheme to verify data integrity and security, which can detect whether data have been tampered with before they are downloaded. Xhafa et al. [4] presented a medical record system based on the cloud, which used attribute-based encryption (ABE) to encrypt medical data based on patients' conditions, without exposing the detailed description of the symptoms and the department of the doctor. These schemes [2]- [4] are highly dependent on the cloud service provider (CSP). On the one hand, the CSP may delete data that users have rarely or never accessed to save space for storing other users' data, thereby earning more revenue. On the other hand, the data stored in the cloud may be damaged due to the failure of the cloud server, management errors or malicious attacks; however, the CSP may intentionally hide the fact of data loss to maintain its reputation. Therefore, we can conclude that the cloud makes it convenient for medical institutions to share electronic medical data, but the data stored in the cloud are subject to tampering, forgery, and being accessed by unauthorized individuals.
Blockchain is an emerging Internet database technology characterized by decentralization, transparency, and data non-tamperability, which was first proposed by [5]. For blockchain technology, there is no need to use third parties to store data for us reliably, nor to worry about the unavailability of data. Consequently, it can avoid the security problem of data in the cloud. Many researchers have recently conducted studies on the application of blockchain in the medical field. To achieve the secure sharing of medical data, Peterson et al. [6] proposed a consensus mechanism based on blockchain technology, but the communication overhead of node consensus is high. Liang et al. [7] presented a secure data transmission scheme based on the Fabric blockchain to improve the security of data transmission and reduce the communication overhead. Ekblaw [8] presented a management system for electronic medical records that ensures the accuracy of medical records by using the non-tamperability of blockchain. However, scheme [8] does not specify an access control policy for data access, which may lead to the accidental exposure of medical records. Scheme [9] achieves the management of distributed medical data by utilizing smart contracts and access control, but the use of the proof-of-work consensus mechanism results in a high computational cost for blockchain. Siyal et al. [10] analyzed the challenges faced by the application of blockchain in the field of medicine, and argued that the public verifiability of blockchain allows electronic medical records to be verified without any third party. Nevertheless, scheme [10] cannot ensure the reliability of the data source, resulting in the decrease of data availability. Sun et al. [11] applied an attribute-based signature (ABS) scheme to a blockchain system, which enables medical data to be shared between medical institutions and the authenticity of the data source to be verified. In this mechanism, the signer's attributes are verified and the identity of the signer is protected simultaneously. However, with the increase of medical data, the blockchain system brings about a large storage overhead as well as computational overhead. To store medical data efficiently and securely, Wang and Song [12] presented a secure medical data storage system that combines cloud storage and blockchain technology, thereby significantly improving the supervision of the system and ensuring the integrity and traceability of medical data. However, the computational overhead was not decreased significantly. In this paper, we propose the first open medical data sharing scheme based on blockchain and cloud storage technology, which can allow multiple users to share and access medical data at the same time, and satisfy the requirements for confidentiality, integrity, non-tamperability, anonymity, and verifiability simultaneously.

A. OUR CONTRIBUTIONS
In this paper, we design a medical data sharing scheme based on blockchain technology and attribute-based cryptosystem, which provides efficient storage and secure sharing services for medical data. Our contributions are as follows.
• We design a new medical data sharing scheme, which submits the encrypted medical data to the cloud and writes corresponding storage address and medicalrelated information into the blockchain. This can ensure efficient storage of medical data and prevent the CSP from tampering with the data.
• The proposed scheme achieves the confidentiality and privacy of medical data. In our scheme, the patient formulates specific access policies and authorizes doctors to encrypt medical data by using ABE, which can ensure flexible access control to cloud medical data and enable the patient to fully participate in the sharing of medical data.
• Our scheme can guarantee the integrity of the medical data and verify the authenticity of the medical data source without revealing the patient's identity or compromising its privacy. The ABS protocol permits the signer to sign medical data by using a set of attributes rather than his or her identity, which plays a very important role in data authentication and identity-privacy preservation.
• The proposed scheme has lower computation overhead than similar schemes. Based on the outsourced decryption mechanism, the data user entrusts the CSP to perform the partial decryption of the medical data ciphertext. Therefore, the data user only needs to execute simple calculations to complete the decryption operation, thereby reducing the computational burden on users who access data. In addition, our scheme provides a verification function for transformed ciphertext, which can prevent malicious attacks in the cloud and ensure the correctness of the transformed ciphertext.
• Our scheme can resist the chosen ciphertext attack (CCA) under the modified decisional Diffie-Hellman (MDDH) assumption. Meanwhile, the proposed scheme satisfies the requirements of unforgeability and anonymity.

B. ORGANIZATION
The remainder of this paper is organized as follows. Section II introduces related work. Section III reviews some VOLUME 8, 2020 preliminaries, including bilinear maps, linear secret-sharing schemes (LSSS), AND gate policy, and MDDH assumption. The system model and detailed description of our scheme are presented in Sections IV and V, respectively. Section VI analyzes the security and performance of our scheme from the perspectives of confidentiality, unforgeability, and anonymity. Section VII presents our conclusion.

II. RELATED WORK
This section is mainly concerned with the cryptographic technology used to achieve secure sharing of medical data. Doukas et al. [13] used the traditional public key infrastructure (PKI) technology to encrypt medical data, avoiding attacks by eavesdroppers, but this scheme faced huge certificate management overhead. Scheme [14] achieved simple role-based access control by using identity-based encryption (IBE), but there is a problem of the key escrow arrangement. Schemes [13], [14] both realize many-to-one encryption; that is, only one user can decrypt data. To address these issues, Li et al. [15] proposed a medical data sharing scheme based on attribute encryption, in which medical data are encrypted according to the users' set of attributes, so that multiple users with corresponding keys can decrypt the data. Moreover, this makes data encryption and key management more efficient.
With the development of ABE, the research related to ABS began to appear. Scheme [16] designed a medical data sharing scheme based on attribute signature, which can verify the source and integrity of medical data and protect the privacy of the signer. To reduce the computational burden on data users, scheme [17] applies an outsourcing calculation mechanism to share medical data. Shamir [18] first introduced the notion of IBE. Boneh and Franklin [19] put forward the first secure IBE scheme by employing bilinear mapping. On the basis of [18], [19], Sahai and Waters [20] constructed a fuzzy IBE scheme, and at the same time extended their ideas to propose a notion of ABE. In order to express a more flexible access strategy, Goyal et al. [21] presented an attribute-based key policy encryption scheme, but the scheme only supports monotonic access control structures. Ostrovsky et al. [22] proposed a logic inconsistent ABE scheme and extended the access control structure based on attribute schemes from monotonic to non-monotonic. However, data owners lack control over access strategies. Cheung and Newport [23] presented an attribute-based ciphertext policy encryption scheme, which takes the user's identity information as the attribute, and allows the data owner formulate the access control policy to fully control the access policy. The access structure of this scheme supports the logical relationship between positive and negative attributes, which proved to be resistant to ciphertext forgery attack. Nevertheless, in this scheme, the access structure is simple and the public parameters are long, so its efficiency is low. Subsequently, many ABE schemes with special properties appeared [24]- [27], and the schemes [26], [27] were extended and applied in practice.
Yang et al. [28] first introduced a fuzzy identity-based signature scheme and proved the unforgeability of the scheme by using the computational Diffie-Hellman assumption. Maji et al. [29] put forward an ABS scheme that can protect the signer's identity and resist collusion attacks by different users. However, the computing performance of this scheme is relatively low. In addition, this scheme only resists selective message attacks under the general group model. To enhance the security of this scheme, Li et al. [30] presented two ABS schemes that support the threshold structure. Lewko and Waters [31] presented a distributed attribute-based encryption (DABE) scheme, which can be applied to distributed networks. Inspired by [31], Sun et al. [11] presented a decentralizing attribute-based signature (DABS) scheme, which is applicable to the blockchain system.
Green et al. [32] introduced an attribute-based encryption with outsourced decryption (OD-ABE) scheme, which delegates a large amount of decryption calculation to the proxy server, and the proxy server sends the partially decrypted ciphertext to the subscriber who can decrypt the ciphertext with only a small amount of computation. However, the proxy server is considered to be semi-honest. In order to ensure that the proxy server correctly performs the partial decryption process, Lai et al. [33] proposed a verifiable OD-ABE scheme. Li et al. [34] presented a checkable OD-ABE scheme that can ensure the validity of the message and outsourced calculation results. At the same time, many OD-ABE schemes [35]- [37] were proposed, but these schemes cannot achieve CCA security. Zuo et al. [38] proposed a CCA secure OD-ABE scheme and proved its security. However, designing a secure and efficient ABE scheme remains an open challenge.

III. PRELIMINARIES
In this section, we review the notations and definitions related to the proposed scheme.

A. BILINEAR MAPS
Let G and G T be two multiplicative cyclic groups of prime order p, where g is a generator of G. The mapê : G × G → G T is said to be a bilinear map if it satisfies the following properties [39].

B. LINEAR SECRET SHARING SCHEME
The aim of the linear secret-sharing scheme (LSSS) [40] is to split the secrets in an appropriate way. Then, each share is managed by different participants. A single participant cannot recover the secret information, and only a few participants can cooperate to recover the secret. The specific description is as follows.
(1) Secret generation: The secret distributor chooses a matrix M with x rows and j columns named the share-generation. Suppose that vector v = (s, r 2 , . . . , r j ) is the transpose of matrix, where s ∈ Z p is the secret value to be shared and r 2 , . . . , r j ∈ Z p are random elements.
(2) Secret distribution: The secret distributor assigns the shared secret value s to x members U 1 , . . . , U x , where the secret share owned by the k-th member U k is M k × v, and the k-th row of the matrix M is identified as the function ρ(k).
(3) Secret recovery: Let K be the authorization set, and There exists a constant {ω k ∈ Z p } x∈K , and k authorized users can recover the secret value s by using {λ i }.

D. MODIFIED DECISIONAL DIFFIE-HELLMAN (MDDH) ASSUMPTION
The MDDH assumption is to distinguish between (g, g a ,ê(g, g) b ,ê(g, g) ab ) and (g, g a ,ê(g, Definition 1: We say that the MDDH assumption holds if there is no polynomial time algorithm that solves the MDDH problem with a non-negligible probability [41].

IV. MODEL OF SYSTEM
There are six entities in the system model of our scheme, such as attribute authority organization (AAO), patients, hospitals, blockchain system, CSP, and medical data users (such as medical institutions and insurance companies), as illustrated in Figure 1.
(1) The AAO is mainly responsible for distributing the corresponding attribute signature key SIK i,GID , transformed key tk and private key d to the patient, medical data users and hospitals respectively.
(2) The patient formulates the access control policy and sends to the hospitals. Then, the patient generates the signature σ of medical-related information m 0 according to the LSSS. Finally, the patient sends σ and m 0 to the data pool.
(3) The hospital encrypts the medical data according to the access structure specified by the patient and sends the ciphertext CT to the data pool.
(4) The blockchain system consists of a data pool, blockchain and consensus network. The data pool stores medical data ciphertext, medical-related information and its signature. To improve the security of the medical data, the consortium blockchain is constructed in the system. The alliance members include medical data users, hospitals, research institutes and accounting nodes, and they maintain the blockchain jointly. The consortium blockchain is responsible for storing medical-related information and the address of encrypted medical data and ensuring that the content on the blocks is immutable. In the consensus network, the proofof-stake (PoS) mechanism is used in the consensus process to ensure the security of the blockchain ledger. The consensus nodes first implement the PoS mechanism to select the accounting nodes, which can realize the distributed consensus of the blockchain. Next, the accounting nodes send the medical data ciphertext to the cloud and obtain the data access address from the cloud. Finally, the accounting nodes write the storage address of cloud medical data ciphertext and medical-related information to the blockchain. Compared with the distributed server, blockchain has the characteristics of decentralization, verifiability and immutability, which are essential in our system.
(5) The CSP mainly stores medical data ciphertext and sends the address of the ciphertext to the bockchain. And the CSP is also authorized by the data users to complete a partial decryption of the encrypted medical data.
(6) Medical data users initiate the medical data access requests by submitting their attributes set to the blockchain system. If the verification is passed, data users obtain the medical data ciphertext address sent by the accounting node. Then, the data users send the address and the transformed key to the CSP, which is authorized to partially decrypt the medical data ciphertext. Finally, the users completely decrypt the medical data ciphertext by using the retrieval key.
• Key Generation: When users register in the system, they can get the corresponding key from the AAO.

Phase 1 (Generation of Private Key):
After receiving the attribute set S ∫ = {S ∫ 1 , · · · , S ∫ w } ⊆ N sent by a hospital, the AAO generates the corresponding attribute private key. The detailed steps are as follows. 1) Randomly choose r i ∈ Z p , where i ∈ [1, n], and calculate r = n i=1 r i mod p andD = g ϕ−r . • Medical Data Upload Phase 1 (Signature Generation): First, the patient chooses an AND gate = ∧ i∈I i ∼ of medical data m, where I ⊆ N , the attribute i ∈ I and it satisfies i ∼ = +i or i ∼ = −i. Then, medical-related information m 0 (including medical record type, diagnostic date, and access control structure) is generated. Finally, the patient generates the signature σ of m 0 by the following steps. 1) First, select an x × j access matrix M with function ρ maps each row of matrix M to an attribute and randomly select z ∈ Z p and a vector ν ∈ Z x p , where z is the first element of the vector ν. Then, define M k as the k line of matrix M and calculate µ k = M k · ν. Finally, randomly select a vector ω ∈ Z x p , where the first element of ω is 0, and calculate ω k = M k · ω.

Phase 3 (Data Upload):
After receiving the ciphertext CT of the medical data m and the signature σ of the medical-related information m 0 , the accounting nodes perform the following three steps. 1) Calculate τ k by k τ k M k = (1, 0, . . . , 0), then verify holds or not. If the equation holds, perform the following steps; otherwise, discard this data. 2) Upload the ciphertext CT to the cloud. 3) Write medical-related information m 0 and the address of ciphertext CT to the blockchain.

• Medical data access Phase 1 (Outsourced Decryption of Medical Data):
After receiving the transformation key tk and the ciphertext CT of medical data submitted by the medical data users, the CSP performs the following three steps. 1) Check whether the following equations hold: If one of the above equations does not hold, the operation is aborted; otherwise, perform steps 2 and 3.

VI. SECURITY AND PERFORMANCE ANALYSIS
A. SECURITY ANALYSIS 1) CONFIDENTIALITY Theorem 1: The proposed scheme satisfies confidentiality in the random oracle model if the MDDH assumption holds.
Proof: We use the proof method of Zuo et al. [38] to prove the security of our scheme. If there exists an adversary A that breaks the confidentiality of the proposed scheme with a non-negligible probability ε, then there is an algorithm B that can solve the MDDH problem. An MDDH instance (g, A = g a , B =ê(g, g) × G 2 T is given, B performs a security game with A to determine whether Z =ê(g, g) ab .
The following proves that the confidentiality in our scheme can be reduced to the hardness of the MDDH problem under the chosen ciphertext attack.
Initialization: After receiving the challenge access structure * = ∧ i∈I i ∼ sent by the adversary A, B randomly selects ϕ, t 1 , . . . , t 3n ∈ Z p , and calculates φ =ê(g, g) ϕ and Phase 1: A initiates the following hash queries, and B responds as follows.
• H 1 -queries: B creates list L 1 (initially empty). A sends V 1 ∈ {0, 1} 2l to B. If (V 1 , h 1 ) exists in list L 1 , B returns h 1 to A. Otherwise, B randomly selects h 1 ∈ Z p to send A and adds (V 1 , h 1 ) to list L 1 .
• H 2 -queries: B creates a list L 2 (initially empty). A sends V 2 ∈ G T to B, and if there is (V 2 , h 2 ) in list L 2 , B returns h 2 to A. Otherwise, B randomly selects h 2 ∈ {0, 1} l to send A and adds (V 2 , h 2 ) to the list L 2 .
• H 3 -queries: B creates a list L 3 (initially empty), A sends • O rk : Similar to O tk , the only difference is that B sends retrieval key rk to A.
• O od : After receiving the (C i , S, i) sent by A, B first checks if S satisfies the access structure corresponding to C i . If not, output ⊥; otherwise, B sends (tk, C i ) to A.
• O dec : After receiving the (C i , S, i) sent by A, B first checks if S satisfies C i and the corresponding access structure * . If not, B outputs ⊥; otherwise, B performs the following two steps. 1) If i = i * or S / ∈ * , B uses rk to calculate m, and sends m to A. 2) If i = i * and S ∈ * , it can be found that the pairs If not, B outputs ⊥, otherwise checks whetherC 5 = e(A, g) ϕ·h 1 holds. If it does not, outputs ⊥; otherwise, B sends m to A. Challenge: A selects two equal length plaintexts m * 0 and m * 1 , B setsC 1 = ζ ⊕H 2 (B ϕ ),C 2 = m θ ⊕H 3 (ζ ), andC 3 = Z ϕ , where ζ ∈ {0, 1} l . If Z =ê(g, g) abc is true, the challenge ciphertext is valid.
Guess: A outputs guess result θ . If θ = θ , then e(g, g) abc = Z, CT is a valid ciphertext; otherwise, CT is an invalid ciphertext.
B uses the above simulation to solve the MDDH problem, but the MDDH problem is difficult. Therefore, the probability of attack is negligible; that is, our scheme meets the requirement for confidentiality.

2) UNFORGEABILITY
According to the analysis results of [11], we suppose that the signature σ = (σ 0 , σ 1 , σ 2 ) is valid for the medical data. In our scheme, the ABS protocol ensures that the medical data cannot be forged. Only the users who meet the access control policies formulated by the patient can calculate the signature of the medical data. It is impossible for a user who does not have the corresponding attribute i to computeê(g α ρ(M k ) , g) and e(H 4 (GID) γ ρ(M k ) , g τ k ). Since τ k ∈ Z p is randomly chosen by the user, the signature σ cannot be forged.

3) ANONYMITY
The AAO assigns corresponding attributes to all of the entities in the system, which are bound to the global identity identifier GID. When other users in the system check the correctness of the signature, only the attribute verification key associated with the signature can be successfully checked. That is to  say, when the data user looks over the medical data, they can verify that the medical data were established by a legitimate user without revealing the user's true identity. Therefore, our scheme can achieve the requisite anonymity.

B. PERFORMANCE ANALYSIS
In this section, we analyze and compare the computation cost between our scheme and that of Wang and Song [12]. The experimental environment is on a Windows 10 (64-bit) operating system with an Intel Core i5 3GHz processor with 8 GB RAM. The encoding is implemented by using the JPBC 2.0 library. The notations used in this section are defined as follows: Notation Description E the exponential operations in group G E T the exponential operations in group G T E Z the exponential operations in ring E P P the pairing operations As can be seen in Table 1, the calculation cost of encryption and decryption operations in our scheme is significantly lower than that of Wang and Song [12]. In the phase of medical data encryption, our scheme needs (n + 2)E + E T operations. However, scheme [12] costs P + (3n + 1)E + E T operations, which increases the pairing and the exponential operations in group G. During the medical data decryption stage, our scheme outsources the medical data ciphertext to the CSP for partial decryption, which only needs P+E Z +2E T operations. But (2n+1)P+E T of computation cost is required by the data users to decrypt the medical data ciphertext in [12]. Hence, the computational overhead of our scheme is greatly reduced compared with [12].
As shown in Fig. 2, compared with scheme [12], our scheme has lower computational overhead in the stages of encryption, signature, and decryption.  As illustrated in Fig. 3, the time cost between scheme [12] and our scheme in the encryption phase is positively correlated with the number of attributes, but the time cost in [12] is always higher than our scheme. Fig. 4 shows that there is a linear growth trend on the decryption phase of Wang and Song [12], while the number of attributes increases. The decryption time cost of our scheme remains basically unchanged as the number of attributes grows, since the most complex decryption work is delegated to the CSP. In summary, the proposed scheme has high computational performance.

VII. CONCLUSION
This paper proposes a new medical data sharing scheme that combines the advantages of cloud storage and blockchain technology. Our scheme uses the cloud server to store encrypted medical data, and the blockchain system to preserve the address of corresponding medical data ciphertext and medical-related information. Therefore, the proposed scheme satisfies the requirements for immutability and unforgeability. By using the attribute-based cryptosystem, the confidentiality of medical data in the cloud can be ensured, and the authenticity of the medical data source can be verified. Furthermore, the computational burden of medical data users can be alleviated through the use of the OD-ABE mechanism. The analysis results show that the proposed scheme has high performance in both computation overhead and security.