A Federated Framework for Fine-Grained Cloud Access Control for Intelligent Big Data Analytic by Service Providers

This paper proposes a novel data-owner-driven privacy-aware cloud data acquisition framework for intelligent big data analytics for service providers and users. To realize this idea, we propose three main components. The first one is a new global identity provider concept to support fine-grained access control for a federated outsourcing cloud, namely called P-FIPS (Privacy-enhanced Federated Identity Provider System), in which data owners perform identity access control with the operator of the federated outsourcing cloud so that the service providers can selectively use their encrypted data on the cloud for various purpose such as intelligent big data analytics. In P-FIPS, data owners manage the access privilege of service providers over their encrypted data on the cloud by (a) labeling the scope of use (e.g., user connection, user disconnection, user tracking) on each encrypted data on the cloud, and (b) by selectively providing the information regarding the data owners to the service provider. The label also includes the attributes related to the data owner’s identity, and this allows service providers to locate the target data with the assist of cryptographic computation according to the scope of the use at the cloud outsourcing server. The second one is a new ambiguous data acquisition mechanism integrated with P-FIPS from a cloud to a service provider. The last one is the Decentralized Audit and Ordering (DAO) Chain mechanism which provides the correctness of obtained data to the service provider as well as ensures the owners that their data is being used for the approved purpose only. Most importantly, we show that our framework is much more efficient than the existing alternative in the scheme.


I. INTRODUCTION
IoT Analytics is predicted that by 2025, more than 20 billion devices will be connected to the internet [1]. With the explosive increase of IoT devices in everyday environments, many global services are making innovative changes centered on user data by discovering hidden data insights through 'the digitization' process and intelligent big data analytic, and the speed of such changes continues increase [2]. The problem, however, is that the amount of raw data that occurs every day The associate editor coordinating the review of this manuscript and approving it for publication was Yang Xiao. is truly enormous, and the most apparent problem is managing and optimizing a variety of complex systems. It might take that the centralized approach can maximize processing efficiency, but it is challenging to apply to legacy systems. Data silo causes difficulty to integrate the data needed for advanced analysis [3]. The limitations of data significantly affect the quality of the data used by artificial intelligence (AI) and the reliability of the model's predictions. New technologies such as distributed learning provide a path forward, but unfortunately, lack of transparency tends to undermine confidence in the data used for analysis [4]. In addition, although the data is composed of personal information or content produced by users of the service, control over this data is monopolized by the service providers. Therefore, how to guarantee sovereignty over the data they produce to each user who produces such data and how to solve the privacy problems that may arise due to misuse of personal data is emerging as an issue.
Typically, all of the user's data collected from IoT devices run on a competent IT infrastructure (e.g., cloud center) as a digital/logical/cyber/virtual representation/replica of a physical system [5]. The cloud data center builds a cloud server in the center and processes real-time data generated by each distributed outsourcing server. Semi-trust outsourcing servers are vulnerable to malicious attacks (data learning, data inferring, forgery, man-in-the-middle attacks, etc.) [6]. Therefore, users usually perform normal encryption for privacy protection. However, after data stored and processing returning the entire ciphertext to the user would significantly increase the user's computing and network overhead, contrary to the original intent [7], [8]. Customized information can be provided to users through attribute-based cryptograms [9]- [11]. However, it can connect and infer between attributes and personal users if publishing attribute policies. This is the first challenge. It is important to consider lightweight and protect privacy for data sharing. Outsourced servers provide support for data security, and searchable encryption technology has been extensively studied with cloud adoption [12]- [14].
To privacy, outsourcing servers should have as few associations between identifiers and encrypted files as possible [15], [16]. To solve issues, several ambiguous cryptographic transmission schemes have been adopted. However, in order to provide the truth of a single version of the interaction, an additional trust manager is needed, resulting in undesirable delays and response to requests [17], [18]. Blockchain, a new decentralized database paradigm, can provide the promised value for digital artifacts and provide transparency by recording the transaction of each other without a third party [19]. However, In a practical view, it is difficult for the Peer to Peer(P2P) blockchain to support a big data environment.
For data use after data stored, users request identification and storage authorization from the service company (Server). The cloud IDP (Identity Provider) provides efficient user identification and data access control at the same time by mediating data access between repetitive users and service providers [20]. However, the traditional IDP structure remained in web-based services' client-server structure. Numerous users (Clients) provide their personal information to Internet service companies (Server) in exchange for the free use of the service, and Internet service companies exclusively manage all information such as user identifiers and data [21]. Each user receives exclusive services from their respective Service Providers (SP) in a centralized form to other companies. Each service provider individually performs all functions of identification (I: Identification), data storage (S: Storage), and service application (A: Application), which are the essential elements of the service. Therefore, users cannot recognize or control the form of corporate monopoly [22]. The recent IDP model provides horizontal trust, unlike the previous one. The blockchain-based decentralized IDP [23]- [25] specified by the World Wide Web Consortium (W3C) is called DID and provides a single truth and defines the privacy boundaries of users. First, the user makes a decision and then negotiates through communication with colleagues. However, DID [26] is just a token that temporarily grants permission for a specific task and does not contain any information about the user. User identifiers can be accumulated through the blockchain and linked to user personal information. This is the second challenge. To technically solve the monopoly of user data of a specific operator, identification, storage, and application are needed one framework with a relation [27], [28]. Table 1 summarizes the reminders that the framework should consider. Between the limited computing power of IoT and the user's data availability, the framework considers the following four perspectives (availability, efficiency, privacy, security) and sub-functions.

A. NAIVE SOLUTION
Our research goal is to allow access to privacy-enhanced data of attributes by using blockchain and cloud outsourcing. To insist on our research objectively, we introduce the privacy policy that GDPR and the global companies consider as follows [29]. (a) Data minimization that allows access to only minimized data using technology (b) Transparency and data management that allows users to check the collected data and make their own selection The key contribution of the proposed research work is summarized as follows. (a) P-FIPS: Users control the use of data by service providers by labeling the scope of use of information (user connection, user disconnection, user tracking). Privacy labeling stores user attributes data in outsourcing servers and allows service providers to access information by calculating cryptography according to user labeling. We want to provide efficiency by applying an outsourced cloud and searchable encryption. At this time, VOLUME 9, 2021 users lead privacy labeling and search keywords for it, thereby enhancing privacy [22]. (b) DAO Chain: Considering the Honest-but-curious model, blockchain is applied for audit without TTP. Blockchain performs as a Rewind Simulator by capturing valid information of participants and the behavior of the protocol. Objective verification provides a balance of information between users-information providersservice providers. In addition, we consider efficiency by separating the chain that records operation performance and verification information via state channel [30], [31]. (c) Ambiguous Data Acquisition Mechanism: Oblivious keyword search with authorization (OKSA) creates a trap door based on the keyword set, the information provider's token, and the service provider's private key so that between the user is mutual authenticated and data shared [27]. In addition, the oblivious trapdoor is to make sure that the user-connected keywords that belong to the keyword set are known but cannot be distinguished for privacy. Our framework is not a trivial combination of blockchain technology and outsourcing cloud. It is built on a new concept of the oblivious and on-off chain. Through the concept of approval, each object authenticates the relation and preserves privacy by checking each other through information imbalance. In other words, it combines each other's information to prevent contamination of the entire system and provide forensics as an auditor of law and investigation results if necessary. Users reduce computation burden through a cooperative system and improve privacy through objective verification and data leadership. Beyond simple qualification verification, service providers can make data available, providing a concept that was not implemented in the existing blockchain.

B. ORGANIZATION
The rest of this paper is organized as follows. Section II gives the initial knowledge of federated identity credential and searchable encryption for understanding the idea of our paper, Section III details the previous research work that has been presented preliminaries, and Section IV presents an overview of proposed framework defined objective, workflow, system requirements, Section V introduce the construction of our framework with the protocol. Then, in Section VI, We analyze our framework to system requirements. And we implement simulation, verify efficiency comparison with other schemes. Finally, Section VII concludes this research along with future directions.

II. RELATED WORK
This chapter describes frameworks and cryptographic algorithms for identification, data processing, and storage. The following related work describes existing research to improve the consideration of cloud identity provider framework.

A. FEDERATED IDENTITY CREDENTIAL
The IDP manages the user's personal data and identity in the IDP paradigm, where various providers handle multiple accounts and attributes. Open authentication (OAuth) works by defining a common password between the SP and the user-mediated IDP called ''Token.'' The OAuth token, on the other hand, has no privacy properties [15]. While the user facilitates the token creation, attributes such as personal data are transferred directly from the IDP to the SP once the user's token has been approved. The consumer is ''out of the loop'' at this point, and the essence of the personal data being transferred is uncertain. Traditional solutions, in short, applied a centralized identity provider, and IDP does not reject user authentication or protect against permitted denial of service attacks.
Personal data management systems [19] have been proposed with the help of blockchain technology to enable users to control their data. Blockchain-based identity management is provisioned using decentralized trust methods without a single identity provider [24]. Key pairs explicitly represent the user, and a single blockchain is maintained through agreements between verifiers and nodes that use algorithms to reach an agreement on the state of the blockchain [26]. However, user data might be more privacy issues such as correlation attacks than traditional federated IDs with OAuth because the user's attribute is distributed at blockchain [32]. Thus, privacy-enhancing techniques such as anonymous authentication credentials function as authenticating users and transferring data while ensuring privacy through encryption [32].
Many schemes for optional public credentials based on RSA or DH assumptions have emerged as a result of Chaum's blind work [34], but these schemes usually require a centralized credential provider and cannot be publicly verified [20]. After that, A healthcare chain [33] is proposed to promote data interoperability and confidentiality in health information networks. Also, [26] proposed enhanced pribacy identitification (EPID) that applied Direct Anonymous Attestation (DAA) and zero-knowledge proof(i.e., Camenisch-Lysyanskaya Signature) to the blockchain. The authors in [36] proposed to issue and manage internal and external certification separately for anonymous certification. To ensure data utilization and data confidentiality, the authors in [27] proposed data exchange on distributed storage based on OKSA in its framework instead of specific algorithms. However, existing studies, including existing DIDs, provided validation and scope proof of attributes but did not consider the attributes available. Existing schemes do not consider token distribution and transfer, or we are only considering environments tailored to Peer to Peer in Public blockchain [35]. Blockchain adopts the evolution of personal information in the concept of security tokens. Security tokenbased blockchain can help users address some fundamental issues related to privacy and governance and improve trust and scalability [9], [33].
Therefore, first, privacy will be largely off-chain and relies heavily on trusted central institutions to access information and keep it locally. Next, privacy solutions based on state chains can separate data into different sets and hide it across the public network. Finally, privacy can reside directly on the chain from a more specialized security token blockchain so that owner and property information can control privacy access levels.

B. SEARCHABLE ENCRYPTION (SE)
Cloud outsourcing server (COS) uses SE technology to offer critical information retrieval services to cloud clients while protecting their privacy. Symmetric Searchable Encryption (SSE) is more effective, but it has a more difficult secret key distribution/management process during data sharing [12]. To address key management issues, Zhang et al. introduced the principle of public key encryption using keyword search (PEKS) [13]. Following that, combinable multi-keyword search techniques, such as Public Key Encryption [10], [14] have been proposed to include a range of search functions. Both of the above methods, on the other hand, are honest-but-suspicious cloud environments that can't check the validity of search results. Usually, third parties have partial confidence COS, which is inadequate since they can deliberately return false search results under various synchronizations. A honest-but-curious COS, for example, can run part of the search job or reverse part of the incorrect search results to save compute and bandwidth resources. To allow authorized personal searches, the public encryption oblivious Keyword Search (PEOKS) [19] has been proposed. The k-out-of-n unknown transmission system is the most popular oblivious transfer (OT) system type (OTkn). S has n messages, and R is searching k messages at the same time, so S has no know what R receives. Centered on the two-way OT protocol between the SP and the user, Ogata and Kurosawa [9]introduced the concept of ambiguous keyword search to resolve user privacy concerns in keyword searches.

III. PRELIMINARIES A. BILINEAR PARINGS
It is the use of a pairing between elements of two cryptographic groups to a third group with a mapping G 1 × G 2 → G T . Let G 1 , G 2 be two additive cyclic groups of prime order q, and G T another cyclic group of order q written multiplicatively. A pairing is a map: e : G 1 × G 2 → G T which satisfies the following properties: • Nondegenerate: e = 1 • Computability: There exists an efficient algorithm to compute e.

B. OBLIVIOUS KEYWORD SEARCH WITH AUTHORIZATION
In OKSA, the data user, such as a service provider (SP), produces a keyword token for any keyword in the authorized keyword set. Then the data provider, such as a cloud outsourcing server (COS), establishes the trapdoor with the obtained token, its private key, and the authorized keyword set [17]. OKSA is composed of a detailed algorithm that is defined in [27].
(a) Setup. A public/private key is generated using the security parameter δ and the integer n. Negotiate a keyword set W between the user and COS, where |W | ≤ n is the keyword. (b) Encyption. The user encrypts ciphertext CT i for w using the keyword w and COS's public key and message. All ciphertext is committed to the COS by the user. (c) Data Request.
(i) Request. The keyword token P(w i ) is generated using the COS input of the allowed keyword set W , a given keyword P(w i ), and the public key. Finally, COS uses the private key to compute the transparency signature (ii) Commit.P(w i ) is sent from COS to SP. From the public/private key, keyword token, and keyword collection, P(w i ) is determined. The obtained token is used to create a trapdoor for only one keyword in the allowed keyword set; the signature aids COS in verifying accountability. (d) Data Retrieval.
(i) Trapdoor. The SP enters the authorized keyword collection W , the keyword w , and the COS' public key, then generates a trapdoor T w and sends it to the COS. (ii) Verification. COS sends a trapdoor Tw to the service provider once the verification is complete. (iii) Data Decryption. Message m is decrypted using an encryption CT i , a trapdoor T w , and SP's private key if w = w , otherwise ⊥. (e) Correctness. If the SP obtains the message of choice after all entities obey the protocol steps above, an oblivious keyword search authorized is correct. Furthermore, the verification of accountability implies that the trapdoor was generated from a single specific keyword in the received token and that this specific keyword is in the permitted keyword set.

IV. PROPOSED FRAMEWORK
We propose a labeled data access system with cloud outsourcing and a keyword search protocol with data linking token. VOLUME 9, 2021 The contact costs between the information issuer service and the service provider are constant in scale. The proposed P-FIPS ensures that the service provider can produce the trapdoor data access token for any permitted keyword in the package, but cannot guess which one.

A. MAIN SYSTEM ENTITIES
The followings are major system entities in our proposed framework.
(a) System Manager. System manager charges the whole system. All the users, information issuers, and service providers must register to the system management.
It generates system parameters and keeps a public key for the whole client. It also generates consensus vector a for the blockchain network. Existing blockchain systems are beyond this discussion's scope to include all environmental factors such as endorse peers and block generation leaders. (b) User. As the data owner and service consumer, he/she submits data through CO to service provider for a service. The user performs data labeling is by specifying the data disclosure scope. Then, they encrypt data, including index, by extracting connection keywords. They then send the index and cryptogram signature to the CO. Also, as a participant in the blockchain network, they only have the header of the block. The detailed algorithm used to encrypt each data is beyond the discussion scope so that any public key cryptographic algorithm can be applied. (c) Cloud Operator (CO). The CO with expertise and capabilities as an outsourcing server can provide data storage and resource access services to authorized cloud clients (user and service provider) through keyword authentication based on ambiguous data acquisition. It is possible to infer sensitive information available and return false retrievers with various motives. It operates on the cryptographic operation and records and shares the performance details within the blockchain network. (d) Resource Server. This is the provisioning system of CO, which stores user indexes, passphrases, and data linking keywords of users. It stores the data connection passphrase generated by the outsourcing server. It returns resources as requested by the outsourcing server. (e) Outsourcing Server. This is the CO's provisioning system, which performs operations to access data between users and service providers. It generates encrypted data containing the service provider's trapdoor generated from users' data linking keywords. (f) Service Provider (SP). After obtaining permission from the user, it can submit a trapdoor to CO to request retrieval queries on the user's data of interest and subsequently manipulate the user's data.
The CO, like [17]- [23], is thought to be truthful yet curious. It only conducts a small percentage of retrieval operations, but it is curious about the sensitive data. It can also return false retrieve results in order to save computing resources. The DAO, on the other hand, is decentralized and can ensure the accuracy of data retrieval performance. The approved data consumer may also retrieve queries without giving the CO any sensitive details.

B. WORKFLOW
We make up a system with the objects mentioned above. According to the data labeling, our goal is for service provider to obtain users' data from the CO. Our framework mainly consists of the following five phases [27]: (a) Initialization. In the initialization phase, the system manager's global setup and key generation are performed. The system manager generates some transaction parameters, where the public parameters are public. Whole clients generate public/secret key pairs through the public parameters and integer numbers. (b) Data Storage. Users utilize encryption modules to process sensitive plaintext data before providing it to the informant. Data desired to be processed provided through linking keywords. Cryptographic data must provide both access and confidentiality through outsourcing operations by the information provider. It means that valid service providers can access the data. (c) Data Retrieval Request. The service provider requests data acquisition. According to the user's labeling, we want to acquire data by creating a valid trapdoor through the connection keyword. Meanwhile, a state channel of data is created. The operation is updated to the latest state and recorded off-chain. (d) Data Retrieval. The informant checks the validity of the request. It verifies that this request's trapdoor is in the data connection keyword set and then distributes the data access token to the node service provider. The final state is distributed on the blockchain as a transaction. (e) Data verification. Upon receiving the data access token, the service provider can check the blockchain's validity and perform a retrieve through the token to access the user data stored by the information provider.

C. SYSTEM REQUIREMENTS
We adopt the semi-honest security model in our study and assume the cloud servers follow an honest but curious model, which has been widely applied in [17]. In this model, the cloud servers will honestly execute the customized protocols and capture and analyze the meaningful information of the data, query requests, and query results. To enhance the security of the system, we adopt the framework of two non-collude clouds (C1 and C2), which has also been widely adopted in recent works [17]. In practice, these noncollude clouds can be provided by competitive cloud service providers, such as Google and Amazon. Such well-known companies are highly impossible to be in collusion with each other. In addition, we assume query users are trusted and will not collude with the cloud or other users. In detail, the privacy requirements are described as follows.
(a) Data Privacy. User controls the scope of their data connection, and the information provider cannot know the connection keywords selected by the service provider. It provides service consumers' data and related privacy. When the service provider accesses data from the CO, it guarantees the connection keyword's authority but does not know the retrieved general password data and the associated keyword. In other words, the service provider knows nothing about the data other than the query result of the linking data. The plain text of the query results should not be learned by any party other than the query user. (b) Preservation of Trading Orders. It needs to ensure that the transactions are executed in sequence. It means that each transaction block can be linked to the previous chain as an uninterrupted order. (c) Result Verification. Honest-but-curious, CO can return incorrect retrieve results, compromising data security and seriously impacting the service browsing experience. It needs a result verification mechanism to ensure data retrieval accuracy. (d) Efficiency and Feasibility. To quickly retrieve user data and avoid wasting bandwidth and computing resources, the service provider should be able to retrieve encrypted data and submit multiple labels. (e) Security Goals. In addition, the keywords are selected from a small space and need resist the standard model's keyword guessing attacks. Moreover, our proposed framework need objectively verification the correctness of retrieved results for the honest-but-curious CO without a trusted third-party auditor.

V. PROTOCOL
This section explains a framework constructed of P-FIPS protocol with OKSA and DAO. Privacy labeling allows the SP to retrieve user data from the user sets data linking fields. At that same time, OKSA provides negotiating data access token obliviously and retrieve user data between SP and CO [27]. DAO guarantees the record transparency of the sequential order. The algorithms in the following main focus on building index and generating each authorized so that the data linking keyword search query a service provider's token via user's labeling can be processed efficiently in the cloud operation. The DAO verify that the retrieval results are accurate. We define a function F(centerdot) that maps the subscripts in the set [1, q] to their corresponding subscript in the set [1, n] for the sake of the following discussion. We upper mention P-FIPS consist of five-phases, and each phase has the step of algorithms which are shown as follows based [27]: (a) Initialization Phase.  verification mechanism; otherwise, ⊥ after obtaining the search results ENC .

A. INITIALIZATION PHASE
Step 1. System manager take as input a security parameter k, integer n. It choose a bilinear map system PG = {p, G 1 , G 2 , e}.
Step 3. For each user randomly selects g, h ∈ G, a, x ∈ Z p and computes g a , h i = h a i for i = 1, 2, 3, . . . n. pk u = (g a , h 1 , h 2 , . . . , h n ), sk u = a Step 4. For CO, it first selects two elements α ∈ R G, a s ∈ R Z * p and compute β = g a s , then the issuer's public/private key pair is donated as pk s = (α, β), sk s = a s .
Step 5.The SP chooses an element y ∈ R Z * p and computes Y = g y , then it sets the public/private key pair of the specific user u as pk sp = Y , sk sp = y.

B. DATA STORED PHASE
Step 1. The user selects a data linking keyword kw where kw ⊆ KW and the size of KW is denoted as (KW = k ≤ n).
Step 2. The user encrypts keywords for retrieving the identities that the CO will use later. The data linking keyword filed is denoted as KW with the size n. Each data has its associated data linking keyword. Given a data d i ∈ (0, 1} l and a keyword w i ∈ KW the user chooses r i ∈ Z p and computes the encrypted data EN as ENC i = enc 1i = g ri(a+w i ) , Step 3. For each identities f i ∈ F(i ∈ (1, n]) with identity id i , each user u ∈ U generates his signature sig i,t = (h 1 (id i ), g h 2 (c i ) ) sk u , and the ordered signature generated by user is set as C. DATA RETRIEVAL REQUEST PHASE Step 1. The CO sends each identity's index f i , which is developed based on the keyword fields KW = {kw 1 , . . . , kw m }.
Step 2. The CO stores signature set OSig = {sig i , . . . , sig n }, index set I = {I i , . . . , I n } and encrypted data ENC at the outsourcing server, where I i = Step 3. The CO given the authorized keyword set KW , a keyword kw i ∈ KW and the public key PK sp , picks s ∈ Z p as private key SK s p and computes the token T KW and the proof as Step 4. Given the tuple P(kw i ), W , and the public key PK sp , the SP checks the accountable by the following equations, if both equations hold, the user accepts the received keyword token is for the trapdoor for one keyword, and we denote is as Q(kw i ) = 1; otherwise, ⊥.
Step 5. The CO give a i and kw to the SP computes the trapdoor P by the following equations then CO returns the trapdoor P to SP.
Step 7. The SP computes T KW , O 2 by following equation and sends the data access token T KW to CO.
D. DATA RETRIEVAL PHASE Step 1. After gaining the data access token T W , the SP first computes γ = e I i,2 , α a s and the CO returns the relevant retrieval data CT = {ct 1 , . . . , ct q } and its corresponding identity set ID = {id 1 , . . . id q } to DAO; otherwise, it returns ⊥.
Step 2. At the beginning, given the trapdoor T W and the index I i for each record f i (1 ≤ i ≤ n), the issuer pre-processes the retrieval query with performing m exponentiation operations.
Step 3. Afterwards, the CO checks whether the submitted trapdoor matches with the index I i by checking. The CO returns the corresponding encrypted data enc i ; otherwise, returns ⊥. Finally, the CO returns the whole relevant encrypted data sets ENC = {enc 1 , . . . , enc q } and its corresponding identity set ID = {id 1 , . . . id q } or ⊥ to DAO.
Step 4. As the value of m is very small in practice, verify computation will not exert a heavy computational burden DAO. Thus, the retrieve algorithm is feasible and practical in actual scenarios.
Step 1. After receiving the retrieval results CT , the service provider first selects an element π τ ∈ R Z * p for each encrypted identities id Step 2. The CO sends the challenging information (τ, π τ ) τ ∈[1,q] to service proivder. After gaining the challenging information, the CO first computes ϕ as Step 3. The SP proves information (ϕ, σ ) through DAO, where sig τ = d t=1 sig ρ(τ ),t . Finally, the DAO verifies whether Eq. (15) holds.
Step 4. The above equation, id τ = id ρ(τ ) , c τ = c ρ(τ ) . If holds, the DAO justifies that the retrieve results C are correct and sends them to the specific data user u; otherwise, it aborts. The detailed process of results verification can be found. At the beginning, the DAO interacts with the CO based on the challenge-response mode.
Step 5. Afterwards, the DAO computes d t=1 pk t and verifies whether the retrieve results C is correct or not. Finally, the DAO draws the conclusion and returns the correct retrieve results to the SP.
Step 6. The SP given ENC i , P, b i . SP executes the retrieval operation by Step 7. If the above equation holds, SP computes the decryption operation by

VI. ANALYSIS
We compare our framework requirements discussed in Section IV-C. Also, We analyze the efficiency of performance evaluation of our framework and others.

A. ANALYSIS OF SYSTEM REQUIREMENT
We compare the proposed framework security with the others according to system requirements in Section IV-C.

1) PRIVACY
When a user's data labeling setting and a service provider's request for data access, if the data label is exposed, there is a problem that the user's information can be inferred. We prevent this through the data acquisition mechanism based OKSA [27] above on IND-KGA security [34]. From Eq.(1): user → CO send a encrypted data with data linking keyword.
For security issues, IND-KGA, as known to all, can ensure that the outsider attacker cannot infer the relationship between the target trapdoor and challenging keyword set even though it can gain other trapdoors. Our framework is secure against IND-KGA in the standard model because the DDHproblem is intractable.
We look at two challenging problems to provide the foundation for the security of OKSA, i.e., (f , n)-DHE Problem and (f , q)-MSE-DDH Problem. The (f , n)-DHE Problem has been based on [34], [36] Definition 1 ((f , n)-DHE Problem): Let G be a group of prime order p, h ∈ G and a ∈ Z p . Given h, h a , . . . , h a n , output [34].
Definition 2 ((f , n)-MSE-DHE Problem): Let PP be a bilinear map group system and g 0 , h 0 be the generators of the group G. They assume two pairwise co-prime polynomials f and q with degree 1 and n − 1, respectively, where n is an integer. Given , . . . , h α nf (α)q(α) 0 , and Z ∈ G T , the goal is to distinguish Z = e(g 0 , h 0 ) rq(α) or a random group element in G T [36].

2) PRESERVATION OF TRADING ORDERS
We ensure that the transactions are executed in sequence. It means that each transaction block can be linked to the previous chain as an uninterrupted order. All parameters for data retrieval are set between cloud oprator and service provider in the current transaction. Our framework's security can be directly obtained from the DAO blockchain's security and ordered signature. First, each transaction operates on a state basis and is reordered according to timestamp with a transaction ID. Also, each transaction signed the user's private key It is hard to violate this order. Second, the Unforgeability and ordering of signature [24] guarantee the impossibility to reorder the positions of blocks in the chain. Firstly, the indistinguishability of OKSA guarantees secrecy protection of m i and w i in our framework, which cannot be obtained without the corresponding secret key of the user. Secondly, the Privacy and Accountability of OKSA provides oblivious retrieval of our framework. In our framework, the CO can know the relationship w i ∈ W but not to know the specific label w i in an oblivious way. The CO also learns nothing about the retrieved plain-cipher data (m i , CT i ).
From Eq.

3) RESULT VERIFICATION
Honest-but-curious, CO can return incorrect retrieve results, compromising data security and seriously impacting the service browsing experience. It needs a result verification mechanism to ensure data retrieval accuracy. According to the proposed protocol, the cloud allows data access with service providers without relying on a third-party. We assume that the service provider specifies the label w i and retrieves the data mi associated with w t and the block σ is verified by more than half of the nodes From Eq. (16): Service Provider → DAO Chain PK sp ).

B. ANALYSIS OF SYSTEM EFFICIENCY
We deploy a static environment composed of ten nodes without adding or revoking nodes. The block will be seen as valid once two nodes and monitor nodes approve it, respectively, via Proof-of-Authority (POA) consensus (Fig 3). We employ the ordered multi signature [9] and oblivious transfer keyword authority construction [9] to instantiate our protocol. We conduct the algorithm implementation on a virtual CPU 2G∼4G memory. We exploit RPC and JSON Web-Socket, where the language is solidity. We select asymmetric elliptic curve α-curve [17], where the base field size is 512-bit, and the embedding degree is 2. The α-curve has a 160-bit group order, which means p is a 256-bit length prime. The system is designed using the Ethereum Virtual Machine (EVM) based Kaleido enterprise blockchain as a service and block cloud. Also, we use the Metamask RPC App account and deploy Remix smart contract. The proposed approach is modeled by developing data processing and queries through peer collaboration between users (data owners) and COs and service providers. Here, the user uploads the encrypted data and connection attributes, while the CO generates an access passphrase through the connection attributes.
Other configurations are shown in Table 3 and Table 4. We quantify the transmission bandwidth between two nodes and the computational overhead of several steps in one transaction with ten regular messages. We compare the framework to retrieve encryption and blockchain in the following experiment.

1) TRANSMISSION BANDWIDTH
Test the bandwidth of three parameters, including cryptographic data, request, and key, and include the response bandwidth of other nodes for block acknowledgment. We choose a hash function with an output of 256 bits and plain data of the same length. Adjust the connection keyword according to the user's connection range. The experimental results are shown in Figure 4. It can be seen that each parameter is almost constant as the size of the associated keyword set increases. Therefore, the transmission bandwidth is also independent of the keyword set. It is more efficient than PEKS in [17] and multi-keyword conjunctive. We measure the computational overhead in several steps. The details and assumptions required will depend on the step.

2) DATA STORED PHASE
Test the calculation time to generate cryptographic data. Encryption for each message is an independent process, so 20 measurements are run for each of the ten pre-set messages. Figure 4b shows the data encryption speed for the data labeling size. It can see that the computation time of the data storage step is independent of the size of the data label.

3) DATA REQUEST AND RETRIEVAL PHASE
We test the computation time of the data retrieval request and data retrieval. It time includes keyword and token transfer to commit data searchable encryption, and retrieve one message. The computational work should primarily contribute to data retrieval and decryption. The results are shown in Figure 4c. When the keyword set's size increases, there is no explicit change that occurs when retrieval for messages. It is aggregated into a protocol with an independent retrieve algorithm for a set of keywords as state blockchain.

4) VERIFICATION PHASE
Measures the rate of verification, including block approval, request responsibility, and access creation. In the experiment, we assume block validation. Our experiment is created by combining two nodes and one system manager through POA consensus. However, the approach in [17] requires majority consent, and the approach in [33] is valid as long as six nodes approve it. Figure 4d shows the validation rate for the size of a keyword set. It shows that [27] increases the computation time linearly as more keywords are included in the keyword set.

VII. CONCLUSION
Due to IoT and intelligent big data recent development, global companies provide user-centric services. However, the global services' legacy system is challenging to integrate the data required for high-quality analysis. Therefore, It is hard for accurate response and expecting big data processing reliability. In addition, the absence of an integrated framework leads to the loss of user data sovereignty and misuse of personal data. For data processing, cloud data centers have centrally built cloud servers, each distributed outsourcing server provides support for data security, and has been providing efficient data processing through searchable encryption technology. Among them, Cloud IDP provides authorization to storage through identification. However, the existing identification and access control structure remained in the client-server structure of web-based service. Also, providing consistent effectiveness for interactions requires a separate manager, resulting in undesirable delays and responses to requests. Blockchain, a new decentralized database paradigm, can realize the promised value for digital artifacts and provide transparency by recording each other's performance without a third party. The blockchain-based decentralized IDP [10] specified by the World Wide Web Consortium (W3C) is called DID and provides a single truth and defines users' privacy boundaries. However, DID [11] is just a token that temporarily grants permission for a specific task and does not contain any information about the user. User identifiers can be accumulated through the blockchain and linked to user personal information. Unfortunately, from a practical point of view, it is difficult for the Peer blockchain to support a big data environment. We still face availability, efficiency, and trust, and security such as privacy.
This paper proposed a new data user-centric, privacyaware cloud data sharing framework for users and service providers. It is a new global identity provider concept that supports granular access control to a federated outsourcing cloud called P-FIPS (Privacy-enhanced Federated Identity Provider System) where data owners perform identity access control with operators. To efficiently provide encrypted data, the user the scope of use for each labeled data in the cloud (e.g., user connection, user disconnection, user Tracking) to manage service providers' access searchable encrypted data in the cloud. The service provider computes data tokens within the labeled keyword scope and locates data via a cloud server. Also, the DAO chain mechanism provides that the service provider correctness for the received data and ensures that the data is used only for authorized purposes by the user. As a result, we satisfied the existing scheme's security requirements thorough security analysis. Simultaneously, a simple simulation implementation demonstrated that the overhead does not increase regardless of the number of data labels. We plan to develop a framework that operates lightly from the number of objects in the future.