The Cloud we Share: Access Control on Symmetrically Encrypted Data in Untrusted Clouds

Along with the rapid growth of cloud environments, rises the problem of secure data storage–a problem that both businesses and end-users take into consideration before moving their data online. Recently, a lot of solutions have been proposed based either on Symmetric Searchable Encryption (SSE) or Attribute-Based Encryption (ABE). SSE is an encryption technique that offers security against both internal and external attacks. However, since in an SSE scheme, a single key is used to encrypt everything, revoking a user would imply downloading the entire encrypted database and re-encrypt it with a fresh key. On the other hand, in an ABE scheme, the problem of revocation can be addressed. Unfortunately, though, the proposed solutions are based on the properties of the underlying ABE scheme and hence, the revocation costs grow along with the complexity of the policies. To this end, we use these two cryptographic techniques that squarely fit cloud-based environments to design a hybrid encryption scheme based on ABE and SSE in such a way that we utilize the best out of both of them. Moreover, we exploit the functionalities offered by Intel’s SGX to design a revocation mechanism and an access control one, that are agnostic to the cryptographic primitives used in our construction.


I. INTRODUCTION
Over the past few years, cloud computing has grown to an extent that affects the day-to-day life of almost everyone. From big corporations to casual internet users the cloud has become an integral part of our lives. However, many users still feel reluctant about outsourcing their personal files since cloud services are hosted and run by third untrusted parties and thus, the files are vulnerable to internal attacks. To this end, both key industrial players as well as researchers have turned for solutions to the promising technique of Symmetric Searchable Encryption [8], [9] and to the well-studied field of Attribute-Based Encryption [10].
In an SSE scheme, users encrypt their files locally before outsourcing them to the Cloud Service Provider (CSP). Thus, the CSP who does not possess the encryption key cannot The associate editor coordinating the review of this manuscript and approving it for publication was Muhamamd Aleem . extract any valuable information about the users' data. However, the most fascinating property about SSE, is that it allows users to search directly on their encrypted data for those that contain specific keywords. Unfortunately, SSE schemes do not support the revocation of users -a problem of paramount importance in cloud-based environments. Hence, revoking a user is equivalent to downloading the entire database and re-encrypt it with a fresh key.
Another technique that fits cloud-based environments is ABE. In ABE schemes, all files are encrypted using a master public key, but in contrast to traditional public key cryptosystems, the resulted ciphertext is bound by a policy. Moreover, each user has a unique secret key associated with the user's attributes (e.g. id, age, organization, etc.). Thus, decrypting a file is possible if and only if the user's attributes satisfy the policy bound to the ciphertext. However, using an asymmetric encryption scheme to encrypt large volumes of data, is rather inefficient.

II. RELATED WORK
In [21] authors present HardIDX, a scheme that supports range queries by utilizing the functionality offered by SGX. Their construction minimizes the leakage by hiding the search pattern but the proposed scheme is static, thus file additions and deletions are not supported. A dynamic SSE with stronger security guarantees is presented in [14], where the authors presented Sophos. Sophos is a forward private SSE scheme in the sense that newly added keywords cannot be linked to previous queries. A more efficient forward private scheme was presented in [19] where the authors improved the search time by presenting a parallelizable scheme. However, all the aforementioned schemes only support the single-client model where the only occurring communication is between a data owner and a cloud service provider. For this work, since we are interested in a multi-client scheme, we chose the scheme we designed and presented in [9], which is an extension of [19], in the multi-client model. A promising scheme is presented in [20] where authors present IRON, a functional encryption scheme based on SGX. IRON's main functionalities (such as decryption of a file and application of a function on the decrypted file) are executed in the isolated environment offered by SGX. In our construction, we use the same hardware principles to design our revocable hybrid encryption scheme and we further exploit SGX by designing a revocation mechanism that is solely based on SGX enclaves.
In [28], authors try to tackle the problem of storing data on untrusted clouds, by designing a revocable hybrid encryption scheme, enhanced with a key rotation mechanism to avoid key scraping attacks. Authors use an Allor-Nothing-Transformation (AONT) [16] to prevent revoked users from accessing the stored data. In particular, they use Optimal Asymmetric Encryption Padding (OAEP) as the AONT, since reversing OAEP requires the entire output to be known. Thus, by changing random bits, reversing OAEP becomes infeasible. Naturally, to decrypt a file, the changed bits need to be stored, so that the AONT could be later reversed. However, this implies that with each re-encryption, the size of the ciphertexts grows and, as a result, decrypting a file that has been re-encrypted multiple times, becomes an expensive operation. Moreover, to make the scheme more efficient, the authors suggest that the AONT could be applied by the online storage server. However, this implies the existence of a fully trusted server and hence, the scheme can be vulnerable to internal attacks.
A revocable Ciphertext-Policy Attribute-based Encryption (CP-ABE) presented in [23] proposes to embed the revocation list in the ciphertexts. However, this embedding results in bigger ciphertexts, deeming the decryption and file modification operations much more demanding. To overcome this problem, authors in [12] propose a method based on Hierarchical Identity Based Encryption (HIBE). In their construction, the users' secret keys, expire after a specified period of time. Thus, the revocation list only contains the keys revoked before the expiration time. Similarly, in [11] authors constructed a Key-Policy ABE (KP-ABE) by extending their work on Revocable Identity Based Encryption. In their design, the revocation of users relies on frequent key updates for all the different attributes; hence, their solution does not scale well for practical usage.
Another promising technique is presented in [30], where authors propose a Traceable CP-ABE scheme that supports the revocation of malicious users. In particular, in their construction, they design a mechanism that can trace users that have leaked information about the key from the system. However, on each revocation, a new group key, for a group of users, needs to be generated and then distributed to all eligible users. Moreover, authors place sensitive operations such as the re-encryption and partial decryption of ciphertexts, in untrusted entities. In our construction, even if a malicious adversary cannot be traced, we ensure that no adversary will be able to tamper with a user's access rights to bypass the system's authentication. Moreover, in our scheme, even if we make use of the trusted execution environment offered by Intel's SGX, all sensitive operations occur on the user's side to minimize the leakage from side-channel attacks. VOLUME 8, 2020 This work is an extension of [7], [24], [26], [27] where authors presented a hybrid encryption scheme based on ABE and SSE. The constructions in [7], [24], lacked a proper implementation as well as an access control mechanism like the one we introduce in this paper. Apart from that, our work differentiates from the schemes presented in [26], [27] since we extend the underlying access control mechanism while at the same time we enable users to search over multiple datasets in one round. Moreover, in contrast to all previous works, the entire protocol is redesigned to support symmetric key cryptography instead of asymmetric. Hence, resulting in a much more efficient approach. In addition to that, we used a far more modern SSE scheme, presented in [9] which supports forward privacy [15]. On top of this, to further examine the efficiency of our construction, we used a more efficient CP-ABE scheme presented in [5]. This way we could evaluate the performance of our construction even under the most demanding policies.

III. ARCHITECTURE
The underlying architecture consists of different components which are described as below.

Cloud Service Provider (CSP)
We consider a cloud computing environment similar to the one described in [29]. We assume that the CSP is SGXenabled, and that core entities will be running in the trusted execution environment offered by SGX.

Master Authority (MS)
Similarly to CSP, MS is SGX-enabled and running in an enclave called the Master Enclave. MS enclave generates and distributes ABE keys to registered users.

Key Tray (KT)
KT is also SGX-enabled and running in an enclave called the KT Enclave. KT enclave is responsible for storing the ciphertexts of the symmetric keys generated by data owners. Such symmetric keys are needed to decrypt the data.

Revocation Authority (REV)
Similarly, REV is also SGX-enabled and running in an enclave called the REV Enclave. REV is responsible for maintaining the valid scopes of users.

User (u i )
In our scenario, a user interacts with the CSP to manage certain files that has access to. A user can (1) store data in the cloud and (2) share data with other users. A user is referred to as a data owner when she is storing data in the cloud.
Moreover, we assume the existence of a registration authority that is responsible for the registration of users. However, registration is out of the scope of this paper and we assume that all users have been already registered.

SGX:
We briefly present the main SGX functionalities which are used for our construction. Further detailed description can be found in [17], [20].
Isolation: Enclaves are located in a hardware guarded area of memory of 128MB in which only 90MB can be used by the software. The processor tracks which parts of memory belong to which enclave, and ensures that only enclaves can access their own memory.
Attestation: SGX supports attestation between enclaves of the same (local attestation) and different platforms (remote attestation). In local attestation, an enclave enc i can verify another enclave enc j and the program/software running in the latter through a report generated by enc j . The report contains information about the enclave and the program running in it and is signed with a secret key sk rpt . This key is the same for all enclaves of the same platform. In the case of remote attestation, the verification is performed through a report signed with a special private key provided by Intel. Therefore, it requires contacting Intel's Attestation Server.
Sealing: As being stored in untrusted memory, data is encrypted with a Root Seal Key provided with every SGX processor. The sealed data can be recovered even after an enclave is destroyed and rebooted on the same platform.

A. NOTATION
The set of all users is U = {u 1 , . . . , u n }. The public/private key pair of a user u i is denoted by (pk i , sk i ) and the signature of u i on a message m is σ i (m). The Symmetric key of u i is denoted by K i . The access rights of a u i are denoted by a list of valid scopes SC i such that SC i = {(j, s l i ), (z, s k i ), . . . (k, s z i )}, in which j, k, . . . , z represent a collection of files encrypted under the symmetric keys K j , K k , . . . , K z and s j i is a one dimensional bit array of length five that represents the scopes (i.e. view, add, delete, manager, owner) assigned to u i . For instance, in case u i has access rights view and delete for data encrypted under the symmetric key K j , then s j i = [10100]. The output y of an algorithm A is denoted by y ← A if A is probabilistic, and by A → y if A is deterministic. A function negl(·) is called negligible, if ∀n > 0, ∃N n such that ∀x > N n : |negl(x)| < 1/poly(x). A probabilistic polynomial time (PPT) adversary ADV is a randomized algorithm for which there exists a polynomial poly(·) such that for all input x, the running time of ADV(x) is bounded by poly(|x|).

B. CRYPTOGRAPHIC PRIMITIVES
We now present the cryptographic primitives used in our construction. As already mentioned, we make use of an ABE scheme and a dynamic SSE scheme. We now proceed with the corresponding definitions as described in [10] and [9] respectively.
Definition 1 (Ciphertext-Policy ABE): A revocable CP-ABE scheme is a tuple of the following four algorithms: Setup is a probabilistic algorithm that takes as input a security parameter λ and outputs a master public key MPK and a master secret key MSK. We denote this by MPK, MSK ← Setup(1 λ ).
• CPABE.Gen is a probabilistic algorithm that takes as input a master secret key, a set of attributes A ∈ and the unique identifier of a user and outputs a secret key which is bound both to the corresponding list of attributes and the user. We denote this by sk A,u i ← Gen(MSK, A, u i ).
• CPABE.Enc is a probabilistic algorithm that takes as input a master public key, a message m and a policy P ∈ P. After a proper run, the algorithm outputs a ciphertext c P which is associated to the policy P. We denote this by c P ← Enc(MPK, m, P). • DSSE.InGen is a probabilistic algorithm that takes as input a secret key K and a collection of files f and outputs an encrypted index γ and a sequence of ciphertexts c. It is used by the client to get ciphertexts corresponding to her files as well as an encrypted index which are then sent to the storage server.
• DSSE.AddFile is a probabilistic algorithm that takes as input a secret key K and a file f and outputs an add token τ α (f ) and a ciphertext c f . The token and the ciphertext are then sent to the storage server, where c f will be added to the collection of ciphertexts and the index γ will be updated accordingly.
• DSSE.Search is a deterministic algorithm that takes as input a secret key K and a keyword w and outputs a search token τ s (w). The token is then sent to the storage server who will output a sequence of file identifiers I w ⊂ c.
• DSSE.Delete is a deterministic algorithm that takes as input a secret key K and a file identifier id(f ) and outputs a delete token τ d (f ) for f . The token will be sent to the storage server, who will delete c f and update the index γ accordingly. The security of a DSSE scheme is based on the existence of a simulator that is given as input information leaked during the execution of the protocol. In particular to define the security of SSE we make use of the leakage functions L in , L s , L a , L d associated to index creation, search, add and delete operations [9]. Definition 3 (Dynamic CKA 2-Security): Let DSSE = (KeyGen, InGen, AddFile, Search, Delete) be a dynamic index based symmetric searchable encryption scheme and L in , L s , L a , L d be leakage functions associated to index creation, search, add and delete operations. We consider the following experiments between an adversary ADV and a challenger C: C runs Gen(1 λ ) to generate a key K. ADV outputs a file f and receives (γ , c) ← Enc(K, f ) from C. ADV makes a polynomial time of adaptive queries q = {w, f 1 , f 2 } and for each q he receives back either a search token for w, τ s (w), an add token and a ciphertext for f 1 , (τ α (f 1 ), c 1 ) or a delete token for f 2 , τ d (f 2 ). Finally, ADV outputs a bit b.
ADV outputs a file f. S is given L in and generates (γ , c) which is sent back to ADV. ADV makes a polynomial time of adaptive queries q = {w, f 1 , f 2 } and for each q, S is given either . S then returns a token and, in the case of addition, a ciphertext c. Finally, ADV outputs a bit b.
We say that the SSE scheme is L-i secure if for all probabilistic polynomial adversaries ADV, there exists a probabilistic simulator S such that: In the cases of file addition and deletion, the simulator must also generate ciphertexts and update the current indexes. In addition to ABE and SSE, we rely on SGX functionalities to attest among the components.
During the execution of the protocol, all parties have access to the secure hardware as defined in [20]. In the beginning, HW.Setup runs to produce the secret key needed to verify reports. Each enclave is then initialized by loading a program P and producing a handle hdl which is used as an identification for the enclave running P. This is done by running the HW.Load interface. After the initialization of the enclave, HW.Run is executed with different inputs. For simplicity, we assume that all enclaves run on the same host, so they only perform local attestations with each other. To do so, an enclave (enc i ) first runs HW.RunReport which produces a report (rpt i ) that is sent to enc j . Upon reception, enc j executes HW.ReportVerify and verifies the validity of rpt i . A more detailed description of the hardware algorithms used by the enclaves is given below: • HW.Load(Q): Takes as input a program Q. An enclave enc i is created in which Q will be loaded. Moreover a handle hdl enc is created that will be used as an identifier for the enclave.
• HW.Run(hdl, in): Takes as input a handle hdl and some input in. It runs the program in the enclave specified by hdl with in as input.
• HW.RunReport(hdl, in): Takes as input a handle hdl and some input in. It will output a report that is verifiable by any other enclave on the same platform. The report contains information about the underlying enclave signed with sk rpt .
• HW.ReportVerify(hdl , rpt): Takes as input a handle hdl and a report rpt. Uses sk rpt generated by HW.Setup to verify the MAC of the report.

V. THE CLOUD WE SHARE (CLASH)
In this section, we present The Cloud we Share (CLASH)the core of this paper's contribution.

A. HIGH-LEVEL OVERVIEW
Before we proceed to the formal description of our construction, we present a high-level overview. CLASH is divided in a Setup phase and four main phases: Initialization, Key Sharing, Data Processing and Scope Management. In the Setup phase, all entities receive a public/private key pair that will be used to establish secure communication channels. During the Initialization phase, a data owner encrypts her data using the SSE scheme, uploads the encrypted files to the CSP, and encrypts the SSE key using an ABE key. The ciphertext of the key is bound by a policy specified by the user and it is stored on the Key Tray. In the Key Sharing phase, different users contact the Key Tray and request for the ciphertext of the symmetric key. Upon receiving the ciphertext, they can decrypt it if and only if their attributes satisfy the policy bound to the key. If the decryption of the key is successful, then the Data Processing phase commences where the users can search for different files, add new ones or delete existing ones, according to their access right (scopes). Finally, in the Scope Management phase, a data owner can modify the scopes of the users and even fully revoke their right to access the encrypted dataset. This high-level approach of our construction is depicted in Figure 1.

B. FORMAL CONSTRUCTION
Setup Phase: In the Setup phase each enclave is initialized and generates a public/private key pair (pk, sk) for a CCA2 secure public key cryptosystem and signing/verification key pair for a EUF-CMA secure signature scheme. An enclave is initialized as follows: CLASH.Setup(''initialize", 1 λ ): Each enclave is initialized by loading the program Q init ID : • On input (''initialize", 1 λ ): 1) Run (pk, sk) ← PKE.KeyGen(1 λ ).

Q init ID
Moreover, the MS enclave loads a program Q Setup MS that outputs the master public/private key pair (MPK, MSK).

Q Setup MS
Finally, MS is responsible for generating secret ABE keys for registered users. To do so, MS retrieves MSK and a list of attributes A associated with each user. CLASH.ABEUserKey(''KeyRequest", MSK, u, A, 1 λ ): The program Q sKey MS , which is responsible for generating secret ABE keys to registered users, is defined as follows: Finally, the different entities, including the users, use their public key pairs to establish secure channels between them. This way, all the exchanged messages will be encrypted symmetrically. Clearly, all the symmetric keys will be different. For example, K i REV denotes the symmetric key shared between the user u i and the REV enclave.

Q sKey MS
Initialization Phase: During the initialization phase, the data owner, u i stores her encrypted data on the cloud and stores the secret key K i to the KT so that it can be shared with other users. To do so, she first runs CLASH.Store and then CLASH.KeyTrayStore. CLASH.Store("store", cred i ): Assuming that all the enclaves are already initialized and that all registered users have received their secret ABE keys, a data owner, u i , can start interacting with the CSP to store her files. To this end she contacts CSP by sending m req = r 1 , E(K i CSP , cred i ), StoreReq, HMAC(K i CSP , r 1 ||cred i ||StoreReq) where r 1 is a random number generated by u i . Upon reception, CSP verifies u i as a registered user and sends m ver = r 2 , E(K i CSP ,Auth), HMAC(K i CSP , r 2 ||u i ||Auth) to u i . After u i gets the authorization message from the CSP, she generates a DSSE key K i and its unique index idx K i , encrypts her files, f i , with the key and sends them to the CSP via • On input (''StoreReq", m req ): 1) Open m req ; verify the message a ; if the verification fails, output ⊥. 2) Compute and output m ver . Run m ver ← HW.Run(hdl CSP , (''StoreReq", m req )). • On input ("store", m store ): 1) Open m store ; verify the message; if the verification fails, output ⊥.
Run HW.Run(hdl CSP , (''store", m store )). a By this, we mean that the entity receiving the message verifies the freshness and the integrity of the message and it can also authenticate the sender. Q Store CSP CLASH.KeyTrayStore("store", K i , p): To enable efficient sharing of K i between registered users, the data owner u i encrypts K i under the ABE master public key MPK and binds it with a policy P, resulting to a ciphertext c Finally, u i sends the list of valid scopes, L vs for every registered user to REV via • On input (''store", m keystore ): 1) Open m keystore ; verify the message. If the verification fails, output ⊥.
1) Open m scope ; verify the message. If the verification fails, output ⊥.

Q Scope REV
Key Sharing Phase: The goal of this phase is to share data between legitimate users. This is done by running CLASH.KeyShare. For a registered user u j to access files encrypted by u i , she first needs to acquire the symmetric key K i . With K i in her possession, u j will be able to both generate the DSSE tokens required to access the encrypted database and to decrypt the files she receives back from the CSP. To this end, u j sends a request to KT via m verReq = r 6 , E(K j KT , u j ||u i ), HMAC(K j KT , r 6 ||u j ||u i ) . Upon receiving the message, the Key Tray will reply with ) to the user, who then forwards this message to REV. REV then locates s i j and will create a report rpt REV containing m rev = r 8 , E pk KT s i j , σ REV H r 8 ||s i j that will be sent to KT. At this point, KT will verify rpt REV , retrieve c Finally, u j uses her private CP-ABE key to recover K i . CLASH.KeyShare(''share", m verReq ): The KT and REV programs, Q Share KT and Q Share REV are defined as follows:

Data Processing Phase:
In the Data Processing Phase a user u j that already received and successfully decrypted c K i P can start interacting with the CSP to access files encrypted under K i . To this end, u j can either run CLASH.Search, CLASH.Update and CLASH.Delete, depending on her access rights. CLASH.Search, allows users to search directly on the encrypted files, for those that contain a specific keyword w. User u j first needs to create a search token for a keyword, τ s (w). After the search token is created, u j sends m search = τ s (w), m key , HMAC(K j CSP , τ s (w)||m key ) , where VOLUME 8, 2020 m key is received in the Key Sharing Phase, to CSP. Upon reception, CSP opens m key to check the freshness of the timestamp and to verify that s i j [0] = 1. Finally, the CSP will search for the files containing w and it will send back to u j a sequence of files identifiers. This is done by loading the program Q Search CSP to the CSP enclave:  To revoke the scope manager, u j must have owner's rights. Finally, ownership rights are assigned and revoked only by the data owner (i.e. u i ). In particular, for u j to revoke a scope from a user u she first contacts REV by sending m manage = r 10 , E(K j REV , u j ||u ||n||''assign/revoke"), c K i p , HMAC(K j REV , r 10 ||u j || u || n || c K i p || ''assign/revoke") , where n ∈ [0, 4] is an index of the one dimensional bit array s i and specifies which bit of the array will be flipped. Upon reception, REV will verify the message and it will generate a report rpt REV containing m idx.req = r 11 , E(K KT REV , u ), c K i p , HMAC(K KT REV , r 11 ||c K i p ||u ) that will be sent to KT. After KT verifies rptREV, it will send a report rpt KT containing idx K i back to REV. REV then verifies rpt KT and uses idx K i to identify the bit arrays s i j and s i , and checks whether u j has the right to revoke or assign scopes to other users. If so, REV revokes/ assigns the requested scope from u by setting s i [n] = 0/ s i [n] = 1. In case of assigning the scope owner (n = 4), REV further sets s i = [11111]. The programs Q Rev REV and Q idx KT responsible for handling this procedure are the following:

C. SEARCHING ON MULTIPLE DATASETS
In a realistic scenario, a user would want to perform a search operation on multiple data sets at once. However, our construction focuses on the problem of searching on a single dataset per search query. The problem that arises is that each data owner is using a different symmetric key to encrypt her data and thus, to perform a global search the CSP would require all the indexes idx K i . To solve this problem, we slightly modify the key sharing protocol as follows: u j sends a request to KT via m verReq = r 6 , E(K j KT , u j ||L u ), HMAC(K j KT , r 6 ||u j ||L u ) , where L u is a list containing unique identifiers of data owners that have granted u j with at least one scope. Upon reception, KT replies with m idxkey = where L c K p is the list of all the corresponding ciphers of the symmetric keys. The user then forwards this message to REV, who will locate s i j , ∀i such that c K i p ∈ L c K p and store them in a list L s i j . Finally, after the KT and REV enclaves execute the local attestation protocol just like in the original construction, u j will receive m key = r 9 , E pk CSP (u j , t, L s i to recover the different symmetric keys. To perform a search operation on multiple datasets, u j can now send the new m key to the CSP as part of m search , and the CSP will proceed with searching on every dataset specified by L idx K i . • On input (''idx", m manage ): 1) Verify the message. If the verification fails, output ⊥. 2) Generate rpt REV containing m idx.req Run HW.Run(hdl REV , (''idx", m manage )), and rpt REV ← HW.RunReport(hdl REV , (''idx", m manage )). output ⊥. 4) Generate and output a report rpt KT containing idx K i . Run HW.ReportVerify(hdl KT , rpt REV ), then rpt KT ← HW.RunReport(hdl KT , (''idx.request", rpt REV )).

VI. SECURITY ANALYSIS A. SIMULATION-BASED SECURITY
To prove the security of our construction, we assume the existence of a simulator S. The main purpose of S is to simulate the algorithms of the real protocol in such a way that any polynomial time adversary ADV will not be able to distinguish between the real protocol and S. We assume that S intercepts ADV's communication with the real protocol and replies with simulated outputs. Before we proceed with the proof, we define the capabilities of S and ADV.
1) Everything ADV's observes in the real experiment can be simulated by S. 2) ADV intercepts all communication between different entities. Since we use an IND-CCA2 public key encryption scheme, if ADV can distinguish between real and simulated answers, then she can also break the IND-CCA2 security. 3) ADV can load different programs in the enclaves and record the output. This assumption significantly strengthens ADV since we need to ensure that only honest attested programs will be executed in the enclaves.

Definition 4 (Sim-Security): We consider the following experiments. In the real experiment, all algorithms run as defined in our construction. In the ideal experiment, a simulator S intercepts ADV's queries and replies with simulated responses.
Real Experiment We say that CLASH is sim-secure if for all PPT adversaries ADV: (1) At a high-level, we construct a simulator S that will replace the CLASH algorithms. In particular, in the real experiment, the adversary ADV observes the algorithms being executed honestly, while in the ideal experiment S responds with simulates answers. The idea is the following: ADV has full control of the client. Thus, she can trigger Setup, Search, Update and Delete operations for the DSSE scheme. For each of these operations, S gets as input the corresponding leakage function L i and simulates the CLASH.Search, CLASH.Update and CLASH.Delete oracles. Finally, we exclude the KeyShare and Manage oracles from the security game as they do not require to produce any simulated output for ADV. However, for purposes of completeness, we include them in the proof of the theorem provided below. VOLUME 8, 2020 Theorem 1: Assuming that PKE is an IND-CCA2 secure public key cryptosystem, SKE is an IND-CPA secure symmetric key cryptostystem and Sign is an EUF-CMA secure signature scheme then CLASH is a sim-secure protocol according to Definition 4. Proof: We start by defining the algorithms used by the simulator. Then, we will replace them the real algorithms with the ones executed by the simulator. The algorithms not mentioned below, work just like in the real experiment. This does not affect the security of our construction as we are mainly focusing on the access control mechanism and the SSE scheme. For example, to further strengthen the threat model, we assume that the adversary has a real SSE key. Thus, there is no point in providing her with a simulated ABE secret key. However, in the security proof, we show that ADV cannot tamper with any of the messages that are being exchanged during a run of the protocol. With the help of a Hybrid Argument, we will prove that the two distributions are indistinguishable.
• CLASH.Setup * : Will only generate MPK that will be given to ADV.
• CLASH.Store * : S generates a dictionary that will enable it to consistently reply to search queries even after file additions and deletions. In particular when ADV triggers CLASH.Store, she actually triggers the InGen algorithm of the SSE scheme. Thus, S gets as input the corresponding leakage function and simulates the SSE indexes.
• CLASH.KeyShare * : S encrypts K ADV under MPK and sends it back to ADV. Moreover, S simulates and sends to ADV m idxkey and m key . Finally m key and m idxkey are stored in a list L in order to prevent an attack in which ADV would try to use a different set of valid scopes than the one she received.
• CLASH.Search * : When ADV performs a search operation for the files containing a keyword w, S gets as input the leakage function L s and outputs a simulated token τ s (w). Based on the simulated τ s (w) can retrieve the files ADV is looking for without performing the real search operation.
• CLASH.Update * : When ADV generates an add token τ α (f ), S gets as input the leakage function L a and outputs a simulated response. S will simulate the add token, the ciphertext to be added to the database, and will also update the encrypted index.
• CLASH.Delete * : When S generates a delete token, S gets as input the leakage function L d and outputs a simulated response. Apart from τ d (f ), S will also update the encrypted index so that if ADV performs a search operation in the future, for a keyword that is contained in the deleted file, the file will not be included in the result.
• CLASH.Manage * : S gets as input the list L VS . By getting this list, an attack in which ADV would try to assign/revoke scopes from a legitimate user can be avoided. In contrast to the real algorithm, CLASH.Manage * does not assign/revoke any scopes from other users. In a pre-processing phase, S runs HW.Setup(1 λ ), just as in the real experiment, in order to acquire sk rpt . ADV outputs a file collection f and it encrypts it using SSE. Finally, she receives a set of scopes SC ADV , that she can use during the run of the game. We will now use a hybrid argument to prove that ADV cannot distinguish between the real and the ideal experiments.
CLASH runs normally.

Hybrid 0
Everything runs like in Hybrid 0, but we replace CLASH.Setup with CLASH.Setup * .

Hybrid 1
The difference between CLASH.Setup and CLASH.Setup * is that in CLASH.Setup * , S only generates a key MPK instead of a (MPK, MSK) pair. Since in the real experiment, MSK is not given to ADV anyway, MPK ADV cannot distinguish between the two hybrids. Hence: Like Hybrid 1, but CLASH.KeyShare * runs instead of CLASH.KeyShare. Also, the algorithm outputs ⊥ if HW.ReportVerify is queried with (hdl KT , ("share", rpt REV )) but ADV never contacts REV.

Lemma 1: Hybrid 2 is indistinguishable from Hybrid 1.
Proof: The simulator encrypts K ADV with MPK and sends it to ADV. Moreover since ADV does not posses K KT REV then she can only generate m idxkey with negligible probability. Finally, ADV can only generate a valid MAC of the report sent from KT to REV with negligible probability. Hence: At this point, ADV can start making search, add and delete queries. The simulator now gets access to all leakage functions L from the SSE scheme.
Like Hybrid 2, but when HW.Run is queried with (hdl CSP , ("search", m search )), S is given the leakage function L S and generates a simulated search token. Moreover, the algorithm outputs ⊥ if the m key message it receives is different than the one stored in L.

Lemma 2: Hybrid 3 is indistinguishable from Hybrid 2.
Proof: The algorithm already outputs ⊥ if m key is different than the one stored in L since the verifications would fail. Assuming the L i − security of the SSE scheme, the token sent by ADV to the CSP, as part of m search , is generated by S with L s as input. As a result, when the CSP receives m search , it will send back to ADV the correct files without running DSSE.Search. ADV cannot distinguish between the real and the ideal experiment since she receives a sequence of files corresponding to a search token that was simulated by S given L s as input. Moreover, ADV can only generate m search without having contacted KT earlier with negligible probability, since she does not possess the secret key used to mac this message, and as a result ADV can only distinguish between hybrids 3 and 2 with negligible probability. Thus: Like Hybrid 3, but when HW.Run is queried with (hdl CSP , (''update", m add )), S is given the leakage function L a and tricks ADV into thinking that she updated the database. Moreover, the algorithm outputs ⊥ is the m key message it receives is different than the one stored in L.

Lemma 3: Hybrid 4 is indistinguishable from Hybrid 3.
Proof: The proof is similar to the previous one but simpler since ADV does not expect an output from this algorithm. So, by assuming the L i − security of the SSE scheme, we know that ADV will not be able to distinguish between the real add token and the simulated one. Additionally, the CPA-security of the symmetric encryption scheme, ensures that ADV cannot distinguish between the encryption of an actual file and that of zeros. Moreover, if ADV can generate m add without having contacted KT, then she can also forge KT's MAC -which can only happen with negligible probability. Finally, the ciphertext sent along with the add token is stored in a list L, so that the simulator will answer consistently future search queries. Hence: Like Hybrid 4, but when HW.Run is queried with (hdl CSP , ("delete", m del )), S is given the leakage function L d and simulates the delete token.

Lemma 4: Hybrid 5 is indistinguishable from Hybrid 4.
Proof: Just like before, the algorithm already outputs ⊥ if m key is different than the one stored in L. By assuming the L i − security of the SSE scheme, we know that ADV will not be able to distinguish between the real delete token and the simulated one. Thus, ADV can only distinguish between Hybrids 5 and 6 with negligible probability.

Hybrid 6
Lemma 5: Hybrid 6 is indistinguishable from Hybrid 5. Proof: Since the valid scope list is not retrievable during the execution of the protocol, ADV can never tell if she really revoked any scope from a specific user. ADV could try to bypass KT's authentication by generating and sending rpt directly to REV. However, since ADV does not possess sk rpt , she can only do that with negligible probability. Hence, ADV can only distinguish between Hybrids 6 and 7 with negligible probability and as a result: By combining inequalities 2 -7 and using the triangle inequality property, we get: However, it is a standard result in analysis that the finite sum of negligible functions, is still negligible. And thus: which implies: And hence, our proof is complete. We managed to replace the expected outputs with simulated responses, in a way that no PPT ADV cannot distinguish between the real and ideal experiments.

B. SGX SECURITY
Recent works [17], [22], [31], [32] have shown that SGX is vulnerable to software attacks. However, according to [20], these attacks can be prevented if the programs running in the enclaves are data-obvious. Thus, leakage can be avoided if the programs do not have memory access patterns or control flow branches that depend on the values of sensitive data. In our construction, no sensitive data (such as decryption keys) are used by the enclaves. KT acts as a storage space for the symmetric keys and does not perform any computation on them. Hence, all the c K i p are data-obvious. Moreover, L VS is stored in plaintext and every entry in the list is padded to achieve same length. Moreover, we can prevent timing attacks on L VS by ensuring that every time REV accesses the list, it goes through the whole list. Finally, as also mentioned in [6], Encryption and Decryption using AES-NI hardware instruction ensure there is no leakage of the encryption key during search and update operations. This is because since AES encryption and decryption using these instructions have data-independent timing and involve only data-independent memory access. Thus, by assuming a constant time implementation, our construction is not vulnerable to side-channel attacks.

VII. EVALUATION AND EXPERIMENTAL RESULTS
In this section, we present our experimental results that aimed at measuring the processing time of our construction. For the implementation of the SSE scheme, we used the forward private scheme presented in [9], while for the ABE scheme we used the library provided by [5]. Finally, to construct the cryptographic parts of the protocol within SGX secure containers (i.e. enclaves), we used the SGX-OpenSSL library in [3].
As we aimed to evaluate the performance of CLASH under realistic conditions, we used different machines -depending on the process to be measured. The setup of the SSE scheme was measured on a Microsoft Surface Book laptop with a 2.1GHz Intel Core i7 processor and 16GB RAM running Windows 10 64-bit. The reason being that in a practical scenario, this process would take place on a user's machine.
Conducting the experiments on a powerful server would result in a set of non-realistic results. The parts running in an enclave were measured in a powerful desktop PC with Intel Core i7-8700 at 3.20GHz (6 cores), 32GB of RAM running Ubuntu 64-bit, and Intel SGX Hardware Debug mode build configurations. The reason for running these parts on such a computer is based on the assumption that these processes will be running on the CSP.

A. SYMMETRIC SEARCHABLE ENCRYPTION 1) THEORETICAL EVALUATION AND COMPARISON
While our construction can work with any dynamic SSE scheme, we chose to use the scheme we developed and presented in [9]. Our SSE scheme is amongst the most efficient schemes that also support the crucial notion of forward privacy in the multi-client model. Informally, an SSE scheme is said to be forward private when the adversary cannot link newly added keyword to previous search queries. More information on forward privacy can be found in [9]. More precisely, our scheme achieves optimal search and update costs O( ) and O(m) respectively, where is the number of the resulted files on each search operation and m is the number of unique keywords in a file. Additionally, the scheme is parallelizable and hence, distributing the load to p processors, would further improve the search and update operations by a factor of 1/p, resulting in a search cost of O( /p) and an update cost of O(m/p) respectively. Finally, our scheme in [9] supports the multi-client model and is SGX-assisted. Hence, it shares a very similar architecture with the one presented in this work. In Table 1, we compare the SSE scheme used in this work with the SSE schemes presented in Section II.

2) EXPERIMENTAL RESULTS
For this part of our experiments, we mainly focused on (1) Indexing and (2) Searching for a keyword w. The SSE scheme was implemented in Python 2.7 using the PyCrypto library [2]. To extensively test the performance of the SSE scheme, we extracted various datasets, illustrated in Table 2, from the Gutenberg project [1]. Finally, the dictionaries were stored in a MySQL database.

a: Indexing & Encryption
This is the setup phase of the SSE scheme. This phase includes (1) reading plaintext files, extracting the keywords and creating the necessary dictionaries, (2) encrypting the files and (3) building the encrypted indexes. We run each process ten times, for each dataset in Table 2 and measured the average time. The results are illustrated in Figure 2. To index and encrypt 1.370.023 keywords, the average time was measured at 22.48min, while for 12.124.904 keywords, the corresponding time was 203.28min. Note here, that this is the most demanding phase of the protocol and that it only occurs once, on the data owner's side. As a result, it does not affect the overall efficiency of our construction. Moreover, based on the results of other SSE schemes that are not forward private [18], the times measured are acceptable. Finally, to recreate a realistic scenario, this phase of the experiments was measured at a commodity laptop.
Additionally to the index of unique keywords, the SSE schemes makes use of one more index containing a mapping between keywords and file identifiers. The total number of these mappings can be seen in Table 3.

b: Search
To measure the exact time needed to perform a search operation we need to take into account (1) The time required to generate a search token and (2) The time needed by the CSP to find and return the file identifiers of those files that contain the specified keyword. On average, the creation time for a search token was measured at 9µs, while searching for a specific keyword over a set of 12.124.904 distinct keywords and 39.747.904 addresses required 3.2sec.

B. CIPHERTEXT-POLICY ATTRIBUTE-BASED ENCRYPTION
For the implementation of CP-ABE, we used the scheme presented in [5], offered by Charm-Crypto Framework version 50.0 in a Docker container. The experiments were implemented in Python 3.6 and conducted on a Desktop machine with Intel Core i7-8700 at 3.20GHz (6 cores), 32GB RAM.

c: Setup Phase
The first phase of our experiments was devoted to measuring the time required to generate a master public/private key pair for a master entity. In our setup, we considered the existence of a single master entity responsible for the generation of CP-ABE keys. The time to generate a single pair was less than a second, while the total time for the generation of 200 master key pairs was measured at almost 6 seconds. These results are illustrated in Figure 3.

d: Users Key Generation
In the second phase of the experiments, we measured the average time needed to generate secret users' keys. In particular, we measured the time to generate a user's key while increasing the number of attributes associated with it. As can be seen in Figure 4, the average time to generate a user key with 1.000 attributes took almost 6.41sec, while a key with 500 attributes required approximately 3.23sec. These results are suitable for covering even more complex cases where big companies are required to generate large keys based on a wide variety of information. Thus, it can be stated  that covering a long list of attributes is realistic and should not prevent an organization from adopting such an approach.
Moreover, as can be seen from Figure 5, we observe that the size of the key is almost linear to the number of attributes associated with it. In particular, the size of a key associated with 1.000 attributes is around 420KB, while for a key associated with 500 attributes the disk size was measured at almost 215KB. Finally, a key associated with 100 attributes has a size of approximately 45KB on the disk.

e: Encryption & Decryption
CLASH only use CP-ABE to encrypt a symmetric key and not large volumes of data. Hence, we measured the time needed to encrypt and decrypt a symmetric key under policies of different sizes. We used access policies of type {1 AND 2 AND . . . AND n} similar to [5]. Such policies are the most demanding since all attributes are required for the successful decryption. The experiment can be divided into two stages. In the first stage, we measured the encryption process. In particular, we ran an encryption algorithm on a message VOLUME 8, 2020 with different policies. In the second stage, we decrypted the freshly generated ciphertexts with keys that are associated with a different number of attributes. In addition to that, we were adding access policies of a different structure to record the performance of the decryption not only when all conditions needed to be fulfilled (most demanding case), but also when a random number of attributes is needed to satisfy the underlying policy. Figure 6, demonstrates the time required to encrypt a symmetric key with a random policy of size up to 1.000 attributes. Similarly, Figure 7 illustrates the time needed to decrypt a ciphertext by using a key with up to 1.000 attributes. As can be seen from the figures, the time to encrypt and decrypt a message depends on the particular attributes available and the size of the policy. In particular, the encryption of a key with a policy of 1.000 attributes took approximately 6.5 seconds while the decryption time was measured at almost 0.068 seconds. However, for more realistic scenarios where policies contain around 200 attributes, the encryption time was around a second and the decryption time was almost 0.028 seconds. It is evident that the underlying CP-ABE scheme does not add any real computational burden to the overall performance of the protocol.
In the second stage of the experiment, we focused on analyzing the behavior of the underlying ABE scheme. More precisely, we created an algorithm that randomly generates a policy that contains numerical attributes as well as conditions such as {(1 AND 2) OR (3 AND 4)'}. This condition required that at least one of two parenthesis are satisfied by the attributes of a user's key. Figure 8 shows the time needed to decrypt a ciphertext bound with a policy of up to 1.000 attributes. From the result shown in the graph, we can observe that the decryption time is linear regardless the randomness of the policy.

C. IMPLEMENTATION AND EVALUATION THE OF Cloud we Share
We used the SGX OpenSSL cryptographic library [3] to implement RSA with 4096 bit keys. The reason for select-  ing such a long key size was that we wanted to test the performance of our construction under the most demanding circumstances. Additionally, the development was done in C/C++ using Intel(R) SGX SDK 2.6 for Linux [4].
An SGX application is divided into two different parts; the trusted part (i.e. the enclave) and the untrusted part (i.e. application). To make a call to the enclave, the untrusted application is using the SGX's ECall function, which allows the application to enter the enclave. Similarly, SGX's function OCall is used to exit the enclave back to the untrusted space.

f: Setup Phase
The setup phase consists of launching the enclaves and generating all the necessary keys. Each enclave contains multiple functions that correspond to different parts of our construction. Therefore, we measured the time taken to launch each enclave separately. To acquire more accurate results, each enclave was launched 10.000 times and we  measured the average completion time. The average time to launch the MS enclave, containing all the functions required for the generation of the RSA keys was 25.4ms while the average time to launch the REV enclave was 28ms. Similarly, the time needed to launch the KT enclave was measured at 27.6ms, and finally, the launching of the CSP enclave, required 28ms. Note here that the enclaves can be launched in parallel. Therefore, the time required to launch all four enclaves is 28ms. The final step of this setup phase is the generation of the RSA keys. In our construction, each enclave generates an 4096-bit RSA key pair. The time required for the generation of such a pair was measured at 840ms. However, this procedure can also be run in parallel, since each enclave generates its own key pair independently from the other enclaves. These results are illustrated in Table 4, along with the functions contained in each enclave.

g: Enclave Attestation
Different enclaves can attest to each other to demonstrate the integrity of their software. SGX offers two different kinds of attestation, local and remote. Local attestation occurs between two or more enclaves running on the same platform, while Remote attestation enables a third party to attest an enclave. Currently, verifying a quote from a third party involves contacting Intel's attestation server -a process that requires a license. Thus, for our experiments, we consider the case of local attestation. We measured the time needed between the KT enclave and the REV one to attest to each other, as part of CLASH.Manage. We run the experiment 10.000 times and the average time was measured at 1.1ms.

h: Execution Time
In the last part of our experiments, we determined the running time of CLASH's functions by measuring the time required to (1) generate, (2) exchange and (3) verify all messages of our protocol. Each experiment was run 100.000 times to achieve a better estimation of the average time. Our focus was to measure the application execution time while it was running in secure containers (i.e. enclaves). Namely, we measured ECall functions at the moment of entering and exiting enclaves from the untrusted part of the application.
Our results are presented in Figure 9 by showing the average time needed for each one of the functions. As can be seen, CLASH.Manage, CLASH.KeyTrayStore and CLASH.KeyShare are the most demanding functions as they were measured at 474µs, 245µs and 81µs respectively. This result was expected due to the big number of exchanged messages. Moreover, the CLASH.Store took 23µs. Finally, the total execution time of CLASH.Search, CLASH.Update and CLASH.Delete were measured at 17µs, 17µs and 20µs.

i: Open Science and Reproducible Research
As a way to support open science and reproducible research and allow other researchers to use, test, and hopefully extend/enhance our protocol. Our CLASH prototype, as well as the ABE experiments, have been already uploaded to GitLab and are publicly available online. 1 In addition to that, the dataset that we used to perform the SSE experiments has been uploaded as a research artifact (Open Access) on Zenodo [25].

VIII. CONCLUSION
In this paper, we proposed The Cloud we Share, a hybrid encryption scheme based on SSE and ABE. Our construction allows a data owner to share her data in a privacy-preserving way and manage the access rights of the rest of the users. Moreover, we show that we can rely on the functionalities offered by Intel SGX, to design an access control mechanism that is agnostic to the underlying cryptographic primitives. In addition to that, we strongly believe that cloud-based services will rely less on traditional decryption of information, and more on computations over encrypted data. We hope that this work will kick-start a period of greater research in the area of privacy-preserving computations in untrusted clouds.