R-PEKS: RBAC Enabled PEKS for Secure Access of Cloud Data

In the recent past, few works have been done by combining attribute-based access control with multi-user PEKS, i.e., public key encryption with keyword search. Such attribute enabled searchable encryption is most suitable for applications where the changing of privileges is done once in a while. However, to date, no efficient and secure scheme is available in the literature that is suitable for these applications where changing privileges are done frequently. In this paper our contributions are twofold. Firstly, we propose a new PEKS scheme for string search, which, unlike the previous constructions, is free from bi-linear mapping and is efficient by 97% compared to PEKS for string search proposed by Ray et.al in TrustCom 2017. Secondly, we introduce role based access control (RBAC) to multi-user PEKS, where an arbitrary group of users can search and access the encrypted files depending upon roles. We termed this integrated scheme as R-PEKS. The efficiency of R-PEKS over the PEKS scheme is up to 90%. We provide formal security proofs for the different components of R-PEKS and validate these schemes using a commercial dataset.


INTRODUCTION
I N cloud, encryption may be a suitable mechanism to protect the data at rest.However, encryption prevents searching within the data which is essential for better usability of the encrypted data.This gives rise to a new area of research, called searchable encryption (SE).One problem of using SE in symmetric setting is the maintenance of index, specially for applications where dataset undergoes frequent update.A variant of SE, called public key encryption with keyword search or PEKS is the most popular encryption technique which was introduced in [1] and is free from any index generation.Following this, many researches has been carried out on single-user searchable encryption (SUSE) with access control mechanisms.However, multi-user searchable encryption (MUSE) is becoming more relevant for most of the commercial applications involving large group of users with complex access structure.Some work has also been done on MUSE by delegating the permission of searching among multiple users in an access controlled environment.Most of these works involve attribute based access structure.
SE for keyword search yields huge outputs for most of the commercial applications which deal with large detaset.
Most of these outputs are not intended and gives rise to unnecessary network traffic.SE for string search is of special interest as this customizes the search.In [2], non-adaptively secure SE for string search was proposed, but the approach was in a symmetric key setting.In [2] authors pointed why it is impossible to design adaptively secure SE in symmetric search.In [3], authors introduced the idea of string search using PEKS which is adaptively secure.All existing PEKS schemes are based on pairing based cryptography where the basic component is bi-linear mapping.One problem with this technique, specially in the context of string search, is the huge computational cost which makes it less suitable for commercial applications [4].In this paper, we introduce a new adaptively secure PEKS scheme, which is free from bi-linear mapping and compatible with role based access control (RBAC) mechanism for secure search and access of outsourced encrypted data.
Recently, most of the researchers applied SE with attribute based access control (ABAC) to provide restricted access based on the attributes for their outsourced personal health record (PHR).In PHR datasets, changing users permissions are minimal [5], [6].Since ABAC is the most suitable approach for these cases where changing user permissions is minimal, protecting PHR data in cloud by the means of attribute based keyword search over encrypted data is the most preferred choice of researchers [4], [7]- [9].
In big organizations, large number of employees access data under a complex access structure.MUSE finds its application in such cases.If the search needs to be customized using string search, PEKS is the best possible solution for that.Since access structure is defined based on the roles of large number of employees, which are subject to frequent modifications, RBAC is the considered most preferred access control mechanism [6].For example, securing wireless networks with RBAC in small and medium sized businesses.
Here the ad-hoc process of granting and revoking network privileges and access to users becomes extremely difficult to manage, as the number of individuals involved increases beyond a certain point.Thus a combination of RBAC with PEKS is an optimal solution to design access controlled MUSE for frequently changing string search privileges.Additionally, RBAC has a minimal overhead when large number of employees enter and exit the organization.Here we describe an industrial application where our model can provide the most optimal solution, but due to the confidentiality agreement the company's identity is not disclosed.
As per the PCI DSS (Payment Card Industry Data Security Standard), the credit/debit card information should not be stored in plain text.So companies use hashing technique (SHA-256) to store the card information in the database.So during the transaction whenever the user enter his card details, based on the hash value of the card details the information such as name and address are auto filled.
It may be noted that the use of hashing technique is also exposed to the risk of statistical attack.In our model, even if the adversary get the trapdoor he cannot search unless he gets the access of search algorithm.Migrating from hashing to PEKS comes with the additional cost but to make searching process efficient, we introduce role based PEKS, i.e., R-PEKS for such situations.To the best of our knowledge, no secure searching scheme is available using PEKS with RBAC.
In this paper, we consider the following three real world scenarios where PEKS, integrated with RBAC, is a good choice for commercial applications.We coin the term R-PEKS for it.We validate and compare our model with existing schemes using the methodologies proposed in [10].Scenario 1 An organization wants to outsource its large amount of data to the cloud service provider (CSP).Such data can be accessed by the set of employees who are privileged by their roles to search.Scenario 2 When an employee P wants to access certain data of some employee Q who is working under him, then the privilege is given to P to access Q's data.Scenario 3 When a group G is formed from a subset of employees who are having different roles, then the group level privilege is given to all members for accessing data pertaining to the group.
Most of the threat models assume that the data owner and data user are trusted.However, cloud service provider (CSP) is considered as honest-but-curious [2], [3], [11].Recently, in MUSE, the threat model was developed by considering that data users are colluding with the CSP where both are honest-but-curious [12].We develop R-PEKS under this threat model.Main contributions of this paper are as follows: 1) Our proposed model is at the intersection of RBAC and PEKS, which facilitates data owner to provide restricted data search and access based on the RBAC configuration.We implement the access control in three different modes depending on the above scenarios, namely, single user (Scenario 1), multi-user peer to peer (Scenario 2) and multi-user group (Scenario 3).2) We introduce a new PEKS scheme which, unlike the previous schemes, is free from bi-linear mapping and is more practical and efficient compared to earlier scheme of [3] by 97% but yet providing equivalent level of security.3) With normal PEKS, the search request of an user u is spread over the whole dataset of the organization, of which a limited portion is meant for the access of the user u.In such PEKS, the access control is handled manually after the search.In this context, R-PEKS enforces the access control during the search and the efficiency of R-PEKS over PEKS scheme is up to 90% when the part of dataset pertaining to user u is 10% of the whole cloud data.Rest of the paper is organized as follows: In Section 2 we discuss the related work.Section 3 highlights the architecture of our proposed model R-PEKS.Section 4 defines the R-PEKS components and Section 5 describes the R-PEKS design.In Section 6 we discuss the security analysis of different components of R-PEKS.In Section 7 we provide experimental results.In Section 8 we focus on comparison with other schemes and Section 9 concludes the paper and outlines areas for future work.

RELATED WORK
Access control mechanisms with SUSE scheme provides restricted access to data based on roles, keys and attributes.In role based access, i.e., role based encryption of [13], the security for cloud storage is enabled by the data owner where he encrypts a data using some role and public parameters.In case of key based access control of [14], the accessible file's decryption key is given to the users.Integrating such model with SE increases the complexity of the key management whenever user accesses large number of files.In [8], [9], multi-field search query and fine-grained access control was implemented in SE with ABAC during file access in the cloud.The file in the cloud is attached with an encrypted index to label the keywords and access policy.So the user can access the file if the keyword is matched and the access policy mechanism grants the permission.Further it allows the user to locally derive the search capability.String search can be viewed upon as ordered multiple keyword search.Few schemes were proposed for string search in symmetric searchable encryption (SSE) [2] and PEKS [3], but both these schemes are without access control mechanism.
Access control mechanisms with MUSE schemes are constructed using broadcast encryption, along with attributes, coarse-grained and fine-grained access control.The key problems identified in MUSE are the key management and the access control.Broadcast encryption [15] was the first mechanism to introduce access control in multi-user environment, where the message is encrypted by broadcaster and can be decrypted only by these users who are part of the broadcast channel.Further, based on the broadcast encryption technique, coarser-grained access control were introduced into MUSE in cloud environment in [16].In [17], Key-Aggregate Searchable Encryption (KASE) was introduced in cloud storage for decreasing the key management during data sharing where each document was encrypted with a different key.This was implemented by generating a single key called key-aggregate in MUSE.In [18] single or multikeyword search based on attributes and access control were implemented, which is known as fine-grained access control.
In [19] multi-keyword search along with authorization in the multiple user setting was studied, where the authorization was meant for a specified period of time.Work has also been done to improve the search efficiency among multiple users using SSE for cloud applications [20].In recent past, PEKS scheme was combined with ABAC to provide restricted access on personal health record (PHR).In [7], authors designed PEKS integrated with ABAC to support multi-keyword search in multi-user settings.In [4], expressive SE scheme in PEKS was studied and developed for outsourced PHRs.In this work authors treated keyword search predicates as access policies and express them as conjunction or any other boolean expression of keywords.Further, in [4], bi-linear mapping in prime-order group was used to make it some what secure compared to bi-linear mapping in compositeorder group.It may be noted that ABAC in PEKS is not a suitable mechanism when access policies are changing quite often.This is because, with every change in the policies, the data owner needs to download, decrypt and re-encrypt the data [5].In [12], data users colluding with the CSP is considered as a mandatory requirement of MUSE.Moreover in SE, most of the threat model focused only on privacy against honest-but-curious CSP and data user [12], [16].
In the light of above discussions, it is clear that an access control mechanism in SE is very much essential to maintain large number of privileged access.Such privileged access in SE should be conformable for future analysis and change in an efficient way.In this paper, we propose R-PEKS based on RBAC and a new PEKS.Unlike the previous ABAC based schemes, R-PEKS is more suitable for applications where change of user permissions is a frequent activity.Also the underlying PEKS in R-PEKS is free from bi-linear mapping and is thus more efficient compared to earlier schemes.In Table 1, we provide a comparison of our scheme with the existing models.It may be noted that our PEKS security relies on CDH assumption (Computational Diffie-Hellman) whereas in [3], [4] the security assumption were based on BDH assumption (Bi-linear Diffie-Hellman).CDH is a very popular hard problem and many state-of-the-art schemes are based on CDH assumptions.For example, authors in [21], [22] also used CDH as a basis of their security proof.

ARCHITECTURE OF OUR PROPOSED MODEL -R-PEKS
The design of single-user and multi-user R-PEKS settings for secure access of cloud data is given in Figure 1.The data owner of Figure 1 is the enterprise which can grant or deny certain set of permissions to its users.Here encryption and searching are the two operations which constitutes a permission.Therefore to maintain the restricted access on such permissions, the data owner maintains the role manager.In Figure 1   assigned to users as well as permissions are maintained across the corporate network and public cloud respectively.The detailed description of the activities are given below: 1) Initially, data owner generates RBAC configuration, key pair for all the authorized users and enables two different modes of searchable encryption, i.e., SUSE and MUSE.2) Data owner encrypts using PEKS and AES (see Figure 1) and stores the dataset using single-user and multiuser R-PEKS in public cloud.
3) The trapdoor is generated by the data owner on receiving the string search request and mode of R-PEKS from the data user.Such trapdoors are used in the public cloud to search and identify the required files.4) Finally, public cloud return the id's of the files that matches the search to the data owner, which then decrypts the corresponding AES-encrypted files using AES-decryption for the data user.
Remark 1.It may be noted that PEKS encryption is an one way function as searchable ciphers produced by PEKS encryption can not be decrypted.It is used only for searching using the trapdoor.In the R-PEKS, to retrieve the plain text file, we encrypt the plain text files using PEKS as well as AES and maintain a map between these two encrypted file pointers.Whenever some PEKS encrypted file pointer is detected by the search algorithm, the corresponding AES encrypted file is sent to the data owner for decryption and retrieval of the plain text file.
Remark 2. We maintain a map since AES encryption and decryption components of R-PEKS are straight forward, we skip this component while describing R-PEKS in the following sections.

R-PEKS COMPONENTS 4.1 PEKS: Definitions and Preliminaries
Document collections and Data Structures: Let = {w 1 , w 2 , . . ., w d } be a dictionary of d words and P( ) be the set of all possible documents which are collections of words.Let D ⊆ P( ) be the collection of n documents D = (D 1 , D 2 , . . ., D n ).Let id(D i ) be the unique identifier for the document D i .We denote the list of all n document identifiers in D by id(D), i.e., id(D) = {id(D 1 ), . . ., id(D n )}.Furthermore, let D(w j ) be the collection of all documents in D containing the word w j .A string s of l words is an ordered tuple (w 1 , w 2 , . . ., w l ).Let D(s) denote a collection of documents in D that contains the string s.It is easy to check that D(s) ⊆ l i=1 D(w i ).We denote by δ(D), all the distinct keywords connected to the document collection D. Cryptographic Primitives: Here we define cryptographic primitives that are needed for our PEKS scheme for string search.
We typically denote an arbitrary negligible function by negl such that for any arbitrary polynomial p(.), there exists an integer a such that for all λ > a, negl(λ) < 1 p(λ) [23].We also use pseudo prime number generator [24], denoted by P P N G(1 λ ) which outputs a λ-bit probabilistic prime number.For a finite set S, we denote the operation of picking an element uniformly at random from S by x ← S. Public Key Encryption with Searching: Definition 1.A non-interactive public key encryption with keyword search scheme consists of the following polynomial time randomized algorithms: 1. KeyGen(λ): This algorithm takes security parameter, λ, and generates a public/private key pair A pub , A priv .2. PEKS Enc(A pub , w): For a public key A pub , and a word w, this algorithm produces a searchable encryption of w. 3. Trapdoor(A priv,w ): Given a private key A priv and a word w, this algorithm produces a trapdoor t w .4. Search(A pub , C, t w ): Given the public key A pub and searchable encryption C = P EKS Enc(A pub , w ) and a trapdoor t w , this algorithm outputs 'yes' if w = w , otherwise 'no'.
Two cryptographic hash functions are used in our scheme Π P EKS , namely H 1 and H 2 which are as follows: For our implementation we take G as Z p .Thus, for a word

Our PEKS scheme
In this section we present our PEKS scheme Π P EKS for string search.
Scheme 1 (Π P EKS ).The scheme Π P EKS is a collection of four polynomial time algorithms (KeyGen, PEKS Enc, Trapdoor, Search) such that: 1. KeyGen(1 λ ) : KeyGen is a probabilistic key generation algorithm that is run by the data owner to setup the scheme.It takes a security parameter λ, and returns the setup.Since KeyGen is randomized, we write it as tic algorithm run by the data owner to generate the cipher text C i corresponding to document D i .Since PEKS Enc is deterministic, we write this as Trapdoor(a, s) : Trapdoor is a deterministic algorithm run by the data owner to generate a trapdoor for a given string of words s = (w 1 , w 2 , . . ., w l ).
It takes k 1 , k 2 , a and s as input and outputs t = (t 1 , t 2 , . . ., t l ), where t i is the trapdoor corresponding to the word w i .Since trapdoor is deterministic, we write this as t = T rapdoor(k 1 , k 2 , a, s). 4. Search(C, t, k 2 ) : Search is run by the server in order to search for the documents in D that contain the string s.It takes a ciphertext collection C corresponding to D and the trapdoor t corresponding to s and returns D(s), the set of identifiers of documents containing the string s.
Remark 3. To encrypt the document D i , we read the whole document as stream of words and form a ordered sequence(w 1 , w 2 , . . ., w l ).A string is a subsequence of this sequence which is encrypted by applying PEKS Enc on every word in the in the string.If the same word occurs multiple times, for each instances the encrypted footprint will be different depending on the position (denoted by k in the algorithm) of the word in the string.
j ← 1; while j ≤ l do A j = H 1 k2,k1 (w j ); t j = A a j ; j ← j + 1; end while Remark 4. To search in a cipher, we first read all the ciphers of the form [B] in a list called listB.To find the match of the first trapdoor, i.e., t 1 , we check it against each entry of listB.This is done in the second while loop.If match is found, then the index of that block is stored in start pointer and rest of l − 1 trapdoors are checked against next l − 1 blocks in listB starting from start pointer + 1.If match is found in all l successive steps, we add the file pointer in encrypted f ile pointer and go for the next file.If the match fails in any step we repeat the matching of t 1 for the remaining blocks in the same way until the file is exhausted.
Remark 5.In the PEKS, we assume that data owner creates the data, encrypts using PEKS and uploads to public cloud for future search.So the KeyGen and PEKS Enc are run by data owner.To search, the query is converted to trapdoor by data owner and is given to cloud.Cloud runs the search using these trapdoors along with the public key of data owner.In R-PEKS, data owner is the data administrator who is responsible for creating employee accounts and provide employees role based permissions to search on the encrypted data stored in server.So, in R-PEKS, data owner runs KeyGen for  2).Depending on the roles, data owner fetch the keys of the corresponding employee/user and encrypt the files.To search, employees request data owner for the legitimate trapdoor for a given search query.Search is run in server using the trapdoor and the public of the corresponding data and the search results are sent to data owner who further decrypts and redirect the files to the employee based on the permissions.
In the next lemma, we study the correctness of the search algorithm.is the trapdoor corresponding to s taken in order, i.e., id(D i ) ∈ Search(C, t, k 2 ).

Lemma 1 (Correctness
Proof: Let A 1 = H 1 k2,k1 (w 1 ).for some j.Note that, then B 1 = H 2 k2 (A am 1 ) for some integer m depending on the position of the word w 1 in the file.Now, This detects [B 1 ] for t 1 .Two consecutive blocks B j and B j+1 are detected for two consecutive words w j and w j+1 if there are two consecutive numbers, say k and k + 1, such that ), and ), where A j+1 = H 1 k2,k1 (w j+1 ) and A j = H 1 k2,k1 (w j ).Hence the result follows.Remark 6.It may be noted that the second while loop of string search algorithm (i.e., Algorithm 4) is responsible for maintaining the ordering of keywords which is essential to detect string.We have also achieved multikeyword search by relaxing this condition for ordering.
For the experimental results we have used the variant which is dealing with string search only, i.e., Algorithm 4.
Therefore our PEKS scheme Π P EKS can search string and multi keyword based on the user's request.In order to perform a selective search we integrated our PEKS scheme Π P EKS with RBAC model (i.e., R-PEKS).The detailed explanation about RBAC components and function which are crucial for integrating with our PEKS scheme Π P EKS are given in the Section 4.3.

RBAC model
The NIST RBAC reference model defines different RBAC elements, RBAC assignments and mapping functions [25].
The pictorial representation of NIST RBAC model is shown in Figure 2. Let us define some of the elements, assignments UA R U PA P Fig. 2. NIST RBAC [25] and functions which are crucial for R-PEKS.Let U denote a set of users.Let R denote a set of roles where each role is a job assigned to some user in an organization.Let P denote a set of permissions where each permission is an approval to perform an operation on an object.Formally we express this by P = 2 OP ×OB , where OP is the set of operations and OB is the set of objects.Let S u denote a set of users who authorize the user u to search and access data pertaining to them.Let G denote a set of authorized users, i.e., a group.Let UA denote a many-to-many mapping which is user-to-role assignment relation, i.e., UA ⊆ U × R. Let PA denote a many-to-many mapping which is a permission-torole assignment relation, i.e., PA ⊆ P × R. Also from [25], assigned permissions: R → 2 P is a mapping function from role to a set of permissions.So, for some r ∈ R, assigned permissions(r) = { p ∈ P | (p, r) ∈ PA }.

RBAC Configuration.
Let UP denote a many-to-many mapping of user-to-permission assignment relation i.e., UP ⊆ U × P .Also let us consider the function RoleMining which on input UP, outputs UA and PA, i.e., RoleMining: U × P → { UA, PA }.The access control on RBAC can be defined as a CheckAccess function which is responsible for authorization process.The function is defined as CheckAccess: U × P → P .CheckAccess takes an user u and the set of all permissions and returns legitimate set of permissions pertaining to user u through roles.Formally, CheckAccess(u, P ) = {p : p ∈ P ∧∀r, (u, r) ∈ UA∧(p, r) ∈ PA}.Similarly, we define CheckGroupAccess: G × P → P , where G is a group of users who may have different roles.
Through CheckGroupAccess the permission is granted for all members of the group depending on role assigned to the group.

R-PEKS DESIGN
In this section we present our R-PEKS for single-user and multi-user settings.Those components of Π P EKS , that are reused in R-PEKS schemes as it is, are not described here.For these components of Π P EKS , which are modified with access control functionality, we use the meta character "*" for the same set of parameters used in the corresponding component in Π P EKS .Also, we modify the name of such components by prefixing the original name used in Π P EKS by R.

R-PEKS scheme
Single-user R-PEKS scheme is a SUSE in R-PEKS, where user u is authorized to search in the encrypted domain restricted by the assigned role.Further the construction is given below.Scheme 2. A single-user R-PEKS scheme, Π suse , is a collection of five polynomial-time algorithms (KeyGen, R-PEKS Enc, Trapdoor, R-Search, CheckAccess) such that: 1. R-PEKS Enc( * , CheckAccess(u, .)):R-PEKS Enc is a probabilistic algorithm run by the data owner to generate the cipher text C i for D i , if the permission set P u , obtained from CheckAccess, allows access to the D i .2. R-Search( * , CheckAccess(u, .)):Search is run by the CSP in order to search only for the privileged documents in D which are accessible under the permissions obtained by CheckAccess.

Remark 7.
In single user R-PEKS scheme, the data owner performs only the selective encryption and CSP performs only the selective search based on the roles assigned to the user using CheckAccess.
Multi-user R-PEKS: peer to peer scheme is a MUSE in R-PEKS.Here user u is authorized to search in the encrypted domain of users in S u .The construction is given below.Scheme 3. A multi-user R-PEKS: peer to peer scheme Π museP P is a collection of seven polynomial-time algorithms (KeyGen, R-PEKS Enc, AddUser, RevokeUser, Trapdoor, R-Search, CheckAccess) such that: Remark 8.In multi-user R-PEKS: peer to peer scheme, the requisite permission given by the data owner to search on others data is managed by AddUser and RevokeUser to add and revoke the user's privilege respectively.Data owner performs the selective encryption for users based on the assigned roles using the keys generated for the individual users.Finally, CSP performs the selective search for the authorized user who has requisite permission to search on others data.
Multi-user R-PEKS: group scheme is a MUSE in R-PEKS.
Here group G is authorized to search in the encrypted domain of some users (i.e., subset of U ) depending on the role assigned to G. Scheme 4.An multi-user R-PEKS: group scheme Π museG is a collection of seven polynomial-time algorithms (R-KeyGen, R-PEKS Enc, Group AddUser, Group RevokeUser, Trapdoor, R-Search, CheckGroupAccess) such that: 1. R-KeyGen(1 λ ) : R-KeyGen is a probabilistic key generation algorithm that is run by the data owner to setup the scheme for the group G. 2. R-PEKS Enc( * , CheckGroupAccess(G, .)): R-PEKS Enc is a probabilistic algorithm run by the data owner to generate the cipher text C i for D i , if the permission set P G , obtained from CheckGroupAccess allows access to the D i .3. Group AddUser and Group RevokeUser are the deterministic algorithms run by the data owner to add and remove a user from the group respectively.Since these functions are very much similar to AddUser and RevokeUser, it is not shown explicitly.4. R-Search( * , CheckGroupAccess(G, .)):Search is run by the CSP in order to search only for the privileged documents in D which are accessible under the permissions obtained by CheckGroupAccess over group G.

Remark 9.
In multi-user R-PEKS: group scheme, the group which is a subset of users are created by the data owner.Key generation, group privileges and managing the group is done by the data owner.Further, data owner performs the selective encryption based on the assigned roles for the group using the generated keys.Finally, based on the request the CSP performs the selective search based on the assigned roles for the group.
Remark 10.We observe that R-PEKS is independent of the underlying PEKS, i.e., Π P EKS in a sense that any PEKS system can be used to install R-PEKS.It may be noted that in R-PEKS for single user, i.e., Π suse , the encryption is done using underlying PEKS encryption whenever there is a permission which is derived from role based structure given by CheckAccess(.)(See Scheme 2).Once the permission is given, the encryption is done using encryption algorithm of Π P EKS , i.e.P EKS Enc(.).Similar analysis holds for Search().Thus the idea of R-PEKS can be instantiated with respect to any arbitrary PEKS system.For example, in Subsection 7.2, for the analysis purpose, RBAC is instantiated with PEKS proposed in [3] under the name r-PEKS.To the best of our knowledge, prior to this work, PEKS scheme of [3] was the only adaptively secure PEKS scheme for string search and so we select this scheme for comparison.

SECURITY ANALYSIS
Threat model.In R-PEKS scheme, we consider both CSP and data users as honest-but-curious.Under this assumption, CSP can infer additional privacy information, i.e., rolepermission assignments and other informations related to search pattern and access pattern of the data [15].Further, the malicious data users may collude with CSP to access unauthorized files by tweaking the roles.Unlike [18], the key and role management is handled by the data owner.So in our model, CSP cannot collude with any revoked malicious users in accessing the unauthorized privileges.
In the next two subsections we discuss security of R-PEKS and its components.In Subsection 6.1, we show that our PEKS is adaptively secure under CDH assumption.In Subsection 6.2, we show that our R-PEKS system ensures data confidentiality by secure access.

Security of PEKS
In this section we show that Π P EKS is secure against adaptive string attack.The basic idea behind an adaptive string attack is that the adversary A is allowed to ask for PEKS encryptions of multiple strings chosen adaptively.The definition of security requires that A should not be able to distinguish the PEKS encryption of two arbitrary strings of same length, even when A is given access to P EKS Enc() and T rapdoor() oracle.We first define an experiment for any PEKS scheme π = (KeyGen, P EKS Enc, T rapdoor, Search) , any adversary A, and any value k of the security parameter.

Definition 2. [Game Adaptive Generic
A key (pk, sk) is generated by running KeyGen(1 k ). 2. The adversary A is given input 1 k and oracle access to P EKS Enc(.) and T rapdoor(.)and outputs a pair of strings s 0 , s 1 of the same length, say m. 3. b ← {0, 1} and then a ciphertext c ← P EKS Enc(s b ) is computed and given to A. We call c the challenge ciphertext.4. The adversary A continues to have oracle access to P EKS Enc(.) and T rapdoor(.)and outputs a bit b . 5. The output of the experiment is defined to be 1 if b = b, and 0 otherwise.In case Game Adaptive Generic A,π (1 k ) = 1, we say that A succeeded.
Since our scheme deals with two cryptographically strong hash functions, namely, H 1 and H 2 , we would like to extend Definition 2 by giving oracle access of these two hash functions to adversary, which leads to the modified security definition as given below.

Definition 3. [Game Adaptive
A key (pk, sk) is generated by running KeyGen(1 k ). 2. The adversary A is given input 1 k and oracle access to P EKS Enc(.), H 1 (.), H 2 (.) and T rapdoor(.)and outputs a pair of strings s 0 , s 1 of the same length, say m. 3. b ← {0, 1} and then a ciphertext c ← P EKS Enc(s b ) is computed and given to A. We call c the challenge ciphertext.4. The adversary A continues to have oracle access to P EKS Enc(.), H 1 (.), H 2 (.) and T rapdoor(.)and outputs a bit b . 5. The output of the experiment is defined to be 1 if b = b, and 0 otherwise.In case Game Adaptive A,π (1 k ) = 1, we say that A succeeded.
It is easy to observe that if a scheme is secure under the Definition 3, then it is also secure under Definition 2. This is because from the point of view of adversary, if oracle access of these two hash functions are withdrawn from Definition 3, then this becomes Definition 2. Also reducing these two oracle access amounts to reduce the degree of freedom of adversary in terms of fetching information from oracle and thus adversary can not get more information than what he could with Definition 3. We record this trivial fact in the following theorem without proof.Theorem 1.If a PEKS scheme is secure under the Definition 3, then it is also secure under the Definition 2.
So it suffices to show that our scheme is secure under Definition 3. Consider the following definition which is based on Definition 3. Definition 4. A PEKS scheme, denoted by π = (KeyGen, P EKS Enc, T rapdoor, Search), is said to be adaptively secure under chosen plain text attack if for all probabilistic polynomial time adversaries A, there exists a negligible function negl such that , where the probability is taken over the random coins used by A, as well as the random coins used in the game.
In the next theorem, we prove that our scheme Π P EKS is adaptively secure.The proof relies on the hardness of Computational Diffie-Hellman Problem (CDH).Now we provide an variant of CDH problem for multiplicative group which is suitable for our scheme.Computational Diffie-Hellman Problem (CDH): Let (G, •) be a multiplicative abelian group.Also let g be the generator of G and let x, y ∈ Z such that a = g x , b = g y and c = g z .The CDH problem is as follows: given a, b, g as input, compute g xyz .CDH is said to be intractable if all polynomial time algorithms have a negligible advantage in solving CDH.
Suppose A is a polynomial size adversary that has advantage negl(λ) in breaking Π P EKS , i,e., P r[Game Adaptive A,Π P EKS (1 k ) = 1] ≤ 1 2 + negl(λ).Suppose A makes at most n H2 hash function queries to H 2 and at most and at most n T trapdoor queries.Here we construct a simulator S and in Theorem 2 we show that A solves the CDH problem with probability at least , where e is the base of natural logarithm.The construction of simulator and the proof technique of Theorem 2 is similar to that of [1] but in a different setting on strings instead of keywords and also the proof is done against CDH assumption.The simulator S. Let g be the generator of G. Let the simulator S is given g, p 1 = g a , p 2 = g b , p 3 = g c , p 4 = g ac .The simulator S simulates the challenger and interacts with the adversary A as follows:

(simulating H *
1 ): Whenever A queries for H 1 , S maintains a list w j , h j , a j , c j called list 1 which is initially empty.When A queries for w i , S responds as follows: A produces a pair of challenge strings s 0 = (w 0,1 , . . ., w 0,m ) and s 1 = (w 1,1 , . . ., w 1,m ) which are of same length, say, m.
where S choses B i 's randomly from {0, 1} λ .IV A can continue to issue trapdoor queries for strings other than s 0 and s 1 .
Before going into the Theorem 2, here we will prove some inequalities which are crucial for the theorem.Lemma 2. Let us consider the following events : ε 1 : The event denoting S does not abort while A is making trapdoor queries.ε 2 : The event denoting S does not abort while A is making challenge.ε 3 : The event denoting A queried against at least one of the strings s 0 and s 1 .Then, (1) Proof: 1. Since in list 1 , the distribution of c i 's are independent of the distribution of h i 's, we have P [one trapdoor query triggering abort] =

P [S will abort during challenge]
3. Let ε 3 be the event denoting A queried against at least one of the strings s 0 and s 1 .
Theorem 2. Π P EKS is adaptively secure against chosen keyword attack in the random oracle model assuming CDH is intractable.
Proof: It may be noted that the challenge implicitly defines B 1 as So similar computation for w b,k , k = 2, . . ., m indicates that this is a valid PEKS for the string s b .

Output phase :
Eventually A outputs the guess b ∈ {0, 1}.Then S picks a random pair (t, v) from list 2 and outputs t (p4) a b as its guess for g abc , where a b is the value used in the challenge phase.A must have issued the query for either s 0 or s 1 as otherwise A's view on the PEKS will be independent of s 0 or s 1 and thus A cannot have the advantage of in breaking the scheme.Therefore with probability 1  2 , list 2 contains an entry (t, v) such that t = g ac(b+a b ) .Let ε 0 be the event denoting that S selects this pair (t, v).Then P , and then

Discussion on Adaptive Security of Π P EKS .
The basic idea behind the security proof of Theorem 2 is that the adversary A is allowed to ask for PEKS encryptions and trapdoors of multiple keywords and strings chosen adaptively.This is formalized by allowing A to interact freely with an encryption and trapdoor oracle.So from A's point of view, encryption and trapdoor functionality comes as a black-box that encrypts keywords and strings of A's choice using the secret key which is unknown to A. Since keywords are mapped into finite field elements using the hash function H 1 and the final output of PEKS encryption maps the encrypted keyword to {1, 0} * using the hash function H 2 , A is also provided with the oracle access of these two hash functions.When A queries its oracle by providing it with a keyword as input, the PEKS encryption oracle returns a ciphertext as the reply.Since PEKS encryption is randomized, the oracle uses fresh random coins each time it responds to a query.The definition of security requires that A should not be able to distinguish the encryption of two arbitrary keywords or strings of same lengths, even when A is given access to P EKS Enc(.), T rapdoor(.),H 1 (.) and H 2 (.).The security of our scheme relies on the CDH assumption, i.e., so long as CDH assumption holds, we need to show that A can not win the game defined in Definition 3 with a probability much more than 1  2 .In the proof, we constructed a simulator S, who simulates the challenger in such a way so as to solve CDH.We have shown that if adversary A wins the game defined in Definition 3, then S solves the CDH in non-negligible probability.Thus if CDH is believed to be insolvable, reasonably we may infer that A can not win the game of Definition 3.

Security of R-PEKS: Data Confidentiality by Secure Access
In this section we show that R-PEKS ensures hosted data confidentiality, i.e., protection of access privilege during string search.Definition 5. R-PEKS is said to have secure access under injection of faulty roles, if R-PEKS access only the files within the scope of the user's privilege during string search.
In the next theorem, we prove that our R-PEKS access only the files based on the roles during string search which ensures hosted data confidentiality by secure access.Theorem 3.Even when the data user colludes with a CSP by the injection of faulty roles, they are unable to access the hosted data by string search over any file that is not in the scope of the data user's assigned roles/privileges Proof: In R-PEKS, let P be the set of all permissions.Let r i ∈ R u and r j / ∈ R u .Also let user u access the permission set P (r i ) through the assigned role set R u , i.e.CheckAccess(u, P ) = P (r i ).So, P (r i ) = {p : p ∈ P, (u, r i ) ∈ UA ∧ (p, r i ) ∈ PA}.
From the role mining algorithm, there is a threshold δ, such that |assigned permissions(r i ) ∩ assigned permissions(r j )| ≤ δ.Therefore the similarity rate ranges from 0 to δ.According to [26], we carry out the security mutation analysis by focusing on fault injection of roles, done by CSP while colluding with the data user, as given below : Original search request of user u is served by the data owner by presenting this request as R-Search( * , CheckAccess(u, P )).Since r i ∈ R u , R-Search( * , CheckAccess(u, P )) should be transformed into R-Search( * , P (r i )).Under the fault injection assumption, let r i be replaced by CSP to a faulty role, say r j .Thus original search request, i.e., R-Search( * , CheckAccess(u, P )) transforms in to R-Search( * , P (r j )), where P (r j ) = {p : p ∈ P, (u, r j ) ∈ UA ∧ (p, r j ) ∈ PA}.There are two possibilities: Case 1: In Case 1, string search is performed on files common to r i and r j .In Case 2, no operation is performed.Suppose the actual number of files identified for a string search on a user without faulty injection is F. Also let us assume that same number of files are identified for successive k-1 searches on the same string.Finally, let the number of identified files for a same string search caused by the fault injection of roles on k th iteration of search is δ.If files returned after search are the files that are accessible according to role r i , we call it a success.If T is the expected number of files satisfying a particular search string after k iterations, then we define score S, on success as S = (k−1) * F+δ T .It is easy to check that S ≤ 1.This is because searches for user u are performed using trapdoors generated from u's private key on data encrypted by u's public key.So from the PEKS privacy, under no situations, the files corresponding to the permissions [assigned permissions(r j ) − assigned permissions(r i )] will respond to the search, even if they contain the query string as these files are not encrypted using u's public key.Thus these files never contribute in S.
If S < 1, then the system is under attack by the injection of faulty role which affected the string search.If S = 1, then δ = F and it is a fuzzy state.In such cases the string search is not affected even though the system is under attack.Therefore in all the above cases R-PEKS scheme is secure under the Definition 5, which ensures high level of hosted data confidentiality when data user collude with the CSP by injecting the faulty role during the string search.

EXPERIMENTAL RESULTS
In this section we present an experimental set-up by generating the RBAC model on the PEKS environment.We also provide the performance analysis of our proposed model on TIMIT dataset [27].Finally, we validate the correctness of our model, i.e. the data confidentiality using test automation framework [28].The implementation is done on Intel Core(TM) i5−7500 with 16 GB RAM using Java in Windows platform.For the cryptographic primitives, 'Jpair' library is used.

Creation of RBAC configuration using RMiner
The creation of RBAC configuration using RMiner [29] is achieved in two phases.Firstly, we generate UP (see Subsection 4.3) and secondly we generate roles and its assignments to users as well as permissions, based on the UP.The required objects to generate a UP are taken from the TIMIT speech corpus [27], which is a phonemically and lexically transcribed speech of American English speakers of different dialects.This dataset is comprised of 42 distinct phonetic symbols giving rise to an average of approximately 200 words in 4607 files in a document of size 19 MB.For the experimentation purpose we considered file level access, i.e., each file is treated as an object and each object is assigned a permission, which is either 1 (grant) or 0 (deny) under two operations, namely encryption and search.It is assumed that if a file has a permission for encryption then it also has a permission for search.Further, we have incorporated 500 users in the implementation.The users and files are given as an input to RMiner [29], which is a role mining tool used for the generation of UP and for the creation of RBAC configuration, i.e., UA and PA (see Subsection 4.3).The generated UP for the given number of users and permissions is listed in Table 2. Also, in Table 2, the parameters presented in forth, fifth, sixth and seventh rows are given as input and the rest are generated by RMiner tool.We use

Performance analysis on PEKS and R-PEKS
All performances are analyzed based on the average time taken by 5 different users to perform encryption and search.These are illustrated on the speech dataset by comparing with the existing models.For the analysis purpose RBAC is instantiated with PEKS proposed in [3] and further it is referred as r-PEKS.Performance of PEKS.In the next lemma, we provide complexity analysis of our scheme.Figure 3 compares the average PEKS encryption time for our PEKS scheme against PEKS of [3].We plotted the graph by considering the average time in seconds along Yaxis against the number of files needed to encrypt along X-axis.For both schemes, the graph reflects a linear growth with the increase in number of files.However average time taken by our PEKS scheme is 5 seconds to encrypt 250 files which is very much less compare to the existing PEKS [3] where it took 188 seconds for same operation.
Figure 4 represents the performance of searching over encrypted files.We plotted the average time of search (in seconds along Y-axis) against the number of encrypted files (along X-axis).The graph shows a linear growth with the Encryption time by the PEKS [3] Encryption time by our PEKS Fig. 3. Comparison of encryption time for our PEKS and PEKS [3] increase in the number of files.The average time taken by our PEKS scheme to search over 250 encrypted documents is 2.5 seconds which is 97% efficient compared to the performance of PEKS of [3], where they took 88 seconds for the same search operation.Performance of R-PEKS and PEKS.We note that, both the PEKS schemes can have a better performance when integrated with RBAC.Now, we discuss the enhancements of R-PEKS over our PEKS.In Figure 5, we provide comparison of R-PEKS and our PEKS search performance.We plotted the graph by considering the average time of search (in seconds along Y-axis) against the available number of files (along Xaxis) using R-PEKS and our PEKS scheme.In all the cases, graph shows a linear growth with the increase in number of files, however the rate of growth reflects the efficiency of R-PEKS against PEKS scheme.In PEKS, since the search is not refined by privileges (i.e., by roles), the search is performed among all the existing files.So the average search time using our PEKS scheme for any percentage of matching files is 2.5 seconds on 250 files.Since R-PEKS is selective search, the string search request is performed only on the privileged files.So the average search time using R-PEKS is 0.25 seconds, 0.5 seconds and 1.25 seconds when the assigned roles contain 10%, 20% and 50% privileged files over 250 files respectively.So effectively from the user's point of view R-PEKS will be faster compare to PEKS.This explains the efficiency of R-PEKS over our PEKS by 90% when the required data is present only in 10% files over 250 files.
Performance of different PEKS schemes with RBAC.
In [10], authors proposed idea of generating the synthetic datasets to evaluate the performance of role mining algorithms.Towards this, we have generated the two synthetic datasets presented in Table 3 and Table 4 to compare the performance of R-PEKS.Both the synthetic datasets consist of five parameters, namely, number of users (#U), number of roles (#R), number of permissions for each user (#P), maximum number of permissions for a role and scope of privileges (range of files).In the first synthetic dataset, user's scope of privileges is a varying parameter and other parameters are constant as shown in Table 3 and the corresponding comparison study is presented in Figure 6.In the second synthetic dataset, user's permission is a varying parameter and other parameters are constant as shown in Table 4 and the corresponding comparison study is presented in Figure 7.In Figure 6 and Figure 7 we plot the average time (along Y-axis) needed by the users to encrypt the documents and to access the data by searching (along X-axis) based on the values tabulated in Table 3 and Table 4 respectively.Figure 6 reflects a linear growth of average time of encryption against scope of privileges (range of files).Figure 6 shows that for R-PEKS, average time varies from 0.5 seconds to 1.7 seconds, whereas for r-PEKS, the average time varies from 8 seconds to 9.2 seconds over scope of privileges varying from 50 to 250 for a constant 10 number of permissions.In case of searching, R-PEKS takes constant average time of 0.1 seconds whereas r-PEKS takes 3.6 seconds respectively under same scope of privileges and permissions as mentioned in Table 3.Therefore, the above comparison revels that the efficiency of R-PEKS over r-PEKS during encryption and search are 81% and 97% respectively.
Figure 7 reflects a linear growth with the increase in the number of permissions for a constant scope of privileges.Further, the average time needed for encrypting 50 files in a scope of 80 privileges using R-PEKS and r-PEKS are around 2.5 seconds and 40 seconds respectively.The average time needed to search the data over the 50 encrypted files using R-PEKS and r-PEKS are 0.5 seconds and 17.5 seconds respectively.Above comparison revels that the efficiency of To determine the correctness, 10 different users are retrieved randomly from the log and all the user behavior patterns are monitored.After one hour of experiment, the data that are saved in the logger are studied which reveals not a single instance of unauthorized access.

COMPARISON WITH OTHER SCHEMES
8.1 Comparison with scheme in [30] In [30], authors proposed MUSE scheme with efficient access control for cloud storage, where the keyword index and trapdoor can be generated with the help of a proxy server.To achieve this authors constructed a new MUSE scheme, where the keyword index and trapdoor can be generated with the help of an additional proxy server.It may be noted that in our scheme, we need not to generate any such index and thus is free from the additional index management and is also free from the risk of leakage from the index.
Also in [30], the core mathematical construct for the searchable encryption is bilinear pairing, which is a computationally intensive module.It may be noted that the scheme proposed in this paper is the PEKS scheme which is free from such bilinear pairing.To the best of our knowledge this is the only such PEKS scheme which is free from bilinear pairing and is much more efficient and easy to implement.
The search complexity of the scheme proposed in this paper is number of finite field operation which is linear in |C| for a cipher file of size |C|.Although the complexity analysis of search is not explicitly mentioned, but from Algorithm 4 of [30], it is clear that search complexity is O(n) where n is the number of keywords in a cipher file.Since we are doing sentence search where each word is of length λ, n in [30] is equivalent to |C| λ , where |C| is the size of cipher file.So, when expressed in our notation, search complexity of [30] is also liner in |C| but operations are in elliptic curve which is costly compared to finite field operations.

Comparison with scheme in [31]
In [31] authors proposed template of adaptively secure searchable encryption scheme with access control without any practical instantiation and so the comparison seems difficult.However we observed that their scheme is based on SSE of [15].One problem SSE is that, adaptive security and string search can not be achieved simultaneously.In this paper, we emphasis on customized search under three application scenarios by enforcing string search without compromising the security.So the scheme of [31] can not be adapted for the application scenarios which are considered in this paper.

CONCLUSION AND FUTURE WORK
RBAC enabled PEKS is the most suitable and efficient solution for secure search applications in multi-user setting where user permissions are updated frequently.Towards this, we have designed R-PEKS.We have designed a new PEKS, called Π P EKS for this.In the threat model, we have considered honest-but-curious CSP as well as data users and have analyzed the security requirements.Finally we have shown R-PEKS to be secure under the definition of adaptive security and also provides hosted data confidentiality by secure access.The experiment is conducted on TIMIT speech dataset [27] to evaluate the performance of our proposed model.We have shown that with respect to the PEKS of [3] for string identification, our PEKS scheme, i.e., Π P EKS is efficient by 97%.We have also shown that using R-PEKS, an user can gain up to 90% efficiency on searching when the share of data pertaining to him is 10% of the whole cloud data.For comparison study, we also have implemented PEKS of [3] with RBAC and named it r-PEKS.With the generated synthetic dataset, we have shown that the efficiency of R-PEKS against r-PEKS is up to 87% for encryption and 97% for searching.We have developed the test automation framework and have determined the functional correctness of R-PEKS by it.It might be of interest to explore how PEKS can be integrated with dynamic user permission assignment using least privilege user-role assignment problem for a better performance and security.

Fig. 1 .
Fig. 1.Design of the R-PEKS settings for secure access of cloud data

Lemma 3 .
Using Π P EKS , the number of group operations for searching a query of l-word string in one ciphertext document C is O |C| λ + (l − 1) .Proof: It is easy to observe that each block of the form [B] is of size λ bits.Thus in the encrypted document C, the number of blocks is |C| λ .To detect the first block requires O |C| λ group operations.Since the blocks are not shuffled, to detect rest of the l − 1 blocks, l − 1 group operations are needed.Thus the number of group operations is O |C| λ + (l − 1) .

Fig. 6 .
Fig. 6.Comparison of encryption and search time for r-PEKS and R-PEKS on varying scope of privileges for a constant number of permissions

Fig. 7 .Fig. 8 .
Fig.7.Comparison of encryption and search time for r-PEKS and R-PEKS on varying number of permissions for a constant scope of privileges R-PEKS over r-PEKS during encryption and search are 93% and 97% respectively.Automation of Secure Access by R-PEKS.In this section we determine the functional correctness of our proposed model on accessing only permitted files using R-PEKS in single-user and multi-user environment.To validate this experimentally, we have carried out the testing by developing a test automation framework, which can spawn 100 parallel users having different roles in every 30 seconds as shown in Figure8.In the logger block of Figure8, the the user behavior patterns are saved for analysis.
, the searchable encryption mode component enables SUSE and MUSE as two different modes for searching the string request based on the privileges in the PEKS domain.Key pairs are generated for each authorized user in Key generation component.A data user may be an individual or group with limited data scope who can request for string search.Public cloud is responsible to store the encrypted data and search.For the security reason, the roles that are

TABLE 1
Properties of different searchable encryption schemes.
RevokeUser is a deterministic algorithm run by the data owner to remove a existing user.It is revoked with the existing user u ∈ U and updates S u .3. R-Search( * , CheckAccess(u, .)):Search is run by the CSP in order to search only for the privileged documents in D which are accessible under the permissions obtained by CheckAccess over S u .
1. AddUser(u, U ): AddUser is a deterministic algorithm run by the data owner to add a new user.It is assigned with a new user u ∈ U and updates S u .2. RevokeUser(u, U ):