An Efficient Privacy-Preserving Multi-Keyword Query Scheme in Location Based Services

With the proliferation of location-aware mobile devices and the prevalence of wireless communications, location-based services (LBS) have attracted much particular attention in recent years. For flexibility and cost savings, the LBS provider outsources the LBS data to the cloud in order to serve the increasing number of mobile users. To guarantee users’ privacy and data confidentiality, some excellent works have been proposed which focus on secure query over the location server. However, these existing works have two limitations. On the one hand, they cannot preserve users’ location and query content privacy simultaneously. On the other hand, they fail to support multi-keyword queries. In this article, aiming at a multi-keywords query in LBS, we propose a novel efficient and privacy-preserving multi-keyword query scheme (PPMQ) over the outsourced cloud, which satisfies the requirements of the location and query content privacy protection, query efficiency, the confidentiality of LBS data and scalability regarding the data users. To improve the efficiency of our proposed scheme, we utilize the linear quad-tree technique to build a grid system to represent the location information in the query condition as well as a searchable index. To protect the location privacy, we combine decimal Morton code and public-key cryptography techniques to build a searchable index or to generate a trapdoor. To enable the cloud server to perform a secure multi-keyword query, we systematically construct a privacy-preserving query scheme with bilinear pairing-based cryptography. In particular, our proposed scheme is scalable and very suitable for multi-user environments due to the flexible user registration and revocation mechanisms. Furthermore, a detailed security analysis shows that the proposed scheme can ensure the confidentiality of LBS data, and protect the location and query content privacy. Extensive experiments are conducted on a real LBS dataset, and the simulation results confirm the security and efficiency of our scheme.

especially those regarding user's location and query content privacy, should be addressed before the deployment of LBS in the real-world [4], [5]. In a LBS system, the data queriers first submit their accurate locations and query contents to the LBS provider, then the LBS provider returns the desirable POIs records to the data queries. By collecting and analyzing query users' current location and query content, the LBS provider can easily obtain a great deal of sensitive information, such as users' real identities, health status, trade secrets, hobbies, and so on on [6]. At the same time, the querying user would only be interested in a few POIs, which are more relevant to her/his query. Returning a larger number of POIs will cause considerable computation and communication costs. Hence, designing a privacy-preserving efficient query scheme in LBS systems that protects the user's location and the query content privacy is still an active topic of LBS research.
Let us consider the following application scenario. To achieve low computation cost and flexible LBS deployment, the LBS provider can be regarded as a data owner, which outsources his/her LBS data (i.e., POIs) to a cloud server for enjoying the abundant benefits brought by cloud computing such as easy access, great flexibility, cost-saving, excellent computation performance, and others. For the sake of protecting the sensitive LBS data, all data are encrypted before being outsourced to the cloud server. The authorized data users, i.e., registered users, can issue a multi-keyword query to find desirable POIs records within a given distance to his/her current location from the cloud server. Then, the cloud server searches its database and returns the corresponding POIs to the data user. Such results help the data user to look for desirable POIs accurately and quickly. For example, a tourist can issue the following multi-keyword query to the LBS provider through his/her smartphone: ''what is the lowest price for the most popular hotel within 1000 meters from my current location ?''. Then, the cloud server searches for the POIs at a given distance and returns the charming hotels considering price and reviews.
To protect the privacy of location data in the LBS system, scholars have dedicated many research efforts to design privacy preserving schemes for LBS [7]. Many solutions have been proposed in the literature [6], [8]- [24]. However, most of schemes only protect either the location privacy [9]- [11] or the query content privacy [14]. They fail to preserve the location privacy and query content privacy simultaneously. In [12], [13], [15]- [17], many approaches have been proposed to protect the location information and the query content privacy. But, they downgrade the accuracy of the user's location information and increase the communication cost between the user and the LBS provider. Encryption based on the various schemes [2], [4], [18], [19], [21] can fully ensure the security of data and provide accurate results to users. However, most of those techniques bring relatively high computation or communication costs on the user side, which lead to much energy consumption over the mobile devices. Furthermore, most of the prior works either address privacy preserving, or instead result in poor user experience on POIs query. For instance, such works only support location coordinate query or single keyword search. However, nowadays, users prefer to submit multiple keywords to retrieve the most relevant POIs. Therefore, it is challenging to develop a secure, efficient multi-keyword query scheme over encrypted LBS location data in a flexible and scalable manner.
In this article, different from existing works, aiming at the challenges as mentioned above, we propose an efficient privacy-preserving multi-keyword query scheme in LBS, named PPMQ, which provides fine-grained queries and returns more accurate POIs to users without divulging users' sensitive information to both the cloud server or to other unregistered users. To prevent the cloud server and the unauthorized users from knowing the exact location data of the data owner, we adopt a systematic encryption to encrypt the outsourced LBS data. To protect the location and query content privacy, we first utilize linear quad-tree to design a perfect grid system, in which the real location information of the user and POI record can be represented as a grid location coordinate. Then, combined quad-tree with decimal Morton code technology, a secure index construction and trapdoor generation algorithm is developed. To enable the cloud server to perform secure multi-keyword searches, based on bilinear pairing cryptography, we construct a secure query protocol. Users can issue a multi-keyword query to accurately locate the desirable POIs quickly and conveniently without revealing any sensitive information, which dramatically improves the user experience. Furthermore, PPMQ is very suitable for multi-user environments by providing flexible registration and revocation mechanisms for users, which can help the data owner to carry out the user's identity authentication. Finally, we conduct extensive experiments on real-world LBS datasets and give a rigorous security analysis to confirm the proposed scheme's security and efficiency.
To summarize, the main contributions of this article are: • We propose an efficient and privacy-preserving multi-keyword query scheme that can simultaneously preserve the location and query content privacy. Besides, with flexible user registration and revocation mechanisms, our scheme is very suitable for the multi-user environment.
• We systematically construct a secure multi-keyword query protocol, which not only enables the cloud servers to perform a secure multi-keyword search without knowing the actual value of both query condition and POIs but also allows the data owners to encrypt keywords of POIs with their secret keys, such that the registered data users can query the POIs without knowledge of any secret key.
• To achieve query efficiency, we first utilize the quadtree technique to build a grid system representing the location information of user and POIs. Then, combining with the Morton coding algorithm, we can build a searchable index and generate a trapdoor securely. We develop a fine-grained query protocol, where the data user can query the POIs by initiating a multi-keyword query and obtaining the desirable POIs according to their preference.
• We give rigorous security analysis and conduct extensive experiments on a real LBS data set. The analysis shows that our scheme can protect user location privacy and guarantee the confidentiality of LBS data simultaneously. Experimental results confirm the efficiency and effectiveness of our proposed scheme. The rest of our paper is structured as follows. We review some related works in Section II. In Section III, we formalize the system model, threat model and design goal. Then, the preliminaries are presented in Section IV. In Section V, we provide the location representation model and define the proposed scheme. Moreover, our construction of the PPMQ scheme is presented in Section VI, followed by the security analysis of our scheme. The performance evaluations are conducted in Sections VII and VIII, respectively. Finally, we conclude our paper in Section IX.

II. RELATED WORK A. LOCATION PRIVACY PRESERVATION IN LOCATION BASED SERVICE
Protecting user's location privacy in LBS has drawn a lot of attention from researchers in recent years. Existing privacy-preserving techniques in the LBS ecosystem can be broadly categorized into four groups [7]: Anonymity based schemes [9]- [11], Obfuscation mechanisms [12]- [17], Encryption based schemes [18]- [20], and shared information reduction mechanisms [21]- [23]. We briefly discuss some of them as follows.
Anonymity based schemes aim to break the links between users' identity and location information, such as k-anonymity [9], l-diversity [10] and p-sensitivity [11]. These schemes can protect the user's identity and location privacy effectively. However, they need a trusted third party (TTP) to blur users' exact location information into a cloaked area. The TTP would easily become the target of attacks and exhibit a single point of failure vulnerability. To avoid using TTP, obfuscation based schemes reduce the precision of users' location information by adding dummy locations [12] or noise [15], [16] and generalizing location data [13], [14]. Nevertheless, most of the schemes introduce additional system costs or sacrifice the utility of the location data. Encryption based schemes, such as homomorphic encryption schemes [17], [18], [20] can provide LBS accurately while protecting the confidentiality of users' location data. Nevertheless, most of them impose relatively high computation requirements on the user side, which is not suitable for resource-constraint mobile devices.
Unlike existing works, we propose a privacy-preserving multi-keyword query protocol in LBS, which enables the data user to obtain the desirable POIs more accurately and more conveniently without divulging the location and the query content privacy. The proposed scheme is flexible and efficient.

B. SECURE KEYWORD SEARCH
To protect the sensitive information of users and enable the cloud server to perform a keyword search, Wang et al. [25] defined and solved the secure ranked keyword search over encrypted cloud data. In [25], they found that invariably retrieving all files and returning undifferentiated results would incur considerable communication costs for the data querier to get the most relevant files. They propose a secure keyword search scheme that returns top-k relevant files upon a single keyword based on an order-preserving symmetric encryption technique. To further enhance search efficiency, Curtmola et al. [26] developed a single encrypted hash table index for the entire file collection and then proposed a per-keyword based scheme. However, as a common practice indicated by today's web search engines (e.g., Google search), data users tend to issue a multiple keywords search rather than single keyword search to retrieve the most relevant data. Further, In [27], based on secure inner product computation, Cao et al. proposed a preserving multi-keyword ranked search over encrypted cloud data (MRSE). [28] also extended the secure keyword search for multi-keyword queries. Their approaches employ ''inner product similarity'' to quantitatively evaluate the similarity between query keywords and outsourced data files. Zhang et al. [29] proposed a scheme that deals with secure ranked multi-keyword search in a multi-owners model. To tolerate both of minor typos and format inconsistencies given user's search input, Li et al. [30] and Chuah and Hu [31] proposed fuzzy keyword search over encrypted data. To enrich query predicates, conjunctive keyword search [32], [33] over encrypted data has also been proposed. As a more general search approach, predicate encryption schemes [34], [34], [36] are recently proposed to support both conjunctive and disjunctive search.
Motivated by multi-keyword search in could computing, we focus on the secure multi-keyword query in LBS, enabling the LBS provider to return the most relevant POIs to query users accurately. Also, we seek to achieve an efficient and scalable query system without sacrificing the user's privacy and security of data, so that the scheme could be suitable for a more significant number of data users.

III. PROBLEM FORMULATION A. SYSTEM MODEL
First, we describe the notations in this article, as shown in Table. 1. Then, we present the system model. In Fig. 1, our system model consists of three different entities: the data owner, the cloud server and the multiple data users. The data owner (i.e., LBS provider) collects a series of location and related information from the business called poi records so as to provide LBS service to data users. These poi records comprise a LBS location dataset. Since the cloud server can provide low-cost storage and powerful computation services, we assume that the data owner is willing to outsource the location dataset to the cloud server for better LBS offers to data users. In order to protect the confidentially and privacy of the location data, the data owner encrypts each poi i record  before outsourcing it to the cloud server. When a data user wants to join the system, the LBS provider provides authentication and registration service for the data user. The data users must provide their own identity information to register with the data owner (i.e., LBS provider). If the data user passes the authentication, the data owner grants the search capabilities to the legal users by sending some important security parameters to data users. The legal data users can enjoy the LBS service by submitting their multi-keyword queries to the cloud server. After that, the cloud checks the identity information and search capabilities of data users and then performs the query process. Without the valid security parameters of the data users, the cloud cannot complete the query process for the data users. At last, the cloud returns query results to the data users. Correspondingly, the LBS provider can also revoke any expired data user, who no longer has the search capability over the outsourced LBS data. Note that how to achieve decryption capabilities is out of the scope of this article, some excellent work to this problem can be found in [36], [37]. Attribute based encryption is a better way to manage user's access towards outsourced LBS data.
Next, we will describe each entity of our model as follows in detail.
1) Data Owner: Data owner assumes the role of a LBS provider (LBSP) who owns a large-scale poi records, denoted as POI = {poi 1 , poi 2 , . . . , poi n }. For convenience, we assume each record poi i contains three elements. We use ( , · · · , w im } is the skeleton description of the poi i record containing m keywords. To prevent the cloud server from knowing the actual content of the poi record, the data owner encrypts each poi i record to form an encrypted location dataset, denoted as POI = { poi 1 , poi 2 , . . . , poi n }. To enable efficient search on the encrypted LBS location data POI, the data owner has to build a secure searchable index I for the location data. Finally, both searchable index I and POI are outsourced to the cloud server. 2) Cloud Server: Cloud server stores the encrypted LBS data POI and verifies the search capabilities for the data user. If the data user obtained the search capabilities from the data owner, then the cloud server stores the search public key and identity information of the registered data user. Thus, the registered data user can pass the authentication for being a valid user. When receiving the encrypted queries from the authorized users, the cloud server performs the secure multi-keyword search over encrypted data POI and then returns the satisfied query results to the users. The cloud server does not know any context of the poi i records, the user's query context, or the location of authorized users. Only the authorized users can search the encrypted dataset and recover the query results after sending a multi-keyword query to the cloud. VOLUME 8, 2020 3) Data Users: The data users are authorized LBS users, who enjoy convenient location-based services by submitting multi-keyword queries to the cloud server anywhere and anytime. The data user first registers him/herself with the data owner and then obtains a secret key and a secret hash function from the data owner. For example, the query request, ''find a 24-hour Indian curry restaurant near me'', contains three keywords, ''24-hour'', ''Indian curry'' and ''restaurant''. To hide the query request for protecting query privacy, the user can use his secret key to encrypt the query keywords and then send the encrypted query keywords to the cloud server. When a data user receives the query results returned from the cloud server, he/she can recover the actual content of the POIs by decryption. The unregistered and revoked users from the data owner cannot enjoy the LBS query service.

B. THREAT MODEL
In this article, we mainly consider two attacks, the external attacks, and the internal attacks. Unauthorized outsiders initiate external attacks. We can build secure communication channels between all parts using standard security protocols, such as Secure Socket Layer (SSL) [37] and Secure Socket Shell (SSH) protocol [38], to resist the external attacks. The SSL and SSH protocol can use a combination of cryptographic processes to provide secure access to a computer over an unsecured network. Thus, we only focus on the internal attacks initiated by the cloud server and the unregistered data users. In our threat model, we assume that the data owner and authorized data users are trusted. However, the cloud server is not trusted. We regard it as ''honest but curious'', which is the same as previous works [25], [27], [36], [39]. That is to say, the cloud server is ''curious'' to learn and infer the encrypted record poi i and the received message. It may attempt to deduce the actual information of the user's query content, stored LBS data, and location information of users. We also assume that the cloud cannot collude with revoked users to derive additional information about the data owner's encrypted poi records.

1) Security Guarantee:
The proposed scheme should prevent the cloud server from inferring the accurate location of users, user's query content, and the actual contents and keywords value of encrypted poi records stored in the cloud. The cloud server should not know the user's query interest, the exact location of users, and the context of each encrypted poi record. Besides, we still need to guarantee that the cloud server cannot recover the actual content of the query result. 2) Access Control: The cloud server only provides multi-keyword query service to current data users who have been authorized by the data owner (i.e., LBSP). The unregistered or revoked users cannot enjoy this LBS service.
3) Scalability: This system can provide a multi-keyword query service for a large number of data users at the same time. The proposed scheme allows a data user to enter or leave the system without affecting other data users. 4) Computation Efficiency: The cloud server should process the multi-keyword query efficiently without disclosing query content and location privacy of data users. The data owner should encrypt these poi records speedily and then send it to the cloud. Moreover, the data users also can compute the trapdoors quickly according to the query condition. The proposed scheme should be as efficient as possible.

IV. PRELIMINARIES
In this section, we recall the bilinear paring map and review the quad-tree technique, which will serve as the basis of our proposed PPMQ scheme.

A. BILINEAR PAIRING MAP
Let G and G T be two multiplicative cyclic groups with the same large prime order q, and let g be a generator of G.
A bilinear pairing map e : G × G → G T has the following properties: 1) Bilinearity, i.e., for all Definition 1 (Discrete Logarithm Problem): Given a multiplicative cyclic group G with the prime order q, g is a generator of G, we first select an element a from Z * q , and compute g a ∈ G. Then, it is difficult to compute the correct value of a. In other words, given a tuple (G, q, g a , g), there is not an efficient algorithm to output a.

B. QUADTREE STRUCTURE
A quadtree is a tree data structure in which each internal node has exactly four children, as shown in Fig. 2. Quadtrees is most often used to partition a two-dimensional space by recursively subdividing it into four quadrants or regions [40]. Quadtrees are used to store data of point on a two-dimensional space efficiently. For example, a quadtree provides a uniform space decomposition mechanism for spatial data such as coordinates in a Geographic Information System (GIS). Quadtree decomposes the location coordinate space into a hierarchy tree. Due to its simplicity and regularity, the quadtree technique has been widely applied in many applications, such as image compression, collision detection, search for nearby points.
The process of constructing a linear quadtree from a two-dimensional spatial area is described as follows. First of all, we assume that the depth of a quadtree has a maximum value. Then, we recursively divide a city area on the map into four equally-sized parts, forming four grids with the same sizes. If the depth of the quadtree does not reach the maximum value, the grid is split into four smaller grids with the same size, and the POIs in the parent grid is inserted into child grids. In this way, a hierarchy quadtree has been built from this region. As we can see from Fig. 2, if the city area has been recursively divided τ times, the depth of quadtree is τ . This quadtree has 2 τ × 2 τ leaf nodes. Thus, the city area is divided into 2 τ × 2 τ grids comprising of a grid system. The grid coordinates are numbered uniquely from 0 to 2 τ − 1. In our proposed scheme, we construct a linear quadtree from the city area. Since each distinct poi is located in a unique leaf node of the linear quadtree, we can use a tuple of grid coordinates (x i , y i ) to represent the location coordinates (lat i , lon i ) of the poi in the city area. Thus, the GPS location information of poi can be represented in the form of grid location coordinates in our grid system.

C. MORTON CODE
Morton [41] first proposed the concept of Morton code in 1966. Morton code is often used to map multidimensional data to one dimension while preserving the locality of the data point. It can be applied to generate a unique index for a tuple integer numbers. For example, the encoded value of a point in the two-dimensional space can be uniquely indexed. These indexes values are sorted in a ''Z'' shape. Fig. 3 illustrates the space partition and the corresponding decimal Morton code of the linear quadtree, where the depth of the linear quadtree is 3. The z-order of Morton codes have such an excellent characteristic that the coordinates of the adjacent Morton code numbers are also spatially close to each other in the multidimensional space. In recent years, the Morton code has been extensively used in computer graphics, such as tree construction, raster data compression, and spatial sorting.
In our proposed scheme, we utilize the Morton code's charming feature to find the nearest neighborhood poi record.The basic idea of the search nearly point is to test whether the Morton code value of a point is equal or close to another point in the grid system. The z-value of a point in two-dimensional space can be calculated by interleaving the binary representation of its coordinate values. We assume that II and JJ represent the row and column number of a point in the two-dimensional space, respectively. Given a row coordinate II of n bits whose binary presentation is II = (i 1 i 2 · · · i n ) and a column coordinate JJ of n bits whose binary presentation is JJ = (j 1 j 2 · · · j n ), the decimal Morton code is M = (i n j n i n−1 j n−1 · · · i 3 j 3 i 2 j 2 i 1 j 1 ) 2 . For example, if the gird coordinate of a point p 1 is (3, 4) as shown in Fig.3, the binary representation of the row and column coordinate is 011 and 100, respectively. The Morton code value of this point is M 1 = (011010) 2 = 26.

V. OVERVIEW OF PPMQ SCHEME
In this section, we first introduce the location representation model. Then, we describe the formal definition of the proposed scheme.

A. LOCATION REPRESENTATION MODEL
In this article, we utilize the quadtree technique to describe the location coordinate of the poi in the searchable index I and encrypted query condition. For simplicity, we construct a balanced linear quadtree with depth τ based on the city area. That is to say, the city area has been recursively divided τ times. Thus, the city area is partitioned into N × N grids, where N = 2 τ , the length of the basic grid is denoted as σ . The grid coordinates are numbered uniquely from 0 to N − 1.
As described above, we use a tuple of coordinates to represent a point in the city area's gird system. The GPS location latitude and longitude coordinates of the poi can be represented as the grid coordinates of the poi. Fig.4 shows a grid system for the linear quadtree, where N = 8. In Fig.4, the grid coordinates of the point p 1 , p 2 , p 3 , p 4 can be expressed as (3,4), (5, 2), (5, 5), (2, 5), respectively.
To improve the query process's efficiency, we encode the quadtree leaf nodes based on the Morton code. As we can see from Fig. 4, the Morton value of the point p 1 , p 2 , p 3 , p 4 is 26, 38, 51 and 25 respectively. The two-point location coordinates, which are close to each other in the two-dimensional space, have Morton values that are close to each other. To achieve location query, the data user can specify his/her region of interest, which can be represented as a circle centered at the query user's current location l i = (x i , y i ) with a radius of d in our grid system. The radius d can be regarded as a search range. Based on the query location l i = (x i , y i ) and radius d, the data user can compute the minimum bounding rectangle, where the range of row and column number can be denoted as [x min , x max ] and [y min , y max ], respectively. Note that data users can adjust the size of the radius d. The larger the d, the more POIs will be searched. As a matter of simplicity, if most of a grid is covered by a data user's region of interest, this grid is added to the query region. For example, if the query location is (4,3) and d = 1000 meters, the region of interest of the data user is shown in Fig. 4. In Fig. 4, we can find that, the query range of the row and column for the data user is [1,5] and [2,6], respectively. After that, the set of Morton code values of query region is QM = {14, 15,26,36,37,38,39,45,48, 49, 50}. The cloud server only needs to return the POIs, which meet the multi-keyword query condition, and the Morton code values of these POIs are in the set of Morton code values of query region to the data user.
As described above, the location of poi can be encoded as an integer number, and the query location range of the data users can be encoded as an integer set by the Morton encoding algorithm. We can covert the testing of whether the location of a point falls within a query region to the testing of whether the Morton code value of a point is one of the elements in the set of Morton code values of query region. That is to say, if a point is within the query region, the Morton code value of a point is a member of the set of Morton code values of the query region. A last, to protect location and query content privacy, we can design public key encryption to encrypt the set of Morton code values of query region and location coordinates of POIs.

B. FORMAL DEFINITION
The formal definition of the proposed scheme is defined as follows. Our scheme consists of six algorithms: System Setup, User Register, Encryption, Generate Trapdoor, Query, User Revocation.
• User Register(PK , MSK , u) → (SK u , ERK u ): The LBSP takes public key PK , master secret key MSK , a random number s u ∈ Z * q , a user-defined random number r u ∈ Z * q as input, and outputs the user's ID UId u , g k·r/s u , and the user's search public key ERK u . At last, the LBSP delivers UId u , g k·r/s u , hk 0 , the grid system parameters τ , σ , and s u to the user u via a secure communication channel and (UId u , ERK u ) is sent to the cloud server by a secure channel.
• Encryption(POI, PK , MSK )) → C = I|| POI: Based on the LBS dataset POI, the data owner first builds a searchable index I. After that, the LBS dataset can be independently encrypted by a systematic encryption. Finally, the data owner obtains POI. At last, C = I|| POI is outsourced to the cloud server.
• Generate Trapdoor( W, loc i , d) → T W : The data user first extracts t keywords from user's query content. With t keywords of interest in W, the query location loc i and query range d as input, it generates a corresponding trapdoor T W .
• Query(C, T W ) → POI W : The cloud server takes the trapdoor T W and ciphertext C as input, and outputs the identifier list of relevant encrypted POIs, namely POI W .
• User Revocation(UId u , UL) → UL : The cloud server takes the identify information UId u of user u, the registered users list UL as inputs, and then delete corresponding user's identify information to obtain a new users list UL .

VI. CONSTRUCTION OF PPMQ SCHEME
In this section, we use the bilinear paring map and quad-tree technique to construct the PPMQ scheme. We present the construction of PPMQ scheme as follows: A. SYSTEM SETUP Given a system parameter λ, the LBSP first generates two multiplication cyclic group G and G T with the large prime order q, and a bilinear paring map e : G × G → G T , where e is a non-degenerate bilinear pairing operation. Let g be a generator of G. Then, the LBSP defines a random oracle H 1 : {0, 1} * → G, a hash secret key hk 0 , the grid system parameters σ , τ , and a key derivation function KF(.), which are shared with the valid query users. Next, the LBSP generates a secret key k ∈ Z * q , and a random number r ∈ Z * q , which helps improve the flexibility and security of our system. Finally, the LBSP keeps the master key MSK = (k, r) secretly, and opens the public key PK = {G, G T , e, H 1 , g, KF(.)}.

B. USER REGISTER
When a user u wants to enjoy the LBS, he/she first needs to register with the LBSP to obtain the search capability. The LBSP verifies the identity information of the user u. After the user u passed the authentication, the LBSP assigns a ID number UId u to user u. Then, LBSP selects a random number s u ∈ Z * q for u and compute g k·r/s u . Next, the LBSP sends g k·r/s u , s u , hk 0 , UId u , and the grid system parameters τ , σ to the user u by a secure communication channel. When the user u receives g k·r/s u , s u , UId u , hk 0 , τ , and σ , he/she randomly selects r u ∈ Z * q and then further computes g r u with keeping the private key SK u = {r u } secretly. According to the received g kr/s u , he/she calculates his/her search public key ERK , ERK u = g k·r/s u × g r u = g k·r/s u +r u At last, the user u also keeps (s u , r u , hk 0 , σ , τ ) secretly. (UId u , ERK u ) is sent to the cloud server. The cloud server stores this tuple into a registered users list UL.

C. ENCRYPTION
To make the system secure and easy to search, before uploading POIs to the cloud server, the LBSP should first build a secure index I for encrypted data POI. As mentioned before, the searchable index I consists of the encrypted keywords W that describe the POIs in all aspect, and the encrypted Morton code value M of location coordinates of the POIs.
To enable the registered users to find the desirable POIs conveniently and quickly, the LBSP use several keywords to describe the POIs in a fine-grained model. Each keyword w i,j (1 ≤ j ≤ m) in W i describes a certain aspect of the poi i . For a point poi i , the LBSP uses the following method to encrypt each keyword w i,j (1 ≤ j ≤ m) in W i as follows.
where 1 ≤ j ≤ m, H 1 is a random oracle shared between LBSP and the registered data users, k is the secret key for the LBSP to encrypt keywords of POI, and r ∈ Z * q is a random number for the LBSP to improve the flexibility and security of our system.
To protect the location privacy, the LBSP first converts the GPS location coordinate of the poi i to the grid coordinate based on the grid system. Then, the LBSP computes the Morton code value of the grid coordinate of the poi i . At last, the LBSP uses the following method to encrypt the grid coordinate of poi i .
where Ed(.) represents the Morton encoding algorithm, (x i , y j ) is the grid coordinate of the poi i , and M i is the Morton code value of the grid coordinate of the poi i . Through the above operations, the searchable index I i has been built, which can be represented as To preserve the confidentiality of POIs, the LBSP encrypts each data item poi i ∈ POI using the following formula.
where Enc(.) is a systematic encryption method, such as AES, DES, and hk i is a secret key. The secret key hk i can be obtained by a key derivation function KF(.) shared between query user and LBSP. The secret key hk i is generated as follows: After that, we obtain the resultant ciphertext of encrypted POIs, denoted as I|| POI, where || is concatenation character.

D. GENERATE TRAPDOOR
To implement the fine-grained query, the data user uses multiple keywords to describe the query requirements accurately. The data user first extracts t keywords from the user's query content, we call it query keywords. Then, the data user computes the query region based on the grid system, which is produced by the parameters σ and τ . To preserve the query content and location privacy, the data user encrypts each query keyword and query region before submitting a query request to the cloud. The data user takes two steps to generate a trapdoor. First, the data user encrypts the query keywords. Second, the data user uses the same method to encrypt the set of Morton code values of the query region.
In order to make the data users generate trapdoors securely, the query keywords encryption should satisfy two main conditions. First, for the same keyword, the data users can generate different trapdoor each time. Second, the data user does not need to ask the data owner for the secret key to generate VOLUME 8, 2020 a trapdoor. That is to say, the data user can generate a trapdoor independently. The data user encrypts each query keyword w i (1 ≤ i ≤ t) ∈ W as follows: where r ∈ Z * q is a random number and the value of r is variable. The data user can set r to a different value each time, which helps to improve the randomness and security of the query content.
To prevent the attackers from obtaining the location of the data user through analyzing the set of Morton code values of query region and disguising a valid query user to launch a query, the data user encrypts each Morton code value M j (1 ≤ j ≤ µ) ∈ QM as follows: where Ed(.) represents the Morton encoding algorithm, (x i , y j ) is the grid coordinate of the point in the query region, and M j is the Morton code value of the grid coordinate of the point in the query region. Through the above operations, the trapdoor has been generated, T W = (T w 1 , T w 2 , · · · , T w t )||( M 1 , · · · , M j , · · · , M µ ).

E. QUERY
After receiving a trapdoor T W from data user u, the cloud server searches I i ∈ I one by one. The query process is conducted in three steps: In the first step, the cloud server first reads the encrypted location dataset I|| POI. Next, the cloud server parses the ciphertext, and then gets the searchable index I. Afterward, for each subindex I i , the cloud server can easily obtain the W i and M i for the point poi i .
In the second step, the cloud server tests whether the grid coordinate of the point poi i is located in the query region of u. The cloud server will match the encrypted Morton code value M i of the point poi i with the element M j in the set of encrypted Morton code values of query region QM using the following equation. e(g r ·s u ·H 1 (M j ) , ERK u ) = e(g r ·s u ·H 1 (M j ) , g k·r/s u +r u ) = e(g r ·s u ·H 1 (M j ) , g kṙ/s u ) · e(g r ·s u ·H 1 (M j ) , g r u ) = e(g k·r·H 1 (M j ) , g r ) · e(g r ·s u ·r u ·H 1 (M j ) , g) = e(g k·r·H 1 (M i ) , g r ) · e(g r ·s u ·r u ·H 1 (M j ) , g) = e(g k·r·H 1 (M i ) , g r ) · e(g r ·s u ·r u ·H 1 (M j ) , g) (8) where ERK u is the search public key of u. If the above equation holds, the location grid coordinate of the poi i is located in the query region of u. The cloud server obtains all POIs that their location coordinates are located in the query region.
In the third step, the cloud server judges whether the keywords of the point match in the query keywords submitted by data users or not. The cloud server tests whether the keyword w ji ∈ W j of the point poi j is contained in the query keywords set W or not. If the following equation holds, the keyword w ji correctly matches the query keyword w i . e(g r ·s u ·H 1 (w i ) , ERK u ) = e(g r ·s u ·H 1 (w i ) , g k·r/s u +r u ) = e(g r ·s u ·H 1 (w i ) , g kṙ/s u ) · e(g r ·s u ·H 1 (w i ) , g r u ) = e(g k·r·H 1 (w i ) , g r ) · e(g r ·s u ·r u ·H 1 (w i ) , g) = e(g k·r·H 1 (w ji ) , g r ) · e(g r ·s u ·r u ·H 1 (w i ) , g) (9) Note that, a point poi j cloud be returned if and only if it has at least one keyword matching the user's query keywords. If the above equation holds, w i is equal to w ji . After that, the cloud server filters unqualified results obtained in the previous step. With these three steps, the cloud server has found all qualified POIs that their location and keywords match the query condition, and then returned the qualified encrypted POIs to the querying user u. The query user uses hk 0 and corresponding Morton code value M i of the poi i to generates the decrypt secret key hk i . Lastly, the querying user utilizes hk i to recover the plaintext of the poi i by the AES decryption algorithm. The could server cannot obtain any sensitive data from the returned POIs.

F. USER REVOCATION
User revocation is an important and challenging take in an LBS system. If the LBSP wants to revoke a user u, the LBSP first sends the ID of user u, UId u , to the cloud server. Then, the cloud server scans users' information in the registered users list UL to find out the information of user u. Next, the cloud server deletes (UId u , ERK u ) to obtain a new users list, denoted as UL . Once (UId u , ERK u ) is deleted from users list stored at the cloud server, the data user u no longer has the search capabilities to query the encrypted LBS location data. Since, without ERK u , the cloud server cannot perform keywords matching between trapdoor and encrypted query keywords. Once the LBSP has revoked u, he/she can still generate a legal trapdoor. However, he/she no longer has the capability to search the encrypted POIs. The cloud server can reject the query request from the data user u.

VII. SECURITY ANALYSIS
In this section, we step by step analyze our proposed scheme's security to demonstrate that the security and privacy requirements have been satisfied for the POIs, the keywords, the queries, and the location information.
POIs: In our proposed scheme, these POIs are encrypted by the semantically secure symmetric encryption algorithm, such as AES, DES, before uploaded to the cloud server. As long as the encryption algorithm is secure, the attacker cannot know the actual content of poi. Since the secret key hk i is keeping secretly by the data user who has been registered to the LBSP, the unauthorized data user is hard to obtain the actual contents of these POIs. Thus, POIs are protected from unauthorized access. The privacy of POIs is preserved.
Keywords: The keywords that describe the poi record from all aspects are encrypted before uploaded to the cloud server. Let us consider a popular game played between a challenger C and a probabilistic polynomial-time (PPT) attacker A. In our scheme, the cloud server can be regarded as an attacker A. The LBSP or authorized data user can be acted as the challenger C. For keywords encryption, C first generates the following parameters, k, r, g, q, and a random oracle H 1 , Then, C makes these parameters public. Next, C selects a keyword w ij from W i of the poi i , and subsequently sends g k·r·H 1 (w ij ) to A. Based on this information, A would try to guess w ij . However, the Decisional Diffie-Hellman Problem (DDHP) is hard; it is difficult to obtain g H 1 (w ij ) . As the Discrete Logarithm Problem (DLP) is also hard, it is intractable to compute H 1 (w ij ) in polynomial time. Even if the ciphertext of the keywords has been stripped, the attacker obtains the hash value of keywords, H 1 (w ij ). Due to the one-wayness and collision resistance properties of the hash function, A cannot recover the keyword w ij by semantic analysis. Our scheme implements semantic security against adaptive chosen keyword attack (IND-CKA) secure using random oracle and bilinear paring technique. Therefore, the security of the keywords is well preserved.
Trapdoors: The security of trapdoor can be analyzed from three aspects. Recall the trapdoor construction formula, On the one hand, as the discrete logarithm problem is hard to solve, the attacker is difficult to obtain H 1 (w i ). The attacker must have known something about H 1 (.) and using semantic analysis to guess the queried keyword w i . However, the attacker has no idea about the H 1 (.), it is infeasible to recover w i . On the other hand, in our scheme, only the registered data users and LBSP knows the random oracle H 1 (.). Without knowing H 1 (.), the attacker cannot generate a correct trapdoor for the chosen keyword. It is inapplicable to try out many possible keyword values to find out w i . Besides, the attacker would try to infer sensitive information based on query results. The attacker would go to a specific place to obtain the corresponding POIs and then guess the query content of the data users. Nevertheless, this attack is impractical. The reason is that, to prevent the attacker from knowing the query content, we insert a random number r to encrypt the keyword, which can blur the encrypted keywords. The same keyword can be encrypted into different ciphertext each time.
For the encrypted location information QM j , we can adopt the same method to analyze the security of QM j . Relying on the semantic analysis, the attacker cannot distinguish a specific encrypted query keyword from the trapdoors, the attacker cannot guess the query content of the data user. Hence, the query privacy is preserved. Location Information: In our scheme, the location information is encoded with the Morton coding algorithm and then encrypted in the same way as the keyword encryption. Since the DDHP is hard, the location information encryption algorithm is IND-CKA secure. The reason is demonstrated as follows. We consider a game played between a PPT adversary A and a challenger C, which acts as the authorized data user and LBSP in the location information encryption algorithm. We assume that adversary A has a non-negligible advantage as the attacker in this game. The game is conducted in the following steps.
Step 1: C runs the system setup algorithm to generate the public parameters (g, q, H 1 , e, G, G T , g r , C 1 ), and then sends these parameters to A.
Step 3: Based on these information, A would try to calculate M j and outputs the guess j ∈ {0, 1} for j. We denote t as the bit that A is trying to guess. When t = 0, C 1 is given g k·r ; Otherwise C 1 is set randomly in G. The adversary outputs 0 if and only if j = j. If j = j, we can say the adversary A wins the game. Let Pr[j = j] be the probability that the adversary guesses correct. The advantage of A that wins this game is Adv A = |Pr[j = j] − 1/2|.
Proof: If t = 0(i.e., C 1 = g k·r ), it means that M j = C 1 · g H 1 (M j ) is a valid location information encryption. In this sense, the adversary A guess correctly with probability 1/2 + . If t = 1, the adversary receives the ciphertext = g r , where r is a random number in Z * q . In this case, the value of j is hidden to A, A will guess it correctly with probability 1/2. Hence, Pr[j = j] = 1/2 · (1/2 + ) + 1/2 · 1/2 = 1/2 + .
Since is non-negligible, Adv A = |Pr[j = j] − 1/2| = . The advantage of A to win this game is non-negligible. This conclusion violates the assumption that DDHP is hard. Thus, our location information encryption is semantically secure. The privacy of the location is preserved.

VIII. PERFORMANCE EVALUATIONS
In this section, we measure the efficiency of the PPMQ in terms of the encryption of POIs time, trapdoor generation time, and query processing time with a real LBS dataset. The proposed scheme supports multi-user settings and provides a flexible multi-keyword location query service to data users.

A. SIMULATION EXPERIMENT SETTINGS
We conduct a thorough performance experimental evaluation of the proposed scheme on a real LBS data set, the Open-StreetMap project in Singapore [42]. The dataset's POIs are extracted from the LBS resource items in Singapore, which has 32730 POIs. Most POIs in the dataset have less than 5 keywords, while a few of them may contain more than 5 keywords. In our experiment, we randomly register 100 users into the geographic area of Singapore. We build a linear quadtree and then obtain a grid system based on the geographic area of Singapore. We assume that the length of the basic grid is σ = 1000 meters. Then, the number of recursive divisions of the geographic area can be set to τ = 5. For every 5 minute, 10% of users are randomly selected to issue queries. VOLUME 8, 2020 The experiment programs are coded using JAVA programming language on a PC running JDK1.8 platform with 3.30GHz Intel(R) Xeon(R) E3-1225 CPU, 8G memory, and a Linux Mint 17.3 Rosa operation system. We use the type-A elliptic cure parameter, where the group order q is 160-bits, and SHA1 as the random oracle for hashing keywords and Morton code value. We use the AES algorithm to encrypt POIs records with a 128-bit key. In our evaluations, we implement the pow and paring operation under type-A parameters without preprocessing. All the experiments are run 10 times to calculate the average time cost in different phases. We implement all necessary routines for LBS providers to process location data such as encryption of POIs, data users to generate trap doors, and the cloud server to perform the multi-keyword searches.

B. THE TIME COST OF DATA PROVIDER
In our proposed PPMQ scheme, the main operations of the LBSP is to prepare POIs. Before outsourced these POI records to the cloud server, we should encrypt these POI records. The encryption execution at the LBSP consists of building secure index for these POI records and encryption of them. The factors affecting the computation cost of encryption algorithm are the number of POIs n, the number of keywords m of each POI record. Thus, we test the efficiency with the different number of keywords m of each POI, and different number of POI records n, respectively. The number of keywords of each POI is selected from 1 to 5. As shown in Fig. 5(a), the time costs of POI records encryption increase with m. Fig. 5(b) shows that, with the n increases, the time cost of POI records encryption increases linearly approximately. We can see that, given m = 5, when n increases from 100 to 1000, the average time costs increases from 5.27s to 51.75s. The reason is that a larger m or n results in a longer POI records ciphertext. Hence, more time cost is needed to construct the secure searchable index and encrypt the POI records.

C. THE TIME COST OF DATA USER
The primary operations of the data user are to generate the trapdoor. To see whether the time cost is acceptable for mobile LBS user or not, we measured the time cost of trapdoor generation. The different number of query keywords t and query range d are chosen to illustrate the time cost of the data user. To observe the query generation's time cost, five query ranges are uniformly selected from 1000 to 5000 meters, and then 10 different query coordinates are randomly selected from the open street map of Singapore for each query range. The number of query keywords increases from 1 to 5. Next, for different query keywords and query range, we execute 10 times trapdoor generation algorithm with different query coordinates and calculate the average time cost for each query condition. Fig.6 shows the time cost of generating query conditions. From Fig. 6, we can observe that the time of generating trapdoor increases as t increases. We can also observe that from Fig. 6, with the query range d increases, the time cost of generating trapdoor also increases. That is because the larger the number of query keywords t, the more encrypt operation computations are required. Given a fixed t, the larger the query range d, the more time cost is needed to construct the query region and encrypted the Morton values of the query region. When the query range d increases from 1000 to 5000 meters, the average trapdoor generation time cost is less than 2.72s. In general, the data user will often choose very seldom query keywords and a relatively small query range to search the desirable POIs. Note that, when the query range d <= 2000, the time cost of query generation is about 0.5s. Thus, the time cost in data users is acceptable for mobile devices.

D. THE TIME COST OF CLOUD SERVER
We now consider the query algorithm. The query algorithm execution at the cloud server consists of matching location and keywords. The cloud server computes a bilinear paring over a prime order group for each location coordinate in the query region to match the location information. Then, the cloud server matches the query keywords with the keywords of each POI. The number of query keywords t increases, the number of keywords m of each POI, query range d, and the number of POI records n may impact the computation complexity in the cloud server. Therefore, different t, m, d and n are choose to illustrate the time cost of query. Both t and m varies from 1 to 5. The number of POI records n grows from 100 to 500. Form Fig. 7(a), we can see that, given m = 5 and n = 100, the time cost increases as the number of query keywords t increases and when the query range d increases, the time cost of query also increases. The influence of the query range parameter is greater than the number of query keywords. That is, because the larger the query range d, the more POIs need to be matched, the more query operations, the cloud server needs to be performed. Fig. 7(b) illustrates that given n = 100 and d = 2000, the time cost in cloud servers rises with the increasing in the number of query keywords t. When the number of keywords of each POI m increases, the time cost increases linearly approximately. Since the larger the m, the more keywords matching operations the cloud server needs to perform. Fig. 7(c) demonstrates that, given a fixed d = 2000 and m = 5, with the number of POI records increases, the time cost also increases. Furthermore, the more number of query keywords, the more time cost is required for keyword matching. The reason is that the larger the n, the more time cost is needed to match the query keywords and location information. Note that our experiments were simulated on a PC, which plays the role of cloud, and only one CPU core can be utilized to computing. Our scheme's performance will be perfect on a real cloud server, which has much more computing resources.

IX. CONCLUSION
In this article, we explore the problem of the multi-keyword query in LBS. Different from prior works, we have proposed a novel efficient and privacy-preserving multi-keyword query scheme in LBS over the outsourced cloud, named PPMQ, which preserves location and query content privacy, and achieves confidentiality of location data. The designed scheme can prevent the LBS provider or unregistered users from deducing the query content. The authorized data user can obtain accurate LBS query results without divulging his/her location information efficiently. Specifically, we developed a flexible user registration and a user revocation mechanism, the proposed scheme is scalable. Furthermore, we give security analysis and conduct extensive experiments on a real LBS data set to evaluate our scheme's performance, and experimental results demonstrate the efficacy and efficiency of our proposed scheme. In the future work, we will take into consideration of the integrity verification for the query results. WEI LIANG received the Ph.D. degree from Hunan University, in 2013. He was a Postdoctoral Scholar with the Department of Computer Science and Engineering, Lehigh University, USA, from 2014 to 2016. He is currently an Associate Professor with the College of Information Science and Engineering, Hunan University. His research interests include networks security protection, embedded system and hardware/IP protection, fog computing, and security management in WSN.