A Top-K Query Scheme With Privacy Preservation for Intelligent Vehicle Network in Mobile IoT

The emergence of mobile Internet of Things (IoT) has brought much more convenience to our daily life than ever before, modern intelligent vehicles are usually equipped with various mobile IoT components or devices. However, when we are enjoying the convenience from the rapid development of mobile IoT services, our privacy can be misused by attackers in an easier way. In this paper, we devise a query scheme with intelligent vehicle privacy guarantees, which enables a vehicle to acquire accurate query service from the service provider (SP) without providing its explicit private information, such as location or query interest privacy. And more than that, we introduce network coding for the first time to make our scheme applicable to a more sophisticated top- $K$ query by multiple vehicles cooperation in mobile IoT. Furthermore, we also consider the unnecessary reveal issue of service data at SP side since the data is its asset. Performance analysis and experiments verify the validity of our scheme and demonstrate a better accuracy and efficiency compared with existing solutions in mobile IoT scenario.


I. INTRODUCTION
With explosively emerging of mobile Internet of Things, numerous smart components are equipped on mobile IoT terminals, such as intelligent vehicles, wearable electronics and smart phones, these mobile terminals usually cooperate with each other to collect, process and transmit data of users, which is the most popular application in our daily life [1].
There are three types of entities in mobile IoT framework: intelligent terminals as we mentioned above, intermediate nodes which carry out data aggregation and preliminary processing, and service providers (SP) which respond the service requests from intelligent terminals, as shown in Fig.1.
Intelligent vehicle network which utilizes mobile terminals are becoming one of the emerging applications in mobile IoT, and the mobile services in intelligent vehicle networks have given rise to widespread interests due to its convenience and The associate editor coordinating the review of this manuscript and approving it for publication was Weizhi Meng . effectiveness [1], [2], such as location-based service, navigation, querying for the K nearest neighbor (K NN) places of interest (PoI), etc. With a variety of sensors or smart components inside a vehicle, diverse and plenty of personal data can be collected and utilized, such as location, velocity, voice, image and even fingerprint. More than that, nearly all of the mobile services ask intelligent vehicles to provide sensing or private data for generating accurate service results, which may extremely lead to privacy exposure of these vehicles.
Malicious attackers are capable to infer deeper private information from a mobile service request associating with background knowledge [2]- [5], such as the financial situation, hobbies, health status, and home address. Therefore, when we enjoy the convenience from intelligent vehicles, we must take effective measures to preserve the sensing and private data collected by the smart IoT components inside a vehicle. For sake of simplicity, we pick location and query interest as two typical privacy of an intelligent vehicle, and devise a service query scheme with privacy preservation for the intelligent vehicle network in mobile IoT scenario.
On the one hand, cloaking region refers to an area which is employed by different mobile terminals to blur their sensing location in a query, it is generally constructed by a vehicle or a trusted third-party server. However, owing to the route uncertainty of all involved vehicles, it's hard to construct successive cloaking regions maintaining k-anonymity in mobile IoT [9], i.e., successive cloaking regions should retain most vehicles in the initial cloaking region, such that attackers are uncertain to find the actual vehicle who initiates the query. And most trusted servers confront a service bottleneck at the demands of cloaking region construction [10], [15]. Wang et al have discussed the balance between privacy preservation and data integrity in various IoT and the trust issues of data collection [16]- [23]. Arain et al have proposed a remarkable location protection approach and a dynamic pseudonymous authentication protocol based on mix-zones, which are practical and efficient in real road network scenario [24], [25].
On the other hand, dummies are pseudo locations instead of the actual sensing location by sensors in a mobile service request, dummies are usually generated around the actual location by smart components [11]. A typical service request with dummy is first proposed by Yiu [12], which is a valid solution to guarantee location privacy, but it has the issues of inaccuracy, failing to achieve k-anonymity and location diversity. To deal with these issues, Zhou has improved the service accuracy with dummy and presented a query scheme in mobile IoT scenario [11]. Gong and Huang [13], [14] have made significant improvements of Yiu's work in the aspect of achieving k-anonymity. But most of these protections are devised in the Euclidean space without considering building blockage, which leads to inaccurate query result, little work has paid attention to the real scenario in mobile IoT.

B. INTEREST PROTECTIONS IN MOBILE IoT
Different from those sensitive sensing data captured by smart components, such as vehicle location, speed, query interest directly reflects vehicle privacy, while most service providers ask vehicles to provide their interests for high-quality service. Trustworthy services should always have as little access to private data as possible without sacrificing service quality.
An existing effective protection for user interest privacy is private information retrieval (PIR) [3], [4], [15], [26]- [28]. By means of PIR, vehicles are capable to calculate desired query results after obtaining a response cryptographic data block from SP without revealing any private information. However, most existing PIR schemes are devised on strong cryptology and inapplicable to mobile IoT due to expensive processing overhead from frequent service requests with constantly updating sensing data [15], [26]. Moreover, most PIR schemes have to utilize a trusted third party [27], [28] between a vehicle and an SP to preprocess the database, which increases the risk of data exposure.
Remarkably, two practical PIR schemes [3], [4] have been devised in Euclidean space with divided cells, both of which make good use of the homomorphic properties [29], [30] of public key cryptosystems [28], [31], but they obviously have the distance inaccuracy issue due to building blockage in a real mobile scenario, and the vehicle can be identified in a particular cell risking location privacy disclosure, and querying in a cell also gives rise to more extensive searching area with too much additional processing overhead.
Besides, intelligent vehicles always have to repeatedly send service requests to the SP as long as it enters into a new cell for obtaining the latest service results, so the vehicle may get a great deal of reduplicative service results which have been obtained in the former cell already, that's a waste of bandwidth. We will consider these mentioned issues and devise a novel and practical PIR query scheme for intelligent vehicle network in mobile IoT.

C. CHALLENGES OF TOP-K PIR QUERY IN MOBILE IoT
Specifically, we take the most common PoI query from intelligent vehicles as an example to illustrate our query scheme. Currently, a majority of query schemes in mobile IoT focus on the K NN PoI query service [5], [27], while most vehicles always expect to retrieve the most popular PoIs which are ranked by the click rate, comments and etc., i.e., top-K query [26], [32], [33], rather than only ranked by distance in the K NN scheme. Therefore, the SP should record all the query interests from the intelligent vehicles, and periodically rank the PoI records according to the popularity.
However, it means the SP needs to be aware of the vehicle interests in a top-K query, but it is opposite to the core idea of PIR which facilitates a vehicle to receive desired service without disclosing its privacy. It's absolutely a great challenge to achieve top-K query with PIR which provides no vehicle interest [27]. To the best of our knowledge, this is the first paper to deal with this issue. We will achieve top-K query maintaining PIR by introducing network coding (NC) [34] with intelligent vehicle cooperation in mobile IoT scenario based on our previous work [11].
NC has been formerly considered as a promising data dissemination approach to improve the network performance [35], [37]. A forwarding node in networks no longer follows traditional storage and forward mode, it combines the input packets into one output packet, and the receiver could calculate the original data with these combined packets in Fig.2. There is a compelling characteristic of NC, i.e., the receiver is capable to calculate the original data iff enough encoded packets are collected, we will take advantage of this beneficial characteristic to prevent SP from acquiring vehicle interest confidently for privacy concerns. There are also security and privacy solutions with NC [36], [37], but they are not applicable to deal with these issues. It's the first time to introduce NC to achieve the top-K query in mobile IoT.
To sum up, for private data preservation which is collected and utilized by intelligent vehicles in mobile IoT, we summarize the issues to be resolved as follows: • Initially, private data collected by intelligent vehicles should be preserved well in a mobile service request without revealing privacy to any service provider (SP); • However, SP has to be aware of the private data, such as vehicle's service interest, to calculate the popularity and then ranks the top-K service records in its database, it's hard to achieve a top-K query with privacy preservation; • Most solutions for privacy preservation in mobile IoT are either built on strong cryptography accompanying with excessive overhead, or built in Euclidean space without considering building blockage, both of which lead to inaccurate result and impractical implementation. And SP also desires strong control on its data security.

D. CONTRIBUTIONS
To deal with the mentioned issues, we achieve a top-K query scheme with mobile IoT devices cooperation in this paper, it protects vehicle privacy in mobile IoT scenario. Our scheme is built on a strong cryptosystem for security concerns, but with less computational and communicational overhead, we summarize our contributions as follows: • Firstly, we devise a query scheme with vehicle privacy guarantees based on oblivious transfer (OT) and private information retrieval (PIR) in vehicle network. Taking full consideration of the influential factors in the road network scenario, our scheme has lower overhead on vehicle terminals in spite of being built on strong cryptology, and its query results are more accurate than other schemes which are proposed in Euclidean space; • And then we further achieve a more complicated top-K query based on our aforementioned query scheme by introducing NC for the first time as far as we know. In our top-K query scheme, neighbor vehicles cooperate with each other to transmit their encoded query interests, and SP couldn't decode the interest privacy for ranking popularity until enough data block are collected and couldn't relate any interest to a specific vehicle; • We also consider the data preservation at SP side since it's the asset of SP, i.e., SP will release no extra data to the vehicles in our scheme. Besides, we did extensive performance analysis and experiments to verify the validity of our scheme in mobile IoT scenario.

A. NOTATIONS AND SYSTEM MODEL
We utilize the ElGamal algorithm [38] in our scheme and describe the encrypting operation of a message m with a public key pk = g x as E(m, pk) = (ε, δ) = (g r , g m (pk) r ), where the generator is g of group G, the integers x and r are randomly picked. Assume a cyclic group G 0 which is a multiplicative subgroup on a finite field F p with an order q and a generator g, respectively. Note that p and q are both large primes which satisfies q divides p − 1. We also assume the prime p and q are long enough for the security reasons. It has been proved that the ElGamal algorithm satisfies the sematic security [3], [4], [38]. TABLE 1 describes the main notations used in our scheme.  As mentioned above, there are mainly three types of entities in the system model of our scheme: intelligent vehicles, intermediate nodes and service providers, as shown in Fig.3, we describe the functionality and credibility of these entities in detail as follows: • An intelligent vehicle collects its location and interest with the help of inside smart components, and take these private data as input to acquire mobile service from SP, but the vehicle has to preprocess its private data before transmitting it to other entities since the vehicle doesn't trust any other entity for privacy exposure concerns. The vehicle travels along the road segments and observe the traffic regulations, it has a certain of computing resource, but tries to minimize overhead to acquire the mobile service as soon as possible; • An intermediate node is a sink node in a mobile sensor network, such as a roadside unit or a base station, it recodes the interest packets from vehicles and outputs one combination for sake of achieving the top-K query and privacy preservation in our scheme and then it transmits the combination to SP to rank the PoI records in the database of SP. The intermediate nodes are not trusted by vehicles or SP in our scheme; • An SP responds to the query requests from vehicles, it aperiodically receives interest combinations from intermediate nodes and decodes them after getting enough ones to calculate the popularity and periodically rearrange the top-K results in its database. A distributed SP only possesses various PoI data and responds to the service requests from vehicles in its governing area. SP is not trusted by vehicles or intermediate nodes in our query scheme. We also make the threat assumptions for attack model in our query scheme: • Semi-honest attackers: as described in aforementioned entity credibility, there exists no trusted entity in a query except the user himself, any other vehicle, intermediate node or SP could be a semi-honest attacker, in other words, they honestly carry out the scheme while they are interested in digging deep into the identities and query interests of vehicle users for commercial interests; • Malicious attackers: they are passive attackers, and try to infer a vehicle's location privacy and query interest when they eavesdrop the communications between the vehicle and SP, they also try to decode the interest combinations when they eavesdrop the communications between vehicles and SP through intermediate nodes. Besides, they could collude with some vehicles and launch traffic analysis attacks in the NC phase. In this way, each packet in networks can be calculated by the equation y(e) = h i=1 g i (e)x i to be a linear combination of the original packets X = (x 1 , x 2 , . . . , x h ) from the original generating node, the vector g(e) = [g 1 (e), g 2 (e), . . . , g h (e)] represents a global encoding coefficient which is calculated successively by each intermediate node with distinct local encoding coefficients by the equation g(e) = e C e (e)g i (e ). Suppose a receiver acquires a combination set from different routes (y(e 1 ), y(e 2 ), . . . , y(e h )), it can be expanded with an operational matrix of original packets X in Equation (1): . . .
Note that, once enough combinations are received, the receiver can decode original packet X iff G m is invertible, i.e., the matrix G m is linear independence. It has been proved that if the encoding coefficients are chosen in a finite field F q (2 16 ) and all the intermediate nodes perform random linear coding on their inputting packets to output one combination, the matrix G m , can be full rank with a probability close to 1 [36], which is crucial to decode at SP.
The invertibility of G m are also discussed in [37]. This characteristic of network coding can be applied to privacy preservation, i.e., only enough combinations of original vehicle interest packets are received, the SP can further decode what types of interests have been queried by vehicles without inferring vehicle privacy as a result of the decoding time delay and the resist to traffic analysis. But NC cannot be applied to VOLUME 8, 2020 top-K query directly in mobile IoT, such as the original data block division issue, and the security issue of G m and y(e). We will deal with these issues and make NC applicable to top-K query.

C. SCHEME OVERVIEW
In this paper, we adopt three-party framework without any trusted entity, i.e., ''Vehicle-Intermediate Node-SP'' in Fig.3, where SP is honest-but-curious, i.e., SP strictly carries out query schemes and provides service to vehicles, but hopes to dig out more information for commercial interests. And all vehicles are not trusted by SP, either. SP wouldn't like to release other PoI record to vehicles. Our query scheme includes three phases and is briefly described as follows: • An oblivious transfer query is generated by a vehicle and then sent to SP for the symmetric key to its desired PoI without revealing vehicle privacy or extra data of SP, as described in phase x y of Fig.3; • Then, the vehicle initiates a PIR query to get the top-K PoIs coordinates and their details which are encrypted by the symmetric key mentioned above, and the SP cannot find any clue of the vehicle's interest or location in this phase, as described in phase z{ of Fig.3; • Finally, a vehicle generates an encrypted packet of previous query interest to a vehicle who is elected by other nearby vehicles, the elected vehicle performs encoding operation on the packets and delivers the combinations to the SP in different routes, the SP cannot decode the original interests until it receives enough combinations, which is essential to our top-K query, as described in phase |} of Fig.3.

D. A BENEFICIAL DATA STRUCTURE
We will introduce a beneficial data structure to store vehicle interests at SP side, such as the PoIs they queried. Assume that all PoIs in the map are distributed on distinct road segments in Fig. 4, we can similarly denote a vehicle u k locating at a directed road segment − − → v m v n as u k ∈ − − → v m v n , and a vehicle initiates a query with its location and interest for top-K PoI set P K u k ,which includes two parts P K uk = P X uk + P . If u k wants to get to its desired PoI on any other road segment, it has to reach the vertex in front firstly anyway.
Note that each vertex in front of a vehicle can be taken as a starting point to reach the PoIs nearby, so all the vertexes can be taken as generators to organize top-K PoI records in the data structure in TABLE 2. In another word, each vehicle initiates a query with its forward vertex as a dummy to obtain a top-K PoI set P K u k , in which K PoIs are mainly ranked based on road distance, rather than the Euclidean distance which is adopted by most other K NN query schemes and leads to service inaccuracy because of the building blockage or road constrains. More than that, only one new query is generated by the vehicle as long as it enters into a new road segment − − → v m v n with its forward vertex in mobile IoT, which significantly reduces the query number.
Six columns exist in TABLE 2 and in sequence indicate the name of vertex v i in road networks, the neighbor vertexes of v i , various PoI types departing from v i , the unique code j of distinct PoI type, the coordinate set of a PoI type ranked by the following Equation (2) departing from vertex v i and detailed information of different PoIs. Each detail of PoI records in the last two columns is encrypted with different symmetric keys in a bid to protect data security stored in SP. Accordingly, if a geography map contains m vertexes and at most n PoI types for each vertex, as shown in TABLE 2. the SP has to generate m × n symmetric keys k ij , (1 ≤ i ≤ m, 1 ≤ j ≤ n) for each PoI type. In addition, SP extracts the columns from 1 to 4 in TABLE 2 as an index and periodically releases it to the vehicles, so that the smart components in a vehicle can determine its road segment.
Furthermore, we consider more factors to rank the popularity of a PoI rather than only the distance in road networks, so we add a criterion value T v to each PoI, T v is computed by SP in the light of other factors, such as query rate, user reviews, financial profit, etc., such that the top-K PoIs can be calculated precisely by the following weighted Equation (2).
wherein dis norm is road distance function by means of taking the longest road segment in entire map as a road distance normalization factor, and 0 < α < 1 is the weighing factor.
In our scheme, to preserve the vehicle's privacy and SP's data security, the vehicle initially carries out the oblivious transfer protocol to insensibly acquire the unique symmetric key k ij of its desired PoI in TABLE 2 and without revealing any other PoI record stored in the SP or the location and interest privacy of the vehicle. Then, the vehicle carries out the PIR protocol to get the top-K PoI coordinates and top-K detailed information in the last two columns of TABLE 2, which are encrypted by the symmetric key k ij , whereas SP hardly relates the PoI record to any vehicle interest which already had been retrieved by a vehicle.
Finally, each vehicle encrypts its interest with SP's public key and transmit it to a sink vehicle who is elected to combinate their input packets, and the sink vehicle carries out a random linear encoding operation on these cryptographic packets and delivers them to SP. The SP decodes and decrypts the original vehicle interests after receiving enough combinations without relating these interests to any specific vehicle and then rearranges the PoI records in TABLE 2. Note that three protocols can be performed simultaneously.
We also assume that the whole map of the road networks is divided into different governing areas, each distributed SP server responds to query requests from vehicles and possess various information in its governing area, such as the map, the PoI distribution information in its own TABLE 2. As shown in Fig. 4, it's a governing area with a distributed SP i .

III. OUR SCHEME
As mentioned before, SP generates distinct keys, such as a key k ij , to separately encrypt the top-K coordinates and details of each PoI record. Owing to the correlation between the symmetric key and a PoI, the vehicle privacy will be exposed if the vehicle queries for the key k ij directly. To deal with this issue, we utilize oblivious transfer to obtain a symmetric k ij from the SP privately.
Originally, SP encrypts each key k ij by Equation (3), wherein H (·) and ⊕ indicate a hash function and an exclusive or operation respectively in the equation, we set keyij = g a i ||g b j , (1 ≤ i ≤ m, 1 ≤ j ≤ n), wherein a i and b j are integers randomly picked by SP and correspond to different rows and columns in TABLE 2. Each row of the same vertex v i has the same a i and distinct b j , which reduces the computational overhead than other solutions without security compromise, the symbol || is a concatenating operation. Then a vehicle can obtain a unique key K ij each time by the oblivious transfer protocol in the following subsection III.A.
In the light of the beneficial data structure in the subsection II.D, we are able to implement the following two cryptographic protocols more efficiently in vehicle networks.

A. OUR OT PROTOCOL
Aiming at mobile IoT scenario in road networks, we propose an oblivious transfer (OT) protocol based on Elgamal cryptosystem [35], which works between vehicles and SP in our query scheme with SP data and vehicle privacy preservation. We will firstly describe the OT protocol in this subsection and its interaction process between a vehicle and the SP is show in TABLE 3, it contains 3 steps: Step 1: Owing to travel restrictions from building blockage or road constraint, a vehicle u k has to travel along the road segment until it reaches the forward vertex and enters a new segment to get to desired PoIs. Thus, the vehicle adopts its forward vertex v i as a starting point in a query rather than its actual location to retrieve top-K PoI records, and then it utilizes the index table from SP to determine the unique code j of its desired PoI record from vertex v i . Besides, the i and j reveal the vehicle's location and interest privacy when it queries for the key keyij, i.e., the vehicle locates on a road segment with vertex v i and it queries for the jth PoI type in TABLE 2. To deal with this issue, we utilize oblivious transfer to encrypt the i in symbol v i and the unique code j, then a vehicle sends the query (OT 1 , OT 2 ) to SP for keyij.
Step 2: In order to acquire the corresponding key secretly from SP, as soon as receiving the OT query (OT 1 , OT 2 ), for each integer ξ and η within 1 ≤ ξ ≤ m and 1 ≤ η ≤ n, SP calculates two candidate vectors (R, S) as potential query result for vehicle to calculate the key keyij, then the cryptographic vector (R, S) are transmitted to the vehicle.
Step 3: On the basis of oblivious transfer, a vehicle is merely capable to calculate its desired key key ij according to (OT 1,i , OT 2,j ) in R and S when ξ = i, η = j, and it computes key ij = w 1 ||w 2 by the following Equation (4): and then it further calculates the k ij = K ij ⊕ H (key ij ) in the light of Equation (3) by its self-inverse of ⊕ operator. With the symmetric key, the vehicle is capable to only decrypt one PoI detail it desired before.

B. OUR PIR PROTOCOL
Similarly, we devise a PIR protocol based on Gentry et al's cryptosystem [28] and it works between vehicles and SP. The novel improvements in this paper are these: we implement a PIR protocol in the real road networks scenario rather than in the Euclidean space, which makes the query result more accurate than other typical solutions [3], [4], [28]. And more than that, our PIR protocol is capable to acquire the desired top-K PoIs accurately just at one time, rather than excessively searching for the K PoIs in broad area around the vehicle successively in other similar solutions [3], [4], [28], our PIR protocol significantly decreases the query scopes and communication cost in this paper. Besides, in mobile IoT scenario, the vehicle sends a query only once as soon as it gets in another segment rather than issuing the same query to SP periodically according to location updates in the solutions [3], [4]. Its interaction process of our PIR protocol between a vehicle and the SP is show in TABLE 4, it contains 3 steps.
In our PIR protocol, we assume all PoI records belonging to a certain vertex v i in terms of a data block, denoted as an integer C vi . Then an integer set PE = {p N 1 1 , p N 2 2 , . . . , p N m m } are calculated by SP and assigned to each data block corresponding to distinct vertexes, wherein ω i = p Ni i in PE is a prime power, p i and N i are respectively a prime number and a relatively small natural number. Particularly, the size of each data block is smaller than its relevant p Ni i , such that |C vi | < ω i , all vehicles are aware of the set PE. At last, a minimum integer C LBS is calculated with Chinese Remainder Theorem, such that C LBS = C vi (modω i ), 1 ≤ i ≤ m, wherein the C LBS indicates the database of a certain distributed SP who governs an area in a divided map.
In our PIR protocol, a vehicle initially determines the vertex v i in front of its current road segment, and its relevant ω i = p N i i which divides | < g > |, the order of an element g in the random group G generated by vehicle. Eventually, the vector (G, g) is sent back to SP.
After receiving (G, g), SP computes g C LBS straightway and delivers it to the vehicle. Then, the vehicle is capable to calculate desired data block C vi by following Equation (5).
wherein c ω i indicates the calculation C LBS ( mod ω i ). With the acquired symmetric key before, the vehicle can uniquely decrypt one PoI record desired in the data block C vi without revealing any vehicle privacy to SP, or extra PoI data in SP to vehicle. Therefore, the vehicle privacy and SP data security are well protected in this PIR scheme.

C. IMPLEMENTATION OF THE TOP-K QUERY
The two above-mentioned protocols aim at getting K PoIs without revealing vehicle privacy or SP data. Such query scheme, however, cannot achieve top-K query since SP has to know the query interests from vehicles to rearrange the top-K PoI records in its database, which is incompatible to the core idea of PIR. We will maintain PIR with achieving top-K query by adding networks coding (NC) into our query scheme. With the cooperation of intelligent vehicles in mobile IoT, the SP cannot calculate the query interest from vehicles until enough NC packets are collected, and it cannot relate any of these query interests to a specific vehicle as well, which insures PIR against being invalid. In the road networks, each vehicle in different road segments can be considered as a member in a vehicle ad hoc network. After or before a query, a vehicle encrypts its query interest with the public key published by the SP, and delivers the ciphertext to a nearby vehicle who is elected randomly by its neighbor vehicles as a cluster head (receiver), the election algorithms [39], [40] and relevant data protections [41]- [44] have been well explored in previous work, we will not focus on it in this paper. The vehicle changes its pseudonym in each snapshot query to prevent correlations.
The algebraic relations mentioned in the subsection II.B can be applied to these query interest packets. After receiving h packets X = (x 1 , x 2 , . . . , x h ), the receiver adds a unit matrix to these packets, which fundamentally helps packets to record each linear transformation on intermediate nodes and facilitates the decoding operations at SP, denoted as the following Equation (6): and then the receiver generates its local encoding coefficient matrix and calculates a combination of these packets by Equation (7), wherein β i is the ith line of matrix X. The h packets can be accordingly called as the corresponding packets of aforementioned combination.
Finally, the receiver delivers the combination and corresponding packets with different unit vectors in h distinct paths to SP.
Actually, the network is dynamic in mobile IoT, with vehicle login or logout in an ad hoc way. But these query combinations and their corresponding packets are always forwarded to different sink nodes, such as roadside units, base stations, etc. After receiving a certain number of the packets of the same combination from a receiver, a sink node will recode the packets with its own local encoding vector by performing random linear coding and output a new combination, so each output can be taken as a combination of original h packets (x 1 , x 2 , . . . , x h ) in Equation (8) where E i is a unit matrix, and each globe encoding coefficient g(e) is time varying and diversified, but only the terminal node, i.e., SP, is capable to decode the combinations due to adding the unit vectors which records the random linear transformation during the transmission and facilitates the VOLUME 8, 2020 calculation of G m at SP without being aware of the network topology knowledge or message routing path. For sake of the confidentiality of G m , homomorphic encryption with the public key from SP can be implemented the on original encoding coefficient at each selected receiver, and at each sink node the linear recoding can be carried out on the encrypted coefficient which is equivalent to the same operation on the plaintext due to the homomorphic feature [28], [31], so any other malicious attacker in the vehicle networks cannot obtain the G m , so the original interest packets of vehicles are protected well in the transmission.
After gathering enough combining packets of the same generation and decrypting the coefficient G m , SP is capable to decode the original vehicle interest with the following Equation (9) as long as G m is invertible.
The invertibility of the matrix G m can be easily achieved as long as the corresponding packets at a receiver are forwarded in distinct edge-separating paths [34]- [37]. It can be seen that each combination seems like the same as an encryption and meaningless to any malicious attacker, and only after gathering enough combinations, SP can calculate the original packets which are the exact query interests from cooperative vehicles, and then SP can periodically update its top-K PoI records stored in TABLE 2 without inferring any of these query interests to a specific vehicle. To sum up, the aforementioned three phases, OT, PIR and NC constitute a scheme that achieves the top-K query with vehicle's location and query interest privacy preservation and SP's data security guarantees in mobile IoT scenario.

IV. PERFORMANCE ANALYSIS A. PRIVACY AND SECURITY ANALYSIS
In the our oblivious transfer protocol, only one symmetric key can be acquired by the vehicle each time, where the encryption key key ij = w 1 ||w 2 = g a i ||g b j can be worked out if and only if ξ = i and η = j in Equation (4) and the following Equation (10).
Because only the vehicle knows x, and it is hard to work out other keys in term of brute force due to the intractability of discrete logarithm, and the vehicle cannot deduce any other encryption key of vertex v i with the same g a i , either.
Respectively, the i of v i and the unique code j imply location privacy and query interest privacy of a vehicle, we utilize ElGamal [38] to encrypt them, such that SP or attackers are hard to crack vehicle's relevant privacy on account of the semantic security provided by ElGamal. During the whole process of our scheme, the vehicle's locations are generalized uniformly to the entire governing area of a certain SP. Compared with the schemes in [3], [15], in which a vehicle is restricted to a small region, it's easy for the vehicle to be identified by a malicious attacker if the vehicle is the only one in that region.
Additionally, SP is never aware of any vehicle interest, e.g., the g C LBS in PIR protocol is calculated by SP who gets no significant information rather than an exponentiation, and consequently the vehicle works out its desired top-K PoIs itself with g C LBS . Despite many similar solutions have already made great improvements [3], [15], we achieve preferable accuracy and efficiency with costly cryptography for mobile IoT in real road networks.
Cooperative vehicles' interests are also preserved well in the NC procedure since each packet is a combination of h vehicle query interests in Equation (8), due to none of these combinations signifies an explicit query, the SP is hard to relate it to any specific vehicle. Moreover, homomorphic encryption is employed on original local encoding coefficient so that each linear recoding can be confidentially carried out on each sink nodes, which effectively guarantees the query interest privacy in mobile IoT.
Besides, traffic analysis can be performed on intermediate nodes or malicious attackers in the NC phase. Generally, there are three typical technologies in traffic analysis: size correlation, time order correlation and content correlation. In our scheme, each packet or combination is set to the same size to resist the size correlation, and it's hard to implement time order correlation since each packet or combination can be randomly sent to the SP. We also assume the resistances to the aforementioned size or time order correlation are compromised, the original global encoding coefficient can be encrypted by Paillier cryptosystem, attackers are hard to decoded the combinations without being aware of the G m , even if an attacker compromises several intermediate nodes to analyze the query interests from vehicles.
Moreover, the sink vehicle is also elected randomly, the cost is extremely expensive for attackers to collude with each potential receiver, the vehicle can preserve its query interest well by means of querying with a new pseudonym each time even if SP colludes with a large number of vehicles. Furthermore, NC also helps to enhance transmission capability in vehicle networks without privacy compromise.

B. COMPUTATIONAL COST
As most researches concern, we will focus on the computation cost analysis and make a comparison between our work and similar solutions [3], [15].

1) OT PROTOCOL ANALYSIS
Similar to [3], [15], we take modular exponentiation as the major cost in our scheme. As shown in TABLE 3, a vehicle performs encrypting operation to generate an OT query in OT 1 = E(i, pk) = (g r 1 , g −i+xr 1 ), therefore 4 exponentiations are executed in OT 1 and OT 2 . After receiving the cryptographic result, the vehicle performs w 1 = Y 1,i (X 1,i ) x to calculate the accurate query result, therefore 2 exponentiations are executed in calculating w 1 and w 2 . Thus, 6 exponentiations are executed in an OT query at the vehicle side in total, as shown in TABLE 5. SP carries out 4 exponentiations in (ε r ξ 1 , g a ξ (g ξ δ 1 ) r ξ ) for every ξ ∈ [1, m] to create an OT 1,ξ after receiving a query. And g ξ can be calculated ahead of time for cutting down the processing time since SP possesses the element g, and consequently 3m exponentiations are executed for calculating the (OT 1,1 , . . . , OT 1,m ), take another 3n exponentiations in (OT 2,1 , . . . , OT 2,n ) into account, the number of exponentiations adds up to 3(m + n) at SP side, as shown in TABLE 5.
Finally, another two exponentiations are performed at vehicle side to calculate w 1 , w 2 . The computational costs in our protocol are compared with two typical schemes [3], [15] who employ the OT protocol in their scheme as well, the comparison of computational cost is shown in TABLE 5.
These solutions [3], [15] whose geographical maps are all described with cells in Fig. 5. For obtaining the latest PoI query result in mobile IoT, the vehicle has to issue a new query as long as it gets in a new cell accompanied with location updates. But in our scheme, the vehicle only issues one query as long as it gets into a new road segment which traverses a number of cells. That implies, if u i enters into −−→ v 13 v 8 which traverses four cells, it only issues one query in our query scheme rather than issue four queries in [3], [15]. As a result, if each road segment traverses a certain of cells at an average of γ , the computing cost is 1/γ of solutions [3], [15].

2) PIR PROTOCOL ANALYSIS
In [15], PIR query is devised in Euclidean space with m × n cells, and accordingly the number of multiplication is N √ m × n and m × n executed at vehicle and SP ends in TABLE 6, respectively. Compared with that, the scheme in [3] contains 3 exponentiations, and the number of its multiplication is 2|N | and |e| at vehicle and SP end. Four exponentiations exist in our scheme for homomorphic encryption on encoding coefficient at the sink nodes, (S NC + 1) × h exponentiations in recoding process, where S NC and h are the average number of sink nodes and the combinations received on each node, respectively. And owing to utilizing the Pohlig-Hellman algorithm, it brings in a process with a complexity of O(N log p N + √ p) in [3], whereas we have a process with a complexity of O( √ p) in the logarithmic solving, which improves the efficiency. Thus, the PIR computational overhead of three mentioned schemes is summarized in TABLE 6, where exp represents the cost of an exponentiation. It shows that our scheme has an advantage of computational overhead in a snapshot query. More than that, if each road segment traverses γ cells on average, the computational cost of our mobile service request is 1/γ than the other two schemes in road networks. We do not consider the computational cost in NC due to its linear operations on sink nodes in mobile IoT scenario, which are negligible compared with modular exponentiations.
Furthermore, we adopt distributed SP whose service data set in TABLE 2 is smaller than that in a centralized database, so when the minimum integer C LBS = C vi (modω i ), 1 ≤ i ≤ m is calculated by SP with the Chinese Remainder Theorem to generate a cryptographic result, in which C LBS indicates the database of a certain distributed SP who governs an area in a divided map, the computational overhead is obviously much less than that in the centralized database. So due to the less computation amount, our scheme generates query result faster than the other solutions who are also built on strong cryptology, it is applicable to the mobile IoT scenario.

3) NETWORK CODING ANALYSIS
The computational overhead can be investigated from three aspects: original encoding on the select cluster vehicle, recoding on the intermediate node and decoding on the SP.
Consider an h × h encoding matrix G m on the cluster vehicle, for sake of security and recoding on intermediate nodes, each element in G m can be encrypted by Paillier cryptosystem to make good use of its homomorphic property. So that the encryption computational cost on G m is O(h 2 ) according to Equation (7), and each encrypting operation includes 1 multiplication, 1 modulus and 2 exponentiations. Thus, the computation cost is O(h 2 · logn) in total. Intermediate nodes can perform random linear recoding on the ciphertext of inputting encoding packets, which is equivalent to the recoding operation on plaintext packets.

Similarly, the computational cost on each local vector g(e)
is O(h 2 · log n). According to the Equation (8), the total computational cost of a G m with hg(e) is O(h 3 · logn) due to the multiplication operation provided by the homomorphic property of Paillier cryptosystem.
Only SP can decrypt the encoded combination according to Paillier cryptosystem, it includes 1 multiplication, 1 exponentiation and 1 division operation on a combination, so the computational cost is O(h · logn). Only after collecting h linear independent combinations, SP can decode the original packets, so the computational cost is O(h 3 ).

V. EXPERIMENTS
We employ the geographical data in dataset OL [45] and generate mobile objects with TBG [46] as vehicle continuous locations in road networks. Our scheme was implemented on Windows 7 platform by Java programs. The primary parameters are recorded as follows in TABLE 7, each service request packet is normally set to 1500 bytes in total, in which includes (1500 − 40)/300 ≈ 5 PoI records as defined that each PoI record and the packet header cost 300 bytes and 40 bytes, respectively.

A. QUERY ACCURACY
Vehicles are capable to obtain accurate top-K PoIs according to its interest, which are arranged and stored in advance in the data structure of TABLE 2, it implies the top-K PoIs are the nearest ones along road segments without blockages in mobile IoT scenario rather than the Euclidean space, such as building blockage, road restrictions. All the top-K records are rearranged corresponding to the updating popularity of vehicle interests by introducing NC. The query accuracy is compared with [3], [15] in Fig. 6 and Fig. 7.
As show in Fig.6, our scheme is more accurate than [3], [15] since it is devised based on real road networks, e.g., despite the K NN PoIs are gotten in [3], [15], these PoIs are  distributed in the Euclidean space, which may be unreachable for a long time along the road, so it implies the accuracy for querying PoIs cannot be guaranteed all the time. As shown in Fig. 7, when the average length of road segments S becomes longer, the query accuracy decreases as a result of discarding some PoIs caused by pruning and merging in our scheme.

B. PROCESSING TIME AND COMMUNICATIONS
As shown in Fig.8, when the number of vehicles increasing, the SP faces more service requests and hence the average processing time of three schemes increases to a certain extent.
Our scheme has lower computational cost compared with the scheme [3]. We also have a certain of preponderances on account of query number declines and discrete logarithm improvements in PIR phase, its average processing time reaches the half of [15], whereas a mobile user in [3], [15] has to send more queries in different cells.
Our scheme makes expensive cryptographic scheme more practical due to the reduction of query number. As shown in Fig. 9, as location updating in time interval from 30s to 5s, the query packets tend to increase rapidly in [3], [15], whereas  our scheme remains stable because the vehicle doesn't issue queries corresponding to the location updating.

VI. CONCLUSION
To preserve the private data which are collected by the smart components in vehicle networks, we propose a top-K query scheme with privacy preservation by introducing NC into mobile IoT scenario for the first time. In our scheme, vehicles with smart components cooperate with each other to access to mobile service without revealing their location or interest privacy, and our scheme has more accurate service result since taking full consideration of the factors in real road network. Moreover, it has less computational and communicational overhead even built on strong cryptology. Compared with existing solutions, we verified its accuracy and efficiency in performance analysis and experiments.