A New Method With Swapping of Peers and Fogs to Protect User Privacy in IoT Applications

,


I. INTRODUCTION
Internet of Things (IoT) now contains numerous innovative technologies in addition to billions of smart devices and objects [1] connected to the Internet. These smart things, spread everywhere around us, are helping to make our routine actions swift, flexible and sophisticated [2].
Many applications of IoT use location based services (LBS). Smart City is one of the natural and most powerful applications in IoT [3], which is heavily dependent on technologies like wireless sensor networks (WSNs) and RFIDs [4]. The job of WSNs is to sense environmental conditions, which can involve numerous variables like pressure, heat, noise, pollution, humidity, lighting, movement, leakage, sounds, images and so on [5]. When combined, WSN and RFID transfer artifacts of a designated space into smart objects, which can be used to share data with other objects as well as human beings.
The associate editor coordinating the review of this manuscript and approving it for publication was Danping He .
As expected, these smart tools are also limited in terms of energy, storage, and processing capacity [6]. Smart City applications rely on giant data centers, service providers and Clouds to provide a suitable environment to store and process the data generated by smart devices and objects. Thus, most smart city applications (whether related to environment, society, energy, health, economy, transport, etc.) rely on cloud computing [7], [8]. Cloud computing processes store and analyze data about users, then try to discover new knowledge and features that help smart cities to improve these services, thus providing smarter apps which are better adapted to each user [9].

A. PROTECTING AND PRESERVING PRIVACY
Preservation of security and privacy in smart cities has now assumed greater prominence as it poses an ominous threat to the future of smart applications and objects. As these smart things often work with and transmit sensitive data to faraway locations (clouds), sensitive data is susceptible to being hacked. In some situations, sensitive information in the wrong hands can cause devastation to users or the people associated with the information [10], [11].
A service provider (SP), normally a cloud, is always in an advantageous position to garner a significant amount of information about habits, behavior, personality, mindset, and ambitions of users or customers [12]. For example, while searching for points of interest, the location based services such as Smart Street, Smart Car, Ubiquitous Health, and Smart Alert enable the SP to track the user's location and find their whereabouts at a given time [13], [14]. An attacker can use the location of the user to gain vital information such as a user's identity, the whereabouts of their home location, and habits, in addition to the details of their job, religion, social leanings, and health data. As the number of IoT devices continue to climb, so does the vulnerability of data intrusion [15]. Thus, the protection of users' ID and location from the SP is crucial to preserve the privacy of users.

B. PRIVACY: THE FOCUS OF OUR RESEARCH
This article mainly addresses issues with the privacy of users from the SP in a LBS environment. Sometimes the concepts of privacy and security can get mixed-up, although they are different. As shown in Table1, privacy seeks to prevent miscreants to trace, link or identify users' personal data. On the other hand, security protects confidentiality and integrity of data and availability of services for applications. In view of [16]- [18], privacy is ''The right of the user to determine when, where, why, how, and who can access and use their data''. Protection of privacy in many situations is a complex problem but could be achieved by using methods to hide the identity from attackers or malicious parties such as the SP in the LBS environment, to prevent users from being profiled [19]. In other words, when sending a query to the SP, the user must ensure that the SP is denied access to any link which could provide them any useful information about them [20].

C. CONTRIBUTIONS AND STRUCTURE OF THIS PAPER
In this article we present a new method to protect the privacy of users by utilising a multi-swapping scheme in the LBS space involving peers and fog nodes (fogs). We call this method the 'Swapping of Peers and Fogs' (SPF). Details of the SPF method, including the swapping scheme and its justification, advantages, resilience, novelty and limitations are provided in Sections 3, 4 and 5.
In section 2, a literature review is presented, wherein Cloud and Fog computing are discussed. These technologies play a pivotal role in the SPF method. Also, a brief discussion of the strengths and weaknesses of the existing methods is provided, which are vital for comparing these methods with the SPF.
In section 3, we formally introduce the SPF method and offer justifications for its swapping scheme. The SPF method can be used in a variety of situations depending on the user requirement, which we analyse in section 4, In section 5, we present two algorithms to summarise the operational details of the SPF method. The first of these demonstrates how the swapping scheme works, and the second describes the cache management.
In section 6, we compare swapping schemes of SPF and other methods, discuss superiority, resilience, and limitations of the SPF method in detail. In section 7, we provide an example of deployment of SPF in the case of connected vehicles, as well as an application of SPF in a Smart City. In section 8, we analyse management issues associated with the swapping scheme, including the estimation of delay, disruptions, and the routing scheme of the SPF method. In section 9, by means of a set of performance metrics, we analyse swapping scheme of SPF and compare it with the other methods. In section 10, we provide simulations to compare the performance of the SPF method with the existing methods, including the details of the experiments.
Acronyms used throughout this article are presented in Table2.

II. LITERATURE REVIEW
In this section we present a list of technologies and tools, and offer justification for using them. We also provide summary of some existing privacy methods, including their weaknesses and relative performance. These methods are are referenced throughout several times in this article.

A. CLOUD AND FOG COMPUTING
Cloud and fog play critical role in the SPF method. The service provider, SP, is a cloud, and fogs are used in the swapping scheme. The number of devices that are connected to the internet is estimated in the billions [21]. With so many devices, cloud computing can no longer provide prompt response to the huge amount of smart applications, especially medical Apps, which are sensitively dependent on time [22], [23]. To meet such requirements, many solutions such as mobile clouds or multiple clouds and fog computing emerged [24] in 2012. Fog computing, with far superior features than cloud computing, is part of the solution to provide a faster response.
In typical applications, fog nodes (which we shall simply call 'fogs') are widely distributed at the end of the network and IoT devices (perception layer), which is closer to the user [25]. This setup manages a cluster or region with tools to provide responsiveness, especially in emergencies, as well as initial processing of data before sending it to a cloud for permanent storage. Fogs can store data for a short period of up to two hours, which is usually enough for nodes to collect and summarize the data [26]. The next step for the fogs is to send the data directly to the cloud, eliminating the need for hundreds of connections from numerous devices to interact with the cloud every few seconds. A batch transmitter also has a significant role in reducing load and improving the performance and privacy dramatically within applications that use this technology [27]. Fog nods (fogs) are made up of a hierarchical structure and share information with the core fog. The core fog is headed by the cloud, which is a distributed structure instead of a centralized one [28]. The main differences between fog and cloud structure are summarized in Table3. For more details, refer to articles [25]- [27], [29]. From Table3, it is evident that fog computing cannot be a substitute for cloud computing, but with their integration, a higher level of services, applications and features can be provided. A user query, when submitted to the LBS, has a number of components, which are shown in Table4. Rapid and massive growth in the number of objects of IoT, spread all around us and the Cyber Space, has resulted in heightened security and privacy concerns [30], [31]. Many of the existing protection techniques rely on the SP as a trusted party, and only focus on external attackers. Since the SP cannot be trusted, any privacy technique reliant on the trust of the SP is not reliable. However the trust of the SP is not critical in the application of data security protection methods, if transmission occurs and nicknames for users are used. Accordingly, advanced methods based on the trust of the SP should have an inbuilt system to alert users to grant permission for their data to be accessed [32]. How to avoid dependence on the trust of the SP has been an open problem, which has recently been addressed in the Blind Approach [33] by way of using a pair of keys in addition to the third party, and in the Double Obfuscation Approach (DOA) [34].

B. EXISTING APPROACHES, THEIR STRENGTHS AND WEAKNESSES
Review existing approaches with a specific aim of highlighting their strengths and weaknesses.
There are many approaches and methods to preserve privacy but most of them suffer from one or more anomalies [16], [35]. Moreover, some of the existing approaches have given rise to challenging issues and open problems concerning performance, trust, and the impact of the core service and applications they provide. Most of the existing approaches depend on the trust of service providers, which is a major weakness and a serious deficiency. Performance of the main approaches against several criteria is summarised in Table5. Their description and weaknesses, which will follow, are relevant to our research in this article.
1) DUMMY APPROACH [36] Purpose: The main purpose of using a dummy is to conceal the real query by mixing it with a set of dummy (unreal) queries to mislead the SP. This method can be used to protect the query or location. The SP will not be able to identify the actual query, and hence would be misled to collect inaccurate information about the users. Hypothesis: • Users are able to create dummy queries by themselves • User resources enable them to create 30 dummy queries for every real query Weakness: • This approach causes overhead on the user as well as the SP as the number of dummy queries grows.
• After observing for a while, the SP can distinguish the user from others.
2) OBFUSCATION APPROACH [37] Purpose: In this approach, the combination of the query and data of the user is changed before it is sent to the SP, unlike having to send many queries in the dummy approach. The level of privacy is related to the amount of noise and obfuscation on the query. Privacy can be increased at the cost of accuracy of results.

Hypothesis:
• Users are prepared to sacrifice the accuracy of results to protecting their privacy • User has enough resources to recover the returned result. Weakness: • Increasing privacy would also increase the cost of processing. Newer Obfuscation techniques require the user to send their area instead of the location. But this method also adversely affects performance and cost. More importantly, Obfuscation is not suitable for smart street applications as it changes the locations of vehicles.
3) DOUBLE OBFUSCATION APPROACH [34] Purpose: Double Obfuscation Approach (DOA) is a recent hybrid method to protect the privacy of users in LBS applications. It depends on obfuscation and Fog as the third party (TP) to enhance privacy compared to the traditional obfuscation, and addresses some drawbacks related to overhead and accuracy of results in the Obfuscation Approach [37]. To achieve that, it bifurcates the obfuscation area (one for the user and another for fog), and divides the returned results into five parts with the help of fog. Hypothesis: • same as in the case of the Obfuscation Approach, with additional overhead for processing. Weakness: • The DOA applications results in overheads on the user and server, and the approach does not provide adequate protection for the data of the query. VOLUME 8, 2020 4) PRIVATE INFORMATION RETRIEVAL [38], [39] Purpose: Private Information Retrieval (PIR) provides privacy by utilising a large amount of data from the SP. Hypothesis: • This method assumes that the user can access a huge amount of data from the SP without the SP.
• Assumes that the user has resources to store information of the whole city and deal with it. Weakness: • Accessing a huge amount of data from the SP may not be feasible at all times.
• This approach is not practical to use with smart devices of IoT, which are scarce resources.
• Some PIR techniques use encryption.

5) COOPERATION AMONG PEERS [40]
Purpose: The main goal of this approach is to reduce the number of contacts with the SP. In this approach, each peer in the same cell seeks the answer of their query from other peers, before sending it to the SP. In other variations of this method, peers collaborate with each other and send the same query to the SP to prevent profiling. Hypothesis: • Assumes that there are many users in each cell and most of them agree to send the same data to the SP. Weakness: • This is not suitable with smart street services. 6) CACHE APPROACH [41] Purpose: This approach is similar to other approaches in caching some queries' answers, and reusing them to respond to future queries. Hypothesis: • Assumes that there is open access point with self-management for storing the result of previous queries of users.

Weakness:
• This method is effective only when the cache-hit ratio is increased, which is proportional to the privacy and performance of the query.

7) BLIND THIRD PARTY PEERS [33]
Purpose: The BTP encryption is used by the Blind Approach, and its role is to change the identities of users. Hypothesis: • In this approach, the user avails all the benefits of using a third party (peer) without having to reveal any data to them.

Weakness:
• There is a possibility of collusion between the third party (BTP) and the SP to breach users' privacy.
• Encryption may cause overload on some users devices.
• BTP encryption usually results in more power consumption by users' devices.

General Data Protection Regulation (GDPR) is an initiative of the European Union (EU), which came into effect on 25th
May 2018. The purpose of enacting GDPR is to empower users and clients' privacy by making it mandatory for service providers, who collect or manipulate data, to only access privacy information with prior permission from users or in a number of defined emergent situations. Moreover, the access control of the GDPR enables users to access and manage their data in the SP database, providing a simple solution to protect their privacy. However, after its promulgation, the GDPR has been facing compliance issues. Many service providers are not enforcing it the way it should be enforced. Moreover, so far, no one has been penalised for not enforcing or misusing the GDPR. As a result, many malicious parties are still able to violate user privacy in their domain. In other words, despite being a very good initiative, the GDPR is not effective enough, and hence we still need other means and methods to ensure the privacy of users. More details, including the principles of the GDPR, are available in [42]- [44].

III. A NEW METHOD TO PROTECT PRIVACY IN IOT APPLICATIONS
Use of Location-based services (LBS) is widespread. Amongst others, these services are used in smart cities and IoT applications including smart streets, self-lighting, pedestrian safety, smart parking, connected cars, medical applications and emergency response, congestion handling, alerts and warnings for drivers and other road users, remote surveillance, location related advertisements, search for points of interest, and smart signals.
In this article we propose, and provide details of, a new method to preserve users' privacy in IoT applications. We call this method the ''Swapping of Peers and Fogs'' (SPF). The LBS environment can be regarded as a set of clusters/cells, each having a fog and a number of users (peers), who can cooperate with each other. With the SPF, Swapping can take many forms involving peers and fogs. Ideally, a user, P 1 , would send their query to a peer, P 2 (surrogate peer), who would relay it to fog node (fog), N 1 , in cluster C 1 , who would send it to another fog, N 2 , in cluster C 2 . Then N 2 would send it to another peer, P 3 (submitting peer), in C 2 , who would submit it to the LBS. Indeed, the main purpose of the swapping scheme of the SPF method is to temporarily change the Ownership (Identity) of the query using a surrogate and submitting peers, successively. The result of the query is sent back to P 1 by the same route in the opposite direction. In this scenario, it is evident that the privacy of P 1 would be protected from the SP, N 1 , N 2 and P 3 but not from P 2 .

A. JUSTIFICATION FOR THE SWAPPING SCHEME
A number of existing privacy methods use some form of swapping involving peers. The P2PCache Approach [45] uses a swapping scheme between the user and another peer in the same cell to create smart dummy (a query submitted by a peer 210210 VOLUME 8, 2020 on behalf of the owner), to change the identity of the owner of the query before sending it to the SP. This is very similar to the swapping between the user and surrogate peer in the SPF method, except that in the P2PCache the same peer acts as a surrogate as well as submitting peer. As a result, the user and the cooperator peer would be located in the same cell, making the user's location vulnerable. Incidentally, none of the methods, which were evolved before the SPF method, resolved this issue.
Indeed, the swapping scheme of the SPF addresses the issue of submitted location. When a query is submitted to the LBS, the SP will not get information about the real owner, P 1 , and instead would receive misleading information (query and location) about the submitting peer P 3 . So, there is no way for the SP to find out any information about P 1 , who would be quite far from P 3 and would appear to be in a different cell to the SP. Existing approaches (Dummy [46], Obfuscation [47], [48], PIR [39], and Cooperation [49]) have also used some techniques to preserve privacy but each of them has created serious issues. These issues are successfully resolved by the swapping scheme of the SPF method.
In section 8, we analyse a number of aspects of the swapping scheme used in the SPF method, including efficiency, extent of the processing delays, management of the processing in case a participating peers leaves the LBS area without completing their assignment, and the estimates of times clarification on the forward and backward routing.

IV. DIFFERENT SCENARIOS OF SWAPPING IN THE SPF METHOD
The SPF facilitates several combinations of swapping between peers and fogs. In an ideal situation, the first swapping would occur between the user and the surrogate peer to convert the real query of the user into a smart dummy in the same cell. In the second swapping, the surrogate peer (smart dummy) would transfer the query to a fog in the same cell. The third swapping would occur when one fog transfers the query to another fog, which in the fourth swapping would transfer it to the submitting peer. This is only one scenario of swapping. There are several other possibilities of which we only describe five. These scenarios take into consideration the possibilities of extraordinary situations at the time of applying the method. On the other hand, they demonstrate that the SPF method is flexible and adaptive. At any given time, a user might face one of the following (rare, but possible) situations, some of which will be taken into consideration in the discussion of the five scenarios.
• In an unlikely situation, a user is alone in the cell, who has to connect to the SP directly.
• A peer in the same cell in some situations (flat-battery) is unable to cooperate.
• If a user does not trust peers in the same cell or trusts Fog more than peers, the query may be sent directly to the Fog.
• A user does not trust the Fog. In such a case, they can increase the cooperation amongst peers in the same cell.
• A user wants to enhance the level of privacy. In such a case, they may increase cooperation amongst peers at the expense of creating overload.
a: TASK: A user wants a query (Q) to be processed by the SP, but at the same time does not want to disclose the ID. location or query to the SP, the submitting peer or the tw fogs.
b: NOTATION: As the LBS area is divided into different clusters/cells, which we denote as C 1 , C 2 , C 3 , C 4 , . . . C n . Each C i is provided with a Fog Node N i , and may contain a number of Peers (users) P 1 , P 2 , P 3 , P 4 . . . ., P n at a given time. Following are the different phases of swapping which can take place involving Peers and Fogs. 1) First Swapping: P 1 sends Q to another peer P 2 in order to hide the real ID of P 1 from N 1 . 2) Second Swapping: P 2 sends Q to N 1 in a cell C 1 .
3) Third Swapping: N 1 sends Q to N 2 in a nearby cell C 2 . 4) Fourth Swapping: N 2 sends Q to a peer P 3 in C 2 (submitting peer), who submits it to the SP.

A. FIRST (MAIN) SCENARIO OF THE SPF APPLICATIONS
This is the best case scenario. As shown in Figure 1, it involves all four phases of swapping. This scheme of swapping will result in providing complete protection of privacy (ID, location, and query) to P 1 from the SP, the submitting peer and the fogs but not P 1 , which we discuss in the next section. It should be noted that the location of the query Q will remain the same in all phases of swapping. Figure 2 shows the second scenario, in which N 2 in C 2 does not have a suitable peer, and so N 2 sends the query to the SP directly. This would enhance performance but the privacy protection, compared to the first scenario, would be slightly less because the SP could be curious about the source of the query.

C. THIRD SCENARIO OF THE SPF APPLICATIONS
If there is no suitable peer in C 1 then P 1 can send Q directly to N 1 , who would forward it to N 2 , who would then assign it to P 3 to deal with the SP. In this scenario the privacy of P 1 would still be protected from the SP but N 1 can access some information about P 1 in its cell. Details are shown in Figure 3.

D. FOURTH SCENARIO OF THE SPF APPLICATIONS
It is possible that, during the processing of a query, a cooperator peer abandons the process and exits the LBS area without completing their task. If this happens, the user would have to start the process again, which would cost additional time.  To avoid such a situation and ensure timely resolution, the user can send the query to two (surrogate) peers. In this case, there would be two parallel processes for the same query, as shown in Figure 4. It is highly unlikely that both of the processes would fail. Indeed, the duplication of the process would strain resources.

E. FIFTH SCENARIO OF THE SPF APPLICATIONS
In this case, P 1 trusts N 1 more than peers. Therefore, N 1 would manage the swap between peers. As shown in Figure 5, in this case only one fog N 1 would be used. As a result, location of P 1 would not be protected.
It should be noted that the submitting peer in a different cell (like P 3 in the First Scenario) boosts the level of their privacy (by misleading the SP about their location) without impacting the distribution of users in the cells/clusters of a smart city. So, the accuracy of the main services of a smart city such as Smart Street will not be affected after applying the SPF. This is one of the major differences between the SPF and earlier approaches to protect privacy. To avoid any adverse effect on  performance, the SPF uses the cache of fog node. The query usually passes through two fogs before contacting the SP. In other words, the effectiveness and higher cache-hit ratio are twofold advantages as compared to existing approaches which only deal with one cache. Use of fog in addition to clouds boosts efficiency for managing peers, caches, and other operations. Moreover, the SPF adds the bloom filter (Hash function) before searching in cache to avoid the miss-hit time of cache [50]. Furthermore, by default, the SPF only stores real queries (without dummies and noise in the cache) which enhances the cache-hit ratio.

V. ALGORITHMS FOR THE SPF METHOD
Here we provide two algorithms, which describe the processing of the SPF method. The first algorithm demonstrated the navigation of the swapping between peers and nodes, as described in the First Scenario (Figure 1), and is also included in the General Case of Vehicles ( Figure 6).
The second algorithm describes the management of the pair caches inside the SP. As each fog has a cache, this algorithm makes use of them. We use a bloom filter (hashtable) to check if the query exists in the cache or not. If the answer is affirmative, we change the position and make this VOLUME 8, 2020  as the first item in the list with MAX ID . In case of a negative answer (miss-hit), we wait for the response of the query from the SP, upon the receipt of which we would delete the last query in the list (which has Min ID ), and insert the new one with (MAX ID + 1) in the first position. This process saves the items with a higher number of requests in the cache list.

VI. COMPARISON, ADVANTAGES, RESILIENCE AND LIMITATIONS OF THE SPF METHOD
The SPF method provides an efficient way to protect the privacy of the user's identity, location and the query from the service provider in the LBS environment such as smart cities. The level of protection is proportional to the number of users at the time of processing. The most important characteristic of this approach is the scheme of spreading fogs and using a pair of caches in smart cities, which are exploited to play a pivotal role in protecting the location of users from the service provider.

A. COMPARISON OF SWAPPING OF SPF AND OTHER METHODS
In case of the SPF, each submitting peer can only send one query, belonging to some other user. This mechanism eliminates the need to generate dummy queries or the user sending their own query to the SP. As a result, the SP would not receive any credible information about the user. On the other hand, the swapping schemes in the Obfuscation and P2PCache, who use Smart dummy, suffer from the following two anomalies. 1) The responsibility of managing peers is assigned in its cell to a fog. 2) The first (normal) swap between the user and surrogate peer in the same cell is used only to protect the privacy of the user from the Fog node of the cell (in case the node is malicious).
3) The second swap, which occurs between two nodes (of different cells), is the most critical in the SPF Method, because it would transfer the query from cell C 1 to cell C 2 , who would send it to P 3 . When P 3 submits the query to LBS, the SP would be disguised to record the ID of P 3 with the location and query of P 1 in C 1 . In this way, privacy of the location of P 3 would also be protected, which is further discussed in section 6C. 4) The cache in each of the two fogs (double cache) would enhance the privacy and performance of the query.

B. SUPERIORITY OF THE SPF METHOD
The double swapping scheme of the SPF method encourages peers to cooperate with each other. In doing so, they boost their level of privacy. In particular, when P 3 submits the query of P 1 , the query becomes a smart dummy, and hence the SP would record the wrong information about P 3 . In other words, not only does the owner of the query, P 1 , get full protection but the privacy of the submitting peer, P 3 , also increases. VOLUME 8, 2020 The first swap between P 1 and P 2 also removes the need to trust N 1 . The SPF method also has the following advantages: • Provides three tiers of protection for privacy, and there is no significant impact on performance.
• Addresses the problem of the dummy approach by creating smart dummies without creating an overload on the system.
• Solves the problem of the TTP approach by employing multi-fog nodes without having to trust them and the fogs facilitate a solution for the managing peers.
• Addresses the Double Obfuscation and Obfuscation Approach issues without affecting the accuracy of processed results or creating an overhead to the user.
• Provides a solution to the caching technique problem by improving the hit ratio (because the cache only contains real queries of users) and using the bloom filter with the help of an available pair of caches.
• Offers a solution to the cooperation approach problem by locating users in the same homogeneous area and within close vicinity.
• Provides resistance to most types of attacks.
• Semantic Context: When an attacker has some additional information like the profession or the age of the user, then the attacker can use this information to break into the protection technique. However, in the SPF, the user would not deal with the SP at all, removing the possibility of such attacks.
• Homogeneity Attack: If the area of protection is homogeneous (has one stamp or same type of building), the obfuscation or cooperation among peers will not be useful. But the SPF uses additional swap among fogs to change the area of the user completely. Hence, a Homogeneity attack would not be successful in this case.
• Path Tracking: An attacker may try to draw a path for the users' positions by time (Historical data), to detect the direction and target of the user. This becomes easier with minor obfuscation, or cooperation among closed peers. However, the SPF uses swapping between two different areas with different nicknames, ensuring that the attack would also not be active in such cases.
• Inversion Attack: If an attacker has information about the protection method, they can access the privacy data by breaching the protection. However, in the SPF, despite having knowledge of using steps of the SPF, the SP can not link the received data to real users.
• Knowledge of Map: The attackers can use their skills or knowledge about the map to eliminate the dummy queries or noise from the obfuscated ones. However, in the SPF, real queries are known only to the real users, so the attacker can not do anything here.
• Malicious Peer: This is an open problem found in the cooperation approach. In the SPF, the nickname mechanism solves this issue, and dealings with the same peers rarely happens in the mobile objects environment.
In section 9, we provide formal analysis, which further elaborates the resilience of the SPF method.

D. LIMITATIONS OF THE SPF METHOD
There is no method which can effectively protect privacy from all nodes in the LBS environment, or serve all types of applications of IoT. Some of the available methods rely either on the trust of the SP or some other node, whereas others suffer from a range of operational anomalies. In case of the the LBS environment, the most dangerous node to breach the user privacy is the SP, followed by the two fogs (N 1 , N 2 ). The least dangerous are the surrogate and submitting peers, (P 2 and P 3 ). In view of the preceding discussion, the SPF method protects user privacy from the SP, the two fogs, and the submitting peer but not from the surrogate peer. As a user randomly chooses a surrogate peer, it is highly improbable that a user would choose the same peer again to act as a surrogate. In general the chances of the surrogate peer being a heckler are slim. Nevertheless, the exposure of user's privacy to the surrogate peer is indeed a limitation of the SPF method. There are some other minor disadvantages of the SPF methods, which are listed below.
• There may arise, although very rarely, a situation when there are no peers in the application area. In such a case, the user should either randomly generate dummies or rely on the swap of the fog node only.
• If, for some reason, the surrogate peer P 2 leaves the LBS environment before the result of the query comes back, the user can choose another peer, which we discuss in section 8B in detail.
• Despite using fog and cache, the SPF process may cause delays in some user queries, although the effect would not be noticeable in the system overview. The delay can happen if there is no other peer in the same cell and the user decides to wait for a peer. Although rare, it would still be a possibility. Instead of waiting for a peer, the user can deal with the fog node directly like in the Third Scenario shown in Figure 3.
• The system focuses on the protection of privacy, not on ensuring the credibility of the processed results, which is an issue related to the reputation algorithms of both the service provider and the fogs responsible for providing services. If a service provider tampers with, it would immediately be detected and reported. As a result, the service provider would loose users after a short time. We shall deal with this issue in our future studies.

VII. APPLICATIONS OF THE SPF METHOD
In this section we present the general form of the case of vehicles, and then an application in Smart Streets.

A. GENERAL CASE OF VEHICLES
We first present the general form of the case of vehicles.
1) A vehicle with a query (Q) would generate a random nickname like P 1 . 2) P 1 would swap its query with another vehicle P 2 in the same zone. 3) P 2 would send P 2 .Q to a fog node N 1 in its cell. 4) N 1 would swap P 2 .Q with another fog node N 2 in another cell C 2 . 5) N 2 would assign P 2 .Q to vehicle P 3 . 6) P 3 would send P 3 .Q to the SP. 7) The SP would return the results to P 3 . 8) P 3 would forward it to N 2 . 9) N 2 would provide the results to N 1 according to the ID of the query. 10) N 1 would return the results to P 2 which in return will return them to P 1 . 11) Fogs would save the results in their caches to answer future queries without having to deal with the SP again. 12) With another query, A can deal with another peer and nickname it to prevent any possibility of linking data by time to vehicles. 13) Then the fogs would search in their cache before contacting the SP. In the previous scenario, if an intruder (outer attacker or malicious SP, fog or Peer) traces any vehicle like P 1 by that time, then this attacker will create a false profile about P 1 and other vehicles, and the attacker will have random paths for each one (P 1 , P 2 , etc.). (See Figure 6)

B. AN APPLICATION OF THE SPF IN SMART STREETS
Smart streets are the most important applications provided by smart cities, where they include a large number of diverse services. The most important of these services include automated addressing of congestion and flow management, medical services everywhere, immediate response to emergencies, easy and accurate search for points of interest, self-lighting, safety of pedestrians, Smart parking, pollution sensors, noise and leaks, alerts and warnings for drivers, remote monitoring, advertisements associated with the place, search for points of interest, monitoring violations, and others [54].
A characteristic of all aforesaid services is that they depend on the location service. For example, in order to solve the problem of congestion in smart streets, the services might rely on automated and continuous calculations for the number of vehicles in each area, which is dependent on current locations of these vehicles, which connect with the SP who is responsible for guiding them to the less congested roads [55]. To protect privacy in this case, the first option is to rely on third-party trust, which is not a real solution. The second option is to rely on one of the dummies or jamming methods, which protect the location. But in this kind of application, the use of dummies would affect the number of vehicles, and the use of noise would affect the proportions of the distribution of vehicles in the region instead of the real proportions. It means that the protection technology has negatively affected the quality and effectiveness of the basic service related to congestion addressing. VOLUME 8, 2020 The steps are shown in Figure 6, whose description follows: In the SPF, it will provide several benefits: • Benefit from the presence of Fogs distributed in smart streets.
• Achieve protection of identity, location and query together.
• The protection approach does not affect the distribution rates of vehicles in each area due to the process of double swapping between the fogs where the same number will remain at each node.

VIII. MANAGEMENT OF QUERY PROCESSING AND PEERS IN THE SPF METHOD
In this section, we (a) provide an estimate of possible delay in query processing by the SPF scheme, (b) discuss the management of cooperated peers, and (c) explain the reverse routing process from the LBS to the user.

A. ESTIMATION OF POSSIBLE DELAY CAUSED BY THE SPF PROCESS
It is known that the searching in DB of the SP, namely (T s ) takes more time than in Cache (T c ), where in the worst case scenario, T c = Time Access Cache × Number of Elements. However, for our calculations, we regard T s and T c to be the same, and so we do not take them into account in our comparison. Moreover, We use a bloom filter to avoid a search in the cache in case there is no result; which costs 1ms of time. However, if the query already exists in the cache, the search time is related to the size of cache (in our experiment, with a 100kB sized cache, the search time was 50ms). The average time for a small size of information (like a Ping Test) to establish an online 4G connection with the SP is about 100ms (which may vary a little according to the speed of the connection), whereas a query, like Ping, for an offline wireless connection takes about 10ms. Using previous statistics, we can estimate the possible delay that would occur in the SPF process. In the main scenario of the SPF, we have four traversals in a wireless connection before the final internet connection is established with the SP. In this case, we have the following time estimates.
• Time taken by the query without protection will be T 1 = A (Example A = 100ms) • Time taken by the SPF to process the query in a worst case scenario will be T 2 = T 1 + 4 * B (Example B = 10ms) • Time taken by the SPF process when the result was in the cache of Fog Node (1 or 2) would be T 3 = 3 * B • If H = cache hit-ratio, the total time for processing N queries will be N*T 3 *H + N*T 2 *(1 − H) To substantiate the forgoing discussion, we conducted a small experiment to quantify the delay caused by the SPF process in milliseconds (ms) by using a Ping Test (with a 4G network). We repeated the experiment ten times and calculated the averages of each count, which are provided as follows: Each of the first four traversals in the forward routing took 4B = 4*10 = 40ms (WiFi) and the last connection with SP took A = 40ms (Internet).Thus, the total time, denoted by TT, comes out as TT = 40 + 100 = 140ms. So, there is a 40ms delay after using SPF method for each query in case if the query results from the SP is communicated in a normal way. However, if the result exists in Cache, there is no need to connect to the SP, and hence the total time would be less than 100ms, resulting in no delay.
To address the issue of delay, the SPF method depends on the double cache of fog nodes. Figure 8 shows the total time for 10 queries with different H values. It should be noted that, in normal cases, if H = 0.36 then the SPF will not cause any delay, and if H < 0.36 then there will be some delay. On the other hand if H > 0.36, then the SPF will enhance the performance of the query.

B. WHAT IF PARTICIPATING PEER ABANDONS THE PROCESS MIDWAY?
In a real and dynamic environment, like the LBS and Connected Vehicles, there is a possibility that the surrogate or submitting peer leaves the environment before completing their part of the process. Such a situation creates a challenge to any privacy scheme. This problem is dealt with in the following two ways:

1) USE TWO SURROGATE PEERS
If the query is time sensitive, employ two surrogate peers, and hence duplicate the process in parallel. As described in the Fourth Scenario, and shown in Figure 4, it would ensure a timely response.

2) MANAGE WITH A PAIR OF CACHES
This is a way to manage the situation by exploiting a pair of caches, Cache 1 of N 1 and Cache 2 of N 2 , in the following four cases: 1) If P 2 abandons before sending query to N 1 2) If P 2 abandons after sending query to N 1 3) If P 3 abandons before submitting the query for processing 210218 VOLUME 8, 2020 4) If P 3 abandons after submitting the query for processing Let T be the average time to send the query for processing to the SP, and receive a response back. The value of T would depend on the type of application and environment. To determine T in the above four cases, let Tcache 1 be the average time to send a query to N 1 and receive the response back.
1) First case: If P 2 leaves the LBS area before submitting the query to N 1 , then P 1 has to send the query again to another peer. The new response time in this case will be NT 1 = T + T . 2) Second case: If P 2 leaves the LBS area after submitting the query to N 1 , like in the first case, the user should resend the query to another peer. However, in this case the time would be NT 2 = T + Tcache 1 , which is less than NT 1 , because the result would already be available in cache 1 of NT 1 . 3) Third case: Peer P 3 would rarely abandon the query before sending it to the SP, because NT 2 would select P 3 by monitoring the entry time to C 2 . But if this does occur, then NT 2 will send the query to another peer in C 2 , save the the query resolution in cache 2 , and return the response to cache 1 of N 1 . So NT 3 = NT 2 = T + Tcache 1 , just like in the second case. 4) Fourth case: It is similar to the third case, and N 2 would deal with it accordingly. So, NT 4 = NT 2 = T + Tcache 1 . To summarise the forgoing discussion, if a cooperator peer leaves the LBS area before completing the task, the additional delay needs to be taken into account, with the following two possibilities: 1) If the query is not sent to the SP, and therefore the user doesn't get the result back after 140ms, then they need to send the query again, in which case TT = 140 + 140 = 280, resulting in net delay of 180ms. 2) If the query is sent to the SP and the response arrives to NT 2 , but the user doesn't receive the result back after 140s, then they need to resend the query. This time, the result would already be in Cache 1 , so TT = 140 + 10 + 10 = 160, recording a delay of 60ms. In general, in any protection method, there is a trade-off between protection level and the processing time and performance/cost. In the future, we shall propose a novel idea to create a reputation for each peer, which would enable the user to only deal with trusted peers.

C. HOW WILL THE ROUTING INFORMATION BE MANAGED WHEN THE QUERY CROSSES THE PEERS, FOGS AND CELLS?
Backward routing of the processed query is the reverse of the forward routing, and both are listed below.
The peers in each cell will be managed by its fog, and each peer will have a different internal IP to connect only with other peers in the same cell (WiFi) or the fog itself to refresh the list of available peers. At the same time, each peer will have a special internet connection (like 4G) to connect to the SP directly. Each cluster can be managed by its fog, and all fogs can be managed be the admin of the smart-city.
In order to manage time, we use simulation (Packet Tracer) in addition to a real test on a small network by using ''Ping'' to check the time of sending and receiving the query, implemented as a short code by Visual Studio.NET. Also, we implement (ASP.NET C#) to manage Fog-Functions (Manage Peers, Swapping Q), and Search for the result in the cache by using a Bloom filter (Hash -Table).

IX. ANALYSIS OF THE SWAPPING SCHEME
To analyse the efficiency and effectiveness of the swapping scheme of the SPF method, we need some measurable metrics. Well known methods, namely the Dummy enhanced-CaDSA [36], Obfuscation [37], and Blind Third Party Encryption [33] have used the following six metrics to analyse their schemes. We shall also use the same metrics to analyse the scheme of the SPF method. In our analysis, we mainly focus on the user (P 1 ), and the Submitting peer (P 3 ), to calculate six metrics in relation to the SP as a perceived attacker, and compare them with the aforementioned three methods. First of all, we provide the values of the above metrics in the case of the query processing without any privacy protection method.

a: CASE OF QUERY PROCESSING WITHOUT PROTECTION
When there is no privacy protection, the metrics are as follows: 1) K-Anonymity = 1/1 = 1, which meas that SP understands that the query belongs to submitted peer P 3 2) E = − K i=1 P i log 2 P i , where P i , the probability of query belonging to the submitting peer, is 1 and so E = 0. + Range(int16) == 4 + 8 + 8 + 50 + 2 = 72 Bytes, which is less than 1 KB for each query From this analysis, it is evident that to enhance the privacy, K-Anonymity, E, EE, and H be increased, but T and S are not increased. These metrics in the above three methods are examined below.

b: CASE OF QUERY PROCESSING WITH DUMMY APPROACH
This approach [36] is based on creating many (K) dummy queries which are sent along with the real query. The six metrics can be summarised as follows: 1) K-Anonymity = 1/(1 + K), so increasing K would enhance the privacy 2) E would be maximum, if all the dummy queries are similar to the real query. In such a case all queries would have the same probability, and Max(E) = log 2 (K + 1)E. But this cannot be achieved by the Dummy approach because it is very difficult to generate dummies similar to the actual query. In other words E would be enhanced but not to the extent of Max(E). 3) EE = E * 100%, showing dependency with the E value. 4) H would be smaller because the dummy data is stored in the cache, which will have an adverse effect 5) Cost − Time = T * (K + 1), an indicator of adverse impact. 6) Cost − Size = S(Q) * (K + 1), another indicator of negative impact.

c: CASE OF QUERY PROCESSING WITH OBFUSCATION APPROACH
This approach [37] is used to change the location (location privacy) of the user before sending the query for processing. 1) K-Anonymity = 1/D, where D is the distance between real and obfuscated location. So, higher privacy would be as a result of greater distance, which is raised at the cost of the accuracy of the result. The BTP encryption is part of Blind Approach [33], and its role is to change the identities of users.
1) K-Anonymity = maximum = 0 because user does not deal with SP in a normal case. 2) E = MAX(E) 3) EE, maximum for the SP 4) H = 0 because encrypted data cannot be stored in the cache, which is an adverse impact 5) Cost-Time = 2*T (because of the connection with the BTP and then connection the BTP to the SP) 6) Cost-Size ≥ S, few additional bits to the last block if it is not completed, and the size of the sent key e: CASE OF SPF 1) K-Anonymity = Maximum = 0 for P 1 and P 3 , as P 1 does not connect to the SP, and P 3 submit a query belonging to some one else. 2) E = MAX(E) for P 1 because of no contact with the SP, while E for P 3 would be enhanced because K becomes K + 1 after each new query. 3) EE is Maximum because of E being maximum 4) Cache-Hit Ratio = H, because two caches are employed here, and only real queries are saved in the cache. 5) Cost-Time = T + Time of swap between P 1 , P 2 , N 1 , N 2 , and P 3 , adverse impact. However, H will compensate this impact as discussed earlier 6) Cost-Size = S, No change From the above analysis, we conclude that SPF is significantly superior in protecting privacy with only an insignificant impact on cost.

X. COMPARISON EXPERIMENTS AND SIMULATION RESULTS
Here we provide a comparison of the SPF method with those which use a dummy (Enhance-Cache) [36], cooperation among peers P2PCache [45], encryption and TP (BTP) [33], and Double Obfuscation Approach DOA [34]. All of these approaches also use the cache technique. In order to facilitate comparison, we use the following hypotheses which were used by these methods.
• The smart area contains a 100*100 cluster/cell, and each cell has a Fog node • Each fog node has a cache • The size of Fog's cache is 100K, while the size of one query is less than 1KB • There are 10000 peers/customers who are spread randomly in the cells • There are 100 POIs • There are Wi-Fi connections (3G/4G Network) The Performance Metrics consist of (a) number of queries sent to the server in each request, (b) percentage of time needed to process the query after and before being sent,  and (c) Cache-hit Ratio (the number of queries that can be answered by the cache without needing to connect to the SP). Logic Metrics consist of (a) kinds of attacks that the protection technique is resilient to as discussed before, and (b) impact of the privacy approach to the core service. It should be noted that Privacy and Performance will be affected by the Cache-hit Ratio. Furthermore, the obfuscation or dummy can impact the number of vehicles in each area or street, which can adversely affect the applications of smart streets like the ones which are used to find the path of lowest congestion and/or traffic.

A. DETAILS OF EXPERIMENTS
Our simulation was carried out with the help of the Visual Studio 2015 (Asp.net C# and SQL Server 2012) and Microsoft Excel Office 365, in addition to the Cisco Packet Tracer Simulator. We wrote a code to conduct the experiments in accordance with the hypotheses. In order to analyse with performance metrics, we had generated random queries and took a part of data from the Geo-life dataset (which contains more than 17000 GPS paths for 182 clients over for three years). Then we applied our SPF Algorithm to check the time and the number of queries which were sent to the SP. To find Response Time, we repeated the experiment ten times for each query on different devices and then took the average. After conducting our tests for all queries, we collected all the results and generated the figures. Then we checked the database to find out as to what data was recorded by the SP. Then we linked the true information about each user with their ID, and compared it with what was recorded by the SP.
We shall not compare the SPF method with the Obfuscation approach, because of the fundamental difference in the nature of applications. Our comparison focuses on the Privacy, and Performance Metrics [16], [33], [34], [36], [45], [51], [53]. As described earlier, the Privacy Metric consists of 1) K-Anonymity (the percentage of the real queries of the user, to the K (queries) sent to the SP) 2) Entropy (the amount of true data out of the whole data received by the SP from the same user 3) Ubiquity (degree of the spread of the user in the study area) B. RESULTS OF SIMULATIONS Figure 9 shows the efficiency of the SPF method over the other approaches according to the performance metric (Average time for response vs. queries number). This result reflects the use of one query for each request instead of a set as in the Dummy approach. This superiority is due to the fact that the SPF doesn't require any change or additional process for query processing. The improvement is also due to the fact that the SPF uses a pair of fogs and a pair of caches instead of just one, as is the case with other approaches, and also because it does not use any encryption, as is the case with the BTP Approach. Figure 10 highlights the fact that the SPF sends fewer queries to the SP compared to other approaches, which is the result of using a pair of caches. This means, the cost of the SPF is the lowest. We have also accounted for the Cache-hit Ratio, which is the same in all approaches except Enhanced-CaDSA where it is the worst because the BTP, P2PCache, the SPF, and the DOA approaches send only one query each to the SP, and use the same method to manage the cache. VOLUME 8, 2020   Figure 11 shows that the SPF method achieves the maximum amount of Entropy (E = 1), which is the same as was in the case of BTP, P2PCache, and DOA because the user in all of these approaches does not deal with the SP for query resolutions. However, the SPF method creates a higher level of privacy for location than P2PCache because P 1 and P 3 reside in different cells. Unlike in the case of DOA, the SPF approach does not affect the accuracy of results because it does not add noise or obfuscation to the query, and it does not deal with the same peers (TP) for different queries as is the case with the BTP.
As shown in Figure 12, the SPF achieves maximum ubiquity (U = 2), because this value is related to E value. Also, SPF achieves higher ubiquity because the users will be distributed randomly in a larger space compared to other approaches. Figure 13 shows the average time required to carry out a search in cache. The BTP approach shows better performance because it used only one cache, not two like in the SPF method. However, the result of the SPF method is very close to the best result because it has used a Bloom filter to avoid the miss-hit time, and each cache is managed by a fog node.
As shown in Figure 14, the SPF received the highest Cache-hit ratio, because each time it had to deal with two caches in two fogs, not just one as in the other methods. In fact, this value is very important because it affects the privacy and performance Metrics.   Figure 15 depicts that the SPF, the BTP, P2PCache, and the DOA achieve a fixed ratio for freshness items in the cache by time, because they use the same algorithm in addition to a Bloom filter.

XI. CONCLUSION
The foregoing discussion has described characteristics, properties, advantages, disadvantages, implementations, and applications of the SPF method. We have demonstrated that the swapping mechanism in the SPF method eliminates most of the issues and open problems of the existing methods. We have also pointed out that this method works on the trust of the surrogate peer, and is not suitable when there are not enough users. It is well known that the number of users associated with applications in Smart Street at any given time is usually high enough for the SPF method to be very effective. In summary, the SPF method will protect user's privacy from the SP more than any known method, without affecting the accuracy of the core service and without significant drawbacks.