A New Cloudlet Placement Method Based on Affinity Propagation for Cyber-Physical-Social Systems in Wireless Metropolitan Area Networks

Cyber-Physical-Social Systems (CPSS) integrates the cyber, physical and social spaces together. There are a large number of mobile users in CPSS that need low latency services. Fortunately, mobile edge computing (MEC) is a novel technology which can provide such services. The edge server plays a key role in MEC, but how to manage the edge server is an important challenge. For one thing, the number of cloudlets and the resource are limited. For another, the number of mobile devices (MDs) is very large and randomly distributed. And thus, how to determine the suitable number of cloudlets while serving the maximum number of MDs is significant. To this end, a new cloudlet placement method based on improved Affinity Propagation (AP) algorithm is proposed to solve the above problems. More specially, the improved AP algorithm can obtain the least number of cloudlets while covering the largest number of MDs. In addition, the load balancing strategy is used to ensure that the load of each cloudlet maintains a balanced state. Last but not the least, our proposed method can be used in scenarios where users move.


I. INTRODUCTION
Cyber-Physical-Social Systems (CPSS) integrates the cyber, physical and social spaces together. One of the ultimate goals of CPSS is to make our lives more convenient and intelligent by providing prospective and personalized services for users [1]- [6]. In CPSS, there are a large number of mobile devices (MDs) and some applications running in MDs may have various constraints, (e.g., time constraint, sequential restriction), which requires to be served in low latency [7]- [13]. Meanwhile, wireless sensor network and RFID system are also generating massive data, which further increases processing pressure for cloud [14]- [23].
The associate editor coordinating the review of this manuscript and approving it for publication was Xiaokang Wang. Fortunately, the new computing architecture named Mobile Edge Computing (MEC) that can compute the offloaded tasks at the edge of the network is proposed to reduce the latency [24]- [29]. Generally, MEC can be simply divided into three layers (i.e., cloud, cloudlets and MDs) and the core of which is edge server, which can provide various types of services [30]. In addition, cloudlet is a typical edge server with advantages in economy and utility, which makes the placement of cloudlets in network much easier than large servers [31], [32]. On the one hand, cloudlet provides fast and low-latency service for the mobile users. On the other hand, it also releases the burden of the remote cloud as it performs a large number of user tasks as an intermediate node between users and the cloud. This architecture of numerous nodes provides better computing capacity and is more suitable to VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ be applied to large-scale networks, for example, wireless metropolitan area networks (WMAN) [33], [34]. However, WMAN is a wide area network consisting of a large number of wireless access points for users to obtain service they need, and the distribution of these users is often complex and unpredictable. Thus, we must put a large number of cloudlets into the network to ensure most of the users can be served effectively. But how to place these cloudlets reasonably faces big challenges.
The placement of the cloudlets in WMAN has been well studied by authors in [35], [36]. It is assumed that cloudlets and MDs are both fixed. However, in reality, MDs are usually in mobile state. Therefore, the static placement method may not be directly used for the dynamic environment.
To solve this problem, researchers have proposed some feasible methods for cloudlet placement in the dynamic environment (to list some here [37], [38]). However, how to minimize the number of cloudlets while serving more MDs, meanwhile, making each cloudlet workload balanced is a very important issue.
In view of this, in this paper, we focus on how to deploy the cloudlets with efficiency. We propose a method called Affinity Propagation with Optimal Preference (APOP) for cloudlet placement. Our main contribution is summarized as follows.
1) We construct the set of the candidate preference to select the optimal preference. More specifically, the selected preference of AP algorithm can obtain the least number of cloudlets while covering the largest number of MDs.
2) The load balancing strategy is proposed to equilibrium the number of MDs covered by cloudlets to ensure that the load of each cloudlet maintains a balanced state.
3) Compared to other methods, such as K-Means and Mini Batch K-Means, our proposed APOP method is more effective and efficient.
The remaining parts of this paper are organized as follows. In section II, the related work is presented. Then, the network model and the problem formulation of cloudlet placement, cloudlet movement and load balancing are introduced in section III. The APOP algorithm for the cloudlet placement is presented in section IV. Section V gives an experimental evaluation and discussion. Section VI gives the conclusion of this paper and future work. Some key terms and descriptions are shown in TABLE 1 to get a more concise and better expression.

II. RELATED WORK
In recent years, due to the convenience of MEC services, an increasing number of CPSS applications have been deployed on the edge side of network. Then, mobile users can get more high-quality services through cloudlets in CPSS. However, how to place the cloudlet reasonably is still an important but difficult problem.
Some of researchers focus on the user allocation of cloudlets. Lai et al. [39] formulated the user allocation problem into a bin packing problem, and then proposed an optimal method to solve the problem in the basis of the Lexicographic Goal Programming technique. Guo et al. [40] focused on the edge clouds at the candidate locations and allocated the mobile users to these clouds. Additionally, a novel decentralized algorithm for finding a Nash equilibrium based on game theory was proposed to address the edge user allocation problem by He et al. [41].
Wang et al. [30] formulated the placement of edge servers as a multi-objective optimization problem. Not only the delay of MDs is optimized, but also the workload of cloudlets is balanced. Chen et al. [34] designed two efficient methods for different situations, and the placement of the cloudlet in their study are determined based on historical statistics. Jia et al. [35] focused on the problem of cloud placement and user allocation in WMAN. They proposed a new method to decide how to deploy the cloudlets and assign the users to the cloudlets. The time consumption of MDs can be effectively reduced by using their method. Xu et al. [36] formulated the problem as integer linear programming and solved the problem of capacitated cloudlet placement and dynamic request assignment. However, the above placement methods may not be applied to the dynamic environment directly because the mobile users are often under moving state. That means the mobility of users should be taken into consideration for the cloudlet placement method.
Lai et al. [42] took into consideration the dynamic Quality of service (QoS) levels for mobile users and turn it into a dynamic QoS edge user allocation problem. Correspondingly, an optimal approach is proposed to find solutions which maximizes mobile users' overall Quality of Experience. In addition, clustering algorithms are widely used for cloudlet deployment. Liang and Li [43] proposed a location-aware service deployment method based on K-Means for cloudlets which divided MDs into multiple MD clusters according to the geographical location of MDs and then deployed service instances onto the cloudlets closest to the centers of MD clusters. Shen et al. [37] proposed a dynamic method for cloudlet placement with clustering algorithm. They used K-Means algorithm to get the initial position of cloudlets. Then, the movement strategy is used in their placement method. Zhang et al. [38] proposed an adaptive clustering method based on the location of MDs. Additionally, they used tracking method to collect the initial position and the target position of the cloudlet to realize the dynamic changes. The access latency of MDs and the number of deployed cloudlet servers are optimized.
Different from the existing researches, in this paper, we focus on cloudlet placement for CPSS in WMAN based on improved AP algorithm. Both the mobility of the users and load balancing of cloudlets are taken into consideration, our objective is to find the best number of cloudlets, which can cover the largest number of MDs while keeping the load balancing of the cloudlets. More specifically, the load balancing strategy has been considered in the initial state and the state after moving.

III. PRELIMINARY
In this section, the system model in WMAN is presented. Then, the problem formulation of how to place the cloudlet for optimizing the CPSS is described in details.

A. THE SYSTEM MODEL
The structure of MEC can be simply divided into three layers (i.e., cloud, cloudlets and MDs), which is shown in Fig. 1. Compared to the cloud, the cloudlet is placed on the edge of the network, which is closer to users. And users can interact with the cloudlet through wireless access points. Then, with the shorter physical distance and better computing capacity, the service latency can be reduced greatly. Furthermore, when the cloudlet is placed in a more critical position, its resource utilization may be improved significantly, which means the placement method of the cloudlet may play a vital role in CPSS to provide the users with high-quality service [6], [8]. The dynamic placement process of the cloudlet in WMAN can be expressed as Fig. 2. The cloudlets with any access points are placed at a specific position, which will provide services for all of the MDs in their coverage, and the users within the coverage can get the lower latency service than users outside. More specifically, the more the number of MDs covered by the cloudlet, the better the optimization of the objectives in WMAN.
In Fig. 2, due to the numerous MDs in CPSS, the density of MDs should be taken into consideration. The density of MDs can be broadly divided into three categories, the Dense Cluster of MDs (DCMDs), the Sparse Cluster of MDs (SCMDs) and the Discrete MD (DMD). After changing the location of the cloudlet according to the distribution of MDs, more MDs will be covered by the cloudlet. Then, not only more users can enjoy high-quality services, but also the resource utilization of the cloudlet will be improved seriously.
Thus, it is of considerable significance to propose a feasible algorithm that can place the cloudlet in the optimal position according to the movement locus of MDs.

B. PROBLEM FORMULATION
In this paper, we mainly focus on how to make the cloudlet serve the MDs with less latency and improve the resource utilization of the cloudlet concurrently. The method for the cloudlet placement is presented firstly. Then, considering the users' mobility and the random assignment of tasks, a movement strategy and a load balancing method are presented subsequently. VOLUME 8, 2020

1) CLOUDLET PLACEMENT METHOD
It's assumed that the cloudlet service scope is circular area and is denoted as R j (t). The MD in the circular area means that the MD is provided with better service by the cloudlet. The positions of the cloudlet and MDs at initial time t are denoted as P cl j (t) and P d i (t) respectively. Thus, the set of MDs covered by the j th cloudlet is denoted as D j (t) = {d 1j (t), d 2j (t), . . . , d ij (t)}, which are computed by using (1) and the number of MDs in D j (t) is denoted k j .
where d ij (t) denotes the i th MD covered by the j th cloudlet, the distance between MD and cloudlet is denoted as dis(P d i (t), Pcl j (t)), which is computed as The total number of MDs covered by the cloudlets at initial time t is denoted as T d (t), which is computed by using (3).
Considering the computing resource of cloudlet is limited, we define the largest number of MDs that one cloudlet can provide service to as L and the largest number of cloudlets as B. Then, the capacity of cloudlet to accommodate MDs should satisfy the following (4) and (5).

2) THE NUMBER OF CLOUDLET
If the MDs are served by too many or too few cloudlets, the computing resource will be unmatched and the latency of service may be in a high standard. Specifically, when the number of cloudlets is too small, many MDs may get a poor service, which will lead to intolerable latency. And when too many cloudlets are placed at WMAN, MDs can get better service, but the cost of the cloudlet will become expensive and the resource utilization of the cloudlet will be under a low standard. Thus, how to compute the optimal number of the cloudlet according to MDs is still important. AP method can generate the optimal number of clusters with the preference. Once we get the optimal preference, the AP algorithm can compute the optimal number of cloudlets according to the number of MDs, which ensures the least number of cloudlets can cover MDs as many as possible.

3) THE CLOUDLET MOVEMENT STRATEGY
At time t, the cloudlet cl j placed at the initial position P(x, y) after the clustering algorithm is denoted as P cl j (t). After the movement of MDs, the set of MDs covered by the P cl j (t ) is denoted as P d j (t ) at time t . Additionally, the potential cloudlet position P (x , y ) computed by clustering algorithm satisfies the strategy of cloudlet placement and the set of MDs covered by the P cl j (t ) is denoted as D j (t ). It's assumed that in the region with P as the center and T as the radius, if the cloudlet cl j at position P can serve more MDs than initial position P, the cloudlet is suitable to move from P to P .

4) THE LOAD BALANCING STRATEGY FOR CLOUDLET
The number of MDs covered by different cloudlets is often different, which leads to a significant waste of the resource. Furthermore, once the task load exceeds the capability of the cloudlet, the latency will become intolerable. In order to settle this problem, the load balancing strategy is proposed.

IV. AFFINITY PROPAGATION WITH OPTIMAL PREFERENCE (APOP) FOR CLOUDLET PLACEMENT
In this section, the method for the problem of the cloudlet placement is described in details. In order to select the position of cloudlet, the Affinity Propagation with Optimal Preference (APOP) algorithm is proposed firstly. Unlike K-Means [44], the Affinity Propagation (AP) can compute the optimal number of the cloudlet, which is determined by the preference. Thus, we propose a method to select the optimal preference by creating the set of candidate preference. Then, the load balancing method is introduced. Finally, a dynamic optimization strategy considering the movement of MDs is introduced.

A. PRELIMINARY
The process of our proposed method APOP can be summarized as follows and the execution order of algorithms corresponds to the process of APOP.

1) PROCESS 1: DECIDE THE NUMBER OF CLOUDLETS
The AP algorithm has an important parameter: preference, which decides the number of clusters. And the optimal preference is selected from the candidate set of preference, which generates the least number of clusters. More specifically, the optimal number of cloudlets can not only improve resource utilization, but also provide the low latency service. If the AP algorithm uses the random preference, the number of cloudlets may be huge, which is unreasonable. Thus, it is significant to choose the optimal preference.

2) PROCESS 2: GET THE CENTER POSITION OF MDs
The AP algorithm depends on the two messages of availability and responsibility to decide which data point as exemplar. In this way, the center position of MDs where can deploy the cloudlet is computed.

3) PROCESS 3: THE STRATEGY OF LOAD BALANCING
When cloudlet is deployed, we consider the efficiency of the cloudlet due to the limit of computing resources. That is to say, we should balance the number of MDs covered by one cloudlet ensuring the efficiency of cloudlet. When the position of the cloudlet is changed, the load balancing strategy should be used again.

4) PROCESS 4: THE STRATEGY OF MOVEMENT
In reality, the MDs are moving randomly. If the position of cloudlet is fixed, some MDs after moving may lose the service provided by the cloudlet and have the intolerable latency. So, we use the strategy of movement to update the position of cloudlet based on the movement of devices.

B. THE BASIC STEP OF APOP
AP algorithm is a high-performance clustering algorithm, which was first proposed in 2007 [45]. Compared to the popular clustering algorithms, the AP can compute the number of clusters based on the data set and preference. But, the original AP algorithm is hard to get an optimal preference for the data set. So, the algorithm of APOP is proposed to have a better clustering result for solving the problem of cloudlet placement.
In this algorithm, each data point can be seen as the potential exemplar, which represents the center point of the clusters.
. . , n} represents the set consisting of n data points. Firstly, the similarity matrix S = [S ij ] n * n , i, j ∈ {1, 2, . . . , n} is constructed, where S ij shows the level of similarity between X i and X j [46]. If i = j, the similarity S ii of data point x i is represented by the preset exemplary preference. The similarity is computed by Euclidean distance (6): Then, AP algorithm takes each data points as potential exemplar and recursively transmits the two types of messages, responsibility r(i, j) and availability a(i, j) to find highperformance exemplars. The exemplar preference p is the similarity S ij , i = j, (i.e., S ii = p). Here come the important parts of AP algorithm:

1) RESPONSIBILITY
In order to find the exemplar, the data point x i sees all the other data points as potential exemplars and x i sends the responsibility information r(i, j) to the candidate exemplar x j [45]. And its process diagram is shown in Fig. 3. More specifically, the responsibility information r(i, j) describes the fitness of data point x j as the clustering center of data point x i and x j is selected as the candidate exemplar. The responsibility information r(i, j) is computed as follows [45], [47]: If the value of r(i, j) is higher, it means that x j is more likely to serve as the exemplar of x i and its performance may be better.

2) AVAILABILITY
When x i sends the responsibility information r(i, j) to x j , x j also sends availability information a(i, j) to x i to confirm if x i will choose x j as its exemplar [45]. And its process diagram is shown in Fig. 4. If x i selects other data points x j , i = j, the availability a(i, j) is computed by using (8) If x i selects itself as the candidate exemplar (i.e., i = j), the availability a(i, j), i = j is computed by using (9) In all, the availability a(i, j) and responsibility r(i, j) is updated recursively by (7), (8) and (9) accordingly. The value of availability a(i, j) and responsibility r(i, j) getting higher means data point x j is more likely to be the candidate exemplar and the probability of x i belonging to the clustering with x j as the clustering center is also greater. According to this recursive computing process, the candidate exemplars are selected finally.

3) THE EXEMPLARS WITH HIGH-PERFORMANCE
The set of candidate exemplar c = {c 1 , c 2 , . . . , c m } represents the center points of the clustering. According to the evaluation of responsibility r(i, j) and availability a(i, j), the data point x j is selected as the exemplar of x i , denoted as c i = j. In order to select the high-performance exemplar from the set of candidate c, a way to estimate the performance of exemplar for the clustering is needed. The sum of the similarities between each data point can estimate the effect of candidate exemplars c for the whole clustering, and we call it net similarity [45]. It is represented by the CluSim(c), which is computed as VOLUME 8, 2020 where S ic i represents the data point x i choosing data point x j as an exemplar. In order to avoid the condition of x i choosing data point x j as, the CluSim(c) is constrained by an exemplarconsistency constraint ξ jc i (11), In (11), it means that if c i = j, the data point x j must choose itself as exemplar. If not, the −∞ is used to punish. Then, the set of high-performance exemplars are selected by maximizing the net similarity CluSim(c).

4) SELECT THE OPTIMAL PREFERENCE
The preference of AP algorithm controls the number of clusters in the clustering. So, how to select an optimal preference according to the number of data set is important. Firstly, the set of candidate preferences is constructed, which is denoted as P. P = {p 1 , p 2 , . . . , p n c }, where m represents the preset parameter of the number of candidate preferences. According to [46], [48], during the process of updating the two types of information r(i, j) and a(i, j), the exemplary preference is also changing between different data points like similarity. So, the optimal preference should be chosen between [p min , p max ] and consider the bounds of the value of the similarity matrix.
The preferences p min and p max lead to a small number of clusters and a big number of clusters accordingly. In order to select the optimal candidate preference, it's computed as follows, The optimal preference is selected by analyzing the number of clusters with applying the candidate preference to the AP. Once the problem of selecting preference is solved, the AP algorithm will have the best clustering effect.

C. THE STRATEGY OF LOAD BALANCING
The position of cloudlet and the set of covered MDs are determined using the APOP algorithm, which means that the service range of cloudlet and providing service to which MD are also determined. But, the APOP merely gets a good result of clustering. If too many MDs are served by one cloudlet, the computing load of cloudlet is too heavy, which results in the response latency. So, the load balancing strategy of balancing the number of MDs served by one cloudlet is proposed to optimize the method, which is shown in Algorithm1. Algorithm1 is used to optimize the effect of cloudlet, so it is used usually after the position of cloudlet is decided, that is to say, Algorithm1 is executed after Algorithm2 and Algorithm3. Here comes the detailed introduction for this Algorithm. When one cloudlet is under the burden of heavy load, the load balancing strategy is used to move MDs from the Algorithm 1 The Strategy of Load Balancing Input: The set of cloudlets CL, the set of MDs M , the position of MDs p d i (t) and cloudlets p cl j (t) at the initial moment Output: The set of MDs covered by the cloudlet at time t D j (t) 1: Compute the distance d between cloudlet cl j and cl k 2: for j = 1 to K do 3: for i = 1 to n do 4: if j! = k then 5: if number(D j ) − average > 0.05 * average and average − number(D k ) < 0.05 * average then 6: if abs(dis ij −d/2) < d/9 and dis ik < 1.4 * d then 7: The cloudlet that covers d i changes from cl j to cl k W ← the distance between P cl i (t) and P cl i (t)(t ) 5: end while 6: for i = 1 to k do 7: for j = 1 to k do 8: if P cl j (t ) matches to the P cl i (t) then 9: Using Dijkstra Algorithm to get the shortest trace 10: Get the ξ from P cl i (t) and P cl j (t ) 11: end if 12: end for 13: end for 14: Get the moving trace of all cloudlets cloudlet with heavy burden to the nearby cloudlet with rich computing resources left. Firstly, the load of cloudlet is evaluated via the number of MDs covered by the same cloudlet. When the number of MDs covered by one cloudlet more than a certain amount, the cloudlets are working in the heavy burden state, which lead to the high-latency service. Therefore, it is quite significant to select the optimal number of MDs as the evaluation criterion of maximum load capacity of cloudlet ensuring the average as well. And the reasonable load range Get the l th candidate preferences p l and apply it to the AP 6: Initialize the value of responsibility r(i, j) and availability a(i, j) with o 7: for i = 1 to n do 8: for j = 1 to n do 9: Update the responsibility r(i, j) ← (7) 10: Update the availability a(i, j) ← (8), (9) 11: Compute the sum of r(i, j) and a(i, j) 12: end for 13: end for 14: Maximize the sum of r(i, j) and a(i, j) 15: Choose the j th MD as exemplar and add the i th MD to the D j of cloudlets is computed with experiments, which is shown in the following situation. Then the distance between cl k and cl j is denoted as d, which evaluates the MDs that can be moved from one cloudlet to another. More specifically, in order to ensure the quality of service, the MD that is moved from the cloudlet full of heavy burden should be served within the service range of another cloudlet. Normally, the MD is moved to the nearby cloudlet, so the distance of MD and two cloudlets are compared to evaluate whether the MD is suitable to be moved. And the potential range of MDs that can be moved is shown in situation 3). Here come the related situations that will change the MD belong to which cloudlet.
1) The number of D j minus the average number of MDs is more than five percent of the average. 2) The number of D k minus the average number of MDs is less than five percent of the average. 3) The distance dis ij between MD d ij and cloudlet cl j minus d/2 is farther than β percent of d, where β is the experimental test value.
When the above situation is satisfied, the MD d j will be migrated from cloudlet cl j to the cloudlet cl k . In this way, the MD can get better and faster service provided by the cloudlet with the load balancing.

D. THE STRATEGY OF MOVEMENT
The MDs in CPSS have become smaller and easier for users to carry around, thus the movements of MDs are also more frequent. In the time period from t to t , the position of cloudlet cl j at time t denoted as p cl j (t) may not be suitable to serve the MDs at time t . Therefore, the position of cloudlet should be changed according to the MDs before and after moving. We use the Algorithm3 to get the cloudlet position at time t and t . Then, the Algorithm2 is used to make the cloudlet at initial position arriving to the position of time t with the shortest path. So a method based on Dijkstra algorithm is proposed to move cloudlet ensuring the resource utilization of cloudlet. The directed acyclic graph G f = (V , ξ, W ) is used to compute the shortest path. V = {v 11 , v 12 , . . . v mm } is the vertex of the path graph and it represents the position of cloudlet, which is shown in Algorithm2. ξ = {ξ 1 , ξ 2 , . . .} is the edge of the path graph and it represents the potential movement of cloudlet. W is the position weight and it represents the distance between the original position of cloudlet and its potential position at time t . And, the cloudlet position before moving is set as the initial point. We get the shortest path for cloudlet movement by looping to compare the distance between the initial position of one cloudlet and other potential positions. Finally, the directed acyclic graph G f = (V , ξ, W ) is output, which shows the moving paths for cloudlets. However, it remains a question that the number of cloudlets before and after moving may be different. And here comes the solution. If the number of original cloudlet k is less than the number of potential cloudlet k , the original cloudlet is matched to the potential cloudlet firstly with the shortest path and then the extra cloudlets will be added. If k is greater than k , the redundant cloudlets having a larger position weight will be disabled.

E. THE PLACEMENT FOR CLOUDLET BASED ON APOP
The set of MDs is presented by M = {m 1 , m 2 , . . . , m n } and it's a two-dimensional data set containing the position of MD d i . And the position of MDs satisfies the Gaussian distribution G. The position of cloudlets is equal to the position of clustering centers, which is based on the APOP. With an optimal preference, the AP algorithm can generate a highperformance clustering result. That is to say, we need fewer cloudlets to place in WMAN but most MDs still can be served by cloudlet, which is shown in Algorithm3.

V. EXPERIMENTAL EVALUATION AND DISCUSSION
In order to validate the performance of the algorithm APOP, it is compared with other methods using K-Means algorithm VOLUME 8, 2020 or Mini Batch K-Means algorithm, which have excellent effect and have been widely used.
K-Means, as an unsupervised clustering algorithm, is one of the most typical clustering algorithms and has a good clustering effect, so it is quite suitable as comparative method. When using the K-Means for a given sample set, the number of clusters is determined as K in advance, so that the samples within the cluster are distributed as closely as possible and the distance between the clusters is as large as possible. More specifically, K-Means is unsupervised learning. When giving the number K of clusters, this algorithm will randomly choose K points as the center of clusters. The point closest to the center of one cluster is marked as belonging to that cluster. Then, the K points are selected again through the computing and repeat the previous step until the position of the cluster center no longer changes. However, the selection of K has no reference and requires a lot of experiments. Moreover, when the number of sample set is more than one hundred thousand, the convergence rate will slow down. So, the Mini Batch K-Means selecting a subset of the sample set to do with the traditional k-Means is proposed for dealing with the numerous sample set, which makes the convergence speed of the algorithm is greatly accelerated, but there are some decrease in the accuracy of clustering.
Using these two excellent algorithms as comparison, the performance of our method based APOP can be fully tested. And the extensive simulation experiments are conducted in this section. Before the experiments being conducted, the experimental parameters are set firstly. Then, the analysis based on the experimental result is followed.

A. DATASET AND EXPERIMENTAL SETTING
In the environment of WMAN, we set up a uniform standard for cloudlet: every cloudlet is equipped with access points to access to the network. We generate the data set that obeys the Gaussian distribution G = {(µ, σ )} to simulate the MDs and its parameters µ and σ represent the mean and standard deviation, respectively. The data sets that obey the Gaussian distributions of three sets of different parameters G 1 , G 2 , G 3 represent the set of different number of MDs: 500, 1000, 1500 accordingly at time t. The Gaussian distributions are set as follows:

B. PERFORMANCE ANALYSIS
In this section, some parameters are used to evaluate the clustering algorithms and the optimization of load balancing including the average coverage of cloudlet Cov and the standard deviation of MDs σ (N ), which follow the (14), (15) and (16): where Cov(i) represents the coverage of i th cloudlet, n cov i (t) represents the number of MDs covered by the i th cloudlets at time t and n i represents the number of MDs served by i th cloudlet. The standard deviation of MDs σ (N ) shows the evenness of the number of MDs in each cluster, which reflects the effectiveness of load balancing optimization algorithm. And the σ (N ) is computed by using (16). The average number of MDs covered by cloudlet is presented as µ. Under the same condition, the smaller the value of σ (N ), the more balanced  the number of MDs in each cluster.

1) THE AVERAGE COVERAGE
The result of APOP is compared with the other two clustering algorithms, which can show the performance comparison more clearly. The important index to evaluate the performance of clustering is the coverage of cloudlet. That is to say, the higher Cov, the better performance of the algorithm and the superiority for dealing with the placement problem can be reflected. K-Means is a typical algorithm that has excellent clustering performance especially for the MDs less than one hundred thousand. The Mini Batch K-Means is proposed to deal with the numerous MDs that are more than one hundred thousand. So the average coverage of MDs using K-Means is higher than Mini Batch K-Means in Fig. 5, where Mini Batch K-Means is using Mbkms to represent. Additionally, we can clearly see that no matter the number of MDs is 500, 1000 or 1500, the average coverage of MD with APOP is higher than K-Means and Mini Batch K-Means. Although when the number of MD n is 1000, the average coverage of different clustering algorithms is close, the performance of APOP algorithm is still better than others. Even after the MDs moving, the cloudlet still can provide service for more MDs compared to other two algorithms. For better efficiency of cloudlet, load balancing is used and its experiment result of coverage is shown in Fig. 6. In the histogram, the value of bar columns between original data and load balancing is quite close, which means the load balancing VOLUME 8, 2020  not only improves the efficiency of cloudlet but ensures the quality of service for users.
During the process of the experiment, we find that the performance of AP algorithm is greatly influenced by preference. More specifically, when giving the optimal number of cloudlets, the AP algorithm can make cloudlet has higher coverage. If the unreasonable preference results in the small amount of cloudlets, the average coverage run by the AP algorithm decreases rapidly and may be lower than the other two clustering algorithms.
2) THE LOAD BALANCING From Fig. 5, we can see that the cloudlet can provide more users with service using APOP method. In order to ensure the efficiency of cloudlet avoiding the heavy computing burden, the load balancing strategy is proposed to solve this problem. The load balancing reflecting the evenness of the amount of MDs in each cloudlet is shown in Fig. 7. In order to evaluate the effect of load balancing, the standard deviation before and after moving is tested for further comparison. After load balancing, the standard deviation decreases obviously in Fig. 7. More specifically, the lower the stand deviation, the more balanced of the amount of MDs in each cloudlet.
For estimating the partial load balancing, we take experiments for further comparison, which are shown in Fig. 8, Fig. 9 and Fig. 10. The load balancing strategy moves the MD from the cloudlet with a heavy load to other cloudlets with low resource utilization. From Fig. 8, the number of MDs served by the cloudlet is more balanced and becomes closer to the average number. Additionally, the number of cloudlets after moving add up to 6 and is more than the initial number because the cloudlet should ensure the quality of service provided to the more dispersed MDs.
Especially the condition after moving in Fig. 9, we can see that the number of cloudlet increases greatly compared to the condition before moving, this is because the distance between MDs and cloudlet is getting further and the service scope is limited, which needs more cloudlets to serve more discrete MDs. When the number of cloudlets increase, such as the condition after moving in Fig. 9 and Fig. 10, the number of MDs after load balancing changes slightly, but it is still more balanced than the initial result without load balancing. When the number of MDs in the cloudlet is much larger than the average, some MDs will be moved to other cloudlets. On the contrary, if the resource utilization of cloudlet is quite low, MDs are more likely to be served in this cloudlet. The reason why the number of MDs changes slightly is that the MDs covered by the cloudlet is in the best position to get less response latency, which can not be moved. Most of MDs are covered by the cloudlet and are accessed to the cloudlet in low latency, so the number of MDs which can be moved is quite small. More specifically, the load balancing strategy is implemented with coverage guaranteed, which ensures the global service.

VI. CONCLUSION
In this paper, we focus on cloudlet placement for CPSS in WMAN. Technically, a new method named APOP method based on the improved Affinity Propagation is proposed. More specifically, APOP does not need to specify the K value in advance, but it can obtain the least number of cloudlets while covering the largest number of MDs. In addition, the load balancing strategy is used to ensure that the load of each cloudlet maintains a balanced state.
In our future work, we will investigate the cloudlet deployment or computation offloading problems over 5G network or other scenarios, such as Vehicular Networks [49]- [51]. Meanwhile, the privacy preservation and data caching will be considered in the cloudlet deployment process [52], [53].
XINGDA QIAN is currently pursuing the B.S. degree in Internet of Things engineering with Huaqiao University. His research interests include mobile edge computing, cloud computing, and the optimization of cloudlet placement.
BOHAI ZHAO is currently pursuing the B.S. degree in Internet of Things engineering with Huaqiao University. His research interests include mobile edge computing and the computing offloading of workflow applications.
KEJIA ZHANG received the Ph.D. degree in cryptography from the Beijing University of Posts and Telecommunications, in 2015. From 2016 to 2017, he was a Visiting Scholar with the Center of Quantum Techniques (CQT), National University of Singapore (NUS). He is currently an Associate Professor with Heilongjiang University, China. His current research interests include cybersecurity, cryptography (including post-quantum and quantum cryptography), machine learning, and information theory.
YANG LIU (Member, IEEE) received the M.S. degree in computer application from the University of Science and Technology Liaoning, China, in 1998. She is currently an Associate Professor with the Computer Science and Software Engineering, University of Science and Technology Liaoning. Her research interests include DM and AI. VOLUME 8, 2020