Fuzzy Logic-Based Path Planning for Data Gathering Mobile Sinks in WSNs

Mobile sinks (MSs) are capable of collecting data along specified paths in wireless sensor networks (WSNs). They are deployed as a popular alternative for data loggers, to which all nodes have to send sensory data. If MSs’ paths (or cycles) are not well determined, it might take a relatively long time for the MSs to make a round trip. Recent research works have proposed methods to determine rendezvous points (RPs) that the MSs must pass by to collect data, with the aim of reducing the data-collection time. Determination of the number of RPs is important, and it is challenging to make ensure that there are sufficient RPs widely located throughout a sensor network, forming a circle along which the MSs can spend limited time traveling. This research presents a method for designing paths and pinpointing RPs for MSs to collect data, as well as determining the next hop to relay data for each sensor node. Instead of reducing the MSs’ travel time, the focus of this research is to preserve the energy of all sensor nodes in WSNs. Our method determines the maximum number of RPs such that the MSs can run through each RP’s communication range (within a time constraint) without depleting their own energy. The method comprises three main steps. First, we calculate the number of RPs and design the MS path. Second, the exact data-collection points are determined. The last step is to specify the path along which sensory data are relayed to the MS. In our experiments, we simulate two WSNs of different sizes. The results show that our method outperforms the others by 70%-80% in terms of the sensor node uptime, power consumption, MS traveling time and the number of RPs.


I. INTRODUCTION
One important characteristic of a wireless sensor network (WSN) is the continuous communication among a large set of sensor nodes (SNs) that are responsible for collecting various data according to their function, such as detecting objects and motion and sensing temperature, humidity, and flicker [1]- [5]. The data are then relayed via the WSN until they reach the base station (i.e., the sink node) [6], which is responsible for storing data for further usage [3]. WSNs are often large, as they are composed of a large number of sensor-node devices for deployment in large areas, e.g., forests for wildfire detection [7], [8], nature The associate editor coordinating the review of this manuscript and approving it for publication was Stefano Scanzio .
monitoring [9]- [11], and deep sea exploration [12], [13]. WSNs are also suitable for use in hard-to-reach areas, where wiring for data transmission can be costly and troublesome. Generally, SNs send sensory data to the sink node via another SN within their transmission range. Hotspot problems are one of the most common problems in WSNs. The SNs close to the sink node tend to lose power rapidly when relaying data. As a large number of distant SNs may exist, the relaying SNs close to the sink node will run out of power before the others [14], [15], causing the entire system to fail.
Mobile sinks (MSs) are often deployed to address the abovementioned problem. MSs travel close to the SNs to collect sensory data [16]- [21] so that the data does not have to be relayed by SNs along paths to the sink node, thereby reducing the SNs transmission power consumption. However, the travel times of the MSs often exceed the delay limits (DLs) [22]. A method for planning MS travel paths is therefore important as it helps determining the rendezvous points (RPs) [23], [24], each of which is an SN selected to be in charge of collecting data from other SNs within its range and relaying the aggregated data to the MSs as they arrive. Instead of spending considerable time visiting all SNs, the MSs only need to pass by the RPs to gather the aggregated sensory data from the associated SNs [23], [25]. The RPs must be pinpointed to cover the whole coverage area to reduce the power consumption. Nevertheless, an excessive number of RPs might cause the MS round-trip times to exceed the DLs.
When selecting RPs, we must consider the number, lifetime (remaining power), and the locations of RPs. For example, if a selected RP has low power, instability in data reception may occur. In the case where the selected RP is too far from the other SNs, the other SNs will waste more power on data transmission. As the MSs can communicate with any SN if it is within the SN's communication range, [26] proposed the idea of storing data at RPs located at the boundary of the SNs' communication range to reduce the MS's traveling distances.
However, storing data at such locations might require the SNs to spend more power on transmitting data long distances. To plan relay routes, previous research adopted the minimum spanning tree (MST) method [27], which could greatly simplify the system at the expense of substantial energy spent on relaying data via multiple hops [28]. Other previous works designed RPs to be close to SNs. Such approaches yield low performance in cases when SNs are far from the RPs.
Our proposed method adopts a fuzzy logic system (FLS), which involves the usage of appropriate weights, to assist decision-making processes. When selecting RPs, FLS assists in determining each SN's weight based on its remaining energy and location. The weight reflects the suitability (also the likelihood) of the SN being selected as an RP. When calculating the data-collection distances, FLS is used to help determine the distance from each SN the MS must reach to start collecting the data. The MS should closely approach low-powered RPs before collecting data.
On the other hand, an MS can collect data immediately once it reaches the boundary of the RPs' communication range if the RPs still have a high level of remaining power. When selecting routes, the FLS is responsible for helping each SN decide which node it should send data to among the closest RP and the other SNs.
This research proposes a route planning method that focuses on efficiently planning a route for MSs under the DL constraint while attempting to reduce the energy consumption of all SNs in WSNs. Our method has three phases. The first phase involves determining the number of RPs and planning the paths for the MS. In this phase, we apply FLS to rank each SN's potential to be an RP. We then determine the minimal number of RPs that the MS will spend time visiting before returning to the sink node such that the DL is not violated. Subsequently, we increase the number of RPs by choosing more potential SNs within the MS's communication range while they are traveling along the path.
In the second phase, we determine areas for the MS to collect data. As the MS can communicate with an RP if they are in each other's communication range, our method does not require the MS to reach an RP before it starts collecting data. By calculating weights based on each RP's energy and the number of SNs within its communication range, we leverage FLS to determine the distance from each RP for the MS to collect data. In the last phase, in consideration of each RP's communication range and remaining power, the number of SNs within the range, and the distance from the SNs to itself, FLS is adopted to determine how SNs (not selected as RPs) send data to the MS (i.e., whether to send the data via the RP or via a neighboring SN.) The structure of this paper is as follows: Section II describes relevant literature regarding path designs for the MS to collect data; the MS has to reach each RP and can visit only the area within the RPs' communication range. Section III defines our network models and energy model for conducting experiments and performance evaluation. Section IV presents a path design method consisting of the three main phases mentioned above. Section V presents the running time analysis and comparison with previous work. The system simulation settings and the efficiency of our proposed method are shown in Section VI. We discuss our design choices, our method's limitations, and future work in Section VII. The analysis and discussion of the experimental results are presented in Section VI, and the conclusions of the experimental results are presented in Section VII. Finally, Section VIII concludes our work.

II. RELATED WORKS
This section provides a review of the literature related to path design for MSs by determining data-collection locations and adopting FLS in WSN.
In 2010, Almi'ani et al. [22] discussed the use of MSs to collect data in WSNs to avoid relaying data via multihop paths, thereby reducing the energy consumption of the relaying SNs. The authors proposed a cluster-based algorithm that uses binary search to determine the RP count. An algorithm for solving traveling salesman problems (TSPs) [29] was also used to determine the sequence of nodes in the path along which the MSs travel to collect data. However, cluster-based algorithms are not deterministic when they select RPs: they might occasionally choose improperly positioned SNs to be RPs.
In 2013, Salarian et al. [30] introduced a weighted rendezvous planning (WRP) algorithm to address the problem. The algorithm selected SNs as RPs based on weights determined by the proposed heuristic process, resulting in the distribution of RPs throughout the area. A shortest-path-tree algorithm (SPT) [23], [31] was adopted to determine the optimal RP for relaying the data of each SN. An algorithm for solving the TSP was used to calculate a cycle for the MS to collect the data from all RPs in order. Although this VOLUME 9, 2021 RP-selection method is efficient, SPT routing may result in excessive data transmission/receipt at specific nodes.
In 2016, Cheng and Yu [32] presented a method to reduce the path lengths given by TSP by allowing the MS to collect data once it reaches the SN's communication range. The authors suggested data collection in the area where the MSs could collect data from multiple RPs (where RPs' communication ranges overlap). The method involved determining the number of RPs whose communication ranges overlap and selecting data-collection locations in the overlapping areas. However, RPs might have to transmit data to an MS located farther away, thereby consuming more energy. This method yielded deterministic paths based solely on the RPs' locations, which could lead to inefficient paths.
In Nayak and Devulapalli [33], the base station is able to move to collect sensory data. FLS is applied to assign each SN's weight value, based on battery power, mobility and centrality. The weights are used to cluster SNs and select super cluster heads, aiming to efficiently distribute the load among the SNs, thereby increasing the network lifetime. Nevertheless, the proposed method did not address the path design problem for data collection.
In 2017, Kaswan et al. [34] introduced methods called reduced k-means (RkM) and delay bound reduced k-means (DBRkM) for route planning. The work focused on clustering RPs based on weights and reducing the number of nodes within the RPs' communication ranges to obtain a wider distribution of RPs. A TSP algorithm was also adopted to arrange data-collection locations for the MS to travel along in order. However, MSs were not fully utilized, as the work focused solely on SN management. Cheng et al. [35] presented a data-collection path planning method that depended on area partitioning and the density of SNs in each area that could increase the efficiency in terms of energy consumption. However, the work did not address the case where data might have to be relayed via multihop paths.
In 2018, Alsaafin et al. [26] presented three path-planning methods: reduced energy path (REP), reduced delay path (RDP), and delay bound path (DBP). REP was the most energy efficient; RDP yielded paths along which the MS took the least amount of time to travel; and DBP planned traveling paths for the MS that satisfied the DL constraint by choosing data-collection locations located between paths obtained from REP and RDP. This research allowed data collection once the MS traveled inside the RPs' communication range, thereby reducing the travel path distances. Wen et al. [36] proposed a non-TSP-based routing method called path construction to determine round-trip routes for the MS. The method first created a convex-polygon path formed by all the outer RPs and then adjusted the path by including all the remaining inner RPs.
Amgoth and Annavarapu [37] proposed a method based on the technique called ant colony optimization (ACO) for path design. The method also relies on a directed spanning tree representing the network relay of sensor data. For efficient path design, pheromones are used as weights for selecting RPs. The method gradually selects one RP at a time until the whole path requires the MS to travel along more than the DL. However, the method depends on the tree and selects the next RP only from the neighbors of the current RP, and it also requires the MS to reach the node before gathering the data. Thus, in scenarios where the DL is limited, the path might cover only the nodes lying on certain connected branch of the trees. In contrast, in our technique, we consider the whole WSN topology and are able to select RPs that are distributed across the network. In scenarios where the DL is large, the ACO-based method does not consider increasing the network lifetime.
Our proposed method increases RPs as long as the path does not exceed the DL so that the path could be longer and closer to SNs, thus saving more network lifetime. Sert et al. [38] addressed the problem of data gathering in WSNs, and proposed a method called the two-tier distributed fuzzy logic-based protocol (TTDFP). The authors suggested the use of FLS, relative node connectivity, distance to the sink, and remaining energy for clustering and selecting the clusters' heads. Fuzzy rule parameters were adaptive (instead of being fixed) to avoid inefficient trial and error methods and any human bias. The FLS was also applied to routing based on the average link remaining energy and relative distances. However, the work required head nodes to relay sensory data to the base station. Once the head nodes are selected, their energy might be depleted rapidly, thus reducing the network lifetime.
In addition, Qadori et al. [39] proposed an MS-based data collection method called fuzzy-based mobile agent migration (FuMAM). The FLS-based method arranged the order of SNs, from which the MS has to travel to collect data by taking into account the remaining energy, the distance to the source node, and the number of neighbors. However, based solely on FLS (and not on TSP), it is possible that the designed path was not the shortest path that passes through all SNs. The work also did not select any SNs as RPs.
In 2019, Wang and Chen [24] introduced a technique for planning paths along which the MS could travel and collect data with the limitation of buffers. The technique, called EARTH, created a tree formed by all SNs and selected RPs based on the hop counts, tree height, and the amount of relay data. The method examined the suitability of SNs before promoting them as RPs. The author also attempted to reduce the amount of data required for transmission per RP. This research demonstrated a technique in which RP selections were revalidated by considering nearby SNs. As the focus of this work was on managing data traffic, the travel paths for MSs were significantly longer than others.
In 2020, Donta et al. [40] presented a method called hierarchical agglomerative clustering-based data collection (HACDC) to address the path design problem in 3-dimensional WSNs. The work is based on hierarchical agglomerative clustering to determine SN groups and on dendrogram statistical methods to determine RPs. The virtual RPs are determined to increase the efficiency and reduce the RPs' load before designing the paths based on the method called MS tour planning. However, once the method specifies the RPs and the paths, it does not perform path fine-tuning and does not address the DL constraint.
In this paper, our research work selects RPs by leveraging a FLS to determine weights, in contrast to other studies that calculate weights based on equations. The FLS allows multiple membership functions (MFs). Each variable value is transformed by the MFs into multiple membership levels, which are used as inputs in the FLS's rules, thereby providing more flexible value interpretations. We take into account SN energy in the path determining process, while previous works consider the quantity of neighboring SNs and distances.
While previous works (except [32]) are based only on the TSP algorithm to determine the traveling path for the MS, we propose a method to reduce the path obtained via TSP to determine data-collection locations (within the RPs' communication ranges), allowing more RPs on the path under the same delay-time constraint and reducing the amount of data transmitted by each RP. When selecting the relaying RP for each SN, previous studies choose the RP closest to the SN and depend on MST for determining the relaying paths, while our work is based on each RP's weight (not just the distance). Thus, the loads can be distributed among all RPs with remaining energy, increasing the overall RP longevity.

III. SYSTEM MODELS
We provide a brief description of network and energy models in this section.

A. NETWORK MODELS
To simulate WSNs to study the performance of MSs' data collection, we define variables and model networks based on relevant research [26], [30], [34]. Each network consists of a single sink node that acts as a base station that receives data from a MS. SNs, with sufficient energy at appropriate locations are chosen to be RPs. The unselected SNs (non-RP SNs) send their data to specific RPs, which then relay the data to the MS. After the algorithm plans an RP route, the MS travels a round trip along the route to collect data from all the SNs.
In contrast to SNs with always-on power sources, all the SNs are sensitive to energy loss (while the MS is sensitive to travel time). Our method also allows an RP to transmit data once it reaches the MSs' communication range without reaching the RPs' exact locations. Certain non-RP SNs are also responsible for receiving data from nearby SNs far from any RPs before forwarding the data to the specific next hop (whether it is an RP or another non-RP SN).
Note that Table 1 presents all the notations used in this paper. Figure 1 shows the network model, which consists of a sink node (shown as the black square) and SNs (shown as the circles). The black circles represent SNs selected as RPs, and the white circles represent the SNs not chosen as RPs  (i.e., non-RP SNs). Our model also makes the following assumptions [22], [26], [30], [34]: • All SNs have the same initial energy level and operate under a limited power constraint.
• Every SN is always able to send data to the sink node via hop-by-hop transmission.
• After node deployment, all nodes are immobile and know their own exact coordinates and the location of the sink node (e.g., from GPS or various localization methods).
• The MS must travel to collect the information from every RP.
• The MS wastes no time collecting data from RPs. VOLUME 9, 2021 • The MS has enough energy to travel a round trip to collect the data. The battery can be either swappable or a fast-charge battery based on current battery technology.
• If the MS travels to collect data within an RP's communication range but does not truly reach the RP, the data transmission requires more power (due to the longer distance), as in the case when SNs send data to their associated RP.

B. ENERGY MODELS
The most popular fundamental means of modeling WSNs is to focus on SNs' power consumption. According to [41], [42], the power consumption depends on the amount of transmitted data (k) and the distance (d) between the sender and the receiver.
Equations (1) and (2) are used to calculate the power consumed by an SN while transmitting and receiving data, respectively. As presented in Table 7, E elec and E amp are set to 50 nJ/bit, and 10 pJ/bit/m 2 , respectively, according to [34].  Figure 2 shows an overview of the proposed system, which consists of three modules described in the subsections RP-selection module, Collection-spot module, and Data-forwarding module.

IV. FUZZY LOGIC-BASED PATH PLANNING (FLPP) FOR DATA GATHERING MSs IN WSNs
1) RP-selection module. This module determines the appropriate number of RPs. Our proposed method starts from considering all SNs according to their weight. SNs with higher weight values are considered first. We also consider the number of RPs along two paths: the one close to all SNs (called TxEPath, to preserve SNs' energy when transmitting data) and the other (called STPath, which is the shortest path for the MSs to collect data at the exact locations of all RPs). Both paths require the MS to travel without exceeding the DL. 2) Collection-spot module. This module pinpoints spots (within the RPs' communication ranges) for the MS to collect data. It considers the TxEPath and STPath to determine a shorter MS traveling path. The FLS weights help in determining how close to the RPs the data-collection locations should be.
3) Data-forwarding module. This module specifies which neighboring node of each SN should be designated the next hop to relay the SN's data. The designated node can be chosen from 1) the RPs within the SN's communication range and 2) other non-RP SNs that are also within the SN's communication range and already have their own designated node. We use FLS to select a designated node based on its remaining power, the number of SNs having the node as their designated node, and its distance from the SN. The results from the FLPP modules can be improved by redesigning the paths (as discussed in detail in Section IV-C.2). In Section VI, we compare previous works with our proposed method without path redesign (i.e., the paths are determined only once). After the comparisons, we present the experimental results of our enhanced FLPP.

A. RP-SELECTION MODULE
This module adjusts the MS traveling path so that the MS takes a shorter or equal amount of time.

1) RANKING SNS (FOR DETERMINING CANDIDATE RPS)
This step ranks SNs in terms of their ability to be RPs. FLS is adopted to determine each SN's weight based on the following characteristics.
• Remaining energy (RE). The SNs that are effective RPs are the ones with high remaining power levels to support data transmission.
• Number of nodes in the transmission Range (NNR).Effective RPs are in an area of dense SNs and could have a large number of associated SNs, thereby reducing the overall number of data relays. However, an excessive number of associated SNs could overload the RPs: a suitable area is not overcrowded nor too sparse.
• Average energy consumption (AEC). AEC is the average energy consumed by each node per the MS's traveling trip. If an RP consumes a relatively large amount of power, the RP will run out of power before other SNs. Thus, rotation of SNs as RPs, (based on their AEC) is necessary.
• Distance from the SN to the centroid of all SNs in its communication range (D SN,C ). SNs closer to their associated centroid (thus being closer to the SNs within their communication range) should be chosen as RPs.
All the associated SNs would consume a similar amount of power when transmitting data and thus run out of energy almost simultaneously. After the weights are determined, we rank the SNs as illustrated in Figure 3. Figure 3(a) shows the weight of each node. Figures 3(b-e) and 3(f-h) show the node ranking in the first and second iteration, respectively. In Figure 3(b), SN 5 is the first node to be considered as an RP as its weight is highest. SN 6 is in SN 5 's communication range and is not ranked nor selected until the next iteration. (To differentiate nodes ranked in the current and the next iteration, the former's communication range is represented by a solid circle while the latter's is represented by a dash circle.) The process continues ranking the next maximal-weight SN and excluding the nodes in its radius (for the next iteration) until no SNs are available for ranking, as shown in Figure 3(e). Then, the process is repeated for the second iteration, as shown in Figure 3 To rank the potential of SNs to be RPs, Algorithm 1 requires the weight of each SN, which is obtained via FLS.
Lines 1-5 initialize the input variables. listSNs is the variable used to store the lists of SNs. listRE stores the remaining energy of all SNs. listAEC stores the average energy consumed by each node per the MS's traveling trip (average over all previous trips); it initially has a value of 0. list_of_ListNinR is a variable that stores a list of SNs within the communication range of each SN. listD SN ,C is used to store the distance from each SN to its centroid. listCRP stores the sequence of SNs to be considered as RPs obtained by considering the FLS-based weights according to lines 7-9. In lines 10-11, the highest-weight SN is determined. The list of SNs within range of the highest-weighted SN are stored in listNinR (line 11). The SN is removed from listSNs (line 13) once  In this paper, we choose triangular MFs, as shown in Figure 4, as they can represent values whose degrees of membership (membership levels) are different. We adjust the MF model based on the results of preliminary experiments and then define the conditions of the ally fuzzy rules. The number of rules depends on the number of inputs and the input MF stage. For example, the W1 weighting requires two inputs: RE and NNR. Each input consists of three MF stages (low: L, medium: M, and high: H). Thus, there are 9 possible fuzzy rules (32), which output 5 MF stages (i.e., very low: VL, low: L, medium: M, high: H and very high: VH), VOLUME 9, 2021  as shown in Table 2. According to the rules, if an SN has more remaining energy (RE), it is assigned a greater W1 value, while the SN's NNR should be at a moderate level.
The proposed fuzzy inference engine (based on an aggregation method called intersection operation) is used to obtain W1 as an output. Figure 5 shows an example of using the fuzzy inference engine according to the 9 rules in Table 2. We assume RE and NNR are 0.8 and 0.6, respectively. In each rule, the membership degrees of RE and NNR are compared: the smaller one is used as the output's membership degree. The outputs of all rules are then aggregated, and defuzzified by determining the center of gravity (CoG) of the aggregated output, yielding the output MF in the first stage of FLS (W1).
In short, the output is obtained by taking the normalized RE and NNR values, which are then transformed by MFs into membership degrees. The degrees are then used for inference by means of each fuzzy rule. The inferred results are weighted and finally aggregated. In other words, the output (W1) is the sum of all membership degrees obtained by applying the MF to the normalized RE and NNR values (according to all the rules) based on the aggregation method.
After obtaining W1, we determine W2 by applying FLS to the values of AEC and D SN ,C . This step is similar to the previous one. The MF for AEC and D SN ,C is shown in Figure 4(a), while that for W2 is shown in Figure 4(b). Note that if AEC (average consumed power) and D SN ,C (distance between the SN and its centroid) are large, then the SN's potential to be an RP decreases. Table 3 shows all 9 fuzzy rules, whose outputs are aggregated later to determine W2.
To calculate the W3 of each SN, both W1 and W2 are used by FLS according to the fuzzy rules in Table 4. The MF for W1 and W2 is shown in Figure 4(a), while that for W3 is shown in Figure 4(b). The SNs with the largest W3 are stored  in CRP. The SN and the other SNs within its communication range are excluded from the weight calculation in the current iteration to avoid selecting RPs that are close to each other. Then, the SN with the second highest W3 is determined. The process is repeated until there are no more SNs to consider. At the end of each iteration, if the number of SNs in the CRP is still not equal to the number of all SNs, the process starts the next iteration by reconsidering the SNs (excluded in the current iteration but not yet in the CRP) to be selected as RPs.

2) TSP TO REDUCE PATH LENGTH BASED ON RADIUS (RBR)
After the CRP is obtained, we focus on determining the appropriate number of RPs (avgRP) to be deployed on the path the MS travels to collect sensor data. However, there are two other outcomes to address: TxEPath (the path that conserves the RPs' transmission energy) and STPath (the path that conserves the MS's energy).  The TSP to RBR procedure has two phases. Phase I (TSP): Determine TxEPath based on TSP [29] to provide cycles visiting all given RPs. Phase II (RBR): Determine STPath, which is the path obtained by applying our distance-shortening process, called RBR (reduce path length based on radius), to the TxEPath. RBR attempts to reduce the MS's traveling distance by redirecting the MS path through the area where the RP's communication ranges overlap because the MS only has to reach the RPs' communication ranges, not the RPs' locations, to collect data. This phase is described in detail below.  6 , and RP 7 ) is already selected. The intersections of the first area are c 1 , c 2 , and c 3 ; the second area has intersections at c 4 , c 5 , and c 6 . The intersections of (RP 8 , RP 9 ) are c 7 and c 8 . As shown in Figure 6, c 1 , c 4 , and c 7 reduce the path distance the most. 2) Path reduction based on shortcuts. In this step, we consider reducing the path length when the MS has to travel to collect data from the remaining RPs whose communication range does not overlap with others. Instead of visiting the exact RPs' locations to collect their data, the MS collects data when it is within the RPs' communication range; see Figure 6 as an example.
The MS does not have to travel from c 1 to RP 4 and from RP 4 to c 4 because it just needs to reach the boundary of RP 4 's communication range to collect RP 4 's data.
To reduce the path length based on shortcuts to collect data from RP n , we first consider the dashed line from RP n−1 to RP n+1 . If the line passes through RP n 's communication range, the line is included in the MS traveling path. (The data-collection location lies on the straight line from RP n−1 to RP n+1 and closest to RP n , shown as a cross in Figure 7a.) Otherwise, the new data-collection location is the intersection between the boundary and the line parallel to the direct line from RP n−1 to RP n+1 , as shown in Figure 7b.  The two steps in RBR are represented by reduce() in line 7 of Algorithm 3 and Equation (5). Algorithm 2 and 3 illustrate the detailed processes for determining TxEPath and STPath, respectively.
The greater the number of RPs on the TxEPath is, the less energy the SNs consume during data transmission. Algorithm 2 determines the upper bound of the number of RPs. The binary-search algorithm is used to determine the maximum number of RPs (on the TxEPath) possible while the round-trip traveling time is still less than DL.
Algorithm 2 differs from Algorithm 3 in that the former applies TSP to determine circles (which are the MS traveling paths) to specify the visitation sequence of the RPs, while the latter is based on RBR to yield paths with reduced length. Algorithm 3 also provides the maximum number of SNs possible under the DL constraint.
After TxEPath and STPath (with the highest number of RPs possible) are obtained such that the round-trip MS's traveling time does not exceed DL, avgRP can be obtained according to Equation (3). The average value is the number of RPs, which can be increased or decreased to adjust the MS traveling path. TxEPath is determined using avgRP according to Equation (4). STPath is obtained by applying RBR to TxEPath, as shown in Equation (5). For our preliminary results, the number of RPs, equal to the average of maxRP_Tx and maxRP_ST, yields best results. The number of RPs close to maxRP_ST is insufficient: RPs waste energy transmitting data to the MS via longer distances. The number of RPs close to maxRP_Tx is excessive, as it results in premature death of the RPs receiving data from the large number of SNs.
TxEPath = TSP sink_Node, RP 1 , RP 2 , . . . , RP avgRP (4) STPath = reduce (TxEPath)    • The distance from the SN to FinPath must be less than that from the SN to its closest RP. Figure 8 shows an example of including more RPs after FinPath has been obtained. FinPath is shown as the dashed line. The distances from the candidate SNs (SN 5 and SN 6 in the figure) to their closest RP are the lengths of the bold lines. The shortest distances from the candidate SNs to FinPath (within its communication range) are equal to the thin lines' lengths. SN 1 , SN 2 , SN 4 , and SN 7 are not candidates for RPs as they are farther from FinPath than their closest RP. SN 3 has no FinPath in its communication range. As being closer to the FinPath than their closest RPs, SN 5 and SN 6 are chosen as additional RPs.

B. COLLECTION-SPOT MODULE (REDUCE PATH LENGTH FROM STPATH TO TxEPATH)
After the previous module, we obtain STPath (reduced by means of the proposed RBR method). In this section, we attempt to form FinPath based on the locations where the MS collects data from each RP. The locations are determined based on the RP's SN density and remaining energy with the aim of balancing the load among RPs with different SN densities.
If RPs are heavily loaded, the MS should approach closer to the RPs before collecting their data so that the RPs' energy lasts longer. In contrast, if the RPs' load is light, we can adjust the MS traveling path (i.e., the RPs transmit data at longer distances) to reduce the MS traveling time. The datacollecting positions depend on the weights, each of which is based on each the RP's energy trend. The weight values are calculated from a fuzzy system and are converted into percentages for adjusting STPath towards TxEPath and finally attaining FinPath (which is still under the DL constraint), as shown in Figure 9.
To determine the RPs' energy-trend weights (W4), FLS is used in a similar way as when determining the RP-candidate weights, as previously described in the Section RP-selection module. Two factors are considered: RP's remaining energy (RE) and the number of nodes in the transmission range (NNR), based on Algorithm 4. The MF of RE and NNR is depicted in Figure 4(a), while that of W4 is shown in Figure  4(b). The FLS rules are shown in Table 5.
Lines 1-2 in Algorithm 4 initialize the variables. W4 in Line 3 represents the energy-trend weight obtained from FLS based on the values of RE and NNR. Line 5 shows ratioRP, which is the greatest possible distance percentage the data-collection locations can be adjusted away from each RP's position (in each iteration). Lines 6-14 show how Fin-Path is determined by gradually adjusting each RP's datacollection location in TxEPath towards the one in STPath until the new path has a round-trip traveling time less than DL, as illustrated in Figure 9.
C. DATA-FORWARDING MODULE 1) SELECT RP CANDIDATES After RPs are obtained, this step determines the designated nodes for each SN. The designated nodes can be selected from either RPs or SNs, which are in charge of relaying sensor data further to an RP. To choose a designated node for each SN, we consider not only the distance between the SN and RPs but also its remaining energy. The selection starts by considering the designated node for the SN closest to one of the RPs.

14: END FOR
Then, the process assigns one designated node to the SN that is second closest to one of the RPs. The process continues on until all SNs have a designated node.
To determine a designated node appropriate for relaying data for SN i , we consider all candidate nodes (SN j ) within the SN i 's communication range. Note that a candidate SN j can be either one of the RPs or another SN that is a designated node already assigned to other SNs. In addition, it is necessary to consider the remaining energy of all nodes lying in the j th candidate path from SN i via SN j to the corresponding RP (RP j ), denoted as candidatePath i,j .
The node closest to SN i is not always the SN i 's designated node. As shown in Figure 10, FLS is adopted to determine the weights of all candidate nodes in (SN j ) based on the following factors: • The average energy of all the nodes in candidatePath i,j (AvgE i,j ). (If AvgE i,j is already low, assigning SN j to SN i might cause the energy of the nodes in the j th path to be depleted faster.) • The distance from SN i to RP j via the candidate node SN j (Dist2RP i,j ).
• The number of nodes lying in candidatePath i,j (NNG i,j ).  Table 6 shows all 27 new rules, each of which yields a W5 value as output based on 7 membership levels: very low (VL), low (L), little low (LL), medium (M), little high (LH), high (H), and very high (VH). Figure 11 shows an example of selecting designated nodes. A designated node is selected for SN 1 first since it is closest to one of the RPs. Figure 11(a) shows the weights of the candidate nodes (RP 1 and RP 2 with the arrow dashed line) obtained from the FLS. Figure 11(b) shows that the designated node of SN 1 is RP 1 (shown with the arrow solid line) since its weight is higher than that of RP 2 . Figure 11(c) shows the designated-node selection for SN 3 (SN 3 is the second closest SN to one of the RPs). SN 3 s candidate nodes are SN 1 and RP 2 . Although SN 1 is closer to SN 3 , the path from SN 3 to RP 2 is shorter than that to RP 1 via SN 1 : SN 1 s weight is therefore less than that of RP 2 . Figure 11(e) shows the designated-node selection for SN 2 , where SN 2 can choose to send data via 3 candidate paths: the direct path from SN 2 to RP 1 , candidatPath 3,1 to RP 1 via SN 1 , and candidatPath 3,2 to RP 2 via SN 3 . Because the weight of SN 1 is the highest, it is assigned as the designated node for SN 2 .

2) ENERGY-AWARE DYNAMIC PATHS
Each node's remaining energy must be monitored as its power might run out prematurely. We propose a metric (T DP ) by  comparing 1) the average remaining energy (i.e., the difference between the initial energy (E init ) and the average expended energy (AEC)) with 2) the least remaining energy (min(RE)) among all nodes in the network, according to Equation (6).
Based on our experiments, path redesign (rerouting) should be performed when T DP is greater than a threshold, indicating that the gap between the average remaining energy and min(RE) might be excessive. If the threshold is less than threshold, path redesign would be triggered more often, but there might be very little or no change in the new path. If the threshold is set more, the resulting path may significantly change from the original. However, the larger the threshold is, the greater the number of nodes that run out of power prematurely. Therefore, T DP is calculated after each round of MS data collection. If T DP is less than threshold, the MS travels along the same path in the next round; otherwise, the MS roams along a newly designed path.

V. RUNNING TIME ANALYSIS & SPACE COMPLEXITY
In this section, we provide the running time analysis and space complexity of our proposed method consisting of 3 major modules: RP selection, data-collection-location determination, and designated-node selection.

A. CALCULATING THE NUMBER OF RPs AND DESIGNING THE MS TRAVELING PATH
Given that there are N SNs deployed throughout a WSN area, the process of selecting SNs as RPs consumes the most running time, O(3N+2N 2 ) = O(N 2 ), consisting of the running times of the following subprocesses: obtaining

D. SPACE COMPLEXITY
In regard to Algorithm 1, 10 lists and 4 integer variables require 40N + 16 bytes. Algorithm 2 involves 2 lists and 6 integer variables, demanding an additional 8N + 24 bytes. Algorithm 3 works similar to Algorithm 2 but requires one more list, so 12N + 24 bytes are needed. In Algorithm 4, data are stored in 6 lists and 3 integer variables (i.e., 24N + 12 bytes). In total, the space complexity of our method is 84N + 76 bytes (O(N )). However, all the algorithms run consecutively (from Algorithm 1 to 4) and the unused spaces should be returned. Therefore, our method requires only 40N + 16 bytes at most.

VI. EXPERIMENTAL RESULTS
In this section, we provide the experimental settings and results.

A. EXPERIMENTAL SETTINGS
We evaluate the performance of our proposed system via simulations using MATLAB R2018b. The proposed algorithm is evaluated in three different network scenarios (i.e., areas of 220 m × 220 m [34], 500 m × 500 m [26] and 1000 m × 1000 m), where SNs are deployed (randomly with a uniform distribution) to demonstrate the efficiency of our proposed algorithm and determine whether it is flexible for both small and large areas. The sink node is always at (0,0) in the lower left position, as in [26]. All SNs have the same initial power (2 J) [34] and have 128 bytes of packet data to send [26].  We also assume that the MS travels at 2 m/s [34], T DP threshold is 5 and has no energy constraint but is subject to the DL, i.e., 275 [34], 1200 [26] and 2400 seconds when the area size is 220 m × 220 m, 500 m × 500 m and 1000 m × 1000 m, respectively.
We compare FLPP and FLPP_RPP with another algorithm called delay bound reduced K-means (DBRkM) [34], which selects RPs based on K-means and determines the MS traveling path based on a weight function. In addition, we also compare WRP [30] and CB [22] (operating under the DL constraint) with these three methods. All comparison results are shown in graphs (together with the corresponding 95% confidence intervals) regarding the following aspects: 1) average energy consumption of all data-collection rounds, 2) network lifetime, i.e., the duration between the time the system starts and the time the first SN's energy is depleted, 3) the MS's average traveling distance for data collection, and 4) the number of RPs. In addition, in each evaluation, there are 20 trials on average. Finally, we average the remaining results.

B. PERFORMANCE EVALUATION OF FLPP AND FLPP_RPP IN A 220 M × 220 M AREA
We evaluated the performance of FLPP and FLPP_RPP for a small area in terms of the following three aspects. Figure 12 shows the average power consumption versus the number of SNs. FLPP and FLPP_RPP always consume less power than DBRkM, WRP, and CB do because of  the different numbers of RPs. (FLPP spends 14%, 54% and 64% less energy than DBRkM, WRP and CB, respectively. FLPP_RPP outperforms FLPP by 0.6%.) CB randomly selects RPs, resulting in RPs at unsuitable locations. WRP uses weights to select RPs, so it outperforms CB. DBRkM yields higher performance than WRP because of its enhanced RP-selection method. FLPP and FLPP_RPP initially attempt to select as many RPs as possible, thereby increasing the likelihood that nodes transmit data directly to the MS without passing data through the RP. Shown as the I shape on the top of each bar, the 95% confi- 2) NETWORK LIFETIME Figure 13 shows a performance evaluation of the network lifetime (i.e., duration from the network start time until the first node runs out of energy) as the number of SNs increases. The network lifetimes of FLPP and FLPP_RPP are significantly longer than those of DBRkM, WRP, and CB because they allow more RPs to be deployed. (FLPP allows nodes to live 58%, 53% and 60% longer than DBRkM, WRP and CB do, respectively. In this aspect, FLPP_RPP outperforms FLPP by 10%.) As a result, the number of multihop transmissions in the network is smaller, lowering the overall RP energy consumption. (Note that RPs often run out of energy before the SNs do.) The communication protocol, including signaling distribution for path redesign, is not within the scope of this  work: we did not include the energy consumed to control signaling transmission in the experiments. Figure 14 shows the length of the MS traveling paths as the number of SNs increases: no paths are longer than the 550 m limit (MS speed × DL = 2 × 250 = 550). FLPP, FLPP_RPP and WRP utilize the DL better than the others, as more time is required by the MS to traverse the paths yielded by FLPP, FLPP_RPP and WRP. This is because FLPP and FLPP_RPP use the proposed function (i.e., reducing path length from STPath to TxEPath) to design routes, while WRP iteratively verifies that all nodes are suitable for being RPs and that the path distance does not exceed the limit. Figure 15 shows the number of RPs versus the number of SNs. FLPP and FLPP_RPP always deploy more RPs than DBRkM, WRP, and CB because FLPP and FLPP_RPP allow the MS to collect the data within the RPs' communication ranges. Consequently, the MS has more remaining energy to collect data from more RPs under the same DL limitation. (FLPP deploys larger numbers of RPs by 2.8%, 85%, 94% and 96% than those of FLPP_RPP, WRP, CB and DBRkM, respectively).

C. PERFORMANCE EVALUATION OF FLPP AND FLPP_RPP IN A 500 M × 500 M AREA
We evaluated the performance of FLPP and FLPP_RPP in a large area in three aspects, as follows.   Figure 16 shows the average power consumption as the number of SN increases. Similar to the results in the experiments with a small area, FLPP and FLPP_RPP always consume less power than DBRkM, WRP, and CB do due to the larger DL, which contributes to the lower energy consumption in all traveling rounds. FLPP consumes 17%, 26% and 53% less energy than DBRkM, WRP and CB, respectively, and FLPP_RPP outperforms FLPP by 3%.

1) AVERAGE ENERGY CONSUMPTION
2) NETWORK LIFETIME Figure 17 illustrates the network lifetime as the number of SNs increase. FLPP and FLPP_RPP have a longer overall network life than DBRkM, WRP, and CB do due to the increase in the number of RPs and the larger DL. The network based on FLPP lives 48%, 36% and 52% longer than those based on DBRkM, WRP and CB, respectively, and the network lifetime of FLPP_RPP is 13% longer than that of FLPP. Figure 18 illustrates the path lengths the MS travels to collect data versus the number of SNs. FLPP, FLPP_RPP and WRP have higher utilization of the DL than CB and DBRkM do (i.e., the paths obtained from FLPP, FLPP_RPP and WRP are longer but do not exceed the DL). Figure 19 shows the number of RPs versus the number of SNs. FLPP and FLPP_RPP always deploy more RPs than DBRkM, WRP, and CB because FLPP and FLPP_RPP allow   the MS to collect the data within the RPs' communication ranges. Consequently, the MS has more remaining energy to collect data from more RPs under the same DL limitation. (FLPP deploys 0.2%, 70%, 82% and 86% larger numbers of RPs than those of FLPP_RPP, WRP, CB and DBRkM, respectively).

D. PERFORMANCE EVALUATION OF FLPP AND FLPP_RPP IN A 1000 M × 1000 M AREA
We evaluated the performance of FLPP and FLPP_RPP in a very large area in three aspects, as follows. Figure 20 shows the average power consumption as the number of SN increases. Similar to the results in the experiments with the small area, FLPP and FLPP_RPP always consume less power than DBRkM, WRP, and CB due to the larger  DL, which contributes to the lower energy consumption in all traveling rounds. FLPP consumes 34%, 50% and 65% less energy than DBRkM, WRP and CB, respectively, and FLPP_RPP outperforms FLPP by 1%.

1) AVERAGE ENERGY CONSUMPTION
2) NETWORK LIFETIME Figure 21 illustrates the network lifetime as the number of SNs increase. FLPP and FLPP_RPP have a longer overall network life than DBRkM, WRP, and CB due to the increase in the number of RPs and the larger DL. The network based on FLPP lives 47%, 19% and 32% longer than those based on DBRkM, WRP and CB, respectively. The network lifetime of FLPP_RPP is 7% longer than that of FLPP. Figure 22 illustrates the path lengths that the MS travels to collect data versus the number of SNs. FLPP, FLPP_RPP and WRP have higher utilization of the DL than CB and DBRkM do (i.e., the paths obtained from FLPP, FLPP_RPP and WRP are longer but do not exceed the DL). Figure 23 shows the number of RPs versus the number of SNs. FLPP and FLPP_RPP always deploy more RPs than DBRkM, WRP, and CB because FLPP and FLPP_RPP allow the MS to collect the data within the RPs' communication ranges. Consequently, the MS has more remaining energy to collect data from more RPs under the same DL limitation. (FLPP deploys 0.7%, 87%, 89% and 92% larger numbers of RPs than those of FLPP_RPP, WRP, CB and DBRkM, respectively).

A. SIGNALING AND TASK ASSIGNMENT
The communication regarding task assignment for SNs (whether they are selected as RPs, assigned as designated nodes for other SNs, or which nodes they should send data to) is out of scope of this paper. In practice, once the WSN starts, the sink node is in charge of task assignment and path design. Control signals might be distributed by the sink node in the form of broadcasting, multicasting, or gossiping protocol.
After the MS completes each traveling round, it updates all the information of the sink node, including each SN's remaining energy (making battery recharge or replacement possible). The sink node may assign SNs new tasks and design a new path for the MS to travel to collect data while distributing the control signals. In particular, the MS transmits signals to each RP, which then distribute the signals further to their associated SNs.
In our proposed method, we also rotate SNs to be RPs to prolong the WSN's lifetime. However, the path redesign and SNs new task assignment should not be conducted every time the MS returns because they might impose more computational time and waste more energy for control signaling. For example, path rerouting can be conducted periodically (e.g., after the MS completes n rounds). In our work, based on Equation (6), we monitor the difference between the average energy and the minimum energy of all the SNs: the path is not changed until the difference is greater than a specified threshold.
In general, research works that require the MS to collect data are based on the centralized concept. All processes in our proposed method are executed by the centralized sink node (i.e., the base station), which has an unlimited power supply. The base station is required only when the MS returns from data gathering and needs to transfer the sensor data to the base station. The distributed route computation demands more computational energy from SNs, reducing the network lifetime and increasing expenses on SNs' battery replacements (especially if the SNs are deployed at the locations that are very hard to access).
Furthermore, the centralized and distributed methods can be complementary to each other. In cases where the MS is down, these events are detected by the base station, and SNs close to the base station could be triggered to perform distributed routings, as introduced by the other works [44]- [47].

B. UPON ENERGY DEPLETION
According to the experimental results, RPs generally run out of energy faster than SNs do. The RPs consume energy according to Equation (2). Specifically, the greater the number of SNs associated with an RP, the shorter the RP's longevity. We can address this issue by various means, such as supplying larger batteries to RPs or increasing the number VOLUME 9, 2021 of RPs to help balance the load. However, the former method requires static RP locations and has a higher cost. For the second method, a greater number of RPs results in a longer MS traveling time, which is limited by the DL constraint. Consequently, we proposed a method to reduce the path length (called RBR) such that the number of RPs can be increased while the path still satisfies the DL constraint.
The energy of RPs, designated nodes, or relay nodes might be depleted before the MS returns to the sink node to have a new path redesigned. In such cases, the data of the SNs associated with the RPs or the designated nodes are lost; the maximum duration of the loss is equal to DL. The data can be collected again once the path is rerouted by the sink node.

C. DESIGN OF THE FLS
SNs are selected as RPs according to their weight, based on NNR, AEC, RE, and D SN ,C . NNR and D SN ,C indicate whether the SN is located in the high-density area or in the center of other related SNs. RE and AEC help in rotating SNs (with remaining energy) to be RPs. The distance between a pair of RPs appears useful for distributing RPs across the WSN. However, when the distance is included in FLS, it makes our preliminary results worse, as it does not favor RPs in high-density areas.
To determine a designated node for each SN, a promising approach is to select the RP closest to the SN as the designated node. However, such selection might overload certain RPs (and shorten the RPs' lifetime), as they are surrounded by dense SNs. In addition, MST is a potential alternative for selecting designated nodes. MST can determine the shortest path for sensor-data transmission; however, the SNs close to RPs might have to forward considerable loads from other SNs to the RPs, causing premature energy exhaustion. Instead of considering only distance, we rely on FLS to calculate each candidate's weights. The weights are determined by taking into account each candidate's remaining energy, the number of SNs associated with the candidate, and the distance from the nodes to their RP. As a result, the load is distributed more widely among all candidates, reducing the energy consumption of both designated nodes and SNs.
Our fuzzy rules designed in this paper are nonadaptive. They are suitable for uniformly distributed node placements. In the situation where nodes are distributed extremely nonuniformly, the rules may not be efficient enough to obtain weights for selecting appropriate RPs. In the future, we plan to address such issues and design adaptive and more flexible fuzzy rules, by focusing on both node density and distances.

D. RUNNING TIME ANALYSIS
Let N and M be the numbers of SNs and RPs, respectively. According to Section V, the runtime of the overall method is O(N 2 + M 3 log M ), while those of CB and WRP are O(N 2 log N ) and O(N 3 ), respectively. Our proposed method outperforms CB and WRP in terms of running time in the case where N is much larger than M. Specifically, our method is efficient if 1) the density of SNs is high, i.e., one RP is the designated node of many SNs, and 2) the DL is low, causing the number of RPs (M) to be small. Although our method requires more running time in the situation of low density and high DL, the simulation results show that it takes less than 10 minutes to execute the whole process, which is satisfactory for applications that are not time sensitive.

E. MULTIPLE MSs
In this paper, we assume, without loss of generosity, that the MS has enough power to travel a round trip under the DL constraint. However, in real WSNs, there might be some cases where the MS power storage is limited; in these scenarios the administrator can decrease the DL value and our method can adjust the path accordingly. In the future, we plan to study the MS's energy consumption model by determining the correlation between the MS's energy capacity and the round-trip distance (i.e., DL) to estimate the DL value based on the MS's energy capacity. In addition, if the capacity is relatively small so that the algorithm cannot find a single cyclic path that covers the whole area, we plan to study how to design multiple cyclic paths so that the MS can travel along them and return to the base station to recharge or swap the battery before traveling on the other paths to cover the whole area.
Furthermore, it is possible to deploy multiple MSs to collect data in a shorter time period. In addition, addition MSs allow more RPs under the same DL constraint, resulting in a lower load at each RP and a longer WSN lifetime. As future work, we will address how to coordinate multiple MSs and how to have the MSs efficiently recharge the SNs energy.

VIII. CONCLUSION
This research presents a novel route-planning method. We attempt to determine efficient cycles for MSs under delay limitations while reducing the energy consumption of SNs. The method consists of 3 phases. The first step is to determine the number of RPs and create a cyclic path. In the second step, data-collection locations are specified so that the MSs can travel within the RPs' communication ranges and collect data effectively. Last, paths for relaying sensory data from SNs are calculated, allowing each SN to choose its best RP while efficiently reducing the overall energy consumption of RPs. The experiments showed that FLPP_RPP and FLPP outperform DBRkM, WRP and CB.
In the small-area experiments, FLPP consumes 14%, 54% and 64% less energy, on average, than DBRkM, WRP and CB do, respectively. Compared with FLPP, FLPP_RPP consumes 0.6% less energy on average. In terms of the network lifetime, the energy of the networks based on FLPP lasts 58%, 53% and 60% longer than that of DBRkM, WRP and CB, while FLPP_RPP outperforms FLPP by 10%. In the larger-area experiments, FLPP uses 17%, 26%, 53% and 3% less energy, on average, than DBRkM, WRP, CB and FLPP_RPP do, respectively. The average lifetime of the FLPP-based network is 48%, 36% and 52% longer than that of DBRkM, WRP and CB, respectively, and FLPP_RPP outperforms FLPP by 13%.
Our proposed scheme is centralized in that the sink node is in charge of grouping SNs and choosing a relaying path for each SN. Although just one MS is deployed in the WSN, the proposed algorithm could be adapted to support multiple MSs. Future research, could focus on how to use MSs to charge SNs to increase the system up time.