An Improved Equal Hierarchical Cluster-Based Routing Protocol for EH-WSNs to Enhance Balanced Utilization of Harvested Energy

Enhancing harvested energy utilization and balancing harvested energy consumption are the main challenges in designing routing protocols for energy harvesting wireless sensor networks (EH-WSNs). This paper considers the efficient hierarchical cluster-based routing method for enhancing harvested energy utilization while ensuring harvested energy consumption balancing, reliability and connectivity of EH-WSNs. For these aims, a new hierarchical cluster-based routing protocol for EH-WSNs is proposed. This protocol contains two phases: cluster-route establishment and data transmission. Cluster-route establishment phase including CH selection, next-hop CH selection and joining CMs to a proper CH, is performed using multi-criteria such as residual energy, distance to the BS, node adjacency degree, link statistics and energy harvesting rate of nodes. Moreover, modified sensing radius adjustment scheme is introduced to further improve energy consumption balance of each node from sensing time. In data transmission phase, an adaptive transmission power adjustment scheme is proposed to use more effectively harvested energy considering current residual energy and predicted energy harvested from ambient environment. This phase further improves energy consumption balance by using a mobile sink. Extensive simulations are conducted to demonstrate that the proposed protocol achieves better network performance than the other existing protocols for EH-WSNs.


I. INTRODUCTION
Wireless sensor networks (WSNs) have many potential applications, including environmental monitoring, tracking, precision construction, military surveillance, smart homes, and so on [1]. By deploying a large number of low-power sensors, WSN collects desired information in a distributed and self-organized manner [2]. In WSN nodes, the battery is a primary source of energy, and its energy consumption is directly proportional to the lifetime of WSNs [3]. Generally, for the lifetime of the traditional WSN, it is well known to be restricted due to the limited battery capacities available with its sensor nodes [4].
So, replenishing the nodes' batteries through energy harvesting is becoming popular nowadays for improving the lifetime of WSNs [4]. WSNs powered by energy harvesting devices are called energy harvesting wireless sensor networks The associate editor coordinating the review of this manuscript and approving it for publication was Marco Martalo .
(EH-WSNs). The sensor nodes in EH-WSNs equipped with energy harvesting modules can extract energy from external sources, such as solar, thermal, vibration, and RF energy [5].
In EH-WSNs, as long as energy consumption is less than the harvested energy, nodes don't breakdown due to the lack of energy [6]. However, although EH-WSNs work by energy harvesting sensor nodes and do not suffer from the limitation of network lifetime in theory, they are still faced with several new design challenges due to the unstable and uncertain amount of energy that can be harvested from the ambient environments.
Compared to battery-powered traditional WSNs, EH-WSNs have their unique characteristics such as the uncontrollable ability and dynamics of available environment energy, different energy harvesting efficiencies among nodes, and limited capacity of rechargeable batteries, therefore the WSN routing protocols are intact applied for EH-WSNs. Taking these characteristics into account, improved clusterbased routing protocols compared with one for traditional VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ WSNs have been designed for EH-WSNs recently [5], [7]- [9]. Most of these protocols contain the multiple criteriabased clustering algorithm. Multiple criteria-based clustering algorithm specifies the set of CHs for data aggregation and relay transmission. At this time, this algorithm uses each node's some local information including residual energy, distance to the BS, adjacency degree, link statistics and energy harvesting rate of nodes as criteria for CH selection. The aim of cluster-based routing optimization in EH-WSNs is to maximize the performance according to the amount of harvested energy, keeping the network stability, reliability and connectivity [10]. However, clusterbased routing protocols used in EH-WSNs are performed in two separate phases of clustering and routing so it incurs much message overhead. The introduced transmission power adjustment scheme for efficient utilization of harvested energy is not reasonable. Moreover, sensing radius adjustment for load balance and complete coverage of network is not considered.
This work aims at maximizing the possibility for ensuring perpetual lifetime of the network by using harvested energy of all nodes in the most balanced and efficient way in each stages of network operation after deployment of nodes in EH-WSNs which are operated by harvested energy.
In this paper, we consider sensor nodes equipped with solar panels and propose an Improved Equal Hierarchical Cluster-based Routing (IEHCR) protocol that contains two phases of cluster-route establishment and data transmission. The proposed protocol reduces the overall message overhead by simultaneous working process of CH selection and routing tree construction in the way for ensuring load balance. In data transmission phase, relay transmission distance is adjusted in keeping pace with the amount of harvested energy and a mobile sink is used to improve network performance further. Also, it prolongs the network lifetime by introducing an adaptive sensing radius adjustment scheme for load balance and reducing energy consumption. The main contributions of this study are as follows: (1) In cluster-route establishment phase, a novel optimal scheme is proposed where multiple criteria such as residual energy, distance to the BS, node adjacency degree, signal-tonoise ratio (SNR) of a link and predicted harvest energy are used in the most reasonable way for CH selection, next-hop CH selection for relaying data and joining CMs to a proper CH. In this phase, the existing sensing radius adjustment scheme is introduced through modifying so that harvested energy consumption balance can be realized from sensing stage.
(2) In data transmission phase, an adaptive transmission power adjustment scheme, which adjusts relay transmission distance according to predicted harvest energy and current residual energy, is proposed.
(3) Extensive experiments were conducted and it was proved that the proposed protocol outperforms other existing ones.
The remainder of this paper is organized as follows: Section II gives a brief introduction of prior works. Relative models including network model, energy consumption model and harvest energy model are described in Section III. Section IV describes the proposed protocol in detail. Expensive experiment results of the proposed protocol and analysis of them are included in Section V. Finally, this work is concluded in Section VI.

II. RELATED WORK
Many researchers have demonstrated that cluster-based routing protocols are more energy efficient than non-cluster based routing protocols such as flat or location-based routing protocols, so that they increase the scalability and network lifetime [11]- [17]. The LEACH [18], [19] is the first cluster-based distributed approach in WSNs. Afterwards, many cluster-based routing protocols have been proposed on the basis of LEACH. Since they are designed for WSNs, however, if they were directly used for EH-WSNs they could not ensure expected several network performances through efficient utilization of harvested energy. Meanwhile, the network topology is necessary due to the energy hole problem for the energy optimization on clustering WSNs [20]. The main objective of this paper is efficient and balanced utilization of the harvested energy in EH-WSNs and therefore this problem is beyond the scope of this paper. However, as the protocol is proposed at aiming at utilizing energy in the efficient and balanced way through the whole process of routing from the sensing time to data collection, it can be regarded as the contribution to the solution of energy hole problem.
Recently, many cluster-based routing protocols have been developed for EH-WSNs [21]- [23]. In [10], authors point out that typical WSN clustering and routing methods are inefficient for EH-WSN due to the unique features of EH-WSNs and propose a novel hybrid methodology involving static and dynamic clustering operations. It uses a distributed-centralized approach and multi-hop routing and considers criteria, such as the energy level, the amount of harvested energy and the number of neighbors in the cluster formation.
DEHAR [24] by Jakobsen identifies the most energyefficient route with the aid of real-time node-to-node energy status exchanges. Although this scheme is conceptually simple, it suffers from heavy adjustment message overhead due to extensive message exchanges and also is inefficient in coping with the mismatch between the energy harvesting process and the real demand on energy. In this scheme, CH nodes packets to the remote sink consume more energy than CM nodes transmitting packets to adjacent CH nodes, and therefore energy shortages often occur at CH nodes.
Dong et al. [2] proposed a cluster-based routing protocol named DEARER. DEARER is proposed to resolve an issue of DEHAR, i. e., often occurring energy shortage at CH nodes. This protocol selects nodes with high energy arrival rates or closer to the sink to play the role as CH nodes. After set-up phase, in steady phase for data transmission CH nodes use single-hop forwarding mode to send the collected data to the sink.
In [7], Sah and Amgoth propose an uneven hierarchical clustering routing algorithm. This algorithm called NEHCP is divided into three sections, such as the initial phase, set-up phase and data transmission phase. In set-up phase, CH node is elected based on two criteria such as maximum residual energy and energy harvesting rate for the current round. In data transmission phase, CH nodes use CSMA and fixed spreading code to send the collected data to the sink. E 2 -MACH in [8] is an energy efficient clustering scheme selecting CH nodes based on multiple attributes, such as current residual energy, energy harvesting rate, number of neighbors, and quality of the link. This scheme did not design the routing tree construction phase for data transmission.
Bahbahani and Alsusa et al. [25] propose a novel clustering protocol (ECO-LEACH) that incorporates duty cycling and cooperative transmission for EH-WSNs. This protocol extends LEACH by replacing its CH selection process while introducing duty cycling and cooperative transmission. Since ECO-LEACH uses single-hop clustering, the proposed protocol has to be expanded in the hierarchical and multi-hop clustering scheme which can support large network deployments.
Zhang et al. in [26] proposed a clustering routing algorithm (CREW) that effectively utilizes the unstable and uneven harvested energy among sensor nodes. In cluster building phase, the CREW uses waiting time for CH selection competition to divide the network into uneven clusters and select CHs. In data transmission phase, it adopts an adaptive inter-cluster communication mechanism for controlling the transmission power to enhance the harvested energy utilization.
Ren and Yao et al. [5] also proposed the improved uneven cluster-based routing protocol (CPMHE) enhancing the harvested energy utilization for solar-powered WSN. It uses a novel waiting time considering multiple criteria such as residual energy, distance to sink, node density, predicted harvest energy, and SNR, to select CH nodes. Also, In data transmission phase, it introduces a dynamic transmission power adjustment scheme to enhance the harvested energy utilization. However, in data transmission phase CREW and CPMHE both use the traditional WSN multi-hop routing protocol so indices of harvested energy and SNR are not used in routing process for next hop CH selection. Furthermore, a transmission power adjustment that sends data directly to the sink only when rechargeable battery is nearly charged leaves room for better efficient utilization of harvested energy.
Literature [4] proposed the EHTEAR scheme considering traffic heterogeneity factor, one of multiple heterogeneities, as well as residual energy and energy harvesting rate of nodes for the multi-heterogeneous and energy harvesting WSN scenario. Since traffic heterogeneity factor is considered, nodes with higher traffic loads can be excluded in CH selection. This scheme also intact uses the traditional WSN routing protocols such as LEACH in data transmission.
In [27], authors proposed another method different from the above mentioned competition-based CH selection algorithms. This algorithm called EECHS specifies a node as the scheduling node (SN) for each cluster, which is used to monitor and store real-time information of residual energy for all CMs and CH in the same cluster. According to the monitored result, the SN selects a corresponding CM as the new CH in each round to reduce the consumed energy caused by CH selection. Moreover, EECHS adjusts the transmission radius of some nodes only if the corresponding batteries are fully charged. However, nodes in a cluster including a CH and CMs involve the overhead of many adjustment messages to the SN for residual energy confirmation. Some protocols studied on EH-WSNs are summarized in Table 1. All protocols have its own advantages and disadvantages.
Most of existing routing protocols designed for EH-WSNs are competition-based, which uses multiple criteria for CH selection.
To provide the balanced energy utilization, factors such as residual energy, distance to BS, adjacency degree, energy harvesting rate are considered and in some protocols even link statistics is reflected because it makes us use the better link to reduce the link loss so that the reliability of data transmission can be ensured. Especially, in data transmission phase transmission power adjustment scheme is introduced to use the harvested energy more efficiently. However, in overall operation stages including CH selection, next-hop CH selection and joining CMs to CH, multiple criteria that reflect the property of each node are not used. This results in unbalanced utilization of the harvested energy. Also, transmission power adjustment for lengthening the transmission distance is only applied in the case that the rechargeable battery is nearly charged so it leaves the possibility to increase the utilization of the harvested energy further. Especially, energy consumed for sensing is ignored in designing routing protocols under the consideration that it is much less than energy consumed for transmission and reception. To solve the above problems, this paper proposes a novel improved equal hierarchical clusterbased routing protocol to enhance balanced utilization of the harvested energy ensuring the reliability of communication and the network connectivity. To the best of our knowledge, IEHCR is the first work that tries to enhance the balanced utilization of harvested energy in designing the whole process from sensing time to data collection of hierarchical clusterbased routing protocols for EH-WSNs.

III. SYSTEM MODEL
Symbols used in this paper are shown in Table 2.
A. NETWORK MODEL EH-WSN consists of sensor nodes deployed in 2-D plain and a mobile sink. Each node has a capacity-limited rechargeable battery powered by the solar panel. Some assumptions for developing the proposed protocol are as follows: (1) Network contains N stationary static sensor nodes deployed in L×L square area and a mobile sink.  (2) Each sensor node has the limited capacity E cap i of rechargeable battery and the initial energy of E ini i . (3) Each node has a unique ID and does not use any localization scheme, thus they are location-unaware.
(4) Moving along the predefined track, a mobile collects data periodically. Here, d MS−start , distance from sojourn points to the begin point of monitored area, stays constant.
(5) Each node is capable of adjusting the transmission power based on the distance and residual energy threshold of node is separated in four values of the minimum threshold E min −thr , lower threshold E lower−thr , medium threshold

B. ENERGY CONSUMPTION MODEL
We apply the ''first-order radio model'' [20] among several energy consumption models in the paper. The energy consumed for a node sending l bit data is as follows: where E elec denotes energy consumed for sending 1 bit data, ε fs and ε mpf are propagation loss coefficients and d is transmission distance. In formula (1), the power exponent of d is determined by the transmission distance and the predefined threshold d 0 = ε f s/ε mpf = 87.7m. The energy consumed for a node receiving l bit data is The energy consumed for sensing during t depends on sensing radius r.
where k is constant coefficient and kr 2 denotes energy consumption rate [1], [28], [29]. In this study, relay nodes do not aggregate input packets and only CH nodes consume energy E DA for aggregating data. Therefore, the entire energy consumption in CH node that even senses data is

C. ENERGY HARVESTING MODEL
Several harvest energy models available for EH-WSN are developed and used, where LSTM neuron [6], EWMA [7], [30], accurate solar energy assignment (ASEA) [30] and Profile energy prediction (Pro-Energy) [31] are included. In this study, due to advantages of adaptability to dynamic change, low computational complexity, fewer sample demands, ASEA is adopted to predict the harvested energy. During the cycle of harvesting energy (t + 1), expected value of the harvested energy of sensor node s i , E har_ exp i (t + 1), is calculated as follows in ASEA model.
where E har_real i (t) denote the real value of harvested energy of node s i in time slot t and α is a weight parameter between 0 and 1. In this paper, E har_ exp i (t + 1) is used by revising E har_pre i (t + 1) to consider the influence of weather condition on unstable and uncertain energy acquisition in a simple way. This is expressed in formula (6).
where ϕ i (t) is the revision factor of node s i in harvesting time t and it can be calculated using the following formula.

IV. PROPOSED ALGORITHM
The working process of the proposed algorithm which is called an Improved Equal Hierarchical Cluster-based Routing (IEHCR) includes 2 separate phases of cluster-route establishment phase and data transmission phase, as shown in Fig. 1. The main task of cluster-route establishment phase is to select CH nodes by using local information of sensor nodes and related parameters and to construct the routing-tree for collecting data at the same time. As a result, the whole monitored area is divided into hierarchical balanced clusters. In this phase, CH selection is performed hierarchically one by one, not at once and next-hop CH selection for constructing the routing-tree is accompanied accordingly. Finally, joining CMs to a proper CH is performed to form clusters completely. Meanwhile, information for routing is cumulated so that all nodes get to know it. Moreover, modified sensing radius adjustment scheme is introduced to further improve energy consuming balance of each node from sensing time.
In data transmission phase, a mobile sink with the predefined track is adopted to improve consuming balance of harvested energy. Data sensed from the entire network area is transmitted to a mobile sink through the constructed routingtree. In this phase, based on the charging degree of rechargeable battery and predicted harvest energy of each node, an adaptive transmission power adjustment scheme is applied to use harvested energy more effectively.

A. CLUSTER-ROUTE ESTABLISHMENT PHASE
In this phase, the main principle of hierarchical clustering proposed in [32] is adopted to form balanced and hierarchical clusters.

1) CH SELECTION AND ROUTING-TREE CONSTRUCTION a: CH SELECTION
For simple description, the distance from a mobile sink to the begin point of the monitored area is neglected, that is, a mobile sink is assumed to be near to the monitored area.
At the beginning, a mobile sink (MS) at a sojourn point broadcasts MS_start_Msg() in competition range R and initiates clustering. The first level nodes that receive MS_start_Msg() get known their own distance to a mobile sink and exerted for CH competition. Each node of the first level broadcasts Hello_Msg(i, d i−MS , E res i , E har_pre i (t + 1)) in radio distance R/2 to exchange their local information for CH selection, where i is ID of node s i , d i−MS is distance between node s i and MS, E res i is residual energy of node s i , is the predicted harvest energy that node s i can harvest during the harvesting cycle (t + 1).
After getting local information of neighboring nodes, each node calculates its own election weight for CH selection by the following formula.
where δ 1 , δ 2 , δ 3 is constant coefficient between 0 and 1 and δ 1 + δ 2 + δ 3 = 1. There are several solutions to tackle the dimensions of different factors such as min-max normalization in [32] and the method in [33], where normalization method by maximum value is used. Note that only in (8) minmax normalization and normalization by maximum value are used together. Formula (8) implies that a node with more residual energy, less distance to MS, higher predicted harvest energy has a better option for CH with bigger election weight. Once CH is selected at each level of the entire network area, each CH updates the distance to MS d CH −MS . Note that d CH −MS of the first level CH nodes is Euclidean distance, that is, direct distance to MS and for lower level CH nodes, it is the indirect distance -the sum of distances between CH nodes to MS. By exchanging local information with neighbors, each node is able to obtain the adjacency degree n adj i and signalto-noise ratio SNR i . Adjacency degree of each node is the number of neighbors calculated by the received Hello_Msg() messages. SNR i can be calculated as follows: where P signal i and P noise i denote effective signal power and effective noise power respectively.
A node selected as CH continues to broadcast CH_Msg( I CH , relay−CM ) in competition range R and announces lower level CH competition, where I CH is information of current and upper level CH nodes and I relay−CM is information of current and upper level relay CM nodes.
A relay CM node is of the highest residual energy among CM nodes in a cluster of the corresponding CH node. These two parameters included in CH_Msg() contain ID of a node, distance to MS, residual energy, adjacency degree, the predicted energy amount in the next harvesting cycle and signalto-noise ratio. Through above stepwise broadcasting of CH_Msg(), each CH node obtains information series of upper level CHs and CM node gets known information series of relay CMs to transmit its own data. In such way, related information of all CH nodes and relay CM nodes of upper level is forwarded step by step to CH nodes and relay CM nodes elected in the lowest level for adaptive transmission power adjustment in data transmission phase.
Each node s i that participates in the CH competition has the following features. Namely, it has CH i as its upper level CH, the distance to CH i is less than the competition radius and its residual energy is more than the predefined energy threshold. Also, it is not a member node of the upper level CH node and the distance to MS is further than that of upper level CH.
Each node that participates in the low level competition may receive CH_Msg() messages from several upper level CH nodes. Considering this, distance to MS is defined as follows: Note that this distance is indirect one, that is, not Euclidean (direct) distance form a node s i to MS. Next, each node s i broadcasts Hello_Msg(i, d j−MS , E res j , E har_pre j (t + 1)) in radio range R/2 to exchange local information with its neighbors for CH selection. And the low level CH is selected by using formula (9) in the same way just as in the first level. Such hierarchical CH selection is repeated until the entire monitored area is covered.

b: ROUTING TREE CONSTRUCTION
In cluster-route establishment phase, the routing-tree for data transmission is constructed simultaneously with the above mentioned CH selection. To construct the routing-tree, each current level CH makes a decision of what a next hop is according to CH_Msg()s received from upper level CHs. Among those CHs, an upper level CH with the lowest forwarding cost calculated by the following formula is selected as a next hop.
where γ 1 , γ 2 , γ 3 , γ 4 , γ 5 are constant coefficients between 0 and 1 and γ 1 + γ 2 + γ 3 + γ 4 + γ 5 = 1. Formula (11) results in a CH node with more residual energy, higher predicted harvest energy, less distance to MS, lower adjacency degree and higher signal-to-noise ratio having more chances to be selected as a next hop CH. To balance load of CHs by using a CH that consumes less energy in communication between clusters is why a CH with lower adjacency degree has higher priority among several next hop CHs.
Consequently, a CH that is located in sparser area with more residual energy, less distance to MS through the constructed routing-tree, more predicted harvest energy and better transmission quality is selected as a next hop. Through such process, the routing-tree is constructed step by step from upper level CHs to lower level CHs and finally encompasses all elected CHs. Then, each CH has its own routing table from itself to MS.

2) JOINING CMS TO CH
After simultaneous working process of CH selection and routing-tree construction including those CHs, CMs join to the proper CHs to form clusters and it means that cluster-route establishment phase is completed.
First, nodes that do not receive any messages from CHs elect itself as CH. Meanwhile, CMs select a CH with the smallest join cost calculated by the following formula as its own CH and join to it.
The closer to MS CHs are, the more data they forward. Therefore, they have heavier relay load between clusters. To overcome this, the closer to MS a CH is, the less CM should join to it. That is why a CH further from MS is prior to other CHs. This makes more opportunities for CMs to be able to join to CHs further from MS. Eventually load can be balanced between CHs in inter-cluster and intra-cluster communication.
Once each CM selects its own CH in this way, it transmits Join_Msg(i, i CH , d i−CHi , E i , SR i ) to its CH and finishes the cluster formation, where i CH is the ID of its CH and SR i denotes the sampling rate. Since amount of energy harvested by each CM is unstable and unequal, each CM should use different collecting rate of sensing data according to its residual energy by adopting the method proposed in [5] and [34] for its continuous operation. This is why SR i should be included in Join_Msg

3) SENSING RADIUS ADJUSTMENT
Once clusters are completely formed and covered the entire network area, sensing radius of each node is adjusted by modifying and using the proposed protocol in [1].

a: OVERVIEW OF SRA ALGORITHM
Sensing radius adjustment algorithm is proposed to balance the lifetime of all sensor nodes and minimize the intersection area of any two neighbors. This algorithm consists of 3 stages-weighted Voronoi diagram construction (WVD-C), overlapping reduction (OR) and improvement on network lifetime (INL). In WVD-C stage, each sensor node locally determines its own responsible sensing region according to its residual energy. This stage aims at achieving the lifetime balance of all nodes. Then, in the OR stage, each node adjusts its sensing radius to minimize the coverage redundancy and therefore achieves the goal to minimize the intersection area of any two neighbors. Finally, the INL stage adjusts the sensing radius of some sensor nodes to prolong the lifetime of the sensor with minimal residual energy.

b: MODIFIED SRA ALGORITHM
In EH-WSN, SRA is modified as follows.
If residual energy of a sensor node is bigger than the predefined minimal threshold energy, it is responsible for sensing area determined by sensing radius which is calculated by original SRA algorithm. Here, the minimal threshold is defined as energy that the furthest node from MS consumes when it sends data directly to MS once. For example, if the size of the monitored area is 300m×300m and the distance from a mobile sink to the begin point of the monitored area is 50m, the minimal threshold energy E min −thr is equal to E tx (80bytes, √ 350 2 + 150 2 ) for 80bytes of data packet. Sensor nodes with less residual energy than the minimal threshold energy are excluded from sensing. In other words, if residual energy of node s i is E res i < E min −thr , it is not responsible for covering area. At that time, necessary residual energy information of neighboring nodes is obtained when they exchange local information with Hello_Msg() to form clusters in hierarchical way. After time elapsed, if residual energy of a node exceeds E min −thr , it becomes responsible for sensing region again. Responsible area is also calculated by the original SRA mechanism. Operation of the modified sensing radius adjustment mechanism is shown in Fig. 2. In this figure, two elements including parenthesis present the sensing radius and lifetime of the sensor nodes.
As shown in Fig. 2 a), if residual energy of node s a reaches the minimal threshold energy, i.e. its lifetime is 100, it reduces its sensing radius and its neighbor (s e ) increases its sensing radius to prolong the lifetime of node s a in the original SRA. However, node s a is not responsible for coverage by making its sensing radius 0 in the modified SRA. A coverage hole created by node s a is covered by neighbors and they expand its sensing radius according to their residual energy.
Pseudo code of above cluster-route establishment phase is described in Algorithm 1.

Algorithm 1 Cluster-Route Establishment 1 MS broadcasts MS_start_Msg() in competition radius
if (node s i has received MS_start_Msg() then 4 Calculate d i−MS according to RSSI i ; 5 Broadcast Hello_Msg() in R/2 to exchange local information 6 Get its node adjacency degree n adj i ; 7 Calculate its signal-to-noise ratio (SNR i ) by (9) After the modified sensing radius adjustment mechanism is operated to adjust sensing radius of each node so as to fully cover the entire network area and data is sensed, this phasedata transmission is performed.

1) DEFAULT DATA TRANSMISSION
In this phase intra-cluster communication that each CM in each cluster sends its sensed data to its CH is performed. Data packet contains residual energy information as well as sensed data. To avoid data collision, each CH broadcasts Schedule-Msg(. . . , i, t(SR i ), . . .) to its member nodes in its cluster before data transmission, where i is the number of a CM node that reflects the sequence of CH receiving Join_Msg() from CMs and t(SR i ) is a time slot for data transmission allocated to each CM calculated by sampling rate in Join_Msg(). Once Schedule-Msg() is received, each CM in a cluster transmits sensed data to its CH and goes to sleep mode for saving energy. After intra-cluster communication, CHs performs assembling process like removing data redundancy and data compression. Then, inter-cluster communication is initiated via constructed routing-tree. Such transmission process is called data transmission via default routing-tree -default data transmission in a word.

2) DYNAMIC DATA TRANSMISSION BY ADAPTIVE TRANSMISSION POWER ADJUSTMENT
In the process for CH selection of cluster-route establishment phase, each CH and CM gets information of upper CHs and upper relay CMs to reach MS. Based on such consideration, we propose an adaptive transmission power adjustment scheme for efficient utilization of energy and to save network connectivity from early death of nodes with little residual energy. Dynamic data transmission by an adaptive transmission power adjustment is performed as follows: (1) CHs or CMs with residual energy of E min −thr < E res i ≤ E med−thr transmit data via the default routing-tree regardless of the predicted harvest energy during harvesting cycle (t +1).
(2) CHs or CMs with residual energy of E med−thr < E res i ≤ E max −thr determine the next hop by calculating transmission distance based on amount of the predicted harvest energy E har_pre i (t + 1). When the energy consumed for node s i transmitting l bit data to node s j E Tx (l, d ij ) is less than E har_pre i (t +1)/C, relay transmission distance is calculated by the following formula. Thereafter, E Tx (l, d ij ) ≤ E har_pre i (t + 1)/C is represented as skip condition.
where C denotes the number of times the data is transmitted in the last energy harvesting cycle. If a node, which satisfies E med−thr < E i ≤ E max −thr , is a CH, a CH with the minimum forwarding cost calculated by formula (11) is elected as a next hop among CHs within less range than d ij determined by formula (13) so as to reduce delay. If it is a CM, a relay CM with the minimum forwarding cost calculated by the following formula (14) is elected as a next hop among relay CMs within less range than d ij of formula (13).
where κ 1 , κ 2 , κ 3 , κ 4 and κ 5 are constant coefficients between 0 and 1 and κ 1 + κ 2 + κ 3 + κ 4 + κ 5 = 1. Formula (14) leads the system to select a relay CM node with more residual energy, more harvest energy during the next harvesting cycle, less distance to MS, higher adjacency degree and higher signal-to-noise ratio as the next hop.
Here, a CM node with bigger adjacency degree is prior to others different from in selection for the next hop CH since such nodes consume less energy for sensing.
(3) CHs or CMs with residual energy of E max −thr < E res i sends data directly to MS if it satisfies skip condition.
(4) If residual energy of CM nodes reaches the minimal threshold energy E min −thr or residual energy of relay CM nodes or CH nodes reaches lower threshold E lower−thr (E min −thr < E lower−thr < E med−thr ), then it performs the following operations.
• In the case of CM, it transmits the request message SRA_Msg() that requires to increase the sensing radius to its neighbors to avoid occurrence of coverage holes.
• In the case of CH, it adds information, which implies its residual energy reaches lower threshold for re-clustering, to data packets and sends to MS. If the number of requests is bigger than the value determined by experiments, MS moves to the next sojourn points and initiates a new round.
In (2) and (3) stages if a skip condition is not satisfied, a node sends data via the default routing tree. Such data transmission phase including adaptive transmission power adjustment is depicted in figure 3. Square, circle and diamond represent CH, relay CM and CM with different residual energy respectively. Red and black arrows correspond to dynamic data transmission of CH and CM including relay CMs respectively in the case when the skip condition is satisfied. Solid lines denote direct transmission to MS and dotted ines denote dynamic data transmission by skipping multiple hops. And Blue arrows correspond to default data transmission.

3) MOVEMENT OF MS
In data transmission phase, a mobile sink with the limited energy is used. The predefined track is shown in Figure 4.
The track has 4 sojourn points for data collection in total. At each point, a mobile sink broadcasts MS_start_Msg() in competition range R to initiate the proposed protocol for data collection.
A mobile sink with the limited capacity E max MS should come back to the original position before it runs out of this energy. A round time of MS including sojourn period at 4 data collecting points is determined by energy consumed at one point, energy consumed for moving to the next point and energy consumed for sending the initiation message for cluster formation at each point.
Energy consumed for collecting data at point E Rx MS is affected by amount of collected data determined by the number of CMs and CHs. Energy consumed for moving to the other point E move MS is concerned with moving distance when moving speed of MS is given.

E Tx
MS , energy consumed for sending an initiation message to form clusters at one point, is determined the length of MS_start_Msg() and the competition range R. A round time with energy consumption rate µ is calculated as follows:  (15) should satisfy the following criterion.
Therefore, holding time at one point is determined as follows: If the number of requests for re-clustering reaches the threshold before T stop MS , MS moves to the next sojourn point. Otherwise, MS moves regardless of the number of requests.
After a mobile sink completes data collection via round, it should be recharged for another data collection. This requires several hours. For instance, a 3950mAh battery takes over 1.5h for charging [35]. Therefore, more than 2 MSs are used alternately or several batteries are prepared to ensure consecutive movement of MS.
Pseudo code of data transmission phase is shown in Algorithm 2.
The overhead complexity of control messages in IEHCR is as follows: At each sojourn points, MS broadcasts MS_start_Msg() and initiates protocol operation. Since there are N nodes in network, the number of Hello_Msg() messages is N. When the number of formed clusters by IEHCR is x, the number of CH_Msg() and Join_Msg() messages is x and N − x, respectively. The number of Schedule-Msg() is equal to x that is the number of CHs. Therefore, the total number of control messages is 1 + N + x + (N − x) + x = 2N + x + 1. Thus, the overhead complexity of IEHCR is O(N).

V. PERFORMANCE EVALUATION A. SIMULATION SETUP
To evaluate the performance of IEHCR, we perform extensive experiments using MATLAB. We compare IEHCR with CPMHE [5], which is known as the best one among cluster-based routing protocols for EH-WSNs recently developed with the best performance like harvested energy utilization efficiency. As CREW [26] uses the similar waiting time scheme for CH election to the one of CPMHE, we also compare IEHCR with it. E 2 -MACH [8] is also chosen for comparison, as it adopts the similar clustering method to the one of IEHCR. E 2 -MACH, as a pure clustering protocol, did not introduce any data transmission schemes. Thus, it is assumed that E 2 -MACH adopts the similar data transmission scheme to the one of CPMHE or CREW. Simulation parameters are listed in Table 3.
Each sensor node is powered by rechargeable battery and its capacity is 100J. In the simulation, solar data profiles in [36] are used as solar power harvesting characteristics and 20% of sensor nodes are randomly deployed in shaded areas. Energy harvesting rate of these nodes are set as 30% of nodes' in sunny areas. The following performance metrics are used to evaluate the performance of two algorithms.
• Variance of residual energy: It is defined as variance of residual energy of all nodes to evaluate the balance and fairness of energy consumption. The less it is, the better the balance and fairness of energy consumption are. Dead nodes are excluded in calculation.
• Ratio of packet loss: It is defined as the ratio between the number of packets received by the MS and the number of packets sent by sensor nodes. It reflects the reliability of data transmission.
• Average delay of transmission: It is defined as the average delay taken by all sensed data packets to be sent to MS, reflecting the fastness of data transmission.
• Ratio of available energy utilization: It can be defined as the ratio of the harvested energy by all nodes over the ambient harvestable energy to all nodes during each time quantum. It is used as a metric to reflect the gain rate of the ambient harvestable energy.
• Harvested energy utilization efficiency: It is defined as the ratio of the number of packets received by the MS in one of the compared protocols over the packets received by the MS in LEACH, evaluating how efficiently the harvested energy is used.

B. EXPERIMENT RESULTS AND ANALYSIS
Experiment results of variance of residual energy are shown in Fig. 5. This figure depicts that variance of residual energy of IEHCR is much less than of other protocols. The proposed IEHCR aims at using harvested energy of all nodes in the best balanced and efficient way in all phases of cluster-route establishment phase including simultaneous CH selection and routing-tree construction as well as data transmission including default and dynamic data transmission. However, CPMHE protocol uses the method proposed in [37] to establish routes. Namely, it doesn't consider adjacency degree, quality of a link, predicted harvest energy for selecting next hop CH for data transmission to MS. Likewise, as shown in Table 1, in E 2 -MACH four factors are used for CH selection and CREW only uses two factors of residual energy and harvested energy for CH selection. Specially, such factors as adjacency degree, quality of link and the predicted harvest energy are not adopted for the next hop CH selection. Therefore, it is considered that they cannot ensure balance and fairness of energy consumption well. Also, difference between IEHCR and algorithms in re-clustering stage affects the performance. When residual energy of CH node reaches the predefined threshold, re-clustering is performed using request messages for re-clustering in the unit of already formed cluster in CPMHE. In the proposed protocol, re-clustering requests are aggregated in MS by appending them to data packets before residual energy of CH node reaches the threshold and if the number of requests exceeds the predefined threshold, MS moves to the next sojourn point and re-clustering is performed in overall network area not in the unit of cluster. Thus, the proposed protocol uses harvested energy in more balanced and efficient way in spite of overhead of control messages.
Comparison results of ratio of packet loss between protocols in Fig. 6 indicate that IEHCR can provide lower ratio of packet loss than other protocols. In general, ratio of packet loss is affected by two main factors of residual energy and transmission distance. CPMHE adopts 5 factors including residual energy, distance to MS, adjacency degree, quality of link and predicted harvest energy only just for CH selection. In E 2 -MACH, four factors are considered and in CREW only two factors of residual energy and predicted harvest energy are used for CH selection. Above three protocols use only distance and residual energy for selecting the next hop. The proposed IEHCR uses only 3 factors of residual energy, distance to MS and predicted harvest energy for CH selection but all of 5 factors are used for the next hop CH selection and joining CMs to a CH in the cluster-route establishment phase. This makes IEHCR provide better balanced and efficient energy consumption. And in CPMHE, CREW and IEHCR, the nodes adjust their transmission power and send data to the sink directly when the certain criteria is satisfied. However, in E 2 -MACH transmission adjustment scheme is not adopted. This means that long-distance transmission is not performed in E 2 -MACH. Meanwhile, in proposed IEHCR adaptive transmission power adjustment to lengthen the transmission distance according to residual energy even when it is less than the maximal threshold is introduced, except adaptive transmission power adjustment that a node with more residual energy than the maximal threshold sends data directly to MS in CPMHE. This may lead to the increase of ratio of packet loss. However, strict criteria of residual energy and skip condition are considered and 5 factors are all used for selecting the next hop CH or relay CM and this results in high reliability of transmission. Additional usage of SRA mechanism also affects this metric. In other protocols, if harvesting energy is little (i.e. cloudy weather continues), all of residual energy may be consumed so that it can't work. This makes a coverage hole in monitoring area. In IEHCR, neighboring nodes increase their sensing area so as to cover a coverage hole and this reduces ratio of packet loss.
We also compare average delay of packet transmission of these protocols and the results are depicted in Fig. 7.
It shows that IEHCR obtains better results than other protocols Number of hops and ratio of packet loss are main factors that affect this metric. As considered in Fig. 6, the proposed IEHCR has the best ratio of packet loss and CREW has the worst one. The second factor, the number of hops, is determined by transmission scheme. CPMHE, CREW and IEHCR allow nodes that satisfy the certain criteria to send data directly to the sink. The criteria are different according to protocols. In CREW, if energy consumed for sending data to the sink directly is less than the harvested energy, nodes can transmit data directly to the sink. In CPMHE, except this condition, only nodes with more residual energy than the maximal threshold can send data directly to MS. Though CREW has more relaxed constraints on direct transmission, more packet loss of CREW causes more retransmission, which increases its delay of transmission. The proposed IEHCR allows even nodes with residual energy between medium threshold and maximal threshold to send data by skipping according to the predicted harvest energy if it satisfied the skip condition. Specially, CMs, which satisfy the skip condition and residual energy condition, send data to relay CMs and this leads to reduce the load of CHs. This scheme is ensured by making clusters formed one by one in hierarchical way and routing information cumulated so that all nodes can know it in IEHCR, while clustering is performed at once in CPMHE. However, E 2 -MACH adopts only multi-hop routing protocol so it has the longest delay.
Experimental results of the ratio of available energy utilization are shown in Fig. 8.
IEHCR obtains the best result and CPMHE has the similar result to the one of CREW. E 2 -MACH has the worst. E 2 -MACH adopts only multi-hop routing for data transmission. When the batteries are fully charged, the harvestable energy is discarded. Thus, it has the lower values for ratio of available energy utilization than the former three protocols. For CPMHE and CREW use transmission power adjustment scheme. Namely, they make nodes, which satisfy skip condition and residual energy condition, send data directly to MS and therefore they can harvest more energy from the ambient environment. Except this, the proposed IEHCR allows a node to send data by adjusting its transmission distance dynamically in different hops when its residual energy is between the medium threshold and maximal threshold and it satisfies  the skip condition. This means that IEHCR can harvest more energy from environment than others.
Final experiments results shown in Fig. 9 also indicate that IEHCR can provide best harvested energy utilization efficiency. As above mentioned, IEHCR can harvest more energy than other protocols during the same time quantum. Also, IEHCR can send data with lower ratio of packet loss and delay and this shows that it can preserve more harvested energy. Specially, balance of energy consumption in sensing, usage of all of 5 factors in each stage of cluster-route establishment phase and introduction of a mobile sink leads to more balanced and less energy consumption. For CPMHE and CREW, CPMHE shows better ratio of packet loss and use 5 factors in clustering stage, whereas CREW adopts only 2 factors. Therefore its harvested energy utilization efficiency is better than the one of CREW. E 2 -MACH does not adopt transmission power adjustment scheme so it has the worse results than CPMHE or IEHCR. However, it has the better ratio of packet loss than CREW, as shown in Fig. 6, as it uses 4 factors for CH selection. This makes these two protocols have the similar results. VOLUME 10, 2022 In total, the above simulations show that IEHCR has better performance than other protocols.

VI. CONCLUSION
Enhancing balanced utilization of harvested energy using hierarchical clustering method is one of the most practical solutions for EH-WSNs. This paper proposed an energy efficient hierarchical clustering routing protocol for energy harvesting aware WSNs, which provided balanced utilization of harvested energy in both cluster-route establishment and data transmission phases. In whole process of cluster-route establishment phase that clusters are formed to be accompanied by routing-tree construction, factors related to balanced energy utilization, energy saving and signal transmission quality are used as criteria for CH selection, next-hop CH selection and joining CMs to CH. In addition, modified sensing radius adjustment scheme is introduced to further improve energy consumption balance. Adaptive transmission power adjustment scheme is designed to use more effectively harvested energy in data transmission phase. The simulation results show that the proposed protocol largely outperform the other existing protocols in terms of balanced utilization of harvested energy.