Stable Matching Relay Selection (SMRS) for TWR D2D Network With RF/RE EH Capabilities

The green Internet of Things (IoT) has emerged as a promising paradigm to reduce the energy consumed by nodes in dense networks. To ensure energy efficiency (EE) operation, network devices are equipped with energy harvesting (EH) batteries that can further prolong the network lifetime. This study investigates a two-way relaying (TWR) device-to-device (D2D) model sharing the same resources with the underlying cellular network where all devices can harvest renewable energy (RE) from the surrounding environment. Each relay can assist one D2D link and harvest part of the received signal using the power splitting (PS) protocol. The radio frequency (RF) harvested energy is modeled using a non-linear EH model to match the behavior of the practical energy harvester. The main objective is to maximize the data rate (DR) of D2D links while preserving the quality of service (QoS) constraints. Therefore, a joint optimization solution based on particle swarm optimization (PSO) for power allocation (PA) and one-to-one stable matching (SM) for best relay selection (RS) is performed to untangle the mixed integer non-linear programming (MINLP) problem. Simulation results illustrate the behavior of the proposed model under different parameters as well as it is superiority over the most recent algorithm in terms of D2D link rate and EE.


I. INTRODUCTION
Fifth generation wireless cellular network (5G) as a key enabling technology for Internet of Things (IoT) promises a powerful combination of high speed, large bandwidth, low latency, ubiquitous coverage, increased power efficiency and more secure connectivity. Some promising technologies arise to fulfill 5G performance desires such as massive MIMO, Device-to-Device (D2D) communication, interference management, spectrum sharing, mm-wave communication, and cloud technologies [1]. IoT enables a wide variety of devices to communicate with each other sharing an enormous amount and variety of data generated by different applications to provide new services to citizens, companies, and public administrations [2]. Within IoT, there are two types of communications. The first is primary communication, which uses an access point or base station (BS) as a single hop, or multiple relays as a multi-hop. The second type is secondary communication, which does not require infrastructure, such as D2D communication [3].
The associate editor coordinating the review of this manuscript and approving it for publication was Francisco Rafael Marques Lima . D2D communications as an underlay cellular network refer to direct transmission between nodes bypassing the network infrastructure while sharing the same resources. To limit the interference experienced by the cellular user (CU) as a result of sharing the same channels with D2D links, evolved NodeBS (eNBs) put restrictions on the transmitting power of D2D devices [4]. Moreover, D2D devices can use fixed relay stations [5] or idle users as relays to assist data transmission, and thus increase the data transmission rate, reduce the transmission power, and mitigate the interference, especially at poor channel conditions [6].
Besides, the two-way relaying (TWR) strategy has the potential to enhance the spectrum efficiency (SE) and the system throughput since it requires two phases instead of four to exchange data from any two terminals [7]. The D2D pairs of each link send the desired data to the selected relay in the first phase. While in the second, each relay retransmits the signals back to the devices that can extract their data using self-interference cancellation [8].
One of the most important pillars in designing 5G wireless communication networks is energy efficiency (EE), as billions of devices are expected to be connected within the same architecture causing more energy consumption [9]. In dense environments, small cell base stations such as micro-cells are utilized to improve the capacity, expand the coverage area, enhance the data rate, prolong the battery lifetime, and thus reduce the power consumption [9]. Low-cost wireless network devices are usually powered by energy limited batteries, nevertheless replacing those batteries is either impractical or costly [10]. Therefore, to prolong the battery lifetime, communication devices are equipped with energy harvesting (EH) capabilities that enable devices to harness energy from the surrounding environment. The harvested energy can be either from renewable energy (RE) sources such as wind, solar, vibration, or kinetic energy, or from radio frequency (RF) signals using simultaneous wireless information and power transfer (SWIPT) technology [11], [12]. Each relay has two energy harvesters, one for RF and one for RE, followed by a power management unit and a battery [11]. The sensitivity of the EH circuit is a critical parameter, as energy can only be harvested above a certain threshold that is required to activate the EH circuit [13]. Relays with EH capabilities can harvest energy using one of the two relaying protocols: time splitting (TS) protocol and power splitting (PS) protocol. The TS protocol divides the time between the information decoding (ID) and the EH modes. The PS protocol, on the other hand, uses the PS factor (ρ RL ) to divide the power of the received signal between ID (1 − ρ RL ) and EH (ρ RL ) [14].

A. RELATED WORK
In the context of stable matching (SM), a multi-tier heterogeneous network with solar EH relays is investigated in [15], where the authors propose a distributed resource allocation (RA) using SM algorithm that outperforms the centralized time-sharing strategy in dense scenarios. Joint power allocation (PA) and relay selection (RS) problem employing full-duplex relays in mm-wave based 5G communication network is presented in [16]. The authors transform the complex issue into a one-to-one matching problem by applying the weighted bipartite graph. In [17], authors introduce a SM model for RS considering the influence of the social tie and the promotion status on the matching process. This model is consistent with socially aware networks as it balances the trade-off between the security of the physical layer and the system utility.
In the context of TWR, designing an efficient TWR D2D model requires addressing RA, PA, and RS problems simultaneously. In [18], the authors proposed an uplink resource sharing TWR model with RF/RE EH capabilities using particle swarm optimization (PSO) algorithm. In [19], a full-duplex TWR utilizing the amplify and forward (AF) technique along with SWIPT is presented to maximize the secrecy rate of the system using two EH protocols at the presence of one half-duplex eavesdropper. Further, the authors in [20] propose a two-phase network coding scheme for a TWR D2D network considering the intra-cell interference. They also derive an approximate expression for the system's outage probability, bit error probability, and average end toend throughput.

B. MOTIVATIONS AND CONTRIBUTIONS
This work is proposed in light of the fact that modeling TWR D2D communication with EH capabilities is still on its fancy, and that using stable matching as a RS algorithm with this model hasn't been presented yet as far as we know. In this paper, we study a model where a set of TWR D2D links share the same resources with the traditional cellular devices. We consider the EH capabilities of all devices where they can harvest RE from the surrounding environment. Besides, we assume that several relays assist the D2D links and harvest RF energy with a PS factor utilizing decode and forward (DF) protocol. Those two-way relays are motivated to collaborate in the transmission process owing to the harvested energy.
In a nutshell, our objective is to maximize the data rate (DR) of the TWR D2D Links while taking into consideration quality of service (QoS) of all devices, non-linear EH capabilities, devices battery level and interference constraints. Then, we compare our work with the rate and energy efficiency trade-off EH-based algorithm (REET) model proposed in [18], where two optimization problems were formulated to maximize the utility or the EE of the TWR D2D links based on the IoT application. Unlike the work in [18], we use the SM algorithm in RS sub-problem to enhance system throughput while boosting up the amount of energy harvested by the relays. Besides that, we consider a dense environment system model with overlapping D2D links instead of the sporadic D2D model studied in [18]. We also model the amount of RF harvested power using both linear and nonlinear models. The main contributions of this work can be summarized as follows: • A maximization problem is formulated as a non-convex mixed integer non-Linear programming (MINLP) problem of overlapped D2D links sharing the same uplink resources with the traditional CU devices. Solving such a problem for a global solution is NP-hard and timeconsuming. A common practice to solve this problem in sub-real time is to divide the optimization problem into sub-problems and solve each one separately for a sub-optimal solution [21].
• Aiming at maximizing the data rate of D2D links, an optimum RA, PA, and RS algorithm is performed taking into consideration the QoS constraints and the maximum practical power allowed for each sub-channel.
• A stable matching relay selection (SMRS) algorithm is introduced to solve the non-convex problem by splitting it into three sub-problems and solving each one separately. First, the best reuse partners are selected in a way that minimizes the interference between CU and D2D devices. Second, the optimum power is allocated for each device as well as the PS factor using PSO algorithm. Finally, the optimum relay is selected using the one-to-one SM algorithm.
• Simulation results illustrate the performance of the SMRS model under various network parameters.
Besides, the proposed model is compared with the REET model, and the results show improved performance in the link data rate and the EE.

C. PAPER ORGANIZATION
The rest of this paper is arranged as follows. Section II introduces the system model, the transmission model, and DR and energy analysis. While section III investigates the problem formulation of the non-convex MINLP problem that maximizes the system throughput as well as the EE using SMRS algorithm. The numerical results and the comparison between our proposed model and the most related model are presented in section IV. Finally, the conclusion is stated in section V.

II. SYSTEM MODEL
A macro-cell system model with a BS stationed at the center of the cell, CU devices, and bi-directional D2D devices is considered as shown in Fig. 1. Both CUs and D2D devices are distributed in a random fashion all over the cell. Also, several relays are assisting the data transmission between D2D devices. An uplink resource sharing scenario is adopted, noting that only one sub-channel is allowed to be shared between each CU and D2D link. All devices including D2D devices, relays, and CUs can harvest RE from the surrounding environment. Each relay is a bi-directional half-duplex relay with a limited battery that harvests RF and RE energy and utilizes the PS protocol.

A. TRANSMISSION MODEL
Herein, we adopt a time slotted-fashion system with time slot duration (T) divided into two equal sub-slots (t1 and t2). During the first sub-slot (t1), data transfers from the two terminals of the D2D link (D1 and D2) to the matched relays (MRL)s. While the relay rebroadcasts the signals again to the two terminals during the second sub-slot (t2). Noting that data transferred from CU to BS during the two sub-slots. We also assume that D1 and D2 are separated with a distance (D D2D ). Channel state information (CSI) is critical for optimal power allocation. If the BS receives a tremendous amount of user feedback, the channel is said to have perfect CSI. As the quality of the communication channel in wireless networks differs significantly, imperfect CSI is a prevalent feature [22]. Dealing with imperfect CSI necessitates the use of specialized channel estimation techniques [23], which is a challenging problem. Perfect CSI via prediction is considered as an upper bound for realistic scenarios with a proper perception of system behavior over time, as discussed in [8], [18], [24]. The communication channel is typically modeled as Rician or Nakagami model if the line-of-sight component is considered. For dense networks with bad channel connections and far distance between users, Rayleigh distribution is adopted. Thus, we model the channel as independent and identical distributed (i.i.d) Rayleigh fading channel that excludes the impact of the direct link between D2D devices and investigates the worst-case scenario according to RF EH. We also assume that the channel gain is constant during each time slot. For any two nodes a and b, the channel gain (α ij ) is calculated as follows: where h ij is the channel attenuation calculated from Rayleigh distribution, d ij is the distance between the two nodes, and γ PL is the path loss exponent. All the symbols used throughout this paper are listed in Table 1. The signal to interference noise ratio (SINR) at the BS ( BS ) during t1 and t2 are represented respectively as [18]: where σ 2 represents the variance of the additive white gaussian noise (AWGN) added to the down conversion noise and α CUBS is the channel gain between CU and BS. In addition, the SINR at the MRL from D1 and D2 during t1 are: and the received SINR at D1 and D2 during t2 are: We also consider that each relay inside relay selection circle (RSC) can harvest RE from the surrounding environment as well as RF energy from the received signals utilizing the harvest-store-use protocol. RSC is the circle whose circumference is defined by the two points D1, and D2 [11]. There are three types of relays inside each RSC: MRL, free relays (FRL), and joint relays (JRL) as shown in Fig. 2. The MRL is the selected relay to assist D2D link transmission process. The idle relays are the FRL and JRL, which can adjust their frequency to harvest energy during their downtime. Moreover, JRLs are relays shared by two or more RSCs that have the option of selecting the D2D pair with the highest EH capability.
During t1, the MRL optimizes (ρ RL ) and harvests part of the received signals from the D2D pair and the allocated CU. Meanwhile, the FRLs inside each RSC adjust their operating frequency to match the shared frequency between CU and D2D pair to harvest their transmitted signals. Also, the JRLs between two or more RSC match the frequency of the D2D pair that maximizes their energy harvested. Moreover, the RF energy transferred from the MRL to the FRLs inside each RSC as well as the JRLs using (ρ RL = 1) during t2, as illustrated in Fig. 3.

B. DATA RATE AND ENERGY ANALYSIS
The achievable DR of the TWR system that utilizes the DF protocol is expressed as [25]: where DR ij signifies the data rate of any link, ij is the SINR between the two nodes, DR MA is the data rate of the multiple access link from D1 and D2 to MRL, and DR sum denotes the sum rate of the TWR network. While the energy consumed by each device in the D2D link is given as: where ζ PA is the power amplifier efficiency that lies between [0,1], and E le represents the battery leakage per each time slot. The conversion efficiency (ζ RF ) is assumed to be constant in the linear EH model, while the linear harvested power is denoted by where Pr i is the input power to the harvester. However, practical harvesters have shown a non-linear input-output relation-ship. Thus, we adopt a simple non-linear EH model based on the inverse proportional function as in [26], [27]. The output power of the non-linear EH model can be expressed as: where the constants a, b, and c are obtained via standard curve fitting. The harvested power at each relay node from a device (i) is given by: and the total harvested energy for the matched and unmatched relays inside RSC using the non-linear model are given respectively as: where EH RE is the RE energy harvested from the surrounding environment for each relay. It is worth noting that equation (17) applies to both the FRL and JRLs with (ρ RL = 1). While the relays that do not belong to any RSC can only harvest RE. Furthermore, the residual energy in the matched relay after each time slot can be evaluated as: whereas the accumulated energy in both the FRL and JRLs after each time slot is:

III. PROBLEM FORMULATION
The primary goal of this work is to maximize the Utility (Data rate) of the D2D links while achieving the QoS constraints by designing an efficient RS mechanism. We consider a joint optimization problem of selecting the best reuse partner for each D2D link among the CUs to share the same uplink subchannel, allocating the optimum power for each user among D2D users and CUs as well as the optimum value of the PS factor, and subsequently selecting the optimal relay for each D2D link using SM algorithm. Objectively, our optimization problem can be formulated as a constrained objective function (OF) as follows: where constraint C 1 clearly shows the limits of the PS factor for each relay. Whereas C 2 serves the QoS requirements that all links must exceed certain limit min . According to the standards, transmission power of all links cannot be more than P max. as stated in C 3 . In constraint C 4 we asserted that only one sub-channel can be shared between each CU and D2D link. Moreover, C 5 clarifies that the residual energy in the relay cannot exceed the maximum battery capacity, also known as the overflow limitations. Furthermore, Causality constraint is stated in C 6 to ensure the harvest-store-use protocol for each relay. Since our optimization problem contains a non-linear objective function and continuous and binary constraints, it is considered a MINLP optimization problem. Due to the complexity of the formulated non-convex problem, we dismantle it into three sub-problems and solve each of them separately to achieve our purpose as follows:

A. RESOURCE ALLOCATION STRATEGY
We propose a resource allocation strategy that maximizes the sum rate of the two-way relaying D2D links by pairing the CU devices with the D2D links. We must allocate the licensed spectrum resources wisely to mitigate the severe interference imposed on both links due to the pairing. The selection is based on the channel gain information where the SINR between the CU and the D2D link increases as the channel gain decreases, and thus minimizes the interference between both links [28].
According to constraint C 4 only one sub-channel is allocated to CU and allowed to be shared with one D2D link during each sub-slot. The best reuse partners are formed according to the minimum channel gain between the CUs and MRL during t1 and between CUs and (D1, D2) during t2, and the interference limited area (ILA) concept introduced in [29] is considered. The main idea behind the ILA is that D2D devices closer to the BS cannot form a D2D connection and instead they choose the cellular mode to maintain the QoS requirements of CUs. Also, the CUs inside the limited area cannot be allocated to D2D links and excluded from the candidate set of CUs. The detailed RA is presented in Algorithm 1.

B. POWER ALLOCATION STRATEGY
After reuse partner selection and resource block assignment, the next step is to allocate the optimum transmission power of all devices including (CUs, D2D devices, and relays) that mitigates the interference. This coincides with allocating the optimum PS factor that balances the DR and the EH for each relay. The cellular links as the primary links are prioritized over the D2D links, thus we assign the maximum allowed power P max. to CU links to guarantee their QoS requirements and formulate a model to optimize the transmission power of the D2D links. This model relies on the PSO algorithm to solve the PA problem.

5:
CU ( ) and RL are partners during(t1). 6: end for 7: Find CU ( ) with min. average channel gain (α CUD1 + α CUD2 ) / 2. CU ( )and D2D link (ω) are partners during (t2). 10: end for related to swarming theory, genetic algorithms, and evolutionary programming [30]. The main issue with PSO is that it can easily fall into a local optimum in high-dimensional problems and is dependent on the topology structure [31]. PSO can also be combined with artificial intelligence to address the issue of RA in heterogeneous networks [32]. The number of the parameters to be optimized is defined as the dimension of the problem, while the population (swarm) represents the number of solutions called particles. Each dimension has a maximum and minimum value that defines the search area, while each particle has position and velocity estimated using the objective function. Here, we transform the PA problem into four-dimensional PSO algorithm, where the location of each particle (i=1,2,. . . .,N) is represented as follows: PSO as a search algorithm, starts with random particles within the search space and initializes the best personal (pbest) value for each particle with the first location, and the global personal (gbest) with the maximum value of initial positions. Next, it updates the positions and velocities of the particles using the following equations: where w is the initial inertia weight, c 1 , c 2 is the acceleration coefficients, and r 1 , r 2 are random values following the uniform distribution U (0,1). Then, it evaluates the fitness values for each particle and updates the personal and global best values. Finally, this process is repeated until the optimum values are assured. The details of the optimization process are shown in Algorithm 2.

C. RELAY SELECTION STRATEGY
In this subsection, the optimal RS strategy that maximizes both utility and EH is introduced. This strategy considers the according to constraint c 3 . 4: for each D2D link (ω) do 5: for each relay inside RSC (L RSC ) do 6: Initialize the population Positions with uniform random values between (0,1).

7:
Initialize the global best solution with the worst value for the optimization problem. 8: for each iteration (κ) do 9: Check the constraints (c 2 , c 6 ) of upper and lower limits on each solution. 10: Calculate the objective function (DR sum ) using (11). 11: Determine the personal and global best solutions. 12: Update the position and velocity of solutions using (22), (23). 13: end for 14: return with the best solution (Optimum values). 15: end for 16: end for overflow constraint C 5 and ensures that only one relay can assist each D2D link. We also consider the RSC concept introduced in [11]. Herein, we propose a RS algorithm based on SM theory which was firstly introduced by Gale and Shapley in [33]. Since we consider a dense environment model with a bunch of overlapped D2D links who aim to select a relay among a large number of relays, the matching between the two sides is an interesting problem. Hence, we formulate a one-to-one matching model between D2D links and relays based on the mutual preference lists between them. There are two alternatives, the D2D links proposing and the relay proposing, knowing that the solution of the problem will be optimal for the proposer. Furthermore, we assume that the number of D2D links is smaller than the number of relays and that the D2D links are the proposers since our main objective is to maximize the sum-rate of the D2D links.
First, mutual preference lists for D2D links and relays are established as shown in Algorithm 3. Since each D2D pair seeks to maximize its DR with minimum transmission power, we formulate the D2D preference list PL D2D based on the maximum utility achieved by candidate relays. On the other hand, the relays preference list PL RL is formulated based on the total harvested energy from each link, as each relay seeks to maximize its benefit from the cooperation. Afterward, we present the SMRS algorithm based on the established  constants (a,b,c), ζ PA , E RLres. Output: D2D preference list (PL D2D ), Relay preference list (PL RL ) 1: Set D2D links as the Proposers & Relays as the acceptors. 2: for each D2D link (ω) do 3: for each relay inside RSC (L RSC ) do 4: Calculate the utility represented in (11) using the optimum power values. 5: Calculate the RF energy harvested by each relay using (16). 6: end for 7: Obtain the preference matrix of D2D links (PL D2D ) by sorting the achievable utilities in a descending order. 8: end for 9: for each Relay (RL) do 10: Obtain the preference matrix of relays (PL RL ) by sorting the energy harvested values in a descending order. 11: end for preference lists. The deferred acceptance algorithm in [34] is adopted to achieve a stable, unique, and optimal matching.
In the beginning, each D2D pair proposes to his most preferred relay according to PL D2D . Then, each relay receives one proposal or more accepts the most preferred D2D according to PL RL and rejects the rest. The rejected D2D pairs propose again to their next preferred relays, then the relays compare the new proposals with their matched pairs if they are matched and choose their most preferred pair.
This iterative algorithm continues until all D2D links are matched as depicted in Algorithm 4. After the RS is accomplished, we update the residual energy values for MRL by considering the consumed energy and the total harvested energy. We also update the energy accumulated in the FRLs

Algorithm 4 Relay Selection (RS) Using Stable Matching Algorithm
Input: Energy harvesting model, PL D2D , PL RL , E RLmax , E RLres. , E le. , Non-linear EH constants (a,b,c), ζ PA Output: Matching matrix ( ), Utility matrix of relays (DR sum ), Residual energy(E RLres. ), RF energy harvested (EH RL ) 1: Initialize all proposers and acceptors to free 2: while there are free proposers do 3: for each D2D link do 4: each D2D proposes to its most preferred relay according to (PL D2D ). 5: end for 6: for each relay do 7: if Any relay receives a proposal from D2D link (i) better than its currently matched Partner (j) then 8: RL rejects (j) and chooses (i) to be its new matched partner 9: Update the matching matrix (RL) = i 10: Set (j) as free proposer and (i) as matched one. 11: Remove the D2D(j) from the preference list of that relay. 12: else 13: RL rejects (i) and continue with his matched partner. 14: Update the matching matrix (RL) = j. 15: Remove the D2D(i) from the preference list of that relay. 16: end if 17: end for 18: end while and JRLs by adding the total harvested energy to their batteries.
The complete SMRS algorithm that represents the joint optimization of the three sub-problems: RA, PA, RS is summarized in Algorithm 5. The SMRS algorithm's complexity is studied utilizing the concept of big O notation, as follows. In Algorithm 1, each D2D link outside the ILA (ω ILA ) pairs with a CU outside the ILA ( ILA ) during t1 and a relay (L RSC ) during t2. The overall complexity of the RA algorithm can be evaluated using the QuickSort technique as O|ω ILA * ILA * L RSC |. The PSO algorithm's complexity is determined by the number of solutions (N), the maximum number of iterations (κ), and the number of decision variables (ϑ). As a result, the overall complexity of Algorithm 2 is O|w ILA * κ * N * (ϑ +N )| and since the number of decision variables is always less than the number of solutions, the complexity can be expressed as O|ω ILA * κ * N 2 |. Finally, the best relay is selected for each D2D link using a stable matching algorithm. First, the preference lists are established as in Algorithm 3. Sorting the preferences of w ILA D2D links and L RSC relays in descending order results in an average complexity of O|ω ILA * L RSC * log (ω ILA * L RSC )|. Then, each D2D link w ILA proposes to the VOLUME 10, 2022  13: if RL ∈ then 14: Calculate total harvested energy (EH MRL ) using (16). 15: Update the residual energy (E RLres. ) according to (18). 16: else 17: Calculate the total harvested energy (EH RL ) using (17). 18: Update the residual energy (E RLres. ) according to (19). 19: end if 20: end for L RSC relays according to his established preference list with a complexity of O|ω ILA * L RSC |, according to Algorithm 4.

IV. NUMERICAL RESULTS AND EVALUATION
In this section, the performance of the proposed algorithm is analyzed and compared with the REET algorithm. The used simulation parameters are listed in Table 2. Also, the results are evaluated using MATLAB R and averaged over multiple iterations. We investigate the average sum rate of overlapped D2D links under different simulation parameters, considering a very dense environment where there is no direct link between the D2D devices as shown in Fig. 4. Since the RE harvested from the surrounding environment is naturally random and follows a stochastic process [34], we model the packets' arrival as a Poisson process with rate 3 packets/s, where each i.i.d energy packet follows the uniform distribution U (0,100) mJoule [35]. In addition, the parameters used in the non-linear RF EH model are a = 429.03,  The comparison between linear EH model, non-linear EH model, and measured data from [36]. b = 473.18, and c = 645.26 based on MATLAB Curve fitting tool, while the conversion efficiency of the linear model is set to be ζ RF = 0.7. The comparison between the two EH models and the measured data from [36] are presented in Fig. 5.
The distance between D2D devices and its effect on the utility of D2D links is investigated in Fig. 6. The results show that as the distance between the devices increases, the path loss experienced by the D2D transmission signal also increases, so D2D devices are forced to increase their transmission power, and consequently, the average sum rate of the D2D links decreases.
Increasing the number of CUs ( ) inside the cell, increases the probability for each D2D link to find a CU with minimum channel gain and share the same sub-channel with, and consequently the interference decreases, and the average sum rate of the network increases as shown in Fig. 7. The effect of changing the number of relays inside the RSC on the average  sum rate of D2D links is plotted in Fig. 8. The more the relays available for each D2D link, the more likely to select the optimal relay for the D2D link that maximizes the utility of the link. Fig. 9 demonstrates the impact of the D2D distance on the transmission power of each device in the D2D link. As shown in the figure, the D2D devices (D1, D2) and the relay increase their transmission power as the distance between the D2D devices increases. Increasing the D2D distance causes more power degradation in the signals. To compensate for the power loss and maintain the QoS requirements, the transmission power must be increased.
The impact of the D2D distance on the transmission power and the RF EH by the matched relay for each link is shown in Fig. 10. As the D2D distance increases, the received power at the relay decreases, and so does the RF harvested power.  Furthermore, the RF harvested power is less than the matched relay's transmission power, then we need the RE harvested energy to help with the transmission process.
Furthermore, we compare the performance of the proposed algorithm SMRS with the REET algorithm in terms of the average sum rate of D2D links, the consumed power, and the total harvested power. Fig. 11 shows the impact of varying the distance between the D2D devices on the average sum rate of D2D links for both algorithms. As the D2D distance increases, the average utility decreases for both algorithms. However, the SMRS algorithm shows relatively higher utility values than the REET algorithm. Moreover, the D2D distance, as shown in the figure, is a critical parameter in determining the superiority of the SMRS algorithm over the REET algorithm. The difference between the two algorithms decreases slightly as the D2D distance exceeds the maximum   limit. The total harvested power by the D2D link as well as the total power consumed by the D2D link devices (D1, D2, and RL) are compared for both algorithms in Fig. 12.
Total power consumption increases with increasing the D2D distance, while the total RF/RE harvested power by the D2D link decreases. As a result of using the SM algorithm in the RS sub-problem, the SMRS algorithm shows better performance in EE as in the utility. According to the figure, D2D links in the SMRS algorithm use less power and harvest more power than the other algorithm. In addition, the residual energy after the transmission process will be stored in the device's battery for future usage.

V. CONCLUSION
In this paper, we addressed a joint optimization problem of RA, optimum PA, and RS in a TWR network while considering the non-linear RF/RE EH capabilities for devices. The PSO and the SM approaches are used respectively to solve the PA and the RS sub-problems. Furthermore, the proposed model was tested and compared with a related algorithm and the numerical results showed that using SM in the RS sub-problem boosted-up the total utility of the system as well as the EH. For future work, multiple D2D pairs share the same sub-channel with CU devices while optimizing the maximum number of matched resources for each user should be considered. Furthermore, the proposed model can be modified by considering the intra-cell interference and the imperfect CSI. Finally, the use of distributed artificial intelligence in resource and power allocation sub-problems should be investigated and compared with the existing models.