Relay Probing for Millimeter Wave Multi-Hop D2D Networks

The communications in the millimeter wave (mmWave) band, e.g., WiGig (60 GHz), is considered as one of the main components of 5G and beyond 5G (B5G) networks. However, it is characterized by short range transmissions along with high susceptibility to path blockage, e.g., human shadowing. Thus, mmWave multi-hop relaying using device to device (D2D) connections turns to be an efficient solution to extend its communication range and to route around blockages. In this context, relay probing is essential to discover/explore the candidate multi-hop routes from source-to-destination and then select the best route among them. However, a trade-off exists between relay probing and the required overhead coming from mmWave beamforming training. In this paper, taking advantage of the multi-band $\mu W$ /mmWave relay nodes, an efficient multi-hop relay probing scheme is proposed for mmWave D2D routing. In this scheme, the collected $\mu W$ received signal strengths (RSSs) among the distributed relay nodes are used to estimate the probability of their mmWave signal-to-noise power ratios (SNRs) while considering line-of-sight (LOS) and non-LOS (NLOS) path availabilities. Then, based on a proposed probabilistic metric, a hierarchical search algorithm is proposed to jointly discover the relay nodes along the candidate multi-hop routes and enumerate the number of routes expected to maximize the spectral efficiency of the whole path from source-to-destination. This is done in an offline phase, and only the relay nodes located within the pre-selected multi-hop routes are requested to do online relay probing. Numerical analysis confirms the superiority of the proposed mmWave multi-hop relay probing scheme over the candidate techniques.


I. INTRODUCTION
The large swath of available spectrum in the millimeter (mmWave) band, 30 ∼ 300 GHz, attracts researchers from both academia and industry to use it for fifth generation (5G) and beyond (B5G) applications [1]- [5]. WiGig standards, e.g., IEEE 802.11ad and IEEE 802.11ay, defined the set of protocols required to attain the communications in the The associate editor coordinating the review of this manuscript and approving it for publication was Cunhua Pan .
unlicensed 60 GHz band [3], [4]. They standardized multi-band WiGig capable devices, i.e., operating in 2.4, 5 and 60 GHz bands for backward compatibility and for fast session transfer (FST) among them [3], [4]. However, mmWave suffers from high propagation losses by which even oxygen absorption affects the mmWave channel [5], [6]. Moreover, mmWave transmission is highly susceptible to path blockage even that comes from human shadowing [5], [6]. All these channel impairments are mainly come from its highly operating frequencies as 28/21.6 dB free space attenuation losses are expected using 60 GHz mmWave band compared to the legacy 2.4/5 GHz µW bands, respectively [7]. Antenna beamforming by the means of structured codebooks is widely accepted as an essential enabler for mmWave communications. Owing to the use of large antenna arrays, mmWave beamforming can be done either using analog beamformers constructed by phase shifters, or using hybrid analog/digital beamformers [8], [9]. Consequently, variety of beamforming training (BT) techniques were suggested in literature to find out the best mmWave transmit (TX)/receive (RX) beam pair. Yet, the mmWave BT process is highly time-consuming which typically lasts in milliseconds [8], [9]. Device to device (D2D) communications are another important 5G and B5G strategy. In this way, the requested transmissions are completed between the distributed devices without traversing through the Macro-base station (BS) [10]- [12]. This will relax the traffic demands on the Macro-BS while highly densifying the network making it suitable for the emergent 5G and B5G use cases [10]. In this context, both in-band and out-band D2D network architectures were investigated in literature [10]. Out-band D2D networks have the advantages over the in-band networks of causing no-interreference with the Macro-BS links in addition to utilizing the high bandwidth of the unlicensed spectrum. A win-win relationship does exist between mmWave transmissions and D2D scenarios [13]- [21]. From one side, the use of mmWave enables the construction of low interference and ultra-dense D2D networks thanks to its highly directional short-range transmissions. On the other side, the short-distance D2D mode is best fitted for mmWave due to its high propagation losses and susceptibility to path blockage. It is stated in literature that the probability of mmWave line-of-sight (LOS) availability is exponentially decreasing with increasing the distance between a mmWave transmitter (TX) and receiver (RX) [22], [23]. Thus, for avoiding LOS path blockage, short range D2D transmission is most fitted for mmWave [10].
Relays have been extensively studied in literature to extend the coverage and capacity of the wireless communication networks [24]. In this context, different relaying schemes were suggested such as analog repeater, amplifyand-forward (AF), decode-and-forward (DF), compressand-forward (CF), and demodulate-and-forward (DF) [24]. Moreover, both half duplex (HD) and full duplex (FD) relying strategies were investigated [24]. The use of relays turns to be an important tool for mmWave to overcome its limited coverage considering the large difference between LOS and non-LOS (NLOS) propagation characteristics [25]- [32]. Thus, mmWave D2D relays can be used to route around blockages and extend the mmWave link [25]- [32]. Recently, the performance of the relay aided mmWave networks was studied under different scenarios to bound its performance against coverage and spectral efficiency, and different relay selection schemes were suggested [25]. The use of BT makes relay probing, i.e., the process of exploring the good relays, critical problem in mmWave relaying. A tradeoff does exist between mmWave relay probing and maximizing the overall throughput from source to destination. That is, probing more relays will increase the probability of finding out a good multi-hop route with high spectral efficiency, but at the expense of increasing the BT overhead. This will result in deceasing the overall throughput, increasing the end-to-end delay and increasing the energy consumptions of the mmWave D2D relaying. The problem of mmWave relay probing was firstly explored by the authors in [26] for the two-hop mmWave D2D relay networks using DF and HD relaying strategies. The authors in this paper showed that the optimal relay probing is a pure-threshold policy. Based on the optimal stopping theory, the authors proposed that the relay probing process is conducted one by one till the predefined spectral efficiency threshold in bit/sec/Hz corresponding to the expected maximum throughput is reached. A fixed-point equation was formulated to bound the expected spectral efficiency threshold based on link availability and probing overhead. Despite the pioneer of this work, there are some critical limitations of the suggested scheme. 1) In order to find out the optimal spectral efficiency threshold, the distribution of the spectral efficiency of the two-relay link should be known beforehand, which is impractical in real scenarios.
2) The extension of the scheme to the multi-hop scenario turns to be also unrealistic due to the need of pre-knowing the distribution of the spectral efficiencies of the multi-hop routes with different hop counts in addition to their LOS blockage probabilities. 3) Only mmWave LOS scenarios were considered and the NLOS paths are completely ignored. Finally, the scheme is based on blindly exploring a fixed number of candidate relays corresponding to the target spectral efficiency. This is done while other non-probed relays may have better spectral efficiency than the pre-calculated optimal threshold.
In this paper, we propose an efficient yet practical multi-hop relay probing scheme for mmWave D2D networks. In this scheme, taking advantage of the integrated µW /mmWave devices [3], [4], the µW band is used to assist the relay probing process in mmWave multi-hop D2D network. The proposed scheme mainly consists of two phases, the offline routes pre-selection phase and the online relay probing phase. In the first phase, the µW received signal strengths (RSSs) among the distributed relay nodes are collected at the Macro-BS. Then, based on these collected µW RSSs, the probability density function (PDF) of the mmWave signal-to-noise power ratios (SNRs) among the distributed relay nodes are calculated while estimating the LOS/NLOS probabilities. Based on the anticipated SNR PDFs, a probabilistic metric is proposed, and a hierarchical search is performed to enumerate the candidate multi-hop routes expected to maximize the overall spectral efficiency from source to destination. The online relay probing phase is conducted by the relay nodes only belongs to the pre-selected routes to select the best route having the maximum spectral efficiency for constructing the link. By this way, there is no need for pre-knowing the exact distribution of the overall spectral efficiency of the candidate routes in addition to their LOS blockage probabilities, which are impractical assumptions in real scenarios. Also, both LOS and NLOS paths are considered, which makes the proposed scheme more realistic. Finally, the proposed scheme can be applied for any number of hop counts. The main contributions of this paper can be summarized as follows: • The problem of multi-hop relay probing in mmWave D2D networks is formulated while highlighting the trade-off between increasing the number of probed routes and the required overhead.
• A heuristic solution based on the interworking between the µW and mmWave bands is introduced including the required multi-band management protocol. The proposed protocol organizes the operation of the whole mmWave D2D relaying network including the signaling between the Macro-BS and the relaying nodes using LTE, µW and mmWave interfaces. The offline phase of the multi-hop routes pre-selection and the online relay probing phase will be given as parts of the proposed heuristic solution.
• Numerical analysis is conducted to compare the performance of the proposed scheme with that obtained using optimal spectral efficiency threshold introduced in [26].
The rest of this paper is organized as follows: Section II gives the related works. Section III gives the system model and problem formulation of multi-hop routing in mmWave D2D networks. Section IV gives the proposed heuristic solution including the proposed multi-band management protocol, the offline routes pre-selection phase and the online relay probing phase. Section V presents the conducted numerical simulations. Finally, Section VI concludes this paper.
Notations: For a random variable (r. v.) X , f X (x) and F X (x) denote its PDF and cumulative distribution function (CDF), respectively. The probability and expectation are represented by P(·) and E[·]. A log-normal r.v. X = e µ+δY , where Y is a r.v. with normal distribution, i.e., Y N (0, 1), is represented as X LN µ, δ 2 , with a PDF as:

II. RELATED WORKS
The application of D2D communications in mmWave band has gained a lot of attention due their inherent win-win relationship. An integration of the mmWave communications into 3GPP framework of D2D communications. i.e., ProSe, was investigated in [14]. In [15], coverage and rate analysis were performed using stochastic geometry for mmWave wearable D2D networks under finite number of interferers. In [16], a tractable framework of mmWave D2D networks was established to investigate their performance using a bipolar model integrated with several features of mmWave transmissions. In [17], the performance of cluster based mmWave D2D networks was investigated under different user association schemes. In [18], multi-cast scheduling was proposed for mmWave D2D networks for energy efficient mmWave concurrent transmissions. In [19], a power control algorithm was proposed to manage the interference in high dense mmWave D2D networks combining both devices association and beamwidth selection. Recently, in [20], a new centralized access control scheme was proposed for enabling concurrent transmissions in mmWave D2D networks. Also, in [21], the interference and outage analysis of random mmWave D2D networks was given.
Relaying was investigated in conjunction with mmWave transmissions in order to increase its coverage and route around blockages as given in [25]- [32]. In [25], a tool of stochastic geometry was used to study the enhancements in coverage and spectral efficiency of mmWave D2D relay networks. The relay probing scheme proposed in [26] was utilized by the authors in [27] when proposing concurrent backhaul link scheduling for mmWave cellular networks. The optimization of the full-duplex relays and power allocations in mmWave D2D networks was formulated as a multi-objective combinatorial optimization problem in [28]. To overcome mmWave path blockage, the authors in [29] proposed multi-hop relay path selection with the aim of minimizing the total transmission time. Coverage, capacity and error rate analysis of multi-hop mmWave using DF relaying was given in [30]. In this work, the performance of noiselimited and interference-limited scenarios were given while considering both LOS and NLOS links. In [31], a tool of stochastic geometry was used to mathematically characterize the enhancement in the ranging performance of the mmWave terminal and relay nodes, which is used in optimizing the placement of the relay nodes. In [32], FD relaying was applied for mmWave relaying where an orthogonal matching pursuit precoder was used for self-interference cancellation. All previous research works [25]- [32] in mmWave relaying except that given in [26] relaxed the problem of relay probing, which is a critical point in mmWave relaying. Instead, they assumed that all mmWave channel information is exactly known beforehand without considering the required overhead of obtaining such information using BT as their main objectives were related to radio resource management in mmWave relaying.
Taking advantage of the standardized multi-band mmWave nodes, the integration between µW/mmWave bands was extensively investigated to address the challenges of mmWave communications and optimize its performance. In [1], the µW RSS radio map was used as an index to the mmWave best beams radio map for enabling mmWave concurrent transmissions in random access scenarios. In [2], a multi-band heterogenous architecture was introduced for 5G networks based on the integration between LTE/Wi-Fi/mmWave bands empowered by a new concept of 2C/U plane splitting. In [33], utilizing µW/mmWave dual band small cells, scheduling over the two different bands was jointly performed based on user applications. In [34], the FST proposed by IEEE 802.11ad was utilized to transfer the data transmission session from the congested µW band to the mmWave band while considering the effect of mmWave blockage. In [35], a paradigm of µW/mmWave D2D networks was given; in which, the µW band was used to assist the constructions of the mmWave D2D links. In [36], neighbor discovery (ND) in mmWave Ad Hoc networks was addressed using µW/mmWave interworking. Despite the effectiveness of these research works in exploiting µW /mmWave interworking to boost the performance of mmWave communications, they never touched the mmWave relaying problem. At the best of our knowledge, the work presented in this paper is the first work that uses the assistance of the µW band in mmWave multi-hop relay probing while considering both LOS and NLOS states of the mmWave link.

III. SYSTEM MODEL
In this section, we will give the system model of the mmWave multi-hop D2D network in addition to the used µW and mmWave link models. Figure 1 shows the proposed system model of the out-band mmWave multi-hop D2D relaying, which avoids interference with the cellular users. In this network architecture, tri-band devices, i.e., containing LTE, µW and mmWave interfaces, are distributed inside the coverage area of the LTE Macro-BS. The large coverage LTE band will manage the overall process of multi-hop transmissions from the source device to the destination device including D2D scheduling, multi-hop route construction, and route release after completing D2D data transmissions using device to base (D2B) station control links. Both LOS and NLOS paths may be available between two mmWave devices based on the existence of blockages. In the proposed mmWave multi-hop relay probing, the relay probing process will be conducted among the relay devices on the candidate multi-hop routes for constructing the route from source to destination.

B. LINK PROPAGATION MODELS
In this section, we will introduce the used propagation models of the µW and mmWave links including the mmWave LOS blockage model.

1) µW LINK MODEL
For the µW link model, we will use the link propagation model of 5.25 GHz band introduced in [7], where the received power P µW r [dBm] at a µW RX located at a distance d from a µW TX is expressed as: where P µW t represents the TX power in dBm of the µW module. ε µW N 0,δ µW represents the dB shadowing term with standard deviation δ µW = 6 dB [7].

2) mmWAVE LINK MODEL
For mmWave link model, we follow the link model given by [37] [39], where a Bernoulli random variable is used to model the LOS blockage effect as follows: where P g t and P g r are the TX and RX mmWave powers in Watt, and d is the separation distance in m between mmWave TX and RX. η (P LOS (d)) and χ (P NLOS (d)) are two Bernoulli random variables with parameters P LOS (d) and P NLOS (d) indicating the LOS and NLOS probabilities of the mmWave link. L LOS g (d) and L NLOS g (d) are the path losses corresponding to the LOS and NLOS paths, which can be expressed in dB as follows [39]: where ν ∈ {LOS, NLOS}, β ν g is the path loss at a reference distance d 0 = 5m, which is equal to β ν g = 82.02 − 10α ν g log 10 (d 0 ) [35]. α ν g is the path loss exponent and ε ν g N 0,δ ν g is the dB shadowing term with zero mean and standard deviation of δ ν g . In (2), TX (θ) and RX (ϕ), represent the TX and RX beam gains, where θ and ϕ represent the angle of departure (AoD) and the angle of arrival (AoA). Using the 2D steerable antenna model with gaussian main loop profile given in [37], [38], TX (θ) can be represented as: where θ TX represents the boresight angle of the TX beam, 0 is the maximum antenna gain, and θ −3dB is the −3dB beamwidth. Same equations can be applied for RX (ϕ) except that θ TX is replaced by ϕ RX , and θ is replaced by ϕ.

3) mmWAVE LOS BLOCKAGE PROBABILITY
For modeling the mmWave blockage, the distance-dependent blockage model given in [22], [23] will be utilized in this paper. This model was applied in variety of mmWave network analysis/design such as mmWave Ad Hoc, D2D and NOMA networks [35], [36]. In this model, random shape theory was used to model the blockages intersecting a mmWave path from TX to RX. Blockages with random shapes, i.e., random heights, widths and orientations, are assumed to be drawn from a Poisson point process (PPP) within the mmWave link between TX and RX [22], [23]. Thus, the number of blockages (B) intersecting the mmWave link with a distance d is a Poisson random variable with an average value of (λd+ ω). λ is a parameter related to the blockages features, i.e., shapes, density, etc., and ω is based on the settings where the mmWave TXs are located indoors. For large values of d, ω is negligibly small and can be ignored. Thus, the probability of the LOS availability, P LOS (d), i.e., there is no blockages in the mmWave path, can be simply expressed as: Based on P LOS (d) and P NLOS (d) given in (5), η (P LOS (d)) and χ (P NLOS (d)), given in (2) can be evaluated accordingly.

IV. PROBLEM FORMULATION OF MMWAVE MULTI-HOP RELAY PROBING
In this section, we will give the optimization problem of the mmWave multi-hop D2D relay probing in addition to the coverage probability of the mmWave multi-hop relaying.
A. OPTIMIZATION PROBLEM Figure 2 shows an example of a mmWave multi-hop D2D routing, which is used to extend the coverage range from the source node (T 0 ) to the destination node (T K ) or to route around blockages using K − 1 relay nodes. The problem of mmWave multi-hop relay probing is to find out the best multi-hop route r m maximizing the overall throughput from T 0 to T K , i.e., T 0 −r m −T K . This can be expressed mathematically as: where T 0 −r m −T K indicates the overall throughput in bps from T 0 to T K using multi-hop route r m , 1 ≤ m ≤ |∅ P |.
|∅ P | is the total number of probed routes, where ∅ P is a subset of all available route space φ R , i.e., ∅ P ⊂ φ R as given in (6). ψ T 0 −r m −T K represents the spectral efficiency in bit/sec/Hz corresponding to route r m . W indicates the used bandwidth, T D represents the data transmission time and T |∅ P | P indicates the relay probing time of the |∅ P | probed routes. Apparently, there is a tradeoff between probing more routes, i.e., increasing |∅ P |, and increasing the overall throughput. Although, increasing |∅ P | will increase the opportunity of obtaining a good route with a high value of ψ T 0 −r m −T K , it will dramatically increase the value of T |∅ P | P decreasing the overall T 0 −r m −T K accordingly. Without loos of generality, HD with DF relaying is assumed in this paper. Also, we will follow the same assumptions given in [30] that co-transmissions can occur between alternately located relay nodes within the same route using the same time slot. This can be done thanks to the highly directional mmWave transmissions. Thus, ψ T 0 −r m −T K can be expressed as [30]: where γ (r m ) eq represents the equivalent SNR of the multi-hop is the received mmWave power at node T j from node T j−1 , and N 0 is the noise power at the receiving node T j . In (7), only noise limited scenario is considered as typically mmWave D2D relay probing is performed before data transmissions using dedicated orthogonal control channels [26], [27].
The relay probing process is started from the source node, T 0 to its next node, i.e., T 1 , in route r m . If the achievable SNR is larger than or equal to the minimum threshold for constructing the relay link, i.e., γ (r m ) 1 ≥ γ th , relay probing will continue between T 1 and T 2 within the same route, and so on till probing the relay between node T K −1 and the destination node T K . In this paper, γ th = P g rth N 0 , where P g rth = −78 dBm corresponding to mmWave receiver sensitivity as given by IEEE 82.11ad [3]. If the condition that γ (r m ) j ≥ γ th is not fulfilled for any T j−1 and T j in route r m , the relay probing process will stop and another candidate route will start the relay probing process starting from T 0 . Based on this multi-hop relay probing strategy, T |∅ P | P can be expressed as: where τ g is the time duration of relay probing between any relay nodes, and (r m ) is the number of relay links in route r m satisfying the condition that γ (r m ) j ≥ γ th where 1 ≤ (r m ) ≤K .

B. COVERAGE ANALYSIS OF MMWAVE MULTI-HOP RELAYING
In this section, we will give the mathematical equations of the mmWave multi-hop coverage probability with respect to SNR and spectral efficiency against the number of hop relays and 30564 VOLUME 8, 2020 LOS availability. Moreover, the relation between increasing the number probed routes and the enhancements in mmWave spectral efficiency will be deduced.
To calculate the coverage probability of both SNR and spectral efficiency, the PDF of γ = P g r (d)/N 0 , i.e., the SNR conditioned on the distance d, should be evaluated at first. For simplicity the conditional notation is omitted.
Using linear notations, L ν g (d) in (3) can be expressed as: where Z ν g,LN ∼ LN 0,δ ν 2 g,LN denotes a log-normal random variable with δ ν g,LN = 0.1δ ν g ln10 as deduced in [40]. By assuming perfect beam alignment occurs between a TX/RX communicating relay nodes, γ can be written using P g r in (2) as given in (11), as shown at the bottom of this page, where A = N 0 . To simplify the mathematical analysis, it is assumed that perfect beam alignment is achieved using the exhaustive search BT done between the TX/RX communicating relay nodes. The analysis related to beam misalignment is left for our future investigations. Using the scaling property of log-normal r.v., F LOS and F NLOS in (11) can be expressed as given in (12), as shown at the bottom of this page, Using Fenton-Wilkinson method [41], f γ (γ ) conditioned on d can be approximated by another log-normal r.v. with: where LN , and δ 2 NLOS = δ NLOS 2 g,LN . Thus, F γ (γ ) conditioned on d can be written as: where erfc is the complementary error function, and Q is the cumulative distribution function of the standard normal distribution.
The SNR coverage probability P cov SNR of the mmWave multi-hop relaying is the probability that γ (r m ) eq given in (8) is greater than or equal to a predefined threshold γ th eq averaged over all d j values. Mathematically speaking, this can be expressed as: In (16), r m is omitted due to the averaging over all d j values. Also, (a) follows from identically and independently distributed (i.i.d) γ j 's. (b) comes from substituting the value of F γ (γ ) given in (15). (c) comes from the assumption that the K − 1 relay nodes are uniformly distributed between the source and the destination nodes. Thus, S K is the space between two adjacent relay nodes in the multi-hop route based on the used number relay links K . That is, as K is increased, S K will be shrunk and vice versa, where D K is the radius of the space S K . From (16), which can be solved using numerical integration, as the number of relay links, K , is increased, the SNR coverage of the multi-hop route will be also increased. Also, as P LOS (d) increases due to the decrease in d, µ LOS will be increased which increases the SNR coverage, and vice versa.
To calculate the spectral efficiency coverage, (16) can be used except that γ th eq is replaced by 2 (2ψ th ) − 1, where ψ th denotes the spectral efficiency threshold. Similarly, as the number of relay links is increased, the spectral efficiency coverage of the multi-hop route will be also increased.
To deduce the relationship between increasing |∅ P | and enhancing the obtained spectral efficiency, let: VOLUME 8, 2020 Thus, the spectral efficiency coverage, P cov ϒ , against |∅ P | can be expressed as: where ϒ th is the pre-defined threshold, and (a) follows from (i.i.d) ψ m 's corresponding to routes r m 's. Also, (b) comes from substituting F γ (γ ) while considering (7). It is clearly shown by (18) that as the value of |∅ P | is increased, P cov ϒ will be also increased. This means that higher spectral efficiency is obtained through increasing the number of probed routes. However, it will dramatically affect the overall throughput as given in (6). As previously explained, the work given in [26] optimizes the value of |∅ P | for maximizing T 0 −r m −T K using only online relay probing. However, better throughput performance can be obtained if we can anticipate ψ T 0 −r m −T K for all φ R beforehand, then only the number of routes expected to maximize ψ T 0 −r m −T K are used in the route probing process, as proposed in this paper.

V. PROPOSED MMWAVE MULTI-HOP RELAY PROBING
In this section, we will give the proposed mmWave multi-hop relay probing scheme based on µW RSS including its multiband management protocol, the offline routes pre-selection phase and the online candidate routes probing phase. Figure 3 shows the proposed multi-band management protocol based on the mmWave D2D system model given in figure 1. In Figure 1, the Macro-BS will provide the global signalling control and orchestration over all distributed multi-band devices via the D2B control links. Thus, it retrieves essential measurement reports, such as devices locations, traffic demands, mobility, QoS requirement, etc., as given in figure 3. If a mmWave multi-hop route needs to be constructed between a source device T 0 and a destination device T K , Macro-BS will trigger their surrounding candidate devices to turn ON their µW modules. Afterwards,µW RSS measurement request (RSS M. Req) and measurement response (RSS M. Res.) frames are exchanged among the surrounding devices including the source and the destination nodes for measuring the µW RSS values among them. Macro -BS will collect these values using the D2B links, and then it pre-selects the candidate mmWave multi-hop routes expected to maximize the overall spectral efficiency from source to destination as will be given in section V.B. Subsequently, the Macro-BS will inform the devices located within these pre-selected routes to turn ON their mmWave modules for online mmWave relay probing as will be explained in section V.C. After finalizing the online relay probing process through the selected routes, mmWave SNRs and the best TX/RX beams among devices located within these routes are collected by the Macro-BS using D2B links. Then, the best route among them, maximizing the total spectral efficiency as given in (17), is selected. The Macro-BS will inform the relay devices located within this best route for constructing the mmWave link from source to destination using their best TX/RX beam pairs and start mmWave data transmissions.

B. OFFLINE PHASE OF ROUTES PR-SELECTION
In this phase, based on the µW RSS among the surrounding devices collected at the Macro-BS, a probabilistic metric is proposed. This probabilistic metric is used to anticipate the SNR coverage of a mmWave link j based on the µW RSS of this link. Based on this metric, a hierarchical search is performed to let the Macro-BS pre-selects the number of mmWave multi-hop routes expected to maximize the spectral efficiency of the whole link from source to destination.
Based on the µW RSS received by node T j from node T j−1 , Macro-BS can anticipate the PDF of mmWave SNR at T j from T j−1 . This can be done by evaluating the expected value and the PDF ofd j , which is the estimated separation distance between T j−1 and T j , based on the µW RSS. Then, the LOS probability and the PDF of mmWave SNR between them can be anticipated. This can be done through the following two steps: (1), the expected value of the estimated separation distanced j of link j between nodes T j−1 and T j can be expressed as: where C 1 = P µW t −47.7, and P µW rj is the received µW power at device T j from device T j−1 . Also, using random variable transformation (RVT) [42], the PDF ofd j ,d j ≥ 0, i.e., fd j (d j ) can be written as, see Appendix A: • CALCULATION OFP LOS E d j AND fγ j γ j Recall (5), E d j in (19) can be used to expect the LOS probability of the mmWave link j as follows: Using (2) along with fd j d j in (20) andP LOS E d j in (21), the PDF of the estimated mmWave SNR, i.e., fγ j γ j whereγ j is a r.v. indicating the estimated mmWave SNR at T j from T j−1 , can be expressed as: whereγ j =P

2) PROPOSED PROBABILISTIC METRIC BASED HIERARCHICAL SEARCH ALGORITHM
After evaluating fγ j γ j , a link probabilistic metric is used to measure the expected SNR coverage of a link j belongs to route r m . This can be expressed mathematically as: where (r m ) j is the probabilistic metric used to measure the expected mmWave coverage of a relay link j between node T j−1 and node T j in route r m as shown in figure 4, and γ th is given section IV.A. Based on the calculated values of (r m ) j , the Macro-BS will conduct a hierarchical search through ∅ R to enumerate the number routes expected to maximize the overall spectral efficiency from T 0 to T K . Towards that, a path probabilistic metric is defined to measure the expected SNR coverage for a route r m up to relay node T j , which is defined as: If different routes pass through the same node T j , only the route having the minimum Q (r m ) j value will survive, and the other routes will be eliminated from further considerations. This can be expressed as: where r s , 1 ≤ s ≤ |∅ s |, indicates the survival route r s and ∅ s indicates the space of the survival routes which is a subset of ∅ R , i.e., ∅ s ⊂ φ R . Then, the operation of eliminating un-necessary routes is continued till reaching the destination node. Figure 4 shows an example of the proposed probabilistic metric based hierarchical search routine. This example shows 3 candidate routes using 3 relay links. In this figure, 2 has the minimum value in this example, route 2 is considered as VOLUME 8, 2020 the survival route and the other routes will be eliminated from further considerations.
At the end of the hierarchical search over all available route space, the Macro-BS collects the survival routes. Based on the value of Q (r s ) K calculated at the destination node for each survival route see figure 4, the Macro-BS sorts r s in descending order. Then, it selects the group of ''good'' routes with high values of Q (r s ) K as follows: where r a , 1 ≤ a ≤ |∅ a |, indicates the index of a good route, and ∅ a is the space of the ''good'' routes which is a subset of ∅ s , i.e., ∅ a ⊂ φ s as given in (26). Figure 5 summarizes the proposed probabilistic metric based hierarchical search algorithm.

C. ONLINE PHASE OF MULTI-HOP RELAY PROBING
After enumerating r a , 1 ≤ a ≤ |∅ a |, the Macro-BS will inform the nodes located within these selected routes to start online relay probing starting from the source node using the previously explained relay probing process given in Section IV. A. Then, the best route among the probed routes maximizing the overall spectral efficiency from source to destination is selected for constructing the link, which can be expressed as: From the above explanations, the overall throughput of the proposed scheme can be expressed as: where T |∅ a | P indicates the relay probing time of the selected routes |∅ a |, which can be evaluated using (9) except that |∅ a | is used instead of |∅ P |. The new added term T Off is used to consider the time consumed in the offline phase, which mainly comes from µW RSS measurement frames exchange, which can be simply expressed as: where N is the total number of distributed nodes, and τ µW is the time duration of the µW RSS M. Req or µW RSS M. Res frames, which are typically in the range of µ sec. It is stated in [35] that typically τ g τ µW . Thus, T Off will have a negligible value compared to T |∅ a | P , and T |∅ a | P will have the dominant effect on T 0 −r a −T K . Thus, the proposed scheme not only enumerates the number of good routes for the online relay probing process, but also it reduces the number of probed routes as well. This contributes in increasing the overall throughput obtained by the proposed scheme while reducing its energy consumptions. Furthermore, thanks to the tiny value of T Off along with the low computational complexity of the proposed hierarchical search algorithm, the proposed multi-hop relay probing scheme is scalable and can be applied for dense mmWave D2D relay networks.

VI. SIMULATION ANALYSIS
In this section, numerical simulations are conducted to prove the correctness of the deduced mathematical equations of SNR and rate, i.e., spectral efficiency, coverage in addition to the relationship between increasing the number of probed routes and enhancing the spectral efficiency of the mmWave multi-hop routing. The mathematical results come from directly solving the aforementioned mathematical equations, while the simulation results come from conducting extensive Monto Carlo simulations. Moreover, the effectiveness of the proposed multi-hop relay probing scheme over the relay probing strategy proposed in [26] will be proved via numerical simulations. In the simulation scenario, 20 users are uniformly distributed in a simulation area of 200m × 200 m. Different scenarios of mmWave multi-hop routing are considered throughout the conducted simulations including different number of relay links and blockage densities. Table 1 gives the used simulation parameters.

A. SNR AND RATE COVERAGE OF mmWAVE MULTIHOP ROUTING
In this part of simulation analysis, the coverage probabilities of the SNR in dB and rate in bps/Hz of the mmWave multi-hop routing are given under different scenarios. Figure 6 shows the coverage probability of the SNR given in (16) using different values of relay links K at λ = 0. Moreover, figure 7 gives the coverage probability of the spectral efficiency using different values of relay links K at λ = 0. As shown by these figures, the mathematical results ''Theory'' and the simulation results ''Sim'' are highly matched. Also, as the number of relay links is increased both SNR coverage and rate coverage are also increased. For example, at SNR = −5 dB, the SNR coverage probability is increased by 60 % using K = 6 over using K = 2.  Also, from figure 7, at spectral efficiency of 1.75, the rate coverage probability is increased by 60 % using K = 6 over using K = 2. Figure 8 gives the average SNR against the blockages density λ. The mathematical results given in figure 8 comes from solving the following equation using numerical integration: where f γ eq (x) can be evaluated as f γ eq (x) = Kf γ (x)[1 − f γ (x)] K −1 assuming identically and independently distributed (i.i.d) γ j 's as in (16), and E d j is evaluated using the same way given in (16). From figure 8, the results come from solving the mathematical equations highly match that come from Monto Carlo simulations. As shown in figure 8, as λ is increased, the average SNRs using all simulated relay  links are decreased. This comes from increasing the value of P NLOS , which decreases the received mmWave power at the relay nodes and the SNR in consequence. This effect is also shown in figure 9, where the average spectral efficiency is decreased as λ is increased using all simulated relay links. However, the use of a higher number of relay links, i.e., higher values of K , still has higher average SNR and spectral efficiency even in harsh blockage environment.

B. THE EFFECT OF INCREASING THE NUMBER OF PROBED ROUTES
In this part of simulation analysis, we will study the effect of increasing the number of probed routes |∅ P | on the spectral efficiency coverage and the average throughput of the whole link from source to destination under different scenarios. Figures 10 and 11 show the spectral efficiency coverage of the mmWave multi-hop routing using different number of probed routes under λ = 0 and K = 5 and 6, respectively. In these figures, the mathematical results come from directly solving (18) under different values of |∅ P | using the specified values of K and λ. As shown by these figures, the mathematical results highly match that come from numerical simulations. Moreover, as we increase the value   of |∅ P |, the spectral efficiency coverage will be increased as proved in (18). Furthermore, as the number of relay links is increased, i.e., higher values of K , more spectral efficiency coverage is obtained for the same values of |∅ P | as shown in figures 10 and 11. It is interesting to note that a nonlinear relation does exist between increasing |∅ P | and enhancing  the spectral efficiency. For example, in figure 10, increasing |∅ P | from 20 to 60 will increase the coverage probability of 0.8 spectral efficiency by 40%. However, increasing |∅ P | from 100 to 180 will increase the coverage probability by only 10%. This means that after a certain value of spectral efficiency, slight increase in spectral efficiency requires a large increase in |∅ P |. Figures 12 and 13 show the trade-off, explained in section IV.A, between increasing the target spectral efficiency of the mmWave multi-hop routing and the obtained average throughput using K = 5 and 6 respectively. In these figures, the average throughput is evaluated under different blockage environments, i.e., using different values of λ. In these figures, as we increase the target spectral efficiency higher average throughput is obtained using higher number of probed routes. However, after a certain value of spectral efficiency, very high number of probed routes are needed to obtain the target spectral efficiency, due to the aforementioned nonlinear behavior, which dramatically decreases the average throughput of the multi-hop link. Thus, for each value of λ, there is a maximum value of average throughput corresponding to a certain target spectral efficiency. This value can be reached by only probing the corresponding number of routes. As the value of λ is increased, the average throughput is highly decreased due to the high increase in the blockage effect. However, increasing the value of K results in better average throughput performance as given by the results in figure 13 compared to that given in figure 12 especially at low values of λ.

C. PERFORMANCE COMPARSIONS
In this section, performance comparisons between the proposed route probing scheme and the relay probing scheme given in [26] are given. We used the same performance metrices given in [26] for fair comparisons. These include the average throughput and the average number of probed routes. Also, we will measure the average energy consumptions, which is a critical performance metric in mmWave D2D relay probing. As previously explained, the methodology given in [26] is based on evaluating the average number of optimum routes corresponding to the peak values of the average throughput given in figures 12 and 13. Then, only these optimum number of routes are used in the online route probing process. Figure 14 shows the average throughput comparisons using K = 5 and K = 6 under different values of λ. In this figure, in addition to the average throughput performance of the proposed scheme and that given in [26], the performance of the optimal route probing is also given. In this scheme, we suppose that there is an optimal route probing scheme that can select the route having the highest spectral efficiency, and then only this route is used for online probing. This resembles |∅ P | = 1 and T |∅ P | P = K τ g in consequence. The optimal average throughput performance is given for the purpose of comparisons. As shown in figure 14, the average throughput performance of the proposed scheme is much better than that comes from using the scheme given in [26] at all λ values for both K = 5 and K = 6. This comes from the proposed route probing strategy, in which all available routes are explored in the offline phase and only those expected to maximize the spectral efficiency of the source-to-destination link are used in the online probing phase. On the other hand, in the  scheme given in [26] a fixed number of routes corresponding to the peaks of figures 12 and 13 are probed without exploring the other available routes. At λ = 0, about 49% and 62% increase in average throughput is obtained using the proposed scheme over that given in [26] using K = 5 and K = 6 respectively. These values are further increased as the value of λ is increased till reaching around 300% at λ = 0.01. However, both schemes still have worse performances in harsh blockage environment compared to the optimal one, which still opens the door for more sophisticated mmWave route probing. Figure 15 shows the average number of probed routes corresponding to the average throughput performance of the compared scheme giving in figure 14. As shown in this figure, the proposed probing scheme uses lower number of probed routes than that used by the scheme given in [26] for all λ values for both K = 5 and K = 6. It is interesting to note that, although both schemes used almost the same number of probed routes at λ = 0, the proposed one has higher throughput performance. This is because the proposed scheme uses the expected good routes for the online probing phase, while the scheme given in [26] is based on probing a fixed number of routes irrespective they are the good ones or not. Figure 16 shows the average energy consumption in millijoule of the compared schemes. The energy consumption of VOLUME 8, 2020 the scheme given in [26] is evaluated as: where T |∅ c | P is the route probing time taken by the scheme given in [26], and |∅ c | is its average number of probed routes. For the proposed scheme, the total energy consumption can be calculated as: where T |∅ a | P and T Off are given in (28) and (29). As the term P µW t T Off is negligibly small compared to the term P g t T |∅ a | P , see Table I, P g t T |∅ a | P is considered as the dominant term in (32). As |∅ a | |∅ c |, great enhancements in energy consumption is obtained using the proposed scheme over that given in [26] at all values of λ using both K = 5 and K = 6, as shown in figure 16. Moreover, the enhancement in energy consumption is increased as λ is increased. For example, at λ = 0, almost 16% decrease in average energy consumption is obtained using the proposed scheme for both K values. However, at λ = 0.01, about 40% and 53% decrease in average energy consumption are obtained using the proposed scheme for K = 5 and K = 6, respectively. This comes from the high reductions in average number of probed routes obtained by the proposed scheme at high values of λ as given in figure 15.

VII. CONCLUSION
In this paper, the problem of relay probing in mmWave multi-hop routing was explored. A trade-off exists between probing more routes and enhancing the average throughput and energy consumption of the source-to destination link. In the paper, this optimization problem was formulated, and a heuristic scheme based on the interworking between µW and mmWave bands was proposed to efficiently address it. The proposed route probing scheme consists of two phases, the offline phase and online phase. A probabilistic metric based hierarchical search is conducted in the offline phase to enumerate the number of good routes expected to maximize the spectral efficiency of the source-to-destination link over all available route space. The probabilistic metric is calculated based on the µW RSS of each relay link in each available route. Then, only the enumerated good routes are used in the online route probing phase. Mathematical analysis was performed to prove the effectiveness of using multi-hop routing in mmWave communications and the relation between increasing the number of probed routes and the enhancement in the obtained spectral efficiency. Simulation analysis was conducted to prove the mathematical findings and to the prove the effectiveness of the proposed mmWave route probing scheme over existing technique. As a use case, the proposed scheme can be used for unmanned aerial vehicle (UAV) networks. In these networks, high throughput with low energy consumption multi-hop UAV-to-UAV routing is highly desirable due to the high dynamicity of the UAV along with its too limited power capacity. Also, for future extensions, machine learning (ML) tools in the form of reinforcement learning (RL) will be investigated to further improve the mmWave multi-hop relay probing process.

APPENDIXES APPENDIX A CACULATION OF
From (1) and using RVT [42], the PDF of the estimated distance, i.e., fd (d), based on the received P Substituting (34) and (35) Therefore, for any relay link j, fd j d j can be written as given in (20). By considering fd (d) given in (36), L ν g given by (10) becomes a function of two independent r.vs., namelyd and Z ν g,LN . Let T =d α ν g , thus, f T (T) can be written using RVT [42] as: Using fd d , f T (T) can be written as: Thus, f L ν g L ν g can be expressed using multiplication of two random variables i.e., Z ν g,LN and T as [41]: By substituting, the PDF of Z ν g,LN and (37) into (38), f L ν g L ν g can be finally written as given in (40), as shown at the top of this page. which can be solved numerically.