TCP Performance Over Satellite-Based Hybrid FSO/RF Vehicular Networks: Modeling and Analysis

Recent years have witnessed a growing interest in Internet access from space assisted by low earth orbit (LEO) satellite networks. In the domain of the last-mile access for the Internet of Vehicles (IoV), hybrid free-space optical (FSO)/radio-frequency (RF) communication has recently attracted worldwide research efforts. While the transmission control protocol (TCP) is the most widely deployed transport protocol on the Internet, its performance in the error-prone environment of LEO satellite-assisted hybrid FSO/RF vehicular networks is not well understood. This paper develops a comprehensive analytical model based on the cross-layer approach for TCP performance, considering the FSO and RF satellite fading channels, modeled by the Gamma-Gamma and Nakagami- $m$ distributions, respectively. The error-control solutions, including the Reed-Solomon (RS) code and Selective repeat automatic repeat request (SR-ARQ), are also employed. Numerical results quantitatively demonstrate the impact of transmission errors at last-mile links and different parameters/settings of error-control solutions on the TCP performance. The paper also supports the selection of proper TCP variants for the considered networks.


I. INTRODUCTION
Thanks to the rapid technological development, several well-known companies have been working for years on plans to provide high-speed Internet access anywhere on Earth, such as SpaceX's Starlink, Amazon's Project Kuiper, the UK's OneWeb, and ViaSat. The typical approach is to use low Earth orbit (LEO) satellites in relative proximity to Earth's surface. The shorter delay in data transfer of LEO satellites than GEO satellites is one of the key reasons why the planned mega-constellations want to use them. Nevertheless, the challenge is that LEO satellites continuously orbit the Earth. As a result, they are only ever accessible for a short period from any one point on Earth. Therefore, a world-spanning network of thousands of such LEO The associate editor coordinating the review of this manuscript and approving it for publication was Young Jin Chun .
satellites is supposed to enable quick data connections and the transfer of large quantities of data. These are substantial opening steps for the ''Internet from Space'' era with satellite internet access provision. With the Internet from Space, the Internet of Vehicles (IoV), which is one of the most rapid development applications of the Internet of Things (IoT) with massive access, has become possible. Satellite assisted-IoV will soon emerge as the scenario for the fifth and sixth generation networks (5G), (6G) in the near future.
In most existing satellite-assisted Internet networks, radio frequency (RF) bands are used for last-mile access. However, because of limited data rates, RF may not be effective for the high-speed satellite network. Free-space optical (FSO) communications, with higher data rates, lower power, and higher directivity due to ultra-narrow beams [1], is therefore being considered in this network. The challenge is that it is highly susceptible to atmospheric turbulence and heavily dependent on weather effects as cloud [2]. On the other hand, traditional RF communication is more robust and less affected by atmospheric disturbances or weather conditions. Therefore, to take advantage of the high data rates of FSO and the reliability of RF communication technologies, the hybrid FSO/RF emerges as a promising candidate for last-mile access of satellite-assisted Internet networks.
In the transport layer, the transmission control protocol (TCP) is the most popular protocol for many Internet applications that require reliable transmissions [3]. However, it is challenging for the TCP, especially the standard variant using Additive Increase and Multiplicative Decrease (AIMD) algorithm, to maintain the performance over the Long Fat Network (LFN) of satellite-based hybrid FSO/RF communications. First, TCP tends to perform poorly in wireless environments because it misinterprets the losses due to bad channel conditions as an indication of network congestion. Second, the conventional TCP variants are practically ineffective in the LFN due to the slow linear growth function of the congestion window.

A. LITERATURE REVIEW AND MOTIVATIONS
The existing studies on the TCP performance over satellite networks mainly focus on RF-based last-mile access [4]- [11]. Particularly, Zhu et al. studied the performance of TCP enhancements for a hybrid terrestrial-satellite network in [5]. In [7], the authors evaluated the quality of video streaming applications using TCP over satellite links. The interactions of TCP at the transport layer and random access schemes at the Media Access Control (MAC) layer for machine-to-machine services over the satellite links were investigated in [8]. In [9], the performance of multi-path TCP was analyzed through a satellite network, which contains several WiFi access points connected to a server together with a convoy of drones for providing Internet access. In addition, a new technique called Path-Based Network Coding (PBNC) is proposed in [10] to maximize the TCP goodput for Terrestrial-Satellite Mobile Systems. And most recently, the performance evaluation of TCP NewReno over satellite-based Digital Video Broadcasting-Return Channel via Satellite (DVB-RCS2) links was investigated in [11].
There are also several works on the TCP performance under the impact of FSO-based last-mile access [12]- [15]. Particularly, in [12], [13], the authors analyzed the throughput performance of TCP-FSO over satellite-to-fixed ground station channel. In these works, the test-bed system with an optical emulator is used to conduct the measurements and validate the analytical results. In [14], the authors demonstrated the impact of atmospheric turbulence on the TCP throughput and confirmed the effectiveness of TCP Cubic in FSO-based satellite communication systems. And most recently, in [15], the TCP performance with Incremental Redundancy Hybrid ARQ (IR-HARQ) protocol was analytically investigated in FSO based LEO satellite-to-vehicles networks. Also, in this work, the authors showed that in the low error-rate conditions, TCP Cubic outperforms TCP Hybla but, when the transmission errors happen more frequently, TCP Hybla can maintain a better performance than TCP Cubic.
While the conventional TCP variants using AIMD algorithms can still maintain a good performance in satellite-based RF networks [11], it is challenging for these TCP variants to achieve high throughput in the enormous bandwidth of satellite-based FSO networks. For instance, the results in [14] illustrated that TCP variants using the AIMD algorithm, e.g., Reno, are not effective in the satellite FSO-based LFN. As a result, it is crucial to select the proper TCP variants for different network scenarios. At the time of writing, we realize that there are no works that address the TCP performance issue in satellite-based hybrid FSO/RF networks, as summarized in Table 1 Furthermore, Monte-Carlo simulations are also performed to confirm the validity of analytical results. It is noteworthy that an essential novelty of the work is the complete model that we set up by jointly considering the impact of both physical and link layers for different TCP variants. These issues include the hybrid FSO/RF satellite channels, the presence of error-control methods of both physical and link layers, i.e., Reed-Solomon (RS) code and Selective Repeat ARQ (SR-ARQ). Each of them separately has been well studied. Still, to our best knowledge, no prior work puts all of these strongly interacting pieces together to understand the performance of TCP variants in satellite Internet access.
The rest of the paper is organized as follows. The network scenario is described in Section II. The hybrid FSO/RF switch-over system and fading channel modeling for FSO and RF are presented in Section III. The TCP performance analysis, including the TCP segment loss model, average RTT delay, and the comprehensive loss-based throughput, is shown in Section IV. Numerical results are given in Section V, and finally, Section VI concludes the paper.

II. REFERENCE NETWORK SCENARIO A. NETWORK DESCRIPTIONS
As shown in Fig. 1, we consider an end-to-end network scenario where a vehicle downloads data from a remote server (the TCP source) through LEO satellite Internet access. The satellite is connected to a TCP source placed within the wired section of the Internet. The connection between the LEO satellite and the vehicle is assumed to be the potential bottleneck in the network where the hybrid FSO/RF link is used. In this connection, FEC and SR-ARQ are respectively used for the physical layer and link layer-based error control protocol.
A hybrid FSO/RF system including two FSO and RF links is employed at the physical layer. Many degrading factors, including interference, absorption, scattering, refraction, and turbulence caused by the atmosphere and weather conditions (fog, rain), induce harmful effects for the FSO and RF links in different ways [17]. For example, the main deteriorating factors of FSO links are fog and atmospheric turbulence. On the other hand, while RF links are not severely affected by fog and atmospheric turbulence, they are susceptible to heavy rains. The integration of FSO and RF links with a switch-over system, therefore, has emerged as a promising solution for providing reliable and effective transport for critical real-time traffic in an outdoor wireless environment [18]. For both links, the popular Reed-Solomon code (RS) is employed to improve the error performance at the physical layer.
In the link layer, we assume that each TCP segment is divided and encapsulated into n f smaller packets, then data frames before being transmitted over the last-mile link. Selective Repeat ARQ (SR-ARQ) is applied for both RF and FSO links to ensure reliable data transmission by re-transmitting the incorrect frames with the persistent level of SR ARQ is defined as X (i.e., the maximum number of transmission attempts for a frame).
We consider several different variants of TCP, including Cubic, Hybla, NewReno, and HSTCP. The performance of TCP depends on the congestion window control's algorithms that are represented as functions of time to deal with loss events. We also assume that the loss events, indicated to the source by three-duplicate ACKs or time-out events, are caused by two main reasons: transmission losses and congestion losses. While errors at the physical layer cause transmission losses, congestion losses are caused by the buffer overflow at routers on the Internet.

B. TCP VARIANTS
In this section, we briefly review TCP congestion control algorithms considered in the paper, including NewReno, Hybla, Cubic and HSTCP.

1) TCP NewReno
TCP NewReno, an upgrade version of TCP Reno, is the most widespread congestion control protocol used in most Internet applications. The congestion window size, denoted as W NewReno , is updated after a reception of non-duplicate ACK received or loss indications. This update is based on the AIMD algorithm by which the window size is increased by MSS W NewReno bytes for each ACK or one segment for each round-trip time (RTT) with (MSS) is the maximum segment size. It is reduced to half of the current window size for each segment loss detection by three-duplicate ACKs or 1 if time-out happens. The TCP NewReno's window growth function can be expressed as with x is the last window reduction, τ is the elapsed time since the last lost event, and RTT is the average round-trip time.
When the loss event is detected, TCP NewReno reduces its current window size by half.
2) TCP HYBLA TCP Hybla behaves similarly to TCP NewReno. The difference is in the approach used by TCP Hybla to remedy the impact of high-latency terrestrial or satellite radio links. Hybla removes the effect of extended RTTs by using a normalized value, given as ρ = RTT RTT 0 , where RTT 0 is is the round-trip time of the reference connection to which we aim to equalize our performance. With ρ, the congestion window size of TCP Hybla can be expressed as It is noted that as RTT ≤ RTT 0 , then TCP Hybla behaves as the standard TCPs (e.g., TCP NewReno) [19].

3) HIGH-SPEED TCP
High-Speed TCP (HSTCP) is the congestion control mechanism for extensive bandwidth connections. Different from standard TCP, HSTCP lets the congestion window increases by a(w) segments per RTT in the absence of loss events and decreases to W HSTCP 1 − b(w) segments in response to an RTT with one or more loss events. Given a(w) and b(w), the window growth function of HPTCP can be given as where a(w), b(w) are the modified increase and decrease parameters of HSTCP as defined in RFC3649 [20] and τ is the elapsed time since the last loss event.

4) TCP CUBIC
The window size of TCP Cubic is a cubic function of time since the last loss event. Let W Cubic (x, τ ) denote the window size as a function of the last window reduction x and elapsed time τ , the Cubic's window growth function can be given as where C and β are the Cubic and the multiplicative decrease factors, respectively [16].

III. HYBRID FSO/RF LAST-MILE MODELING
In this Section, we first review the fading channel models of both FSO and RF links. Then, the design of link switching and channel-state for the considered hybrid FSO/RF last-mile system are presented.

A. FADING CHANNEL MODELS 1) FSO LINK
The atmospheric turbulence phenomenon causes the scintillation effect resulting in the signal power fluctuations at the vehicle's detector. As reported in [21], the Gamma-Gamma (GG) model can be used to describe a wide range of turbulence conditions for the down-link from satellite to vehicle, in which the probability density (PDF) of received signal-to-noise ratio (SNR), denoted by γ f , is given as [21, (1)] whereγ f is the average value of γ f , K α−β (·) is the modified Bessel function of the second kind of order α − β, Here, α and β are effective numbers of large-scale and small-scale eddies of the scattering environment, which can be written as [22, (2)] where σ 2 R is the Rytov variance and in case of plane wave propagation, it is written as [21, (2)] where k = 2π λ is the optical wave number with λ is the optical wavelength, ξ is the zenith angle, H is the satellite altitude, h v is the vehicle height, and sec(.) is secant function. In addition, C 2 n (h) represents the turbulence strength accordingly to the altitude h, and it is determined based on the most widely used Hufnagel-Vally model as [23, (12.1)] where ν wind is the root mean square (RMS) wind speed, and C 2 n (0) is the ground turbulence level varying from 10 −17 m −2/3 to 10 −13 m −2/3 .
Given the PDF of γ f in (5), its cumulative distribution function (CDF) can be computed as [24, (4)] where is Meijer's G-function. On the other hand, one of the essential parameters to deeply understand the fading channels is the level crossing rate (LCR). It is defined as the average number of times per second that received SNR passes the certain threshold γ f th (either positive or negative direction). The LCR for GG model defined in [25, (26)] is, however, not available in the closed-form expression. By applying the Gauss-Laguerre quadrature approximation [26, (25.4.45)], the closed-form expression for LCR can be obtained in (11), as shown at the bottom of the next page, where n is the Gauss-Laguerre approximation order, while w i and x i are respectively the weight factor and the i-th zero of VOLUME 9, 2021 Laguerre polynomial L n (x) found in [26,Table 25.9]. In addition, σ 2 s is the log-intensity variance found in [1], and t c f is the FSO fading channel coherence time defined as t c f = √ λH sec(ξ ) ν wind [21, (7)].

2) RF LINK
The RF-based satellite fading channel can be well characterized by the Nakagami-m model thanks to its capability of describing different fading scenarios, from line-of-sight (LOS) to non-line-of-sight (NLOS) [27]. Furthermore, it was verified in [28] that using the Nakagami-m model can simplify the analysis while maintaining good accuracy. As a result, the PDF of received SNR in RF fading channels, denoted by γ r , can be expressed as [28, Table 3] whereγ r is the average value of γ r , and m represents for different fading scenarios (refer to [28] for more details). Given PDF of γ r , its CDF can be determined as On the other hand, the LCR of Nakagami-m fading model, denoted as LCR γ r th , is given by [29, (10)] where f d is the maximum Doppler frequency, which is due to the relative motion between LEO satellite and vehicles. Given the frequency carrier as f c , the Doppler frequency shift can be written in (15), as shown at the bottom of the page, where c is the speed of light in vacuum, θ max is the maximum elevation angle, w F is the relative angular velocity of the satellite with respect to the vehicle, r E is the radius of Earth, and r is the distance between the satellite and the center of Earth [30, (5)]. In addition, for the analysis of the Doppler effect with respect to time, it is necessary to investigate the satellite visibility window duration. It is defined as the total duration that the LEO satellite can communicate effectively with the vehicle within a small percentage of the satellite orbit period and can be expressed as [30, (11)] where θ min and θ max are the minimum and maximum values of elevation angle in a satellite pass. An example of Doppler frequency shift according to visibility time is illustrated in Fig. 2. The satellite orbit is H = 600 km, f c = 2.4 GHz, the vehicle's velocity is v c = 50 km/h, the minimum elevation angle is θ min = 10 • . Also, different maximum elevation angles θ max , i.e., 20 • , 30 • , 60 • , 90 • are taken into account. As seen, the maximum Doppler frequency shift is ranging from approximately −50 kHz to 50 kHz.  with higher bandwidth, is considered a primary link in this system, while the RF link acts as a backup when the FSO link is down. Also, the feedback link is assumed to be error-free and no latency. Based on the last CSI of FSO as a default link, the system decides the transmission link at the beginning of a transmission cycle. Initially, the system transmits data by using the FSO link, which has a higher transmission rate. Whenever the environmental condition worsens, which leads to the decrease of FSO channel quality, the system switches to the backup link as RF. When using the RF link, if the quality of the RF link is not good and the FSO's CSI is not favorable either, the system is supposed to be in the outage state. Whenever the FSO link CSI gets better, the system switches back to the FSO link. Denote S as the link selection for this transmission, S can be expressed as where γ f , γ r are the instantaneous received SNRs from FSO and RF links, respectively. The probability of link switching from FSO to RF therefore can be given as where F γ f is the CDF of FSO channel. Obviously, at high average received SNR of the FSO link, the link switching probability is very low and it is increasing when the quality of the FSO link degrades. Atmospheric turbulence and zenith angle also greatly affect the link switching probability. For example, comparing Fig. 4(a) and Fig. 4(b), to keep the link switching probability under 0.1, the required average received SNR is higher than 7 dB and 14 dB corresponding to the weak and strong turbulence conditions. The outage event happens when the system is operating at RF link and γ r < γ out ; outage probability thus can be determined as  where F γ f , F γ r are the CDF of FSO and RF channel, respectively. Fig. 5 depicts the outage probability as a function of both γ f andγ r with different turbulence conditions and Nakagami factor m. As evident, the atmospheric turbulence has a significant impact on the outage probability. For example, to keep the outage probability under 10 −2 , the required SNRs are {γ f ,γ r > 15dB} and {γ f ,γ r > 25dB} corresponding to the weak and strong turbulence conditions.

2) CHANNEL-STATE MODELING
To effectively facilitate the frame transmissions, we now design a channel-state model, in which a range of SNRs VOLUME 9, 2021 defines a state. The duration of the channel state is selected long enough for the transmission of multiple frames can be completed within a state, and it should be equal and shorter than the channel fading coherence time. 1 Both FSO and RF channels are organized into non-overlapping consecutive states. The channel is said to be in-state k-th if the received SNR falls into the interval of [γ k , γ k+1 ). In addition, the duration of channel-state k-th, denoted byτ k , depends on the statistical characteristic of fading channels, which can be obtained as [31, (27)] where LCR(γ k ) is the LCR given in (11) for FSO and (14) for RF. Additionally, P k is the probability of the channel-state k-th, which is determined as FSO model: RF model: where f γ (·) and F γ (·) are the channel PDF and CDF, respectively. We now determine the SNR threshold levels γ k for the channel-state model, including both FSO and RF. Given the fading channel coherence time of t c f (FSO) and t c r (RF), the algorithm for searching these thresholds of FSO and RF models is summarized as in the Algorithm 1. After the searching process, all M + 1 and N + 1 threshold levels corresponding to M and N channel-states of FSO and RF models, respectively, can be determined. Here, we set γ thres = γ f 2 and γ out = γ r 2 .
Given [γ k , γ k+1 ), the average bit error rate (BER) at the channel-state k-th can be computed as FSO model: where B(x, y) = (x) (y) (x+y) is the Beta function, K and L can be expressed as [31, (14-16)] 1 This guarantees the channel remains invariant during the frame transmissions. Search γ r n that satisfiesτ r n = c r T r ≤ t c r 8: n = n + 1 9: end while 10: and with (m, x) := ∞ x t m−1 e −t dt is the upper incomplete Gamma function, while g z and a i (x, y) are expressed as RF model: where b n = m γ r + 3 2(Z −1) , and BER(γ ) is the instantaneous BER, which is well approximated as [32, (28)] where Z is the signal constellation size and Z = 2 for the BPSK modulation.

IV. TCP PERFORMANCE ANALYSIS
This section focuses on the throughput analysis of TCP variants, including TCP NewReno, Hybla, HSTCP, and Cubic with SR-ARQ and RS code over the hybrid FSO/RF fading channel. The TCP throughput depends on the behavior of the congestion window and the average round-trip delay (RTT), which will be described as follows.

A. TCP SEGMENT LOSS
The TCP segment loss is assumed to be caused by two main factors: transmission losses at the last-mile link and congestion losses on the Internet.

1) TRANSMISSION LOSS
The transmission loss is caused by the errors on the last-mile link. Given BER k , the RS coded frame error rate for each channel of FSO and RF links can be expressed as [33, (16)] FSO link: RF link: where L f is the frame size, and t 0 is the maximum number of errors that can be corrected. Based on the channel state modelling of hybrid FSO/RF channel, the average frame error rate over last-mile link can be given as [34, (28) where FER f m , FER r n , P f m , P r n are the average frame error rate, steady-state probability of FSO link m-th state and RF link n-th state while R f , R r are the transmission rate of each link, respectively. Considering the operation of SR-ARQ, when a frame loss is detected, the sender keeps re-transmitting this frame; if this frame does not get through the networks after X re-transmissions (i.e., there is no ACK), the corresponding TCP segment is assumed to be lost. Given a FER, the TCP transmission loss probability over the last-mile link can be obtained as where X is the persistent level of SR-ARQ, and n f is the number of frames per segment.

2) CONGESTION LOSS
The congestion loss is caused by buffer overflow at routers on the Internet section. For the sake of simplicity, it is assumed that congestion happens when TCP window size attains the maximum window size W max , which is approximately the sum of the buffer size B of the Internet section and the average Bandwidth-Delay Product (BDP) of last-mile link so it can be given as [35, (2)] where L TCP is the TCP segment size whileR, RTT are the average transmission rate and TCP round-trip time of the last-mile link that will be discussed later.
In addition, since the Slow Start (SS) period is very concise in comparison with the Congestion Avoidance (CA), the congestion loss probability can be well-approximated as [36, (2)] based on the process of a transmission cycle between two consecutive loss events TCP NewReno, where N c is the total number of segments that are transmitted in a cycle (as in Fig. 6), W max is the maximum congestion window size, C and β are the Cubic and multiplicative decrease parameters, W = W max 1 − b(W max ) and a(w), b(w) are the modified increase, decrease parameters of HSTCP, respectively, while ρ is the normalized Hybla factor.

3) UNIFIED LOSS MODEL
This section derives the unified loss probability (ULP) under the impact of transmission and congestion losses. Figure 6 shows the TCP congestion window evolution when only congestion loss is considered (left) and when both transmission and congestion losses (unified loss) are considered, respectively. Denote N max as the maximum number of segments that can be sent when a congestion loss occurs in the unified loss model. Based on the similar approach as [36], by approximating N max to a constant value that is equal to N c , we can obtain the expression of the ULP (P loss ) as whereN is the average number of segments that can be sent during a cycle between two consecutive losses in the unified loss model while P t , P c are the transmission loss and VOLUME 9, 2021 congestion loss probabilities, respectively. Eq. (35) can be validated as As all losses are equally treated in the TCP, for the sake of simplicity, we can re-design an equivalent model with congestion losses only, as in Fig. 7. It is seen that, while the actual network model, which is characterized by its Unified Loss(P loss ), including both transmission loss P t and congestion loss P c , the equivalent model can be described by congestion event only, which is represented asP c . To equalize the two models, we haveP Based on the analysis above, (38) can be given as whereN ,N c is the average number of segments that can be sent during a cycle between two consecutive losses in the actual and equivalent model, respectively. By substituting (35) into (39),P c can be expressed as where P t , P c are the transmission and congestion loss probability of the actual network model. With N c is a function of W max andN c is a function of W max that are defined in (33), the formation of W max can be given as TCP NewReno, , TCP Hybla, HTCP, where G = 4−β 4RTT 3 βW max 4 C is the Cubic constant parameter is the approximated parameter of HSTCP and ρ is the normalized Hybla factor.

B. AVERAGE TCP ROUND-TRIP DELAY
The average end-to-end round-trip time for a TCP segment, denoted as RTT can be approximated as where T OD is a fixed delay caused by other parts of the network, which is deterministic. T LD is the average delay in the last-mile link that depends on the data frame re-transmissions occurred at the link-layer with T prop is the propagation delay from satellite to vehicle, L f is the frame size andR is the average transmission rate of the last-mile link, which is given as [34, (31) Here, R f , M , R r , N are the transmission rate, number of channel-states of FSO and RF link, respectively. L f /R is therefore the average transmission time for one frame. In addition,X is the average number of retransmissions for one frame and it can be obtained as [37, (22)] where X is the persistent level of SR-ARQ and FER is the average frame loss probability of the last-mile link.

C. TCP THROUGHPUT ANALYSIS 1) LOSS-BASED TCP MODELLING
The TCP throughput performance can be evaluated by analyzing the TCP window variation between two adjacent losses. Specifically, the window behavior of TCP can be modeled as a Markov chain. By deriving the stationary distribution of the chain, the average throughput can then be obtained. In particular, the range of TCP window size (0, W max ] is first separated into (N + 1) intervals. The zero interval 0-th is (0, SST) (SST means Slow-Start threshold) denotes the Slow-Start phase, and other N intervals 1-th to N -th) in the Congestion Avoidance phase are equally divided in the range of (SST, W max ], where W max is the maximum TCP window size. Each interval represents to a Markov state. The TCP window is said to be in the m-th state if its value belongs to the corresponding interval. In that case, the window size is considered the midpoint of the interval, denoted as z m . When TCP is in 0-th state the TCP window size is SST  We observe the Markov chain at the time instant when a window reduction occurs, which is indicated by Retransmission Time-out (RTO) or Triple Duplicate ACKs (TD ACKs). The transition between m-th and n-th state (n = 0) corresponds to the state transition between two consecutive loss events, as illustrated in Fig. 8. When a window reduction event happens, the window size reduces to β times of its size just before the event (z m → βz m ). There are two possible cases: if the value of reduction window size is below the Slow-Start threshold (βz m < SST), TCP window grows from βz m to SST by Slow Start growth function, after that with Congestion Avoidance growth function, it grows from SST to z n without encountering another loss event (as in Fig. 8(a)). Otherwise, if (βz m ≥ SST), the window size grows from βz m to z n , according to Congestion Avoidance growth function (see Fig. 8). Assuming that the time between two consecutive losses T loss follows an exponential distribution with the rate is δ [38], [39], i.e., T loss ∼ Exp(δ), the transition probability between m-th and n-th state (n = 0), denoted as P m,n , can be given as where z m is the congestion window size of m-th state, T m,n max , T m,n min are the maximum and minimum duration between m-th and n-th interval, respectively. As shown in Fig. 8, the transition between m-th and n-th state cannot occur if z max n < βz m with z max n is the maximum window size of n-th interval, corresponds to n-th state so that we have P TO (w) is the probability that a loss leads to a Time-out event and it can be approximated as [40, (2)] where w is the TCP window size at this time. If a loss leads to a Time-out, TCP congestion window size is reduced from z m to 1 (MSS). The transition between m-th and 0-th state occurs thus can be given as On the other hand, as T loss ∼ Exp(δ), the value of δ can be expressed in terms of the average duration between two consecutive loss event as TCP NewReno, where T loss is the average time between two consecutive losses. In the equivalent model, it is the duration when the TCP window size increases from βW max to W max (as in Fig. 7). W max is the maximum window size of equivalent model, while RTT is the average RTT of last-mile link, defined in (42). C, β are the Cubic and multiplicative decrease parameters, ρ is the normalized Hybla factor and a(w), b(w) are the modified increase and decrease parameters of HSTCP with W = 1 − b(W max ) W max . From (45), (46), (48), with (N + 1) is the total number of states, the complete transition probability matrix of the Markov Chain P = [P m,n ] (N +1)×(N +1) can be obtained. Denote π = [π 0 π 1 . . . π N ] as the matrix of steady-state probabilities, where π m is the probability of the m-th state. Following the Markov chain theory, we can derive π m by solving π = π × P, N m=0 π m = 1. (50)

2) TCP THROUGHPUT
Given Z (t) is the total amount of data transmitted during time t, by applying the renewal reward theorem, the average TCP throughput can be expressed as  where P m,n , T m,n and A m,n are the transition probability, average time duration and total transmission data during the transition between m-th and n-th state, respectively. As shown in Fig. 8 and the analysis above, the average time duration between m-th and n-th state, T m,n , can be given as where RTT is the average RTT of the last-mile link, and ρ is the normalized Hybla factor. D CA (x, y) is the time duration when the window size grows from x to y in the Congestion Avoidance phase, and it is given as where C, β are the Cubic and multiplicative decrease parameters, a(w) is the modified increase parameter of HSTCP, and ρ is the normalized Hybla factor. Let D 1 = D CA (z m , z n ), D 2 = D SS (βz m , SST) and D 3 = D CA (SST, z n ), the total transmission data during the transition between m-th and n-th state, A m,n , can be expressed as where W (x, τ ) is the TCP window function, including W SS and W CA in the Slow-Start and Congestion Avoidance phases, respectively. For different TCP variants, A m,n is determined as If βz m ≥ SST: TCP NewReno, TCP Hybla, If βz m < SST: where L = 3 βz m C with C, β are the Cubic and multiplicative decrease parameters, a(w), b(w) are the modified increase and decrease parameters of HSTCP, respectively, and ρ is the normalized Hybla factor.
If Time-out event occurs: A m,n = 0.

V. NUMERICAL RESULTS AND DISCUSSIONS
In this Section, we present numerical results and comparatively discuss the performance of TCP variants with different parameters/settings in the scenario of satellite-assisted hybrid FSO/RF vehicular networks. Monte Carlo simulations are also performed to validate the analytical model.  frames/slot for RF, n f = 5 frames/segment. In addition, the parameters related to TCP variants are given in Table 2. Firstly, we focus on analyzing the effectiveness of the link-layer error-control design, including SR-ARQ and error correction, for the improvement of TCP throughput performance in this network. Fig. 10 depicts the TCP Cubic performance comparison of different SR-ARQ persistent levels on the condition with or without the RS code. Obviously, a significant throughput improvement is observed when SR-ARQ is employed. For instance, in Fig. 10(b), whenγ f = 20 dB with the persistent level X = 2, TCP Cubic throughput can achieve 850 Mbps, while when X = 1 the achieved throughput is only 300 Mbps. When SR-ARQ is not employed, throughput drops to almost zero. This emphasizes the importance of the SR-ARQ protocol by mitigating the impact of frame loss and improve the TCP performance. On the other hand, while the higher persistent level of SR-ARQ may provide higher throughput, it may increase the overall delay in the network, which is sensitive to LEO satellite-to-vehicle communications design. Therefore, the selection of a reasonable persistent level X is necessary. In addition, the RS code also holds an important role in improving the effectiveness of this network. Fig. 10(a) and Fig. 10(b) confirm the effect of RS on the throughput performance. Givenγ f = 15 and X = 2, with RS code, TCP throughput can achieve 400 Mbps while it achieves under 100 Mbps in the uncoded system. Next, we investigate the impact of segment size on the TCP performance over the last-mile access so that optimal segment sizes can be determined to maximize the system throughput at a certain atmospheric turbulence condition. Fig. 11 depicts the relationship between segment size and the throughput of several TCP variants under the different turbulence conditions, weak, moderate, and strong, respectively. As seen in Fig. 11(a), when the turbulence is weak, which is equivalent to low transmission loss probability in this network, larger segment sizes do not lead to a significant increase in the segment loss probability. Better throughput performance, therefore, can be obtained. For example, in Fig. 11(a), when the segment size is 500 bytes, TCP Cubic only achieves 800 Mbps, while when it is 1500 bytes, the throughput is increased to 900 Mbps. Nevertheless, when the turbulence gets stronger, increasing segment size does not mean better throughput performance. In Fig. 11(b), throughput improvement can only be seen in TCP Cubic, HSTCP when the segment size increases from 500 to 1000 bytes. At the same time, the throughput performance degradation is severe in TCP Hybla and TCP NewReno. With TCP Hybla, when the segment size is 500 bytes, the throughput is 500 Mbps. However, TCP Hybla can only reach less than 100 Mbps when the segment size is 1500 bytes.
The throughput performance deterioration of these TCP variants becomes more severe in the case turbulence condition is strong, as shown in Fig. 11(c). This phenomenon is logical as when the turbulence becomes moderate and strong, it increases the transmission loss probability. Therefore, with VOLUME 9, 2021 a larger segment size, the segment loss probability tends to increase. Based on this analysis for this figure, optimal segment sizes can be determined to achieve the maximum throughput in a particular turbulence condition. For example, the segment size should be set to 1000 bytes for TCP Cubic and HSTCP when the turbulence strength is moderate. Another essential issue of this network is the impact of the zenith angle on the TCP throughput performance. Fig. 12 illustrates the relationship between the throughput of TCP Cubic and zenith angle with different satellite altitudes and turbulence conditions. As is evident, the throughput gets higher when the zenith angle is low because, in the LEO satellite-to-vehicle optical link, the turbulence strength is weaker in the case of small zenith angles. Obviously, the increase of satellite altitude causes a longer transmission duration, and it exceeds the channel coherence time, leading to the degradation of the throughput performance.
We now comparatively discuss the performance of TCP variants with different parameters/settings in the scenario of satellite-assisted hybrid FSO/RF vehicular networks. Fig. 13 shows the comparison between TCP variants in terms of average throughput for a range of average received SNR of the FSO link. First, it is seen that, overall, TCP Cubic achieves the best throughput performance, while NewReno has the worst performance compared to HSTCP and TCP Hybla. The performance difference can be explained by the way that TCP protocols react when the loss events happen. For instance, different from HSTPC, Hybla, and NewReno, the window growth function of TCP Cubic is defined in real-time, independent from RTT, which can be extended in LEO satellite links. In addition, because Cubic increases its window size slowly when it is close to the previous saturation point (i.e., the last loss event), it makes Cubic's performance less efficient when the losses frequently occur (i.e., low SNR). With HSTCP, its performance is better because the modified congestion control mechanism is flexible with increase and decrease parameters. Finally, the performance of TCP variants is analyzed under the impact of turbulence conditions on the LEO satelliteto-vehicle channel at the last-mile link. Fig. 14 depicts the average TCP throughput versus atmospheric turbulence strength, represented as ground turbulence levels C 2 n (0). It is clear that the strong turbulence significantly deteriorates the TCP performance, especially for the Hybla and NewReno. For example, TCP Cubic can achieve the maximum throughput for the turbulence range from C 2 n (0) = 5 × 10 −15 m −2/3 to C 2 n (0) = 7 × 10 −14 m −2/3 while with TCP Hybla, it is only from C 2 n (0) = 5 × 10 −15 m −2/3 to C 2 n (0) = 10 −14 m −2/3 . In another example, when C 2 n (0) = 3×10 −13 m −2/3 as strong turbulence condition, TCP Cubic can achieve 400 Mbps, whereas TCP Hybla achieve only 50 Mbps. This throughput degradation by the effect of atmospheric turbulence is caused by the fact that when the turbulence strength increases, transmission losses happen more and more and reduces the TCP sending rate, which leads to performance degradation.

VI. CONCLUSION
We presented a cross-layer analysis for TCP performance over LEO satellite-assisted hybrid FSO/RF vehicular networks. A truncated SR-ARQ and RS code were employed to improve the link and physical layer performance, respectively. Using the proposed channel model and error-control method, a comprehensive end-to-end throughput model for several TCP variants, including NewReno, Hybla, Cubic, and HSTCP under the impact of both congestion losses in the Internet section and LEO satellite-based Hybrid FSO/RF transmission errors at the last-mile link, was developed. We quantitatively illustrated the impact of transmission errors at last-mile links, comparatively discussed throughput performance of TCP variants, and the selection of proper TCP for our networks. The results also revealed that Cubic is most suitable for the considered networks among the four TCP variants. The HSTCP and Hybla could be used; however, NewReno was inefficient in the high-speed satellite networks, especially in strong turbulence conditions. Finally, the Monte Carlo simulations were also performed to validate the correctness of the analysis model.