Single- versus Multi-Carrier Terahertz-Band Communications: A Comparative Study

The prospects of utilizing single-carrier (SC) and multi-carrier (MC) waveforms in future terahertz (THz)-band communication systems remain unresolved. On the one hand, the limited multi-path components at high frequencies result in frequency-flat channels that favor low-complexity wideband SC systems. On the other hand, frequency-dependent molecular absorption and transceiver characteristics and the existence of multi-path components in indoor sub-THz systems can still result in frequency-selective channels, favoring off-the-shelf MC schemes such as orthogonal frequency-division multiplexing (OFDM). Variations of SC/MC designs result in different THz spectrum utilization, but spectral efficiency is not the primary concern with substantial available bandwidths; baseband complexity, power efficiency, and hardware impairment constraints are predominant. This paper presents a comprehensive study of SC/MC modulations for THz communications, utilizing an accurate wideband THz channel model and highlighting the various performance and complexity trade-offs of the candidate schemes. Simulations demonstrate that discrete-Fourier-transform spread orthogonal time-frequency space (DFT-s-OTFS) achieves a lower peak-to-average power ratio (PAPR) than OFDM and OTFS and enhances immunity to THz impairments and Doppler spreads, but at an increased complexity cost. Moreover, DFT-s-OFDM is a promising candidate that increases robustness to THz impairments and phase noise (PHN) at a low PAPR and overall complexity.


I. INTRODUCTION
T HE successful deployment of millimeter-wave (mmWave) communications [1] has encouraged researchers to explore the last piece of available spectrum, the terahertz (THz) band over 0.3 − 10 THz, which promises to be an essential ingredient of future ultra-broadband wireless communications [2], [3].Moving towards beyond-fifth generation (B5G) and sixth-generation (6G) wireless networks [4], [5], a plethora of services are expected to be supported [6], such as ultralow latency communications, ubiquitous connectivity, and very high data rates (up to several terabits-per-second (Tbps)).Such features can be leveraged in novel use cases in fixed radio S. Tarboush is with Telecommunication Department, Higher Institute for Applied Sciences and Technology (HIAST), Damascus, Syria (e-mail: simon.w.tarboush@gmail.com).The rest of the authors are with the Department of Computer, Electrical and Mathematical Sciences and Engineering (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Makkah Province, Kingdom of Saudi Arabia, 23955-6900 (e-mail: hadi.sarieddeen@kaust.edu.sa;slim.alouini@kaust.edu.sa;tareq.alnaffouri@kaust.edu.sa).
This publication is based upon work supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No. ORA-CRG2021-4695.links, wireless local area networks, nano cells, or inter-chip communications.Furthermore, accurate localization, sensing, and imaging applications are promised in the THz band [7], [8].However, researchers should first overcome several challenges in THz materials and technologies (photonic and electronic) and the corresponding system designs and hardware complexity [9], [10].
The THz-band channel's peculiarities (frequency/distancedependency and sparsity) impose challenging constraints on the physical layer of future wireless standards.THz signals suffer from severe path loss, which limits the transmission distances to a few meters [11].However, long distance sub-THz communications (over hundreds of meters) are still feasible with high-gain antenna arrays [12].The frequencyselective molecular absorption further results in distancedependent spectrum fragmentation and shrinking (variablebandwidth transmission windows) [13].Hence, ultra-massive multiple-input multiple-output (UM-MIMO) antenna arrays and intelligent reflecting surfaces (IRSs) are essential for extending the THz communication range [13]- [15].Furthermore, since the line-of-sight (LoS) path dominates THz-band signal propagation, THz channels tend to be flat-fading.However, a few multi-path components might persist, especially in indoor scenarios, resulting in frequency-selective channels (FSCs) of coherence bandwidths of hundreds of megahertz (MHz) over medium communication distances [16].Therefore, THz multicarrier (MC) schemes retain scenario-specific benefits.
Designing efficient THz-specific waveforms is crucial for unleashing the THz-band's true capabilities.Because bandwidth and spectral efficiency (SE) are not yet a THz bottleneck; low complexity, robustness to hardware impairments and Doppler spreads, and high power efficiency are prioritized.The first sub-THz standard (IEEE 802.15.3d [17]) supports switched point-to-point connectivity with data rates exceeding 100 Gbps, offering two modes: (1) single-carrier (SC) modulation (long-range; high-rate) and (2) on-off keying (OOK) (low-complexity; short-range).OOK utilizes femtosecondlong pulses that could span an ultra-wideband THz spectrum [18].However, temporal broadening [16] and the challenging synchronization procedure question the feasibility of pulse-based modulation.IEEE 802.15.3d-compliant waveforms are proposed in [19], where novel pulse-shaping designs reduce out-of-band (OOB) emissions.Several other projects revisit the physical layer for future B5G sub-THz systems.Most notably, the BRAVE project [20] advocates for modified SCs schemes, such as continuous phase modulated singlecarrier frequency-division multiple-access (CPM SC-FDMA), constrained envelope CPM-SC, differential modulation (like differential phase-shift keying), SC with optimized polar modulation (robust to phase noise (PHN)) [21], and variations of spatial-and index-modulation [22]- [24].Block-based SC waveforms, such as discrete-Fourier-transform spread OFDM (DFT-s-OFDM) [25] can also be investigated.
A variety of THz MC schemes can be explored.In the simplest form, multiple (quasi)-orthogonal non-overlapping SC modulations can be combined with some form of carrier aggregation [26].Cyclic-prefix orthogonal frequencydivision multiplexing (CP-OFDM) is well investigated, but it is discouraged at THz [27], [28] due to its strong spectral leakage (high OOB emissions), unfavorable peak-to-average power ratio (PAPR) properties (limitations in state-of-theart THz power amplifiers (PAs) [29]), strict synchronization procedures, and high sensitivity to Doppler spread.Other MC schemes such as novel fifth-generation new-radio (5G-NR) filter-based candidates have their prospects and challenges.Such filtering is on the whole band in filtered-OFDM (f-OFDM) [30], per-subband (a set of contiguous subcarriers) in universal filtered multi-carrier (UFMC) [31], or per-subcarrier in offset quadrature amplitude modulation-based filter-bank multi-carrier (OQAM/FBMC) [32] and generalized frequencydivision multiplexing (GFDM) [33].Although filter-based schemes overcome some CP-OFDM limitations, reducing OOB emissions and enhancing SE, their high PAPR characteristics and increased implementation complexity can be prohibitive in Tbps baseband systems.For example, the single-tap equalizer is no longer sufficient with CP-free OQAM/FBMC, requiring more complex equalization.Other works propose windowed overlap-and-add OFDM (WOLA-OFDM) [34], or combinations such as OQAM/GFDM [35].
THz-specific multiple-access techniques are also emerging, such distance-adaptive MCs [27], hierarchical-bandwidth modulations [28], and distance-/frequency-dependent adaptive CP-OFDM [36], which optimize distance-dependent spectral window utilization.Other works, such as [37], develop a novel distance-adaptive absorption peak modulation mainly for THz covert communications by exploiting the unique properties of the THz spectrum (frequency-dependent molecular absorption) through dynamically modulating signals under the molecular absorption peaks.Moreover, the work in [38] focuses on the multi-band-based spectrum allocation with adaptive subband bandwidth to improve the SE of MC-enabled multiuser THz communications, where sub-bands with unequal bandwidths can be assigned to the users.Spatial-spread orthogonal frequency-division multiple-access (SS-OFDMA) is another THz MC candidate that realizes frequency-based beam spreading by allocating subcarriers for users in different directions [39].Similarly, beam-division multiple-access (BDMA) [40] schedules mutually non-overlapping beam subsets for users, followed by relaxed per-beam synchronization.Moreover, THz-band non-orthogonal multiple access (NOMA) techniques are argued to be feasible, despite the narrow beams that make user clustering difficult [41].Other conventional techniques that improve SE at a reduced power cost, PAPR, and transceiver complexity, are also being studied for THz communications, including spatial [15] and index modulation [42] paradigms.
Other novel waveforms target specific THz use cases and constraints.For instance, zero-crossing modulation [43] uses temporal oversampling and 1-bit quantization to relax hardware requirements, such as in the digital-to-analog converter (DAC) and analog-to-digital converter (ADC).Furthermore, orthogonal time-frequency space (OTFS) waveform [44] is tailored for time-variant (TV) channels and high Doppler spreads, which arise in high-speed THz communication scenarios such as vehicle-to-everything (V2X), drone, and ultrahigh-speed rail communications.OTFS is superior in block error rate performance to CP-OFDM when assuming mmWave LoS V2X channels [45]; also when accounting for oscillator PHN impairments [46].To meet the THz integrated sensing and communication (ISAC) requirements, the utilization of DFT-s-OFDM, with some modifications, is discussed in [47].Most recently, a novel scheme called DFT-s-OTFS is proposed [48], [49] to address the severe Doppler effects and PAPR challenges of THz ISAC.
Many performance metrics need to be considered when designing THz waveforms, such as bit error rate (BER), PAPR, and baseband computational complexity.Furthermore, hardware imperfections and radio frequency (RF) impairments critically impact THz waveform design, where candidate THz materials/hardware are still under development.Hardware imperfections include PA non-linearity, wideband in/quadrature-phase imbalance (IQI) [50], phase uncertainty in the phase-shifters (PSs) [51], and PHN (studied for SC schemes [52] and CP-OFDM [53] in sub-THz and THz [50] systems).THz channel-induced phenomena such as beam split and misalignment [54] are also critical, especially with UM-MIMO systems.Moreover, synchronization becomes more challenging with carrier frequency offset (CFO) and symboltiming offset (STO) at THz frequencies.Subcarrier spacing (SCS), its impact on PHN, and the design of phase-tracking reference signals are studied in [55], [56] to assess whether CP-OFDM and DFT-s-OFDM can support mmWave and sub-THz communications.Moreover, a THz SC frequency-domain equalization technique (SC-FDE) is developed in [50], and a pilot design strategy based on index modulation is proposed in [57].SC systems are found superior to CP-OFDM in mmWave systems [58] when taking into account the transmitter PA nonlinearities.For indoor THz scenarios, SC-FDMA with linear equalization is shown to be superior to CP-OFDM and SC with linear-/decision-feedback-equalization [59].
The literature lacks a holistic and fair comparative study of THz-band SC/MC schemes, and this work attempts to fill this gap.The main aim is to analyze a plethora of candidate waveforms to draw recommendations on the suitable waveforms for specific THz use cases.The main contributions of this paper are summarized as follows: • Studying the THz compatibility of multiple waveforms, namely, SC-FDE, CP-OFDM, DFT-s-OFDM, OTFS, DFT-s-OTFS, and OQAM/FBMC, adopting our newly developed accurate THz channel model/simulator (TeraMIMO [54]).• Promoting DFT-s-OFDM and DFT-s-OTFS as promising schemes for future B5G/6G networks.The remainder of this paper is organized as follows: Sec.II first introduces the system and channel models.Then, Sec.III presents several key performance indicators (KPIs) to compare different waveforms.Sec.IV details a general framework for analyzing the studied SC/MC waveforms.Afterward, extensive simulation results validate our analyses in Sec.V, where recommendations of suitable waveforms for specific scenarios are introduced.Sec.VI concludes the paper.Regarding notation, non-bold lower case, bold lower case, and bold upper case letters correspond to scalars, vectors, and matrices: [] denotes the th element of a and a[] and [, ] denote the th column and the (, )th element of A, respectively.I  is the identity matrix of size N, 0  , is a zero matrix of size N × M, and a  is a vector of size N.The superscripts (•) T , (•) * , (•) H , (•) −1 , and (•)  stand for the transpose, conjugate, conjugate transpose, inverse, and thpower functions, respectively.|•| is the absolute value (or set cardinality), diag( 0 ,  1 , . . .,   −1 ) is an N×N diagonal matrix of diagonal entries  0 ,  1 , . . .,   −1 , vec(A) is the vectorized matrix representation that stacks the columns of A in a single column, E(•) is the expectation operator, and Pr (•) is the probability density function.The notations ⊗, [•]  , ⟨ , ⟩, R (•), and  = √ −1 denote the Kronecker product, remainder modulo N, inner product, real part, and imaginary unit, respectively.The superscripts (t) and (r) denote transmitter (Tx) and receiver (Rx) parameters, respectively.N (,  2 ) is the distribution of a Gaussian random variable of mean  and variance  2 , CN (a, ) is the distribution of a complex Gaussian random vector of mean a and covariance matrix .The normalized -point DFT and IDFT matrices are denoted by F  and F H  , respectively.The used acronyms are summarized in Tables VI and VII.

II. SYSTEM AND CHANNEL MODEL
The main aim of this work is to evaluate the performance of candidate SC/MC waveforms in realistic THz settings, including massive antenna dimensions and ultra-wide bandwidths.We adopt the array-of-subarrays (AoSA) architecture of TeraMIMO [54], in which each subarray (SA) is composed of many antenna elements (AEs), as depicted in Fig. 1.AoSAs can mitigate THz hardware constraints and combat the limited communication distance problem using low-complexity beamforming [10].The model assumes  =   ×  SAs, and Q = Q × Q tightly-packed directional AEs per SA.Each AE is attached to a wideband THz analog PS of acceptable phase error, return loss, and insertion loss [51] (such PSs can be implemented using graphene transmission lines in plasmonic solutions [60]).The AoSAs are assumed to realize sub-connected hybrid beamforming, with analog beamforming over the AEs of each SA.Each RF chain thus drives one disjoint SA, reducing power consumption and complexity; the SAs provide the spatial diversity gain.
For SCs, this work considers SC-FDE, DFT-s-OFDM, and DFT-s-OTFS.For MCs, the work investigates CP-OFDM, OQAM/FBMC, and OTFS, assuming -subcarriers.The thsubcarrier Rx signal is where assuming perfect time and frequency synchronization (no STO or CFO), the received signal is processed using an RF combining matrix, , and a digital baseband is the additive white Gaussian noise (AWGN) vector of independently distributed CN (0  (r) Q(r) ,  2   I  (r) Q(r) ) elements of noise power  2  .Note that  tot =  st × , where  st ≤  (t) is the number of data streams ( (t) is also the number of Tx RF chains), and  is the number of MC symbols per frame.
The UM-MIMO channel matrix, , represents the overall complex channel at the th-subcarrier; assuming a time-invariant (TIV)-FSC, H can be expressed as where denotes the channel response between the  (t) th Tx SA and the  (r) th Rx SA.Further details on the channel model can be found in [54] and equations therein (Eqs.(15) and (16) in the delay and frequency domains, respectively).The discretetime Tx complex baseband signal at the th-subcarrier is where P BB [] ∈ C  (t) × tot is the digital baseband precoding matrix per subcarrier, is the analog RF beamforming matrix, and s[] =  1 ,  2 , . . .,   tot T ∈ X  tot ×1 is the information-bearing symbol vector consisting of data symbols drawn from a quadrature amplitude modulation (QAM) constellation, X.We assume normalized symbols, tot I  tot , where  t is the average total Tx power over -subcarriers.We adopt this model for simulating THz-specific beam-split effects.For other scenarios, the system model reduces to a single-input single-output (SISO) model.We adapt the TeraMIMO THz channel simulator [54] to account for diverse scenarios.

III. KEY PERFORMANCE INDICATORS FOR SC/MC WAVEFORM PERFORMANCE EVALUATION
Choosing a suitable waveform is a challenging task that depends on several conflicting communication system performance requirements and design criteria.For fairness of comparison, we consider the transmission of  ×  complex symbols of bandwidth  = Δ  , with SCS Δ  and frame duration   = , for both SC and MC schemes; the signal  Similarly, the delay-Doppler (DD) plane is discretized into where 1    , 1 Δ  define the Doppler and delay domain resolutions, respectively.The maximum supported Doppler and delay spreads are  max =     < 1/ and max <1/Δ , respectively, where  is the user velocity,  is the speed of light, and   is the carrier frequency.
The discrete-time representation of (6) (Nyquist sampling at  s = 1  s = ; limited by ADC/DAC specifications) is where  = {0, 1, . . .,  −1}.The PAPR of a discrete-time signal  [] over a finite observation period  per is expressed as a random variable [61] PAPR the statistical behavior of which can be estimated through numerical simulations.However, the PAPR for the discretetime baseband signal,  [], is noticeably lower than the PAPR of the continuous-time baseband signal, ().Thus, we perform -times interpolation (oversampling), where  ≥ 4, to obtain a close PAPR to that of ().We characterize the complementary cumulative distribution function (CCDF) of PAPR.In the remainder of this section, we detail various KPIs and introduce several schemes, namely, CP-OFDM, DFT-s-OFDM, SC-FDE, OQAM/FBMC, OTFS, and DFT-s-OTFS, as illustrated in Fig. 3.

A. Spectral Efficiency and Transmit Time Interval Latency
The SE (bits/sec/Hz) is an essential indicator of throughput and achievable rate for a given bandwidth.Since the THz band promises huge available bandwidths, unlike below 6 GHz communications, SE is not a primary concern.However, SE is still important for data demanding use cases, such as THzenabled holographic video meeting, augmented reality (AR), and virtual reality (VR).Similarly, the TTI latency, defined as the minimum time to transmit each packet of data [62], is waveform-dependant (overlapping in OQAM/FBMC lengthens the frame duration, for example).Nevertheless, the ultrabroadband THz bandwidth () ensures a very small sampling period ( s ).Note that physical layer latency includes other delay components [62], such as the signal processing time of the equalizer and channel encoder/decoder (but are not included in our latency definition and computations).Signal processing latency is more critical at THz frequencies and depends on the used waveform.

B. Power Spectral Density and Out-of-Band Emissions
The power spectral density (PSD) and OOB emissions follow strict standard regulations to meet spectrum mask requirements.For example, the international telecommunications union (ITU) radio regulation 5.340 prohibits transmissions   in ten passive bands over 100 − 252 GHz to protect deep space observatories and satellite sensors [63], resulting in a maximum available contiguous bandwidth of 23 GHz.OOB emissions are also critical in integrated space-air-ground THz networks.It is thus important to study OOB emissions-induced interference to neighboring systems and among multiple users, highlighting the role of carrier-aggregation techniques.The severity of OOB emissions is dictated by bandwidth, required SE, and neighboring co-operating systems.The waveform Tx spectrums in the IEEE 802.15.3d sub-THz standard are described for different bandwidths in [17].

C. Transceiver Complexity
The computational complexity of the studied SC/MC transceivers is arguably the most important KPI to consider, given the limited processing capabilities at Tbps and the need for low-cost and low-power solutions.Without loss of generality, the computations only consider the number of real multiplications per unit of time in the modulation, demodulation, and equalization processes.The complexity of channel coding and decoding are important in their own right but not included in our study.
The number of real multiplications in an -point fast Fourier transform (FFT)/inverse FFT (IFFT) (split-radix algorithm) is [64] As illustrated in Fig. 3(a), IFFT/FFT is followed by rectangular pulse-shaping in CP-OFDM, resulting in a complexity ( COFDM ) and number of multiplications per unit time (C OFDM ): Furthermore, in the case of DFT-s-OFDM (Fig. 3(b)), the equalization complexity remains the same, while an additional precoding FFT/IFFT block in Tx/Rx results in SC-FDE enjoys relatively low Tx complexity as symbols are directly transmitted after CP (Fig. 3(c)).However, with FFT/IFFT at Rx, the overall transceiver complexity is that of CP-OFDM (complexity shift from Tx to Rx); COFDM = CSCFDE .For OQAM/FBMC, we consider the direct form polyphase prototype filter realization, with a filter length of  p = × ( is the pulse-shaping overlapping factor).In general, a multitap channel equalization per subcarrier with an equalizer of length  eq is used for this waveform.Accounting for OQAM, phase offsets (for linear phase filters), IFFT, filtering, 50% overlapping, and equalization, CFBMC , and C FBMC add up to [32] where the first multiplication by a factor of 2 accounts for complex-valued QAM symbols that are separated into two real-valued symbols.The OQAM/FBMC complexity is slightly dependant on ().Note that we only assume a onetap equalizer in simulations ( eq = 1).OQAM/FBMC is clearly more complex than CP-OFDM.While for OTFS 1 , based 1 In this work, we use OTFS with rectangular Tx and Rx windowing and pulse-shaping, and consider one CP per frame (× symbols), which results in a low-complexity implementation [65].This setting is different from the OTFS setting in [44] with complexity C (t) /(r) OTFS = 2 CFFT (  ) + CFFT (  ) + 4(  +  CP / ) and the OFDM-based OTFS setting in [46] which adds one CP every  blocks (each block is of length ).See Sec.IV-E for more details.
on (42), the complexity and number of multiplications per unit time are expressed as Hence, COTFS /C OTFS are functions of both  and 2 .Moreover, in the case of DFT-s-OTFS (Fig. 3(f)), following the same logic of DFT-s-OFDM: From ( 14) and ( 15), we note that the complexities of OTFS and DFT-s-OTFS are dominated by DD equalization.

D. Peak to Average Power Ratio
PAPR is an essential and important KPI for sub-THz/THz communications as it dictates the Tx power efficiency, which affects energy efficiency, link budget, and coverage.Large amplitude fluctuations in high PAPR lead to spectral regrowth and non-linear distortion; an output back-off is thus needed to retain the linear PA region, reducing power efficiency.Processing ultra-wide bandwidth sub-THz/THz signals is also very power consuming.Moreover, The saturated output power ( sat ) recordings in state-of-the-art THz PAs [29] reveal limited achievable output power that decreases drastically with operating frequency (the trend lines for different technologies follow a stepper increasing slope).For example,  sat ≈ 20, 23 dBm and 28 dBm at   = 100 GHz for CMOS, SiGe BiCMOS, and InP technologies, respectively.Furthermore, high PAPR necessitates high dynamic-range THz ADCs of low signal-to-quantization-noise ratios, which are not cost-and power-efficient [67].The ADC signal-to-noise and distortion ratio (SNDR) decreases by increasing the Nyquist sampling rate.However, the energy per conversion step increases linearly with frequencies beyond 100 MHz [68].For example, for an ADC of  s = 100 GHz, the power consumption and SNDR are approximately 0.3 Watt (very high) and 35 dB (very low), respectively.The PAPR CCDF of one CP-OFDM symbol ( = 1) is expressed as [61] for a PAPR threshold  th .Furthermore, the closed-form approximation of the PAPR CCDF of OQAM/FBMC in [61] reveals higher PAPR values compared to CP-OFDM due to per-subcarrier filtering.In [69], the PAPR CCDF of discretetime OTFS (no oversampling and rectangular pulse-shaping) is approximated for high values of  as The work in [69] shows that the PAPR CCDF of OTFS increases with  as the probability of having large peaks increases.However, the maximum OTFS PAPR is upperbounded by a linear function of  [69], unlike TF MC waveforms, such as OFDM, where the PAPR grows linearly with the number of subcarriers .Note that generalizing (16) over the entire frame approximates (17); OTFS provides significantly better PAPR than OFDM for  < .Thus, OTFS PAPR is not energy-efficienct for THz system design.This problem is solved by using a DFT spreading block with OTFS in the uplink [48], where the PAPR upper bound grows linearly with the DFT spreading size N. Since N is less than the number of the OTFS symbols in a frame () and the number of subcarriers () in a wideband THz channel, ( N <  < ), we expect that DFT-s-OTFS can achieve lower PAPR than both OTFS and OFDM.Thus, DFT-s-OTFS promises to be a more energy-efficient solution for future THz communications.Furthermore, other SCs inherently result in low PAPR, whether in SC-FDE or DFT-s-OFDM, due to DFT-precoding.

E. Robustness to Hardware Impairments
THz-band transceivers are substantially more vulnerable to conventional RF impairments than microwave and mmWave transceivers.Therefore, the waveform's robustness to impairments is a critical KPI.We focus on two important hardware impairments.
1) Phase Noise: Due to time-domain instability, the local oscillator (LO) output can be a phase-modulated tone.PHN in THz devices (that are not yet mature) has more severe consequences than in microwave or mmWave devices.The motivation to use low-cost devices for THz communications is also limiting, where achieving low PHN requires advanced complex techniques such as phase-locked loops [70].In particular, if the THz LO signal is generated using a low-cost low-frequency oscillator followed by frequency multipliers, the required multiplication factor, , is relatively high, which further increases the PHN power by a factor of  2 .Therefore, PHN increases by 6 dB for every doubling of the oscillation frequency [70].Furthermore, PHN causes significant performance degradation and reduces the effective signal-tointerference plus noise ratio (SINR) at the Rx, limiting both data rate and BER.Unfortunately, increasing the signal-tonoise ratio (SNR) does not mitigate the PHN effects.Therefore, optimized SC schemes and non-coherent modulations that are inherently robust to PHN are argued to be good candidates for sub-THz communications [20].
There are several approaches for modeling PHN, two of which are most prominent.The first is a correlated model that uses the superposition of Wiener (Gaussian randomwalk) and Gaussian processes; the second is an uncorrelated model that considers only a Gaussian noise reflecting the white PHN floor.The appropriate choice of PHN models for sub-THz band is addressed in [52], where it is argued that the uncorrelated Gaussian PHN model should be favored if the system bandwidth () is large enough compared to the oscillator corner frequency (  cor ): Therefore, the Rx signal, at instant , is expressed as where * denotes linear convolution, and  (t) [],  (r) [] are discrete stochastic processes representing Tx, Rx LO PHN, respectively.The correlated model is defined as where the Wiener and Gaussian PHN models are expressed, respectively, as The uncorrelated PHN implies . The variances are defined as  2 w = 4 2  2  and  2 g =  0 /, where  0 and  2 are the PHN levels that can be evaluated from the measured PHN PSD, the corner frequency is (  cor =  2 / 0 ),  = 1/ is the modulated signal duration, and  is the system bandwidth [71].Thus, we can note a strong dependence of system performance on bandwidth.
2) Wideband IQI: The frequency-dependent wideband IQI is another dominant hardware impairment in THz transceivers operating over ultra-wide bandwidths.Efficient signal processing techniques have been extensively studied for narrowband IQI at both Tx (via digital pre-distortion) and Rx.However, only a few works address the wideband IQI model in the THzband, such as [50] for SC-FDE.Furthermore, wideband PA non-linearity models still lack in the THz literature.Extensive research to study such impairments is crucial.However, the existing models for wideband systems operating at 60 GHz [72] can provide a preliminary analysis and evaluation of THz candidates.

F. Robustness to THz-specific Impairments
THz-specific channel-induced impairments should also be considered when studying candidate waveforms.For example, THz propagation suffers from misalignment between Tx and Rx, which is highly probable given the narrow nature of the THz beams [54].Another THz channel characteristic is the spherical wave propagation model (SWM), which should be accounted for at relatively short communication distances [54].More importantly, a beam split effect arises in wideband UM-MIMO beamforming.In particular, the difference between the carrier and center frequencies,   and   , results in THz path components squinting into different spatial directions at different subcarriers, causing severe array gain loss [54].Such beam split is mainly caused by frequency-independent delays in analog-beamforming PSs.Furthermore, large UM-MIMO THz arrays result in very narrow beamwidths that worsen this effect.Several beam-split mitigation methods are proposed in the literature, such as delay-phase precoding in [73], where CP-OFDM is assumed.However, the effect of beam split on other SC/MC schemes is not yet studied.This work only studies the impairment caused by beam split as it is more relevant to waveform design than misalignment and SWM.

IV. CANDIDATE THZ-BAND SC/MC WAVEFORMS
In the upcoming subsections, we aim to mathematically describe the modulation and demodulation steps for each candidate SC/MC waveform, highlight the design procedure, and link it with THz band system parameters.

A. CP-OFDM
The discrete-time Tx OFDM signal is derived from (8) ( = 1) using rectangular pulse-shaping: To combat inter-symbol interference (ISI) in a time-dispersive wireless channel of lengh  ch =  rms / s , where  rms is the root mean square (RMS) delay spread, a guard interval of  CP ≥  ch samples is added to the Tx signal.The CP-OFDM signal can thus be expressed as where where assuming perfect time and frequency synchronization, H s is an  ×  circular convolution matrix of band-diagonal structure built upon h s , and n s ∼CN (0,  2  I  ) is the AWGN vector.Note that the actual transmission is expressed as ȳOFDM = Hs xOFDM + ns , where Hs ∈ C (  t + ch −1)× t is derived from H s , and ns ∼ CN (0,  2  I  t + ch −1 ).The signal is then processed by a DFT block F  .Equalization can be performed using zero-forcing (ZF) or minimum mean-squared error (MMSE), with corresponding equalization matrices where   is the signal power.The Tx symbol estimates are retrieved as dTF  =  (EF  y OFDM ), where  (•) maps an equalized symbol to the closest symbol in X.Note that (24) can be generalized to express the Tx CP-OFDM frame ( symbols) as Designing a CP-OFDM system requires tuning many parameters such as the number of subcarriers (), the CP duration ( CP ), and the SCS (Δ  ).Such parameters are chosen such that where  coh = 1 5 rms is the coherence bandwidth and  coh = √︃ max is the coherence time.The SCS satisfies (29) to ensure orthogonality and maximize SE.
The SCS choice also affects TTI latency, PAPR, complexity, and equalization performance.In particular, the SCS provides a trade-off between CP overhead, sensitivity to Doppler spread, and robustness to hardware imperfections.The CP length is also a critical design parameter, where larger  CP relaxes time synchronization constraints caused by STO, but also at the expense of larger CP overhead (decreased SE).Furthermore, the number of subcarriers () impacts the PAPR performance ( 16) and the FFT/IFFT complexity (Sec.III-C).
The transmission spectra and molecular absorption dictate the available bandwidth in THz LoS scenarios [54].However, in indoor THz scenarios, the channel can be LoS-dominant and non-LoS (NLoS)-assisted, or only NLoS (multi-path).Based on the coherence bandwidth, for a given communication distance, we decide on the corresponding design parameters of a frequency-flat channel or FSC per subcarrier.For example, for a communication distance of 3 m, in the sub-THz band (   = 0.3 THz),  coh = 1 GHz; in the THz band (   = 0.9 THz),  coh ≈ 5 GHz [16].
We list in Table I some of the expected CP-OFDM parameters for sub-THz/THz band communications, derived using (29) and based on  rms values from [16], alongside parameters adopted in both 4G-LTE (below 6 GHz) and 5G-NR (below 6 GHz and mmWaves).In a nutshell, CP-OFDM enjoys a relatively low-complexity implementation (using FFT), is robust to multi-path fading, and uses a simple FDE method (singletap equalizer for a broadband FSC).However, the resultant high PAPR is challenging for power-limited sub-THz/THz communications.Moreover, the CP-OFDM time tolerance for symbol synchronization is very low (order of nanoseconds) due to the expected small values of  CP and  rms .

B. DFT-s-OFDM
DFT-s-OFDM, also known as precoded OFDM, is adopted in 4G-LTE/5G-NR uplink and is a promising candidate for THz communications.The use of a DFT-block at the Tx reduces the PAPR and retains all SC benefits, albeit at a marginal complexity cost.DFT-s-OFDM thus aims at reducing power consumption and PA costs at user terminal.When data symbol blocks are assigned to different users, DFT-s-OFDM reduces to SC-FDMA in multi-user scenarios.
As illustrated in Fig. 3(b), data symbols are first spread in DFT-precoding; the outputs are the complex symbols that modulate the OFDM subcarriers.For a selection of M ≤  subcarriers to be modulated.The Tx signal is expressed as where M , M is a mapping matrix between data symbols and the M active subcarriers (zero insertion at  − M unused subcarriers).The mapping can be localized or distributed.In the localized mode, M , M = [I M , 0 M, − M ] T , and the DFT outputs are directly mapped to a subset of consecutive subcarriers.In the distributed mode, the DFT outputs are assigned to non-continuous subcarriers over the entire bandwidth.The additional need for signaling, pilots, and guard bands (in multiple access scenarios) in the distributed mode increases the system complexity, whereas the straightforward implementation of equal SCS in the localized mode is favorable.

C. SC-FDE
A promising alternative to CP-OFDM is SC-FDE, which combines the benefits of CP and FDE, and has low PAPR due to low envelope variations.Unlike in CP-OFDM, where each data symbol is allocated a small bandwidth over a long symbol duration, in SC-FDE, data symbols are assigned to a single large bandwidth with short symbol durations.For the same CP-OFDM symbol duration, the SC-FDE Tx signal, containing  symbols, can be expressed as The remainder transmission, equalization, and demodulation stages are similar to those of CP-OFDM, as shown in Fig. 3(c).
The SC-FDE synchronization algorithms are also very similar to those of CP-OFDM.Furthermore, the spectral shape of the SC-FDE waveform is determined by the Tx pulse-shaping, used DAC, and RF filtering stages.It is worth noting that the choice of pulse-shaping affects the PAPR, OOB emissions, complexity, and immunity to hardware impairments.

D. OQAM/FBMC
OQAM/FBMC is another promising waveform candidate, especially for cognitive radio (CR) and dynamic/intelligent spectrum sharing applications.OQAM/FBMC offers high SE (no need for CP), low OOB emissions levels, and low sensitivity to CFO.Furthermore, by using a per-subcarrier well-localized pulse-shaping filter in both time and frequency (such as PHYDYAS [74]), OQAM/FBMC supports enhanced synchronization procedures.However, such benefits come at the cost of limited integration with MIMO systems (maintaining real orthogonality in OQAM complicates precoder design [75]), higher PAPR compared to OFDM (due to subcarrier filtering), and higher complexity (especially in the equalizer as there is no CP).Given the importance of such KPIs at high frequencies, OQAM/FBMC is not a good candidate for THz communications.
The direct form of an OQAM/FBMC system is illustrated in Fig. 3(d), consisting of OQAM pre-processing, a synthesis filter bank (SFB), an analysis filter bank (AFB), and OQAM post-processing.We assume a low-complexity implementation based on a polyphase filter structure (PHYDYAS with overlapping factor ) and FFT, as described in [74] (Figures (2- 7) and (2-8)).OQAM/FBMC satisfies the real orthogonality condition, R ⟨ FBMC  1 , 1 (),  FBMC  2 , 2 ()⟩ =  ( 2 − 1 ), ( 2 − 1 ) , instead of complex orthogonality.Thus, the useful symbol time still satisfies Δ  = 1/ u , but the symbol duration is  =  u /2.The Tx signal can be derived from ( 6) by adding to the Tx basis pulse in (7) a phase shift,  , =  2 (+): Such a phase shift transfers the induced interference between symbols to the imaginary domain [76].The resultant basis pulse in ( 32) is a frequency-and time-shifted version of the prototype filter  FBMC tx ().Furthermore, the prototype filter is designed using the frequency-sampling technique, with (2−1) non-zero frequency-domain samples for an overlapping factor .For filter of length  p and coefficients  []'s (defined in [74]), the impulse response is Then, the discrete-time Tx signal can be express as where the time interval is − u /2 ≤  <  u /2+ ( −1) and G syn ∈ C  t × is the Tx matrix that contains the basis pulse- defined as At the Rx, the analysis filter, G ana = G H syn , is used in matchedfilter decoding.

E. OTFS
The recently proposed OTFS waveform [44] is tailored for high-Doppler doubly-selective channels, typically arising in V2X communications.Unlike the other waveforms that modulate data in the TF domain, OTFS modulates data in the DD domain, transforming the TV channel in TF into a 2D quasi-TIV channel in DD.The corresponding transmission frame symbols experience a nearly constant channel gain [77], making OTFS a promising solution in high-Doppler multi-path channels, exploiting the full diversity of TV-FSC and providing substantial delay and Doppler resilience [78].OTFS is superior to CP-OFDM in this context.
OTFS modulation consists of two main blocks, OTFS transform and Heisenberg transform, as illustrated in Fig. 3(e).Furthermore, OTFS transform involves two stages, inverse symplectic finite Fourier transform (ISFFT) and windowing.ISFFT maps data symbols  DD [, ] in the DD domain to samples  TF [, ] in the TF domain as follows [77]  TF [, ] = A closer look into (37) reveals that the ISFFT of D DD is equivalent to an -point DFT and an -point IDFT of the columns and rows of D DD , respectively.Subsequently, (37) can be expressed in matrix and vectorized forms as The OTFS transform applies a Tx window  tx [, ] to the TF signal in (37).Let U tx = diag( tx [, ]) ∈ C × and assume rectangular windows for both Tx and Rx (U tx = U rx = I  ), the OTFS transform output is expressed as Heisenberg transform then forms the time-domain Tx signal; combining (6), (7), and (39) The Tx ( tx ()) and Rx ( rx ()) pulses ideally satisfy the bi-orthogonality condition [44], although not practical.Let be formed from samples of  tx (); G rx similarly defined (assuming rectangular pulse-shaping G tx = G rx = I  [77], then DTF = D TF ).We can restructure (40) in matrix and vectorized forms as If  tx () is a rectangle pulse-shape of duration , (40) reduces to IDFT, and for  = 1, the inner box of Fig. 3(e) is CP-OFDM.Therefore, one OTFS frame is effectively an ISFFT over  consecutive independent OFDM symbols with  subcarriers.As a spectral-efficient solution, we assume one CP for the entire OTFS frame, of the same duration  CP as in previous waveforms.The Rx signal can be expressed as (43) where  and  are delay and Doppler variables, respectively, and ℎ DD (, ) is the DD channel response that is typically sparse [77] (a small number of reflectors with associated delays and Doppler shifts; limited number of multi-paths) and can be expressed as [77] ℎ DD (, ) where  P is the number of paths, δ(•) is the Dirac delta function, and ℎ  ,   , and   are the th-path gain, delay, and Doppler shift, respectively: for integers    ,    (indexes of the lattice in ( 5)).Note that the assumptions in ( 45) can be further extended to involve fractional Doppler shifts, which result in additional inter-Doppler interference.The resultant performance degradation can be compensated in the equalizer, using the messagepassing algorithm [77], for example.We can ignore fractional delays in a typical wideband THz system since the resolution is sufficient to approximate the path delay to the nearest point in the DD lattice [77].The Rx signal, after discarding CP, is sampled as where y OTFS ∈ C ×1 , n OTFS ∼ CN (0,  2  I  ), and H DD ∈ C × is the channel matrix with     ∈ R × being the delay matrix, a forward cyclic shifted permutation of  of delay    (    =1 =  and     =0 = I  ), and     ∈ C × is the Doppler shift matrix which modulates the Tx signal with a carrier at frequency    , where The Rx signal is then transformed into TF using Wigner transform, which match filters  OTFS () with an Rx pulse shape,  rx (), and samples it at the lattice points defined in (4).The Wigner transform is given by We can express (50) (similar to ( 41)) after building Y OTFS ∈ C × from y OTFS 's in (47) as where Y TF ∈ C × consists of elements  TF [, ].The Rx windowing operation is similar to (39).Thus, ỹTF = vec( ỸTF ) = U rx y TF (in our case U rx = I  ).Then, the TF domain signal, ỹTF [, ] =  TF [, ], is mapped back to the DD domain using the symplectic finite Fourier transform (SFFT) as where We can write (53), after substituting ( 42) in (47), in a vectorized form where H eff DD denotes the effective channel matrix in the DD domain, and ñOTFS is the modified noise vector.OTFS equalization and detection can be applied directly on the vectorized form in (54), where message passing is shown to be efficient [77].However, we only consider the linear equalizers ZF/MMSE [79] (not the low-complexity version in [79]) for a fair comparison with other candidate waveforms.
Note that  determines the delay resolution and the channel's maximum supported Doppler spread ( max ) for a given bandwidth ();  dictates Doppler resolution and latency (  = ).OTFS system design parameters are thus related (we only choose three from Δ  , , , ), where Δ  = / = 1/ is chosen such that Using more subcarriers () results in smaller SCS (Δ  ) for a fixed , which in turn results in a longer slot duration ().
The OTFS design should thus observe the maximum latency constraints of novel use cases.Furthermore, the OTFS PAPR, complexity, and decoding delay are proportional to , so lower  values are favored.However, increasing frame size (large ) results in enhanced BER performance [78] (higer diversity).Therefore, a careful trade-off between latency, PAPR, complexity, and performance is crucial with OTFS.
Table II presents maximum Doppler spread ( max ) values in different bands, from below 6 GHz to THz.The noted severe changes in  max impose many challenges on both waveform design and receiver components (automatic frequency control range and synchronization).Frequency synchronization in the presence of large CFO (tens/hundreds of KHz) is challenging for CP-OFDM, even for low user mobility, where complex circuits are required at the Rx side.The resilience of OTFS to CFO and Doppler spreads reduces the need for complex wideband automatic frequency control range circuits, which is much needed in THz communications.

F. DFT-s-OTFS
DFT-spread-OTFS [48], [49] is recently proposed for THz ISAC to improve OTFS's PAPR characteristics and enhance the robustness to Doppler effects.Although this waveform seems to be a promising candidate for many THz applications, a detailed analysis of complexity, SE, TTI latency, and robustness to PHN and THz-specific impairments is still lacking.We conduct this analysis through this work to draw a fair conclusion.
The block diagram of DFT-s-OTFS is illustrated in Fig. 3(f).One DFT-s-OTFS data frame contains the same number of symbols in an OTFS frame ().However, for the case of one user in the uplink, only N data symbols ( N ≤ ) are first spread using DFT-precoding, similar to DFT-s-OFDM, followed by a DD mapping.The mapping is expressed via M DD , N of size ()×( N), which forms the data frame by concatenating the DFT-spread data into N points and zeropadding on the remaining points ( − N) to form the DD lattice 3 .Then, the same operations of OTFS Tx are applied.Thus, the data matrix, D DD of (38), is expressed as We can perform DD domain equalization for delay-Doppler domain signal estimation first, using linear equalizers (MMSE or ZF), and then perform N-point IDFT to obtain the Tx symbols [48].Other low-complexity solutions in [48], [49] do not guarantee a fair comparison with waveforms that use linear equalizer.As illustrated in Sec.III-D, the maximum PAPR of DFT-s-OTFS is limited by N; an enhanced PAPR performance compared to both OTFS and CP-OFDM [48].Moreover, due to the potential full TF channel diversity, OTFS and DFT-s-OTFS outperform reference MC schemes.All that emphasizes the prospects of DFT-s-OTFS in emerging V2X use cases in THz-enabled B5G/6G.However, the price to pay is in increased Rx detection and DD channel estimation complexity, as illustrated in Sec.III-C, especially in the presence of fractional Doppler [80].

V. SIMULATION RESULTS AND DISCUSSION
This section presents the results of extensive simulations investigating relevant waveform KPIs under realistic THz conditions.The default simulation settings are listed in Table III (modifications are declared subsequently and channel parameters are taken form [54]).We list in Table V a summary of the waveform comparisons under the studied KPIs.
Table IV compares the normalized SE and TTI latency of the studied schemes, assuming a fixed modulation order of log 2 (|X|), and a frame of  symbols; Fig. 4 further plots the normalized SE values versus the number of subcarriers ().Assume a target of 10 Gbps at a system bandwidth of  = 10 GHz, with  CP = 48.OQAM/FBMC achieves high normalized SE for large  (asymptotic normalized SE of 1 as  goes to infinity; absence of CP) and low normalized SE for short frames (per-subcarrier pulse-shaping extends frame duration by  − 1/2).OTFS achieves the best normalized SE performance, outperforming both CP-OFDM and SC-FDE (mainly due to CP overhead).An additional SE loss is introduced in DFT-s-OFDM where only M ≤  symbols allocated over  subcarriers (30).Moreover, using only N  data symbols ( N ≤ ), the DFT-s-OTFS waveform results in a similar SE loss (56).However, such loss for DFT-s-OFDM is negligible in multi-user scenarios as vacant subcarriers can be allocated to other users, and in DFT-s-OTFS as multi-user can be multiplexed along the Doppler axis (see Fig. 2).The normalized SE of CP-OFDM increases with , where a CP-OFDM symbol transmits  QAM symbols over  subcarriers of duration  =  u +  CP , repeated for  symbols per frame.However, the normalized SE of CP-OFDM does not reach 1 bit/sec/Hz and is independent of , which is an important feature for controlling the TTI latency.Note that SC-FDE has the same normalized SE as CP-OFDM; we thus exclude its results (one CP for every  QAM symbols (31)).
Regarding TTI latency, Fig. 5 illustrates that OTFS has lower latency than CP-OFDM.DFT-s-OTFS has the same OTFS latency, while SC-FDE and DFT-s-OFDM have the same latency as CP-OFDM.OQAM/FBMC has lower latency for smaller  and  values, but higher latency for long frame duration.The OTFS advantages, in terms of normalized SE and TTI latency, are arguably due to the use of a single CP per frame of  symbols, which can be achieved in other waveforms by considering a longer frame of the same number of  symbols and CP length.However, it is important to emphasize that the channel is imposed, and consequently, both  rms and  coh control the maximum symbol and CP duration, as shown for CP-OFDM in (29).

B. PSD and OOB emissions
The OOB emissions and the impact of adjacent channel leakage are studied in Fig. 6 by comparing the PSD of two users utilizing various waveforms under the IEEE Tx spectral mask specifications (Sec.13.1.3[17]).Each user occupies a bandwidth of  = 2.16 GHz, with   = 305.64GHz for the first user (ID = 25 [17]) and the second user is assigned the next channel.The results confirm that OQAM/FBMC has the best frequency localization, thanks to its pulse-shaping filter on each subcarrier; other waveforms respect the specified mask.However, it is expected that a much lower spectral emission mask will be specified for B5G/6G networks.Thus, all waveforms other than OQAM/FBMC would show a high interference level.Note that without additional pulse shaping, the OOB emissions performances of OTFS, DFT-s-OTFS,   and SC-FDE are those of CP-OFDM and DFTs-OFDM; thus excluded from Fig. 6.To ensure a fair comparison with OTFS and DFT-s-OTFS, we concatenate  = 16 MC/SC signals in a frame for the previous waveforms.Although DFT-s-OFDM suffers from high OOB emissions, variants such as zero-tail DFT-s-OFDM [25] could overcome this limitation.Furthermore, the performance in the presence of a pulseshaping filter depends on parameters such as the roll-off factor, oversampling ratio, and used filter length, raised-cosine (RC) for example, (out of this work's scope).Therefore, including a guard band is crucial to achieving the required OOB emissions and interference levels; a careful trade-off between OOB emissions and SE needs to be maintained.

C. Complexity Analysis
A significant part of complexity comes from the equalizer process.It is clear from the analysis in Sec.III-C that the DD equalization complexity is multiple orders greater than that of other SC and MC schemes, where the state-of-the-art OTFS equalizers are much more complex than the CP-OFDM ZF/MMSE equalizers.Thus, OTFS and DFT-s-OTFS have a much higher complexity C than other SC/MC waveforms.We aim to compare the remaining two components of the complexity (modulation and demodulation).The computational complexities of different waveforms (Sec.III-C) are compared in Fig. 7, for a maximum  s = 1 GHz (due to hardware constraints) but without taking into account the equalization complexity.CP-OFDM has the same overall complexity as SC-FDE; CP-OFDM and SC-FDE enjoy lower complexities than both DFT-s-OFDM (due to additional DFT/IDFT precoding blocks in Tx/Rx) and OQAM/FBMC (due to pulseshaping, overlapping, and OQAM processing that doubles the complexity).OQAM/FBMC is the most complex scheme compared to the previous schemes in that sense.When neglecting the equalization complexity for both OTFS and DFT-s-OTFS, Fig. 7 shows these two waveforms to be the least complex (complexity that is a function of  and N; not ).Thus, a viable OTFS solution with low-complexity implementation and good performance, such as unitary approximate message passing (UAMP) or variational Bayes (VB) detection, is a must [66].A low-complexity solution, compared to linear MMSE, is proposed in [49] for DFT-s-OTFS, based on the conjugate gradient method, which has an overall complexity of O (log 2 ()); the solution still results in high complexity compared to TF MC schemes.

D. CCDF of PAPR
A comparison among the waveforms using different settings and assuming Nyquist sampling is illusrrated in Fig. 8.All schemes (except OQAM/FBMC) apply rectangular pulseshaping.SC-FDE achieves the best performance (lowest PAPR), followed by DFT-s-OFDM and DFT-s-OTFS, which outperform OTFS, CP-OFDM, and OQAM/FBMC.In particular, DFT-s-OFDM with M = 8 and DFT-s-OTFS with N = 8 ).Note that, as expected from analysis in Sec.III-D, OTFS shows good characteristics only for small .OQAM/FBMC is worst performing (0.9 dB worse than CP-OFDM at a CCDF of 10 −3 ) due to inherent per-subcarrier pulse-shaping.The CP-OFDM simulations are in agreement with approximation (17).However, this is not the case for OTFS as the theoretical bound in ( 17) is only valid for high  values [69].
The results of detailed analyses of waveform's PAPR CCDFs are illustrated in Fig. 9 for different scenarios assuming both Nyquist sampling and oversampling ( = 4) to provide accurate conclusions for the discrete and continuoustime signals.Figures 9a and 9b show the effect of changing  and  on PAPR, with  = 4,  = {32, 128, 256} and  = 32,  = {128, 256}, respectively, for both  = 1 and  = 4; we only simulate CP-OFDM and OTFS (CP-OFDM findings also apply to OQAM/FBMC).We note that increasing either  or  increases PAPR (( 16) and ( 17)).Nevertheless, the maximum PAPR in OTFS grows linearly with , and the CCDF is zero for  th values greater than a threshold related to  (for example, the maximum PAPR for  = 4 is 10 log (4) = 6.02).For  = 1, the PAPR gap is more that 6 dB.However, this gap decreases with continuous-time signals, where the OTFS PAPR gain is no more than 1.5 dB for small  values, and is only 0.3 dB compared to CP-OFDM at a CCDF of 10 −3 for large  values.Thus, OTFS provides significantly better PAPR than CP-OFDM only for small  values compared to .However, OTFS still shares the high PAPR characteristics with TF MC signals.Furthermore, we demonstrate in Fig. 9b good agreement between simulated and     analytical results of CP-OFDM for large  and acceptable bound of OTFS (no more than 0.5 dB difference).
The effect of pulse-shaping on PAPR performance in SC-FDE, DFT-s-OTFS, and DFT-s-OFDM is illustrated in Fig. 9c, varying the roll-off factor ( = {0, 0.5, 1}) of the RC filter: (an RC filter of 6 symbols; oversampling factor of 4; normalized to unit energy).We notice that increasing  significantly improves the PAPR performance in SC-FDE, but at the expense of excess bandwidth; DFT-s-OFDM is not as highly affected.Moreover, the PAPR variations with  are negligible for small M values.Furthermore, DFT-s-OTFS promises low PAPR when carefully choosing N and .
In Fig. 9d, we show that the PAPR in DFT-s-OFDM and DFT-s-OTFS has almost the same value when the DFT precoding sizes are equal ( N = M), for both Nyquist sampling ( = 1) and continuous-time signals ( = 4).Moreover, the two waveforms secure approximately 3 dB PAPR reduction compared with both OTFS and CP-OFDM.

E. Phase Noise
We first verify our assumption of Gaussian PHN for THz communications.We incorporate the PHN measurement results in [81] for a 300 GHz signal source, a PHN floor level of  0 = −110 dBc/Hz, and  2 = 10 (  cor = 1 MHz); we set  = 10 GHz, which satisfies (18).We consider Tx PHN without loss of generality.For an AWGN channel plus Tx PHN (ℎ[] = 1, for all  in (19)), the CP-OFDM results in Fig. 10a illustrate that the models in (20) and ( 22) are equivalent.Hence, for large bandwidths, the uncorrelated model of ( 22) is sufficient.The reason behind this observation is that the Wiener model PSD decreases with frequency, resulting in PHN power levels lower than the white floor noise of the Gaussian model at frequencies higher than (  cor ).Note that the PHN power of the Gaussian model is constant; we add a reference AWGN lower bound.PHN leads to inter-carrier interference (ICI) and adjacent channel interference, which explains the resultant degradation.The comparison assuming  Gaussian PHN in Fig. 10b illustrates that DFT-s-OFDM and DFT-s-OTFS are the most robust waveforms to PHN, and decreasing M and N enhances the performance.Surprisingly, we demonstrate that both DFT-s-OFDM and DFT-s-OTFS result in the same performance when the ratios / M and / N are equal.OQAM/FBMC outperforms other schemes because of its good time and frequency localization.Furthermore, CP-OFDM and OTFS are more robust than SC-FDE.SC-FDE has the worst performance.We also analyze the effect of changing the noise variance ( 2 g ) assuming THz-band Gaussian PHN and an SNR of 10 dB in Fig. 10c.We vary  2 g between low (10 −3 ), medium (10 −2 ), and strong (10 −1 ) values (as indicted in Table I in [52]), retaining a system bandwidth of  = 10 GHz; this changes the spectral density ( 0 ) of the white PHN floor.Increasing  2 g increases the BER, where DFT-s-OFDM and DFT-s-OTFS are the best performing.
In Fig. 10d, we study the effect of changing SCS by changing  = {4096, 2048, 1024, 256, 64} and fixing  = 10.24GHz.Surprisingly, the waveform BERs are retained, which is an important feature that relaxes other design parameters.For example, we can use a small  to ensure low PAPR and high PHN robustness concurrently.Such results are not observed below 6 GHz, where increasing SCS ensures high robustness to PHN (different low-frequency models).

F. Beam Split
We compare all waveforms using a stochastic THz channel simulator, TeraMIMO [54], in Fig. 11, for  = 50 GHz and   = 0.325 THz.We consider an UM-MIMO system with beamforming, where both Tx and Rx have uniform linear arrays of 32 AEs; we set the communication distance to 1 m.The channel is LoS-dominant with a few multi-path components, which tends to be almost flat-fading.We first plot the BERs assuming the absence of beam split; all waveforms achieve similar performance except for OQAM/FBMC (we considered simple MMSE equalization).When adding beam split, OQAM/FBMC is shown to be less affected compared to CP-OFDM.DFT-s-OTFS, OTFS, SC-FDE, and DFT-s-OFDM (with large M) have high robustness to THz-induced impairments, securing multiple-dB BER gains over both CP-OFDM and OQAM/FBMC.Furthermore, we study the effect of changing the system bandwidth  for both CP-OFDM and OTFS.We keep the previous simulation settings and only change the system bandwidth as  ∈ {1, 30, 40}GHz.Fig. 12 shows that increasing the system bandwidth  results in severe performance degradation due to significant array gain loss.The THz path components squint into different spatial directions at different subcarriers, causing this loss.Moreover, the results confirm the superiority of OTFS compared to CP-OFDM in terms of beam split robustness.

G. Performance in Doubly-Selective Channels
Doppler spreads in THz channels are orders-of-magnitude larger than those in the conventional microwave and mmWave channels (Table II).In Fig. 13, we compare the BERs of the studied schemes in a doubly-selective THz channel.We consider   = 0.5 THz,  = 0.25 GHz,  = 64,  = 16 (for fairness between DFT-s-OTFS, OTFS and other waveforms, we concatenate  symbols per frame), and user velocity  = {500 km/hr}.Note that the communication distance is 2 m, and cluster/rays parameters are taken from Table III (waveform parameters are derived following ( 55)).We consider MMSE equalization for all waveforms (other waveforms such as OQAM/FBMC show the same performance as CP-OFDM and are thus omitted).DFT-s-OTFS and OTFS more robust than CP-OFDM and other waveforms in TV-FSC, even for larger user velocity (), showing multiple-dB BER gains.Such advantages render DFT-s-OTFS and OTFS exceptionally suitable for high-mobility, high-carrier scenarios (THz V2X scenarios, for example).Note that we consider both integer and fractional Doppler shifts when simulating OTFS and DFT-s-OTFS.We notice that fractional Doppler causes performance degradation due to the inter-Doppler interference.However, OTFS and DFT-s-OTFS are still superior to other SC/MC waveforms in high-speed scenarios.

H. Recommendations
Table V presents a summary of the waveforms' performance under the adopted KPIs in this work.However, the importance of these KPIs varies from one application/use case to another in 6G networks.We first list some of the 6G use cases mentioned in [6] and match them to the appropriate KPIs.Typical applications in further-enhanced mobile broadband (FeMBB) scenarios are holographic MIMO, AR, and VR.The relevant KPIs are, but not limited to, enhanced SE, UM-MIMO compatibility, and robustness to beam split (as it is related to the usage of wide bandwidths).Thus, we recommend DFT-s-OFDM as a viable solution, as illustrated in Table V.
Ultra-massive machine-type communications (UMMTC) use cases includes several applications such as the Internet of everything and smart home and city.The major KPIs affecting UMMTC performance are low latency, robustness to hardware impairments (like PHN), and increased energy efficiency (lower PAPR and high SE lead to enhanced energy efficiency).Therefore, DFT-s-OFDM waveforms should be a  priority based on our evaluation.Furthermore, for extremely low-power communications (ELPC) use cases, such as the Internet of bio-nano-things, the DFT-s-OTFS seems to be the most promising candidate as energy efficiency is the important KPI.Extremely reliable and low-latency communications (ERLLC) scenarios involve fully automated driving and industrial Internet.The robustness to doubly selective channels, immunity to high Doppler spreads, and low latency are the determinant KPIs.Thus, we recommend DFT-s-OTFS and OTFS.Another use case is the THz ISAC, where an energyefficient waveform with high robustness to Doppler shifts and PHN is desired.We recommend DFT-s-OFDM and DFT-s-OTFS for THz ISAC.Other 6G verticals impose novel/specific requirements on localization.However, our analysis did not include any KPIs directly related to localization performance.

VI. CONCLUSION
In this paper, a comprehensive study of SC/MC waveforms for THz communications is conducted.The analysis  and simulation results demonstrate that the candidate 5G waveforms (filtered-based OFDM, such as OQAM/FBMC) are not suitable for future B5G/6G networks because of the increased PAPR and complexity.Furthermore, CP-OFDM and SC-FDE share similar characteristics: good SE, moderate TTI latency, high UM-MIMO compatibility, acceptable to high OOB emissions (without any additional pulse-shaping), and relatively low implementation complexity (especially with a single-tap ZF/MMSE equalizer).SC-FDE is shown to be less robust to uncorrelated Gaussian PHN, but it results in low PAPR and high robustness to THz beam split.DFT-s-OFDM is further shown to offer low PAPR and high robustness to both THz PHN and beam split.Finally, DFT-s-OTFS is illustrated to achieve high SE, low TTI latency, good PAPR characteristics, and high robustness to THz impairments.However, these advantages come at the price of increased equalization complexity, which opens important future research directions.Furthermore, DFT-s-OTFS and OTFS outperform all other waveforms in doubly-selective channels.In a nutshell, the findings of this work recommend the use of DFT-s-OFDM and DFT-s-OTFS in B5G/6G sub-THz/THz communications; CP-OFDM can still be used in sub-THz indoor scenarios (TIV-FSC).Other relevant performance metrics can be considered in future works.For instance, researchers should study the waveform robustness to asynchronous access, synchronization procedures in the presence of both STO and CFO, wideband
denote by D TF and D DD ∈ C × the data symbol matrices (of elements  TF [, ] and  DD [, ]) in the TF and DD domains, respectively.In vector form, d TF = vec(D TF ) and d DD = vec(D DD ).Furthermore, d TF  ∈ C  ×1 is a column of D TF (of elements  TF []).In the case of DFT-s-OFDM, the data symbol matrix is DTF ∈ C M× , a sub-matrix of D TF , where M represents the number of Tx symbols modulated over  subcarriers.We also denote by dTF M ∈ C M ×1 a column of DTF .Moreover, for DFT-s-OTFS, the data matrix is D N ∈ C  × N , where N represents the number of Tx symbols.Note that S = [s[0], s[1], . . ., s[ −1]] of (

Fig. 11 :
Fig. 11: BER performance of various waveforms in the presence and the absence of beam split ( = 1024 and  = 8).

Fig. 13 :
Fig. 13: BER performance of various schemes in a THz TV-FSC.
Providing a fair comparison of waveforms under THzspecific scenarios such as oscillator PHN (by studying a Gaussian uncorrelated PHN model), mobility, and beam split.

TABLE I :
Key parameters for CP-OFDM at different bands

TABLE II :
Maximum Doppler spread ( max ) at different terminal speeds and frequency bands

TABLE III :
Simulation parameters

TABLE IV :
normalized SE and TTI latency of SC/MC schemes

TABLE V :
Performance evaluation metrics for different SC/MC waveforms

TABLE VI :
Summary of Frequently-Used Acronyms IQI, PA non-linear distortion, multi-user scheduling, and flexible resource allocation.

TABLE VII :
Summary of Frequently-Used Acronyms