Precoded Filterbank Multicarrier Index Modulation With Non-Orthogonal Subcarrier Spacing

In this paper, we propose a novel precoded non-orthogonal frequency division multiplexing (NOFDM) with subcarrier index modulation (SIM) to increase spectral efficiency for multicarrier systems. The proposed NOFDM-SIM scheme is constituted by a filterbank multicarrier system under reduced subcarrier spacing, where precoded index modulated symbols are mapped onto the subcarriers. The detrimental effects of inter-carrier interference (ICI) resulting from the non-orthogonality between subcarriers are efficiently mitigated with the aid of eigenvalue-decomposition-based precoding. Our simulation results demonstrate that the proposed precoded NOFDM-SIM system exhibits a similar peak-to-average power ratio to OFDM and OFDM-SIM counterparts while achieving the bit error rate limit of zero-ICI OFDM in a channel-uncoded setting. We analyze the power spectral density of the proposed scheme to demonstrate that the proposed scheme is strictly bandlimited, unlike its orthogonal counterpart. Furthermore, we present detailed investigations of the effects on ICI in the proposed scheme with and without precoding as well as power allocation. Finally, we introduce a three-stage iterative decoding architecture for our proposed scheme and show near-capacity error performance.


I. INTRODUCTION
O RTHOGONAL frequency division multiplexing (OFDM) [1] is a multicarrier transmission technique widely employed in diverse industry standards, such as digital audio broadcasting, digital video broadcasting, asymmetric digital subscriber line, powerline communications, 4G/5G mobile communication, and IEEE802.11 series. A key advantage of OFDM driving its widespread use is its ability to convert a broadband dispersive channel into multiple narrowband frequency-flat subchannels. Appending a cyclic prefix (CP) ensures that the linear convolution between the symbol vector and channel impulse response is converted to circular convolution, which is equivalent to point-wise multiplication of channel frequency response with each OFDM subcarrier. This enables single-tap equalization in the frequency domain of OFDM symbols. Despite its acclaimed benefits, OFDM has several limitations. For example, since OFDM employs a rectangular pulse in the time domain, it suffers from high out-of-band (OOB) emissions. To combat the limitations of OFDM, several non-orthogonal frequency division multiplexing (NOFDM) schemes, including the filterbank multicarrier (FBMC) [2], generalized FDM (GFDM) [3], universal filtered multicarrier (UFMC) [4], spectrally efficient FDM (SEFDM) [5], have been developed. In FBMC, by filtering each subcarrier using a narrowband orthogonal shaping filter, OOB emission is reduced. Furthermore, owing to its lower sensitivity to carrier frequency offset relative to OFDM, the FBMC system typically performs better than OFDM in a high-mobility scenario. UFMC offers an intermediate choice between OFDM and FBMC, where in contrast to filtering of each individual subcarriers as in FBMC and no filtering as in classic CP-OFDM, in UFMC, each clustered set of subcarriers is filtered, which allows a shorter filter length than in FBMC, thus making it suitable for low-latency applications. Specifically, adjustments of the prototype filter and length of each sub-group of symbols offer flexibility. In SEFDM, subcarrier spacing in the frequency domain is set smaller than that of OFDM, similar to faster-than-Nyquist (FTN) signaling [6] that packs symbols tighter than the Nyquist criterion to improve spectral efficiency. This benefit is achieved at the cost of detrimental inter-carrier interference (ICI) effects. To combat this limitation, several precoding schemes, such as zero-forcing [7] and eigenvalue decomposition (EVD) [8], [9], were developed. Also, in [8] and [9], optimal power allocation employing the amplification factors formulated as the reciprocals of the eigenvalues was presented. Moreover, EVD-based SEFDM was extended to that supporting physical-layer security transmission [10]. Index modulation (IM) [11] is a modulation technique that conveys information by activating a subset of available indices in the space [12], time [13], and frequency domains [14]. Time-and frequency-domain IM techniques convey information by activating indices of time slots [11] and This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ frequency subcarriers [15], respectively. Specifically, some of the information bits are modulated using the activated index subset, and the rest are modulated onto a conventional amplitude and phase-shift keying (APSK) symbol. Additionally, IM supporting combinations of multiple resource domains, such as space-time domains [16] and space-time-frequency domains [17], have been developed. Moreover, IM has been introduced in beamspace dimensions [18], [19], filter shape dimensions [20], and distributed node-space dimensions [21]. More recently, in [22], the SEFDM with SIM (SEFDM-SIM) scheme was first studied without employing any precoding and the minimum mean-square error (MMSE)-based linear receiver was developed. Also, a joint channel estimation and equalization technique was developed for SEFDM-SIM [23]. Furthermore, the time-domain counterpart of SEFDM-SIM was independently developed as FTN-IM [24], [25], [26]. To the best of our knowledge, the conventional SEFDM-SIM scheme has not been investigated with any sophisticated precoding or power allocation, based on our review of the existing literature.
Against the above background, the novel contributions of this paper are as follows 1 1) We propose a novel precoded NOFDM-SIM architecture to enhance spectral efficiency by exploiting the combined benefits of the reduced subcarrier spacing of filtered SEFDM and the gain from introducing SIM. More specifically, EVD-based precoding is employed to eliminate the effects of ICI, imposed by the non-orthogonality of subcarriers and filtering of subcarriers. 2) We investigate the proposed scheme by analyzing its power spectral density (PSD), bit-error rate (BER), and peak-to-average power ratio (PAPR) performance.
3) The ICI effects of NOFDM-SIM, resulting from the combination of symbol density (occurring due to NOFDM) and symbol sparsity (occurring due to SIM), are analyzed in detail. Furthermore, the effects of precoding and power allocation on ICI are also revealed. 4) We present a three-stage concatenated iterative decoder employing the turbo coding principle for the proposed scheme to demonstrate the near-capacity performance in a frequency-flat Rayleigh fading channel. The rest of this paper is organized as follows. Section II describes our proposed scheme. Section III presents the BER performance in a channel-uncoded scenario to demonstrate that the proposed scheme does not have any BER penalty over its orthogonal counterpart. Section IV shows the PAPR evaluations to evince that PAPR is not affected by the introduction of non-orthogonality in the proposed method. In Section V, we investigate the bandlimitedness in our scheme resulting from its subcarrier-wise filtering. Section VI evaluates the resultant ICI in the proposed scheme arising from the amalgamation of contrasting features of reduced subcarrier spacing and sparsely activated subcarriers. The 1 The original concept of the proposed scheme was presented in our preliminary study [27], which did not cover the detailed performance analysis of PSD, ICI, and BER for a near-capacity turbo-coded scenario, as well as the theoretical BER bound for a channel-uncoded scenario. These comprehensive analyses are included in this present paper. BER performance in near-capacity turbo-encoded scenarios is presented in Section VII. In Section VIII, we discuss the implementation complexity of the proposed method. Finally, Section IX concludes the paper.
Notation: We use boldface uppercase letters to denote matrices, boldface lowercase letters for vectors, and lowercase letters with a suffix to denote elements of a vector. C j×k and R j×k denote the complex and real fields of dimensions j × k, respectively. P(.) denotes the probability of an event, E[.] denotes the expectation of a random variable. Transpose and Hermitian operations are denoted by (.) T and (.) H , respectively.

II. SYSTEM MODEL OF PROPOSED SCHEME
Let us consider a precoded NOFDM-SIM scheme operating over a frequency-flat Rayleigh fading channel, where the channel coefficients are modeled by independent and identically distributed complex Gaussian random variables. A block diagram of our transceiver is shown in Fig. 2. In the following subsections, we expound on the features of each block.

A. Frequency-Domain Index Modulation
Information bits are modulated onto N subcarriers, which are divided into L clusters, each containing M subcarriers. Thus, we have the relationship N = LM . We assume that B information bits are modulated onto each cluster, giving a total of LB information bits per frame. For the lth cluster, B information bits are divided into two groups of B 1 and B 2 bits as shown in Fig. 2, where B = B 1 + B 2 . The first B 1 bits are used to activate an index activation pattern of subcarriers, which corresponds to SIM. For example, if K out of M subcarriers are activated in a given cluster, then B 1 = log 2 M K bits are modulated using the SIM principle. Since M − K subcarriers are deactivated, the symbols modulated on the active subcarriers are scaled by a factor of M/K to maintain the same transmit power. The remaining B 2 bits are modulated based on P-ary APSK modulation, where P = 2 B2/K . Thus, the proposed modulation scheme  is characterized by the parameters (M, K, P). The resultant spectral efficiency is formulated by Moreover, the corresponding energy efficiency of the proposed scheme in bit per Joule per noise level [28] can be expressed as where P t is transmit power and N 0 is one-sided noise power spectral density. The uth frame of our SIM symbols in the frequency domain is represented by where u,l , . . . , s For ease of exposition, the frame structure employing the SIM parameters M = 4, K = 1 is exemplified in Fig. 3. Furthermore, s u,l is the symbol modulated on the mth subcarrier in the lth cluster. The symbol block has to satisfy the energy constraint E[s H u s u ] = LM σ 2 s , where σ 2 s is the average energy per symbol.

B. EVD-Based Precoding
In our scheme, the EVD-based precoding technique [9] is invoked to attain equivalent ICI-free parallel substreams at the receiver. Let us define a matrix H ∈ C LM×LM that is composed of ICI components originating due to the non-orthogonality between subcarriers introduced by reducing the subcarrier spacing in the proposed scheme. More specifically, the kth-row and lth-column entry of H is expressed as where g(t) is the impulse response of a shaping filter and F is the subcarrier spacing. For ease of exposition, the H matrix can be further represented as where c i (0 ≤ i ≤ N − 1) denotes the index of the nth subcarrier, and φ(c m , c n ) denotes interference between the mth and the nth subcarriers. Note that we have the relationships of Assuming the use of the same shaping filter for each subcarrier, the relationship Hence, H is a Toeplitz matrix (or a diagonal-constant matrix). Furthermore, for a specific zero-ICI OFDM signaling, having τ = 1 and the use of a sinc pulse, H becomes the identity matrix owing to the orthogonal relationship of φ(c m , c n ) = 0. With a sufficiently high subcarrier packing ratio τ , the H matrix tends to exhibit full-rank, which can be verified from its non-zero determinant, thus satisfying the positive definiteness. Hence, based on EVD, H is factorized as where and Q ∈ R LM×LM is an orthonormal matrix containing LM eigenvectors. At the transmitter, the frequency-domain modulated symbols s u are precoded by QP, where P ∈ R LM×LM is a real-valued diagonal matrix, called the power-allocation (PA) matrix. Thus, the precoded symbols of the uth frame in the frequency domain are expressed as Here, according to [9], the PA matrix P is given by As observed from (10), the eigenvalues λ i , 0 ≤ i < N have to be positive and sufficiently large to be tractable [9]. Also, low eigenvalues tend to increase the effects of interframe interference [29]. Hence, the τ value has to be set sufficiently high to avoid the limitations imposed by the numerically ill-conditioned H matrix.
The complex multiplication operations used in the linear precoding are dominant in the complexity of our proposed scheme. However, by approximating the EVD using a fast Fourier transform (FFT) according to [30], the multiplications of Q and Q H can be efficiently implemented [31], which has complexity order O(LM log LM ). For a predetermined number of subcarriers N , subcarrier packing ratio τ , and subcarrier filters employed at the transmitter, the FFT calculations can be carried out offline in advance of transmission since the H matrix is uniquely defined by these three parameters.

C. Non-Orthogonal FDM Signaling
Having attained the discrete precoded symbols x u in the frequency domain, now let us introduce the transmitted signals in the continuous-time complex-valued baseband representation as follows: where g u,v (t) is the basis pulse shape, formulated by More specifically, (12) is obtained by appropriate time-frequency shifting of a prototype transmit filter g(t) in the time-frequency lattice (Fig. 4). The subcarrier spacing is reduced below the orthogonality limit by applying F = τF 0 (0 ≤ τ < 1) in (12), where F 0 = 1/T 0 and T 0 denote the subcarrier spacing and the symbol duration of OFDM, respectively.

D. Detection Algorithm
At the receiver, the continuous-time signal at the output of the AWGN channel is expressed as where a is the block-fading coefficient, and n(t) is a complex-valued white Gaussian random process with a zero mean and one-sided spectral density of N 0 . Based on the received signal r(t), the receiver carries out matched filtering, sampling, ICI cancellation, and demodulation to recover the transmitted bits. These steps are described as follows.

1) Matched Filtering & Sampling:
The received signal is projected onto the basis pulse g u,v based on matched filtering to maximize the signal-to-noise ratio (SNR). Then, the matched-filtered signal is sampled. To be more specific, the sampled symbol on the vth subcarrier in the uth frame is represented by where a u is the frequency-flat block-fading coefficient for the uth frame and the term η u,v corresponds to the additive noise component. From (14), the vectorial form of the sampled symbols is expressed as where The noise samples η u,v are correlated and have covariance which is specific to SEFDM.
2) ICI Cancellation: Next, ICI-contaminated symbols of (16) are decomposed with the aid of EVD. By multiplying the orthonormal matrix Q H by the received samples r u , we obtain Since Λ and P are diagonal matrices, the ICI effects are eliminated in (22). Furthermore, the noise samples Q H η u in (22) are uncorrelated with the diagonalized covariance matrix As a consequence, the symbols r † u correspond to N independent substreams with a scaling term of ΛP and an uncorrelated additive noise term of Q H η u . Finally, r † u are equalized with a one-tap equalizer as follows: 3) Demodulation: Next, we introduce two types of demodulators, namely the maximum likelihood (ML) detector and the log-likelihood ratio (LLR)-based low-complexity detector, for our proposed NOFDM-SIM scheme.
3.1) ML Detection: The ML detector attains optimal detection performance by performing an exhaustive joint search on the legitimate SIM and APSK constellation. The equalized symbols r u,eq are split into L sub-blocks r l u,eq (0 ≤ l < L), each containing M symbols, similar to the transmitter. Then, the estimated information bits in the lth sub-blockb Naturally, the complexity of ML detection is significantly high, especially when the number of subcarriers M is high.

3.2) Low-Complexity LLR-Based Detection:
To avoid the high complexity imposed by an exhaustive ML search, reduced-complexity two-stage LLR-based detection is used to demodulate the symbols in r u,eq , similar to that employed for OFDM-SIM [32]. More specifically, the LLR of the mth symbol s (l) m in the lth frame is given by At the first stage of LLR detection, an active index is estimated by a high value of the corresponding LLR. For example, in the case of IM parameters (M, K, P) = (4, 1, 4), the index with the highest LLR value is detected as the activated index.
The symbol corresponding to the detected active index is then demapped according to the look-up table. At the second stage, the APSK symbol of the estimated active symbol is demodulated. The LLR expression in (27) is further simplified for ease of implementation. Firstly, note that Then, by applying Bayes' theorem to (27) and substituting (28) and (29) into (27), we arrive at Furthermore, the last term of (30) is simplified with the max * () function [33], which is defined for BPSK modulation as follows: In general, for P −ary APSK modulation, we obtain [33] max

III. BER ANALYSIS IN CHANNEL-UNCODED SCENARIO
In this section, we evaluate the BER performance of the precoded NOFDM-SIM scheme, with and without PA for a channel-uncoded scenario, which is introduced in Section II. Specifically, we show that the proposed scheme does not incur any BER penalty over its orthogonal counterpart, despite introducing non-orthogonality among subcarriers.
Figs. 5 and 6 show the BER results of the proposed scheme with and without PA, respectively. Two benchmark schemes, namely the OFDM with BPSK and the OFDM-SIM with QPSK, were considered. Without loss of generality, the root-raised cosine (RRC) shaping filter with a frequencydomain roll-off factor β is assumed as the prototype transmit filter 2 g(t) in (11). The pulse is T −orthogonal and satisfies the energy constraint ∞ −∞ |g(t)| 2 dt = 1. For OFDM, a rectangular pulse was employed. The SIM parameters (M, K, P) = (4, 1, 4) were employed for both the OFDM-SIM and the proposed schemes. Furthermore, the number of subcarriers was set to N = 10 3 for both schemes. For simplicity, we considered only the interference occurring between subcarriers in the same frame, thus ignoring the effects of inter-frame interference like most previous work. ML detection was employed at the receiver of each scheme. It can be observed in Fig. 5 that the introduction of PA in the proposed scheme allows us to accomplish perfect cancellation of the ICI effects, hence obtaining the same BER performance as that of the classic OFDM free from ICI. Recall from (22) that the EVD-based diagonalization, as well as PA, enabled the same received SNR at each subcarrier, which justifies the BER results of our proposed scheme observed in Fig. 5. It is also observable in Fig. 6 that the absence of PA imposed a BER performance penalty on the proposed scheme over the OFDM benchmark, which increased upon a decrease in τ .
The theoretical upper bound on the average error probability, denoted byP err , in the proposed scheme for a channel-uncoded scenario is formulated bȳ where s and s denote two distinct symbols composed of B bits each, and d(s, s ) denotes their Hamming distance. P(s, s ) indicates the pairwise error probability, which is expressed for the AWGN channel as follows: where Q[·] denotes the tail probability of a normal distribution, defined as The theoretical BER bounds with PA and without PA settings are calculated using (35) and shown in Figs. 5 and 6 to further validate our system model.

IV. ANALYSIS OF PEAK-TO-AVERAGE POWER RATIO
Having presented the insights on BER of the proposed scheme in Section III, here we investigate another key performance metric, the peak-to-average-power-ratio (PAPR). The PAPR [36], [37] captures the amplitude fluctuations in the envelope signal of a multicarrier transmission system. 3 Since individually modulated subcarriers are added to generate a transmit signal, a high peak may stochastically occur in time. This is illustrated in Fig. 7, where 4 subcarriers are coherently added to generate overshoots in the envelope.
For our transmit signal expressed in (11), the PAPR is defined by   1) and QPSK modulation. An RRC shaping filter was used for all cases except for OFDM. The subcarrier packing ratio τ and roll-off factor β of the RRC shaping pulse were chosen as (τ, β) = (1, 0.5), (0.9, 0.5), (0.8, 0.5), and (0.7, 0.5) in order to satisfy the condition of positive-definiteness of the H matrix. A rectangular pulse in the time domain was used for OFDM. The number of subcarriers was set to 10 3 , and the evaluation was done without channel coding. Observe in Fig. 8 that the proposed scheme exhibited a similar PAPR to other benchmark schemes, 4 regardless of the system parameters employed. Since a sufficiently high subcarrier packing ratio τ was chosen to avoid any numerical ill-conditioning of the H matrix and associated inaccuracy in eigenvalue computation, hence the employed PA did not affect the PAPR performance in our evaluations. 4 A PAPR reduction [36], [37], [38] has been an issue in multicarrier systems. The popular PAPR reduction techniques developed for OFDM have been studied in the context of OFDM-IM. For example, selective mapping (SLM) has been investigated for OFDM-IM in [39] and [40]. Active constellation extension (ACE) of OFDM has also been extended to the OFDM-IM by exploiting inactive subcarriers in [41]. Furthermore, adding an optimal dither signal to the inactive subcarriers of OFDM-IM was introduced in [42], which was shown to outperform SLM and ACE. Multilevel dither signal was proposed in [43] to allow higher freedom of dithering in the inactive subcarriers. Additionally, partial transmit sequences were adopted to reduce the PAPR of OFDM-IM in [44]. Since the proposed NOFDM-SIM scheme tends to exhibit a high PAPR similar to OFDM-IM, it may be beneficial to incorporate the above-mentioned PAPR reduction techniques into the proposed NOFDM-SIM scheme. However, the detailed investigation is beyond the scope of this paper.

V. ANALYSIS OF BANDWIDTH LIMITATION
In this section, we present the PSD of the proposed scheme, which captures the advantage of subcarrier-wise filtering. More specifically, the proposed scheme's bandwidth limitation, i.e., the OOB emission, is compared with those of the conventional OFDM and OFDM-SIM schemes by analyzing the PSD, which characterizes the power distribution of a stochastic signal over a continuum of frequencies. For a transmit signal x(t) of (11), its PSD is obtained as [45] where A(τ ) is the autocorrelation function of x(t) and A(0) is the average power of x(t). The periodogram technique [31] is used for calculating the PSD by considering only discrete fundamental harmonic frequencies instead of continuous frequency. Specifically, x(t) of (11) obtained from bandlimited complex-valued symbols is sampled to obtain a discrete time series. Then, the PSD is approximated by the periodogram, which is expressed as In our simulations, a sampling interval of 10 −2 s is used for discretizing the continuous-time signal. Figs. 9-11 show the PSD calculated within the time interval [−100T 0 , NT 0 + 100T 0 ]. Fig. 9 shows the PSD comparisons of the classic OFDM-SIM scheme and the proposed precoded NOFDM-SIM scheme without PA. The number of subcarriers in OFDM was set to 20, and the SIM parameters (M, K) = (4, 1) with QPSK were employed. For the proposed scheme, the subcarrier packing ratio was selected from τ = 0.83, 0.71, and 0.625, which resulted in 24, 28, and 32 subcarriers, respectively, in the same bandwidth as OFDM. The proposed scheme is strictly bandlimited, unlike OFDM-SIM, as evident from Fig. 9. Specifically, the low sidelobe level in the proposed scheme was seen as a consequence of the subcarrier filtering applied at the transmitter, as mentioned in Section II-C. Fig. 10 compares the PSD of the proposed scheme with PA, and OFDM-SIM. The τ value did not affect the sidelobe level much since a mild compression was assumed. The proposed scheme with a tighter subcarrier packing ratio of τ = 0.625 exhibited a marginally increased sidelobe level, but is still significantly lower than that of OFDM-SIM.
Finally, Fig. 11 compares the PSD of the proposed scheme with PA for varying roll-off factor β of the RRC shaping filter while maintaining a fixed subcarrier packing ratio of τ = 0.71. Observe that the sidelobe level goes higher as β is reduced. Our additional extensive simulations showed that there was no significant difference in the PSD sidelobe levels regardless of the modulation size and of the introduction of SIM under a transmission rate of 1 bit per subcarrier.

VI. ANALYSIS OF INTER-CARRIER INTERFERENCE
In this section, we examine the effects of ICI in the proposed scheme. In contrast to the conventional OFDM free from ICI, the proposed scheme suffers from two-fold ICI. First, each subcarrier is bandlimited by an RRC shaping filter, which introduces non-orthogonality among subcarriers, as detailed in Section II-C. Second, spacing between subcarriers is reduced below the orthogonal limit to increase the spectral efficiency by applying F = τF 0 , τ < 1, in (12)), hence introducing ICI. 5 While the reduction of subcarrier spacing decreases the minimum Euclidean distance, the SIM principle described in Section II-A deactivates a subset of the subcarriers, which introduces sparsity. Hence, the proposed scheme is capable of striking a tradeoff between the symbol density and the sparsity on overall ICI. Additionally, it is similarly beneficial to characterize the effects of precoding and power allocation on ICI.
ICI imposed on the nth subcarrier from all other subcarriers within a single block is characterized in the time domain complex-valued baseband representation as follows: Complex baseband signal corresponding to pulse-shaped kth subcarrier Here, we numerically evaluated ICI of (40) 6 for a single block transmission with N = 8 subcarriers to maintain a 5 However, ICI introduced in our proposed scheme is efficiently canceled out by precoding at the transmitter and weighting at the receiver. 6 Note that (40) does not include the effects of ICI resulting from other frames; only the ICI occurring in a single frame is represented. This is in line with the assumption considered in most previous studies [22], [23], [34], [35], [46]. practical computational complexity. The 2 N legitimate symbol blocks were considered, and ICI at each of the N subcarriers was computed. Fig. 12 compares the cumulative distribution functions (CDF) of ICI for the proposed scheme with and without SIM with that of classic OFDM, each employing a transmission rate of 1 bit per subcarrier. The RRC filter with the roll-off factor β = 0.5 and subcarrier packing ratios τ = 0.9, 0.8, 0.7 were considered. Since our proposed scheme introduces a controlled-amount of ICI to improve spectral efficiency over OFDM, Fig. 12 enables us to quantify this deliberately introduced ICI. Observe in Fig. 12 that ICI increases upon reducing the subcarrier packing ratio τ . Furthermore, the CDF curves for NOFDM and NOFDM-SIM are similar with crossover points. However, as mentioned above, the proposed scheme with SIM has a lower number of active subcarriers than the non-SIM counterpart. Hence, it is expected to have a lower ICI. To show this, in Fig. 13, we compare ICI between the two schemes, where the scaling factor M/K imposed by the transmit signal is deactivated, while it is not in Fig. 12. It was found that the proposed scheme with SIM exhibited significantly lower ICI than the non-SIM case when the scaling of M/K is deactivated, as expected. Hence, the similarity in ICI occurring in the proposed schemes with and without SIM may be caused by the power scaling factor. Furthermore, in Fig. 14, the effects of precoding and power allocation on ICI in the NOFDM scheme with SIM are depicted. Observe in the figure that the introduction of precoding renders an otherwise staircase CDF plot as a smoother curve. Moreover, employing EVD-based power allocation leads to a further increase in ICI.

VII. NEAR-CAPACITY TURBO-ENCODED EVD-NOFDM-SIM WITH ITERATIVE DETECTION
Having characterized the fundamental properties of the proposed scheme, we now investigate its performance limits under the assumption of employing powerful channel coding and  CDF of ICI in precoded NOFDM-BPSK, and precoded NOFDM-SIM without the symbol scaling which was imposed by the sub-block power constraint. The RRC filter with roll-off factor β = 0.5 was used in the precoded NOFDM schemes. Subcarrier packing ratios τ = 0.7, 0.8, 0.9 were employed. iterative decoding. More specifically, we show the performance of our scheme with three-stage serially concatenated turbo coding and iterative detection to demonstrate the near-capacity performance. In contrast to the BER evaluation covered in Section III for an uncoded AWGN channel employing ML detection, here we perform the symbol detection by max-log-MAP detector by iterative exchange of soft extrinsic information. Throughout this section, we consider communication over a frequency-flat Rayleigh fading channel. Fig. 15 shows a block diagram of the channel-coded architecture considered. The information bit sequence is encoded by a recursive systematic convolutional (RSC) encoder, followed by interleaving using the first interleaver. The interleaved bits are further encoded by a unity rate code (URC). The output bits from the URC are interleaved by the second interleaver, and the interleaved bits are loaded into each cluster of M subcarriers. More specifically, the first B 1 coded bits are used to determine which subcarriers in each cluster will be activated, and the remaining B 2 coded bits are mapped to suitable APSK symbols, which are loaded onto the active subcarriers, according to our NOFDM-SIM principle.

A. Turbo-Coded Architecture
The receiver is constituted by three serially concatenated soft-output detectors: a NOFDM-SIM soft demapper, a URC decoder, and an RSC decoder employing the max-log-MAP algorithm. Extrinsic information is exchanged between the three soft-output decoder stages in an iterative manner. More specifically, the joint demapper block computes LLR from the post-diagonalization received symbols and soft a-priori information from the URC decoder in each inner iteration using the expression [47] L e (b i ) = max where L e (b i ) and L a (b j ) denote the extrinsic LLR and a-priori LLR, respectively, corresponding to the ith and jth bits. Furthermore, Γ i (1) and Γ i (0) denote the sets of legitimate symbol points in the hybrid SIM-APSK constellation where the ith bit is 1 and 0, respectively. Also, L e (b i ) is exchanged between the soft demapper and URC decoder in each inner decoding iteration and between the URC decoder and RSC decoder in each outer decoding iteration. We denote the number of inner decoding iterations between the soft demapper and URC decoder by I in and the number of outer decoding iterations between the URC and RSC decoders by I out . Thus, each received block is detected by a total of I out × I in

B. BER Performance
Here, we provide numerical results to show the BER performance of the turbo-encoded NOFDM-SIM scheme, where both the activation and deactivation of power allocation were considered. Furthermore, the conventional OFDM-SIM system was selected as a benchmark 7 scheme. The basic system parameters used in our simulations are provided in Table II. We considered the total available system bandwidth to be the same for all the schemes. At the transmitter of the proposed scheme, N = 10 3 subcarriers were considered regardless of the subcarrier packing ratio. Correspondingly, the conventional OFDM scheme has a lower number of subcarriers in the same bandwidth than the proposed NOFDM scheme. 7 Other NOFDM schemes, such as GFDM and UFMC, are also potential benchmarks that can be studied with SIM and reduced subcarrier spacing. Such schemes have originated with the motivation of combating the OOB leakage of OFDM and improving the time-frequency localization by applying filtering, pulse shaping, and precoding. Detailed comparisons of their OOB emissions, BER, latency, carrier frequency offset and Doppler diversity can be found in literature [48], [49], [50], [51]. In this paper, we focus our attention on characterizing the fundamental performance metrics of a precoded FBMC-SIM with reduced subcarrier spacing and compare it with its most popular relative, OFDM-SIM.  Table III shows the representative numbers of subcarriers in the conventional OFDM scheme and the proposed precoded NOFDM-SIM scheme for different values of the subcarrier packing ratio used in our simulations. Since the classic OFDM scheme has a lower number of subcarriers than the NOFDM scheme, a higher-order modulation has to be adopted in OFDM to achieve the same transmission rate as the proposed NOFDM. The half-rate RSC code was used at the transmitter, and the second interleaver's length was set to 2 × 10 5 bits, which is sufficient for demonstrating the theoretical performance limit. Furthermore, the numbers of inner and outer iterations were set to I in = 2 and I out = 40. The target transmission rate was set to 1.5 bps/Hz. The roll-off factor of the RRC shaping pulse was set to either 0.25 or 0.5. For roll-off factor β = 0.25, subcarrier packing ratio τ was chosen from the set {0.8, 0.9}, and that for β = 0.5 was selected from the set {0.7, 0.8, 0.9}. For the benchmark OFDM scheme, an ideal rectangular filter was employed, while subcarrier spacing was set to satisfy the subcarrier orthogonality. Fig. 16 shows the BER performance of the proposed NOFDM-SIM scheme with PA, employing an RRC shaping filter with roll-off factor β = 0.5. Additionally, the BER performance of the classic OFDM-SIM scheme is also plotted.  Observe that the proposed scheme outperforms the classic OFDM scheme by a non-negligible margin close to 8 dB, 4 dB, and 1.8 dB at τ = 0.7, 0.8, and 0.9, respectively, for the BER = 10 −4 . As seen in Fig. 16, the BER performance of the proposed scheme improves upon decreasing the subcarrier packing ratio owing to the benefits of reduced modulation order. Fig. 17 shows the BER performance of the proposed precoded NOFDM-SIM scheme without PA, employing an RRC shaping filter with roll-off β = 0.5. Observe that even without PA, the proposed scheme outperforms the classic OFDM-SIM scheme by 4.4 dB, 2 dB, and 0.8 dB at τ = 0.7, 0.8, and 0.9, respectively, for the BER = 10 −4 . Figs. 18 and 19 show the BER performance of the proposed schemes with and without PA, respectively, both employing an RRC shaping filter with roll-off factor β = 0.25. As shown in Fig. 18, compared to the classic OFDM-SIM scheme, gains of 4 dB and 1.8 dB are observed at τ = 0.8 and 0.9 in the proposed scheme, respectively, for the BER = 10 −4 . Furthermore, in Fig. 19, we observe gains of 2.6 dB and 1.2 dB at τ = 0.8 and 0.9, respectively. Thus, the proposed precoded NOFDM-SIM scheme can perform better than the OFDM-SIM scheme even without PA. This makes it an attractive technique, since complications arising due to low eigenvalues of the interference matrix are avoided.

C. Convergence of Turbo Detector
Here, we discuss the convergence of the serially concatenated RSC-URC code and its iterative detection in our proposed NOFDM-SIM method. The three-stage structure allows us to employ a simple RSC(2, 1, 2) code instead of a highly complex channel encoder and achieve near-capacity performance. By iteratively exchanging extrinsic information for I in inner iterations, and I out outer iterations as mentioned in Section VII-A, the hard bits are generated. The number of hard bits in errors reduces upon increasing iterations, and eventually, after several iterations, there is no further significant reduction in the BER. Note that the maximum number of outer decoding iterations is set to 40 in the simulations of Figs. [16][17][18][19]. However, it was found in our extensive simulations that the results nearly converged at around I out = 8 when BER= 10 −5 . Fig. 20 shows the convergence behavior with the outer iterations from I out = 1 to 20.

VIII. DISCUSSION ON IMPLEMENTATION COMPLEXITY
In this section, we discuss several aspects of implementation complexity for our proposed scheme over classic OFDM.

A. Complexity of EVD
The proposed scheme performs EVD of the H ∈ C LM×LM matrix, which has the complexity of O (LM ) 3 arithmetic operations. Recent work [52] has revealed that the complexity of EVD operation could be reduced to O (LM ) υ where ω < 2.376 is the exponent of matrix multiplication. Furthermore, in our assumption, H, τ , and β are perfectly known in advance of data transmission. Hence, the EVD operation can be performed offline and the values can be stored in memory to avoid high real-time computational complexity. Also, as outlined in Section II-B, the EVD operation may be approximated by an FFT algorithm [30], [31], which has a complexity of O LM log(LM ) .

B. Complexity of SIM Detection
The optimal ML detector in (26) may have a high complexity, which increases with the sub-block length M . This is because exhaustive search is performed over all possible index activation patterns and APSK signal constellation points. Specifically, the ML detector requires O(2 B1 P K ) complex multiplications [32] to demodulate each of the L sub-blocks. Thus, it may be practically infeasible for high B 1 and K values. However, a low-complexity LLR-based detector mentioned in Section II-D may require as low as O(M ) complex  Table III. multiplications per sub-carrier. Moreover, there are several low-complexity detection algorithms for SIM [53], [54], [55], [56], [57], [58], [59], [60], which may be applicable to our proposed scheme. Specifically, a low-complexity IM detector, exploiting sequential Monte Carlo theory [54], has been shown to produce near-optimal error performance at significantly lower complexity than the optimal detector. Compressedsensing-based detectors for IM in [56], [57], [58], and [61] were also developed to reduce the complexity. Furthermore, in [55], two low-complexity detectors for IM using local ML-based MMSE and local ordered block-based MMSE principles were presented.

C. Complexity of Turbo-Coded Architecture and Three-Stage Iterative Detection
Assuming that perfect knowledge of channel state information is available, the complexity of our ideal RSC-URC coded architecture and the three-stage demapper/decoder system is expressed as [62] where C RSC , C URC and C Demapper indicate the complexities of the RSC decoder, the URC decoder, and the soft-output demapper respectively, shown in Fig. 15. Note that the turbo-coded architecture and the iterative detection are aimed at numerically demonstrating the maximum achievable limit of the proposed scheme. Thus, we employed ideal parameter assumptions of high interleaver length of 2 × 10 5 bits and a total of I in × I out = 2 × 40 = 80 decoding iterations with the near-optimal max-log-MAP algorithm. However, in a realistic scenario, the optimal detector is replaced by a low-complexity symbol detector mentioned above. Also, the convergence may be achieved within 8 to 10 outer iterations. Hence, complexity reduction is possible by fine-tuning the detector parameters based on the requirements of an application.

IX. CONCLUSION
In this paper, we proposed a novel precoded NOFDM scheme with SIM to harness the combined benefits of NOFDM and SIM aimed at spectral efficiency. With the aid of EVDbased precoding, the detrimental ICI effects induced due to non-orthogonality among subcarriers are removed. Our numerical results for the channel-uncoded scenario demonstrated that the proposed scheme is capable of achieving the ideal BER performance free from ICI while attaining a higher normalized throughput. We also demonstrated that the proposed scheme is strictly bandlimited as compared to the classic OFDM-SIM scheme as a benefit of subcarrier-wise filtering. Our PAPR analysis found no increase in the PAPR of the proposed scheme over that of classic OFDM-SIM. Our detailed investigations on ICI unveiled the tradeoff between the symbol density and the sparsity in the proposed scheme, as well as its overall impact on ICI. Furthermore, the BER of the proposed scheme with near-capacity turbo coding revealed the competitive advantage over the conventional orthogonal counterpart, which is achievable owing to a higher number of subcarriers and a lower modulation order.