Near-Instantaneously Adaptive Learning-Assisted and Compressed Sensing-Aided Joint Multi-Dimensional Index Modulation

—Index Modulation (IM) is capable of striking an at-1 tractive performance, throughput and complexity trade-off. The 2 concept of Multi-dimensional IM (MIM) combines the beneﬁts 3 of IM in multiple dimensions, including the space and frequency 4 dimensions. On the other hand, IM has also been combined with 5 compressed sensing (CS) for attaining an improved throughput. 6 In this paper, we propose Joint MIM (JMIM) that can utilize the 7 time, space-and frequency-dimensions in order to increase the 8 IM mapping design ﬂexibility. Explicitly, this is the ﬁrst paper 9 developing a jointly designed MIM architecture combined with 10 CS. Three different JMIM mapping methods are proposed for 11 a space-and frequency-domain aided JMIM system, which can 12 attain different throughput and diversity gains. Then, we extend 13 the proposed JMIM design to three dimensions by combining 14 it with the time domain. Additionally, to circumvent the high 15 detection complexity of the proposed CS-aided JMIM design, we 16 propose Deep Learning (DL) based detection. Both Hard-Decision 17 (HD) as well as Soft-Decision (SD) detection are conceived. 18 Additionally, we investigate the adaptive design of the proposed 19 CS-aided JMIM system, where a learning-based adaptive mod-20 ulation conﬁguration method is applied. Our simulation results 21 demonstrate that the proposed CS-aided JMIM (CS-JMIM) is 22 capable of outperforming its CS-aided separate-domain MIM 23 counterpart. Furthermore, the learning-aided adaptive scheme 24 is capable of increasing the throughput while maintaining the 25 required error probability target. 26

Frequency Division Multiplexing (OFDM) is referred to as Subcarrier-IM (SIM), where only a fraction of the subcarriers is activated for signal transmission and the index of active subcarriers conveys extra information bits [7].The effective signal power of the subcarriers activated in the FD is amplified, without increasing the time domain signal power after Inverse Fast Fourier Transform (IFFT).This results in a higher Signalto-Noise Ratio (SNR) for the modulated symbols without requiring extra radiated power.Then, Tsonev et al. [8] proposed an enhanced SIM and Basar et al. [9] conceived a novel IM-aided OFDM (OFDM-IM) scheme for increasing the spectral efficiency.However, subcarrier-index modulated OFDM suffers from significant throughput reduction compared to the classic OFDM due to the deactivation of a number of subcarrers.Hence, Zhang et al. [10] proposed an improved SIM concept relying on Compressed Sensing (CS) [11], which benefits from the sparsity of symbols in the FD by compressing the sparse transmit vector [12].
To further increase the overall performance, Datta et al. proposed the concept of Generalized SIM (GSIM) and proved that Generalized Space-and-Frequency IM (GSFIM) achieves better performance than MIMO-OFDM.Their solution conveyed extra information in the SM part compared to GSIM [13].However, the detection complexity of GSFIM escalates.Hence, Chakrapani et al. [14] proposed a message passing based low-complexity detection method for reducing the complexity of GSFIM detection.Furthermore, inspired by the SM and Quadrature SM (QSM) concepts [15], Quadrature Space-Frequency IM (QSF-IM) was proposed in [16], which applies a twin-antenna constellation for the in-phase and quadrature-phase transmission, in order to increase the throughput without extra energy consumption.Hence this solution struck a compelling Spectral Efficiency (SE), Energy Efficiency (EE) and Cost Efficiency (CE) trade-off.Furthermore, several researchers considered the design of Multi-Dimensional Index Modulation (MIM) relying on both the Spatial Domain (SpD) and FD.For example, Space-Frequency Shift Keying (SFSK) [17] relies on an SFSK Dispersion Matrix (DM), which achieves beneficial transmit diversity in rapidly time-varying channels.Space-Time Shift Keying (STSK) constitutes another multi-functional MIMO technique in the family of MIM.It combines the Time Domain (TD) and the SpD and it is capable of striking a beneficial diversity versus multiplexing trade-off [18].More specifically, in STSK, Q DMs are designed for spreading the signal over T Time Slots (TSs) and M Transmit Antennas (TA) in the TD and the SpD, respectively.Furthermore, the IM design activates one out of the Q DMs for transmission, hence log 2 Q extra IM bits may be conveyed.By appropriately adjusting these parameters, improved Bit Error Ratio (BER), throughput and complexity trade-offs may be struck [19].
Additionally, the concept of MIM was proposed in [20], which is capable of improving the degrees of freedom, hence achieving all the benefits of the IM concept in several domains without introducing extra deployment costs, such as extra RF chains or transmission power.Furthermore, Lu et al. [21] proposed Compressed-Sensing-Aided Space-Time Frequency Index Modulation (CS-STFIM) to combine CS techniques with STSK and OFDM-IM, which is an MIM system concept that inherits the benefits of both STSK and OFDM-IM.As a further advance, SM was also integrated into this MIM scheme for TA selection in [22].In [6], the concept of multi-functional layered SM was proposed, which offers flexible trade-offs in terms of performance, hardware cost and power dissipation.
However, in previous MIM schemes, the index selection was performed separately in each dimension.By contrast, in this paper, we extend this concept to a Joint MIM system, where we jointly designs the IM in several dimensions.More specifically, the degrees of freedom of the IM mapping design is increased by harnessing multiple dimensions, which leads to a more flexible trade-off between the throughput, power efficiency, and cost.In this case, both SFSK and STSK can be considered as special cases of the proposed joint MIM (JMIM) family.JMIM may also be combined with CS techniques for increasing the spectral efficiency.
However, the joint detection of multiple dimensions leads to massive computational complexity at the receiver side.
More specifically, conventional Maximum Likelihood (ML) detection, suffers from a rapidly escalating complexity upon increasing in the number of dimensions [31].Coherent detection also requires the accurate knowledge of Channel State Information (CSI) at the receiver side, which leads to a substantial pilot overhead [32] as well as to a high Channel Estimation (CE) complexity [33], [34].In [22], CS-aided MIM (CS-MIM) was presented, where multiple detection stages were required for recovering the data from the constituent CS, STSK, OFDM-IM and SM schemes.As a result, nearcapacity operation can only be achieved, when Soft-Decision (SD) detection is used [35], but again, the complexity of MIM detection escalates with the number of IM dimensions.
Recently, learning-based detection has been used as an efficient tool for reducing the complexity of detection, while dis- pensing with the requirement of explicit CSI estimation [36].134 In [37], a Deep Neural Network (DNN) based model is 135 proposed for detecting the OFDM-IM signal, while the authors 136 of [38] and [39] harnessed convolutional neural networks 137 for IM detection, when the CSI is available at the input of 138 the detector.By contrast, blind learning based detection was 139 designed for Millimeter Wave (mmWave) IM in [28] and for 140 multi-set STSK in [29].However, the authors of [29] only 141 investigated the combination of basic SD and Deep Learning 142 (DL).In [36], both DNN-based Hard-Decision (HD) and 143 iterative SD assisted blind detection have been proposed for 144

CS-MIM.
Additionally, given the flexibility of our CS-aided JMIM (CS-JMIM) design, we can adapt the JMIM mapping to hostile time-varying channel environments to improve the attainable performance.Hence, the concept of adaptive modulation can be intrinsically amalgamated with CS-JMIM to improve the attainable throughput, while maintaining a specific target BER.
Yang et al. proposed machine learning aided adaptive SM [40], while Liu et al. [41] conceived learning-assisted IM for mmWave communications.In their follow-on research, they further developed the work by considering CE employing sparse Bayesian leaning for accurate CSI estimation [42].
Table I boldly contrasts the novelty of this paper to the literature.More explicitly, the contributions of this paper can be further detailed as follows: 1) We propose the CS-JMIM system concept and present several JMIM mapping matrix designs.Then, we demonstrate that the proposed JMIM mapping design is capable of striking an attractive trade-off between diversity and throughput.
2) We propose a DL-based HD detection aided CS-JMIM system that can achieve near-ML performance, while imposing significantly reduced complexity.Furthermore, we propose a DNN-aided SD detector for the proposed CS-JMIM that is capable of achieving near-capacity performance.
3) Both a K-nearest neighbour (KNN) algorithm based and a DL-assisted adaptive modulation scheme is proposed for CS-JMIM.We demonstrate that the learning-assisted adaptive CS-JMIM scheme is capable of selecting more appropriate CS-JMIM mapping design for transmission than its conventional threshold-based adaptive counterparts.Hence it can obtain a significant throughput gain over the conventional threshold-based adaptive method.4) Our simulation results demonstrate that the proposed learning-based detector is capable of approaching the performance of the conventional coherent detection techniques at a reduced detection complexity.We also provide the associated capacity and throughput analysis, for characterising the trade-off between each mapping matrix and the benefits of the learning-assisted adaptive method.
The rest of the paper is organized as follows.In Section II, the system model of CS-JMIM is presented.In Section III, we characterize both HD and SD based learning-aided detectors.
Then, in Section IV we present our proposed adaptive system design.In Section V, we present our simulation results, while our conclusions are offered in Section VI.

II. SYSTEM MODEL
In this section, we introduce the transceiver model of the proposed CS-JMIM system employing N t TAs and N r Receive Antennas (RAs).Fig. 2 shows the block diagram of the CS-JMIM system considered, where b bits are equally divided into G groups.We consider OFDM having N c subcarriers, which are then split into G groups and each group has 1 , while N vt TAs and 200 N v subcarriers of each group are applied for the CS-JMIM 201 system in the Virtual Domain (VD) 2 .To be more specific, in 202 each subcarrier group, there are N v available subcarrier indices 203 within the VD, where the dimension N v of the VD is larger 204 than the dimension N f of the FD.Similarly, N vt antennas 205 in the VD are larger than the N t antennas of the SpD.For 206 each group of b bits as b g bits are used 207 for generating K Phase Shift Keying/Quadrature Amplitude 208 Modulation PSK/QAM symbols, while the remaining b 2 g bits 209 are mapped to the JMIM mapping matrix selector, which 210 chooses a specific mapping matrix out of Q JMIM matrices.211 Then, these K PSK/QAM codewords and the selected JMIM 212 mapping DM are combined to generate a Space-Time (ST) 213 block S. Afterwards, the block creator of Fig. 2 collects all 214 codewords from the G groups for forming a frame, which is 215 mapped to multiple index domains by the carrier index mapper, 216 followed by the CS method and OFDM modulation, as shown 217 in Fig. 2.Then, after transmission over the wireless channel, 218 the receiver estimates the channel and detects the signal.At the 219 receiver side, the signal is transformed back to the subcarrier 220 symbols and each JMIM group signal is detected separately.221 In the following, we present the details of the processing 222 stages at the transmitter and the receiver.In this case, we only 223 focus our attention on a single group instead of G groups, 224 since the same procedure is applied to all groups, as shown 225 in Fig. 2. The transmitter model is introduced in Section II-A, 226 followed by the receiver model in Section II-B.g bits for the classic PSK/QAM.In the 232 following we explain in detail the Joint Index Mapping (JIM) 233 part of the CS-JMIM transmitter of Fig. 2. g is used for selecting the active DM from the The second part is used for determining the 240 constellation symbol, which is employed for modulating the 241 active DM.The classic constellation symbol is then selected 242 from a M-ary PSK or QAM constellation χ.

243
Let us denote the selected DM and the selected constellation 244 symbol, respectively, by D i , i ∈ {1, • • • , Q} and x, x ∈ χ. 245 Then the combined signal in group g can be expressed by In the following, we introduce three designs of the DMs.247 Firstly, to leverage the multi-dimensionality of MIM systems, 248 the design of IM encompasses all dimensions.Then, the 249 activation of the corresponding indices is guided by the 250 smaller sub-group matrices, each adopting a general JIM.Fur-287 thermore, striking a trade-off between throughput and diversity 288 involves choosing either the same or different DMs across 289 groups.To elaborate further, applying the same DM across 290 all groups results in multiple copies of the information bits, 291 which produces a diversity gain.On the other hand, employing 292 different DMs for each group improve the throughput.

293
For example upon assuming N vt = 4, N v = 4 and K = 2 294 for each groups DM results in D q ∈ C Nvt×Nv .Then, we 295 This article has been accepted for publication in IEEE Open Journal of Vehicular Technology.This is the author's version which has not been fully edited and content may change prior to final publication.Citation information: DOI 10.1109/ OJVT.2023.3328823This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ further split D q into four equal sub-matrices expressed as where we have D i q ∈ C Nvt/2×Nv/2 , i = 1, 2, 3, 4. For each sub-matrix D i,j q , (i = 1, 2, 3...gsx), (j = 1, 2, 3...gsy) general JIM can be applied.Here, gsx and gsy represent the number of sub-group's in the FD and SpD, respectively.In the above example, we can have a total of gs = gsx × gsy = 4 sub-groups and b 1 g = [log 2 C(4, 2)] = 2 bits for each subgroups matrix.To maximize the throughput, four different sub-matrices can be aggregated to one DM D q to obtain 8 bits in total.Fig. 4 shows the block diagram of the grouped JIM, where we have four sub-groups of smaller general JMIM matrix.For a small general JMIM matrix we can apply Q = 4 DMs in total, where we can assign 4×2 bits for all sub-groups.
On the other hand, if four repeated sub-matrices are used, we can achieve similar structure of coded JMIM which will be discussed below.By adjusting the index mapping of each sub-group, it offers significant throughput or diversity gains.However, this leads to a substantial increase in detection complexity for conventional methods, such as the ML detector.c) Coded Joint Index Mapping: Another way of further increasing the transmit diversity is to employ coded index mapping, where we use a circular shift based design of the DMs, which was proposed for SFSK in [17].In this method, the number of active subcarriers in each column is n q , with N q − n q inactive subcarriers, where N q is the column length of D q .Then, the second column is the circular down shift of the first column by one position.Similarly, other columns can be obtained based on the previous column distribution.
To elaborate a little further, using a 'toy' example, for following is an example of a circular shifting based DM: Given b 1 g = 2 bits, then 2 2 = 4 DMs are selected for the 327 CS-JMIM system.We assume that T v TSs are applied in the VD and T TSs are 336 used in the TD, while we have T v > T .Then, we can assign 337 three-dimensional DMs D q ∈ C Nv×Nvt×Tv .In this case, the 338 above-mentioned three mapping techniques can be applied.Then, we can generate each TS index mapping with the aid of a single position shifting, which can be represented as: 2) Compressed Sensing and Block Assembly: In order to exploit the sparsity of the JIM DM, CS is applied to all the dimensions of the joint multi-dimensional matrix symbol created by the block assembled to increase the throughput.As shown in Fig. 7, a matrix S g associated with will be transformed from the matrix S, S ∈ C Nvt×Nv into the 388 vector s, s ∈ C NvtNv×1 .

389
The symbol vector s is then compressed by a CS measure-390 ment matrix A ∈ C N f Nt×NvNvt from the N v N vt -dimensional 391 s in the VD into the N f N t -dimensional form in the Real 392 Domain (RD) 3 denoted as s (RD) , which can be written 393 as: s RD = As.The RD vector s RD after CS is then 394 transferred into a compressed joint multidimensional symbol 395 matrix S (RD) , where S (RD) ∈ C Nt×N f .Then, the index 396 carrier mapper maps the corresponding joint multidimensional 397 symbol elements to the OFDM subcarriers and the TAs to form 398 the SF symbols.Afterwards, G groups of SF symbols S are 399 assembled by the OFDM creator to a long SF symbol frame, as 400 shown in Fig. 2. The RD SF symbol can be separated into N t 401 FD symbols, which means that N t FD symbols are transmitted 402 by N t TAs.Similar to conventional OFDM, the FD symbol 403 will be transformed into TD symbols to be transmitted by 404 their corresponding TAs and then a Cyclic Prefix (CP) will 405 be added.The G groups of SF symbols S are assembled 406 by the block creator of Fig. 2 to form a long ST frame, 407 which is processed by the ST mapper to output a symbol for 408 transmission over multiple TAs and TSs, Equivalently, the ST 409 symbols S of each subcarrier group are mapped to N t TAs 410 during T TSs, which have N t symbol sequences {s 1 , ..., s Nt } 411 for transmission from the N t TAs during each TS.412 3 RD is the joint dimension of DM after the CS process.For instance, the SF-based JMIM signal conveys more bits in the VD than in the RD.For the three-dimensional JMIM, utilizing the TSF dimensions, the TD is also compressed by CS for improving the throughput, where T v TSs are introduced in the VD for IM, complemented by T TSs in the TD.Specifically, for the general JMIM scheme, the TD is introduced for increasing the sparsity and for incorporating extra embedded information bits.As shown in Fig. 8, we apply CS to the TSF JMIM, where all the three dimensions are compressed for increasing the throughput.Specifically, a (4 × 4 × 4)-sized DM in the VD will be compressed to a (2 × 2 × 2)-sized DM of the RD.For example, when we have and K = 1, the element at the fourth subcarrier, fourth TA and first TS is activated, corresponding to the coordinate of (4, 4, 1).
As for the coded JMIM scheme, additionally the TD is harnessed for further increasing the diversity gain, where CS is not considered for the TD.We assign either the same or different symbols in a sub-group matrix of the grouped JMIM scheme, which leads to a different CS approach.Given the different sub-group matrix symbols, the TD is exclusively harnessed for carrying extra copies of the symbol without CS.
The design objective of this scheme is to increase the diversity gain.

B. Receiver Processing
As shown in Fig. 9, a receiver having N r antennas is employed, where we assume that the transmitted signals are

452
As for the three-dimensional signal, the transmitted signal is 453 mapped ST symbols, which are also collected by the receiver 454 and split into G groups by the Block Splitter of Fig. 9. 455 Afterwards, the symbols received in the three dimensions by 456 each subcarrier group Y ∈ C Nr×M ×T may be expressed as .
(3) The received symbol of the t-th TS can be represented 458 as and α = 1, 2, . . ., N f , t = 1, 2, . . ., T characterizing the ST 460 structure per group and the ST symbol received at the α-th 461 subcarrier of each subcarrier group, respectively.Since the 462 index is jointly decided in the multi-dimensional space, we can 463 transform the ST symbol into a vectorial form y associated 464 with y ∈ C NrN f T ×1 .

465
Let the FD channel be H α ∈ C Nr×T for α = 1, . . ., N f .466 Then the signal Y t [α] ∈ C Nr×T (α = 1, . . ., N f ) received 467 during the T TSs for each subcarrier group can be expressed 468 as [22] 469 The received signal y contains N f ST symbols at N f subcarriers in the FD of each subcarrier group.Then, we can rewrite y with the aid of (4) in the following form: where Ā is the equivalent measurement matrix A used for compressing the s VD vectors.In our three-dimensional CS-JMIM system, Ā also compresses the TD, where we have Conventional exhaustive search based maximum likelihood (ML) detection can be applied at the receiver, albeit this may lead to excessive complexity [5].Furthermore, in the soft detection scenario, the received signal is converted into probability values, which are referred to as Log Likelihood Ratios (LLR) that are fed into the channel decoder for obtaining a near-capacity performance [43].
In the following section we present the conventional MLbased HD detector, followed by our proposed DNN aided HD detector, where the neural network replaces the exhaustive search by a learning-based classification model in order to significantly reduce the complexity.Afterwards, we discuss the SD detector, where we first present the conventional SD detectors followed by our learning-aided SD receiver.

A. Hard Decision Decoding
Again, we commence with the conventional ML-based detection of the CS-JMIM system, followed by the DNN-based detector.
1) Maximum Likelihood Detection: As shown in Fig. 9, we detect each group;s signal separately.In the CS-JMIM detector, according to the receiver model of ( 5 all the possible realizations of the selected PSK/QAM symbol.The ML detector makes a joint decision concerning the 521 JMIM DM and PSK/QAM with the aid of exhaustive search, 522 which can be modelled as where γ and β represent the estimates of the selected DM and 524 the corresponding PSK/QAM constellation in each subcarrier 525 group, respectively.

526
The excessively high search complexity of considering 527 all possible candidates by the ML detector is given by 528 O[N JM IM (X ) K ] per subcarrier group.

533
Detection may also be considered as a classification prob-534 lem, where the corresponding bits of the harnessed CS-JMIM 535 DM and PSK/QAM symbol constitute the DNN output.Under 536 the assumption of perfect CSI at the receiver side, we use 537 the received signal and the CSI as the input of the DNN 538 model.The proposed DNN structure is shown in Fig. 10, 539 where both the CSI H at the receiver and the received symbols 540 Y constitute the inputs of the L-layer Fully-Connected (FC) 541 network.Then, the output bits û can be modelled as where W n and θ n , n = 1, • • • , L represent the weights and 543 biases, respectively.In (7), the Rectified linear unit (Relu) 544 function of f Relu (s) = max(0, s) is employed for activating 545 the DNN during the training phase, while the sigmoid function 546 of f sigmoid (s) = 1 1+e −s is used to obtain the detected bits 547 û.The raw input data represented in the complex-valued 548 matrix form obtained from the received signal Y is vectorized 549 first and then we rearrange the complex values by separately 550 extracting the real as well as the imaginary parts and then 551 merging them into a real-valued vector.
where B is the sample size of the current iteration.A stopping criterion can be defined either by the number of iterations or by an MSE threshold.Then, the parameter sets {W n , θ n } can be updated in each training iteration based on our learning algorithm using gradient descent, which is formulated as where α > 0 is the learning rate and ∇L({W n , θ n }) represents the gradient of L({W n , θ n }).In our proposed network aided detection, we use α = 0.001.
By the end of the training phase, the DNN has learnt the mapping from the received signal and stores both the weight as well as the bias information, which will be used for producing the desired outputs based on the input data in the testing phase.The statistical properties of the input/output data have to remain the same as those used during training.
The detection complexity of the learning algorithm is dominated by the calculation of the layer weights and biases, which may be considered to be of the order of O( with n representing the number of neurons in each layer.Hence, the DNN complexity order is significantly lower than that of the ML detector.

B. Soft Decision Decoding
SD detection is employed for attaining near-capacity performance, when combined with channel coding.As the computational complexity of the maximum a posteriori probability in SD detector rapidly increases upon increasing the modulation order and the number of dimensions [44], the complexity of CS-JMIM rapidly becomes prohibitive, owing to the joint detection of JMIM signal in multiple dimensions.In the following, we present the conventional SD detector of CS-JMIM, followed by the correspond learning aided SD detector.

1) Conventional Soft Decision Detection:
A channel coded CS-JMIM scheme is shown in Fig. 11, which was derived from the CS-MIM model of [22], [36] for achieving near-capacity performance.A Recursive Systematic Convolutional (RSC) encoder encodes the information bit sequence b followed by an interleaver, where the coded bit sequence c is interleaved to generate the stream u of Fig. 11.Then, the stream u is modulated in the CS-JMIM modulator of Fig. 2.
At the receiver side of Fig. 11, the received signal Y and CSI H are input to the soft CS-JMIM that outputs LLRs.
The LLRs output from the demodulator are then passed to the de-inteleaver and the RSC decoder performs soft decoding.In The LLR of a bit is defined as the ratio of probabilities associated with the logical bits '1' and '0', which can be written as L(b) = log p(b=1) p(b=0) .The conditional probability 604 p(Y |X β,γ ) of receiving the group signal Y is given by [45] 605 where X γ,β represents the PSK/QAM symbol at the β-th CS-606 JMIM DM.Furthermore, N 0 is the noise power, where we 607 have σ 2 n = N 0 /2 with N 0 /2 representing the double-sided 608 noise power spectral density.

609
Hence, we can formulate the LLR of bit u i as where X l 1 and X l 0 represent a subset of the legitimate equiv-611 alent signal X corresponding to bit u l , when u l = 1 and 612 u l = 0, respectively, yielding X l 1 ≡ {X γ,β ∈ X : u i = 1} and 613 X l 0 ≡ {X γ,β ∈ X : u i = 0}.

614
Upon using ( 9) and ( 10) we obtain the LLR L(b i ) of the bit 615 sequence conveyed by the received signal Y.To simplify the 616 calculation, the Approximate Log-MAP (Approx-Log-MAP) 617 algorithm based on the Jacobian Maximum operation can be 618 used, which is given by [46], [47] 619 where jac(.)denotes the Jacobian maximum operation and the 620 intrinsic metric of λ γ,β is At the receiver, the soft demodulator evaluates the prob-622 ability of each bit being logical '1' and '0'.Then it ap-623 plies the approx-log-MAP algorithm for obtaining the extrin-624 sic LLR of the coded bits, which has a complexity order 625 O[2 (cg) (N JM IM (X ) K )], where c g represents the number of 626 coded bits after the RSC encoder and interleaver, and N JM IM 627 represents the number of possible realizations of JMIM.
and the corresponding loss function is We can also define a stopping criterion, which can be either the number of iterations or meeting a maximum MSE threshold.Then, the parameter sets {W n , θ n } can be updated in each training iteration based on the learning algorithm using gradient descent, which is formulated as where α > 0 is the learning rate and ∇L({W n , θ n }) Fig. 13: BER vs. SNR performance of the CS-JMIM system for different mapping modes shown in Table III. 1) Conventional Threshold-based Adaptive Design: In our adaptive scheme, we can adapt both the configuration of JMIM DM and of the PSK/QAM mode.We can define the different configurations as M ode1, M ode2, M ode3, . . ., which can attain different BER performance and throughput.Based on the different modes, the parameters N v , N t , T and A of JMIM DM can be selected according to the SNR calculated at the receiver, where the SNR threshold values are selected for the different modes to satisfy a specific target BER [41], [42].
In the following, we present the scenario, where the different adaptive modes P refer to different configurations of the JMIM DM for characterising its design flexibility4 .
As an example, Fig. 13 shows the BER performance of three different CS-JMIM mapping modes.The corresponding parameters and data rates provided by these modes are shown in Table III.For a target BER of 10 −3 , as shown in Fig. 13 the SNR values of mode transition points P 1 and P 2 can be selected as the thresholds for operating the appropriate modes.Specifically, M ode1 is applied at low SNR values until the specific SNR reaches P 1 .Then, the mode is changed to M ode2 to provide higher throughput, when the SNR range spans from P 1 to P 2 .Finally, M ode3 is selected at SNRs higher than P 2 , which has the highest throughput among the three modes.
For adaptive modulation, the receiver has to confidently infer the choice of the most appropriate transmission mode by comparing the instantaneous SNR of the received symbol against the Mode-switching threshold values.Then, the decision is fed back to the transmitter and applied for the next frame to be transmitted.Generally, with more available operation modes as well as faster and more accurate SNR feed-back to the transmitter, we can obtain an increased throughput compared to non-adaptive designs.However, threshold-based adaptive modulation design ignores many of the hardware imperfections when deciding upon the threshold values, which results in sub-optimal performance of the adaptive system [41], [42].Hence, in the next subsection, we propose the learningbased adaptive modulation scheme for our CS-JMIM system to further improve the adaptive system's performance.
where ξ represents the SNR value of a symbol with a BER lower than the target BER value, with i = 1, 2, • • • , I representing the adaptive mode index and N p is the total number of instantaneous SNR values with BER under the target.Then, the total training set of each mode can be formulated as During runtime, for a given new data point, which corresponds to the instantaneous SNR ξ, the KNN model finds k nearest neighbours in the training set T , using a distance metric d(.), which can be expressed as Then, the mode is decided by the majority mode of the k nearest neighbours to the input test point.With the possibility of several modes having the same number in the k nearest 747 neighbours, the mode with the highest throughput will be 748 selected.

749
The performance of KNN significantly depends on its 750 parameters and on the value of k, where the best value of 751 k can be selected empirically.In this adaptive system, the 752 best value of k is determined by considering the trade-off 753 between the BER and throughput.Furthermore, KNN results 754 in a high computational complexity for the nearest neighbour 755 search in addition to requiring a large memory for storing the 756 training.Hence, in the following we present a DNN based 757 design alternative.Similarly to KNN, we randomly generate the training data 761 and then store the mode index and SNR value pairs, which 762 have BERs lower than the target value.Then, the training set 763 T constitutes the estimated SNR ξ of a symbol associated 764 with a BER lower than the target BER.We use the DNN-765 based classification model, where the input corresponds to the 766 instantaneous SNR and the output corresponds to the mode 767 index of adaptive modulation.

768
The output mode index î of the DNN can be expressed as 769 where W n and θ n , n = 1, • • • , L represent the weights and 770 biases, respectively.Relu is also employed for activating the 771 DNN during the training phase, and the softmax function is 772 used to obtain the mode index î, which is The number of training samples required is selected based 774 on experimentation by gradually increasing the training size 775 until acceptable MSE values are achieved.In this case, the 776 MSE loss function of the DNN used for the training is where B is the sample size of the current iteration.
A stopping criterion can be defined either by the number of iterations or by the maximum tolerable MSE threshold.Then, the parameter sets {W n , θ n } can be updated in each training iteration based on our learning algorithm using gradient descent, which is formulated as where α > 0 is the learning rate and ∇L({W n , θ n }) subcarriers per group, while considering 8 subcarriers per group in the VD and K = 1, 2 activated subcarriers.2) Scheme 2: applies maximum likelihood hard decision detection for the CS-JMIM system in the SF domain along with 2 TAs, 2 RAs, and 2 subcarriers per group in the RD, while considering 4 antennas and 4 subcarriers per group in the VD.In this scheme, we consider the following mappings: b) Grouped JMIM with gs = 4 subgroups, and each subgroup applies general JMIM in conjunction with K = 1 (In this case, we can consider that both the FD and SpD is split into two sub groups, which have gsx = gsy = 2.).
3) Scheme 3: applies ML HD detection for the CS-GFIM-SM, which activated one antenna out of 4 TAs, 4 RAs, and 4 subcarriers per group, while considering 16 subcarriers per group in the VD and K = 1, 2, 3 activated subcarriers.
4) Scheme 4: applies maximum likelihood hard decision detection for the CS-JMIM system in the SF domain along with 4 TAs, 4 RAs, and 4 subcarriers per group in this RD, with 8 antennas and 8 subcarriers per group in the VD.In this scheme, we consider the following mappings: a) General JMIM with K = 1, 2, 3.
c) Coded JMIM with n q = 4 5) Scheme 5: applies ML HD detection for the CS-MIM system in the TSF domain with     b) Grouped JMIM with gs = 8, gsx = gsy = gsz = 2 subgroups, where each subgroup applies general JMIM along with K = 1.(In this case, we further split the TD into two parts, which have gsz = 2.).c) Coded JMIM n q = 2. 7) Scheme 7: applies DNN based HD detection for the CS-JMIM system.Here, we consider 2 TAs, 2 RAs, 2 subcarriers per group, and 2 TSs in the RD, while using 4 antennas, 4 subcarriers per group and 4 TSs in the VD.
which applied the general JMIM method of Section II-A1a).
In this case, based on the transmission rate calculation formula bG Nc+L CP , we have the transmission rate of the CS-GFIM-IM associated with K = 1 in Scheme 1 as R k=1 t = 2.667 bits/s/Hz.This is the same as the CS-JMIM associated with K = 1 in Scheme 2a) under identical hardware configuration.However, the performance of Scheme 2a) is almost 10 dB worse than that of Scheme 1 at a BER of 10 −5 .Hence CS-JMIM is unattractive in this situation.For more activated index entities of both CS-JMIM and CS-GFIM-IM, the throughput of Scheme 1 is increased to R 1,k=2 t = 4 bits/s/Hz and Scheme 2a has R 2,k=2 t = 4.444 bits/s/Hz.In this case, Scheme 2a of K = 2 has a 3.6 dB better performance than Scheme 1 of K = 2 at a BER of 10 −5 .Fig. 16 shows the performance of the proposed CS-JMIM Scheme 2 for different JMIM methods.Observe that for a small index space of N t = N f = 2, the detector cannot beneficially exploit the sparsity.The transmission rate of Scheme 2 is either R k=1 t = 2.667 bits/s/Hz, or R k=2 t = 4.444 bits/s/Hz and we have R b t = 7.111 bits/s/Hz, R c t = 1.778 bits/s/Hz.As shown in Fig. 16, Scheme 2a) associated with K = 1, 2 has a similar BER performance, while Scheme 2a) of K = 2 has a higher throughput.Additionally, Scheme 2b) has almost 4 times the transmission rate compared to Scheme 2c), but the latter has an increased diversity gain.Hence the BER performance of Scheme 2c) is 12dB better than that of Scheme 2c).IV.
To further exploit the sparsity of CS-JMIM, we also consider larger SF dimensions applied to the JMIM method, as shown in Fig. 17.We assume that both schemes have the same number of TAs and subcarriers per group along with an adjustable number of VD subcarriers.For N t = 4, N f = 4, the CS-JMIM of Scheme 4a) achieves better performance than the separate MIM in Scheme 3 with the same K value.Specifically, both schemes have R k=1 t = 1.777 bits/s/Hz and Scheme 3 associated with K = 1 obtains 5 dB SNR gain over Scheme 4a) with K = 1 at BER of 10 −5 .When relying on a higher K, CS-JMIM is capable of providing higher throughput as well as improved detection performance.With K = 2, 3, the throughput of Scheme 3 is R k=2    IV.R t = 3.555bits/s/Hz.Scheme 6c) achieves a BER of 10 −5 at an SNR of 1.1 dB.When higher dimensions are introduced, both the general JMIM and grouped JMIM can provide a high throughput as well as a good BER performance, albeit at the cost of a huge detection complexity.In Fig. 20, Scheme 6b) represents the grouped JMIM associated with 8 sub-groups.When K = 1 and the general JMIM DM is applied, we have R t = 17.778bits/s/Hz.This scheme attains a BER of 10 −5 at an SNR of 15.1 dB.Scheme 6a) with K = 3 has R t = 9.333bits/s/Hz and achieves a BER of 10 −5 at an SNR of 11 dB.Hence, for higher dimensions, the grouped JMIM outperforms the other two JMIM methods.However, the complexity of grouped JMIM is exponentially increasing.Specifically, the detection complexity order of the grouped JMIM can be expressed as O[(N JM IM (X K ) N sub ] for the TSF domain CS-JMIM system.This can be simplified to O[((N v N vt T v /(g s ))(M K ) N sub ], where N sub represents the number of sub-groups.On the other hand, the detection complexity order of the general JMIM is O[(N v N vt KT v M K )].Furthermore, the coded JMIM complexity order can be O[(N q − n q )n q M ].Then we can formulate the computational complexity order of ML for Scheme 7a) as . For Scheme 7b), the sub-groups must be considered in each rounds, which have a complexity of For Scheme 7c), we have a reduced complexity order of O M L [N r N f N t T N vt N v T v M N f N t T M (N q − n q )n q M ] due to having multiple bit copies.Then we can calculate the computational complexity based on Table IV, as shown in Table V.
Upon increasing the throughput excessive detection complexity is imposed by conventional ML detection.To reduce the detection complexity, we have to accept a performance vs. complexity trade-off.In this context, we compare our DNNbased detector of the TSF based CS-JMIM system to conventional maximum likelihood detection by comparing Scheme 6 and Scheme 7 in Fig 21 .Observe that the DNN-assisted HD detector achieves a similar performance to the ML detector.Furthermore, the complexity of the NN is determined by that     IV.IV.

2001 2023SMFig. 1 :
Fig. 1: Milestones of the index modulation family from single dimensional index modulation to MIM.

228
As shown in Fig.2, b bits are split into G groups, where 229 the b g bits, (g = 1, 2, 3...G) of each group are split into two 230 parts by the block splitter: b 1 g bits are used for JMIM mapping 231 matrix selection and b 2

234 1 )
Joint Index Mapping: As shown in Fig.2, the N c sub-235 carriers of the OFDM symbol are divided into G groups of 236 size N f , with N f = N c /G.For each b g group of bits, the first 237 part b 1

Fig. 5 Fig. 5 :
Fig.5 shows a block diagram of the coded JIM, where 329 we can apply the first JIM DM for b1=[0 0] based on 330 the code book used.In this scenario, coded JIM offers the 331 maximum diversity in the design of coded DMs, enabling 332 reliable detection even in highly noisy environments.333

339Fig. 6 (
Fig.6(b)  shows the structure of the grouped JIM applied in 350 three dimensions.Similar to the SF matrix, the TSF matrix can 351 be split into several equal sub-groups.As shown in Fig.6(b), 352 we assume N vt = N v = T v = 4 and K = 1 for each group's 353 DM, which results in D q ∈ C Nv×Nvt×Tv .Then, we further 354 split D q into 8 equal sub-matrices.Each sub-group DM can 355

Fig. 6 :
Fig. 6: Illustration of the structure for JIM DM in time-space-frequency domain

Fig. 7 :
Fig. 7: Illustration of the process for compressing the JMIM DM in the SF domain with N vt = N v = 4, K = 1.Note that this example applies the general JIM with b 1 g = [0100].

Fig. 8 :
Fig. 8: Illustration of the process for compressing the JMIM DM in the TSF domain with T = N t = N v = 4K = 1.Additionally, the example presented applies the general JIM for b 1 g = [000100].
conveyed over a frequency-selective Rayleigh fading channel and the CSI is perfectly acquired at the receiver side.The G groups of signal are received by the receiver over N r antennas and then the CP part of the received signals is removed.Finally, the processed signal is transformed into the FD by using the Fast Fourier Transform (FFT), as shown in Fig.9.The channel model can be expressed as h α ∈ C Nr×Nt , which represents the TD CSI between the N t TAs and the N r RAs.Then, the FD channel matrix can be expressed as H α ∈ C Nr×Nt for α = 1, . . ., M , which are then

Fig. 9 :
Fig. 9: CS-JMIM system receiver block diagram This article has been accepted for publication in IEEE Open Journal of Vehicular Technology.This is the author's version which has not been fully edited and content may change prior to final publication.Citation information: DOI 10.1109/OJVT.2023.3328823This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ where S RD [α] ∈ C Nr×T denotes the ST symbols at the subcarrier α transmitted from the N t TAs in the RD.Furthermore, W [α] ∈ C Nr×T represents the Additive White Gaussian noise (AWGN) obeying the distribution of CN (0, σ 2 N ), and σ 2 N is the noise variance.III.CS-JMIM SIGNAL DETECTION Given the received signal model Y in (4), the receiver detects the information bits of the JMIM mapping matrix, which jointly conveys the index of the active subcarrier, the active TA and TS in the VD.Firstly, we reshape the received signal into a vectorial form y associated with y ∈ C NrN f T ×1 .
NvNvtTv×1 denotes the vector of DM combined with the PSK/QAM symbol.In this case, we could rewrite s in a matrix S associated with S = x D, where D ∈ C Nv×Nvt×Tv denotes the realization of the JMIM DM in each subcarrier group.
), we have the modified joint JMIM and PSK/QAM symbol, which can be expressed as S = x D.Here D represents a specific realization of the selected JMIM DM and x represents K STSK PSK/QAM symbols.To detect the specific realization, we use D(β) (β = 1, 2, ..., N JM IM ) to denote all the possible realizations of the JMIM DM.Furthermore, as there are

529 2 )
DNN-based Detection:To reduce the complexity of the 530 ML detector, learning based detection is considered in this 531 section, where a DNN based model is proposed for detecting 532 the received CS-JMIM signal.

552
In the training phase, we employ randomly generated re-553 ceived signals, which are transmitted over a frequency selec-554 tive Rayleigh fading channel for CS-JMIM.Afterwards, both 555 the CSI and the received symbols are employed as the input 556 data of the DNN.The number of training samples required is 557 selected based on experimentation by gradually increasing the training size until acceptable mean square error (MSE) values are achieved.In this case, the MSE loss function of the DNN used for the training is

Fig. 11 ,
Fig. 11, L(•) represents the LLRs of the bit sequences, where L e (u) is the output extrinsic LLR after soft demodulation and L a (c) is the de-interleaved LLR sequence of L e (u).

628 2 )
DNN-based SD Detection: In this section, we propose a 629 reduced-complexity SD detector using DNN, which considers 630 a similar DNN architecture to that of[29].Since the conven-631 tional SD detector obtains the LLRs of the received signal 632 after the CS-MIM soft demodulator, we replace the detected 633 This article has been accepted for publication in IEEE Open Journal of Vehicular Technology.This is the author's version which has not been fully edited and content may change prior to final publication.Citation information: DOI 10.1109/OJVT.2023.3328823This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

2 )
Learning aided adaptive modulation: The adaptive modulation can be modelled as a classification problem, which can be solved using learning-based methods.The SNR of the received signal, which is evaluated at the receiver side, can be fed back to the transmitter and then given the SNR information, which also corresponds to the current channel state information, the transmitter can select a specific mode from a range of candidates to achieve the highest throughput, which still maintain the target BER.Therefore, for a given channel condition, adaptive modulation selects the most suitable mode to achieve the highest throughput, under the constraint of achieving the target BER.In this paper, both the KNN and DNN techniques are investigated in the context of adaptive modulation.Before the training phase, the input data should be preprocessed to improve the learning efficiency.First, we randomly generate the training data of each mode under different instantaneous SNR values at the receiver.Then, the corresponding switching SNRs that can maintain a BER lower than the target BER are stored.Given these training data, we can use learning models to find the mode switching thresholds in the training phase.After training, the trained model becomes capable of predicting the next mode, given the knowledge of the SNR.In the following, we first employ KNN for our adaptive modulation scheme and then we propose a DNNbased adaptive model for further improving the performance.a) KNN-based Adaptive Design: KNN is a popular classification techniques relying on low-complexity implementation and yet providing a good performance [48].Yang et al. [40] developed KNN-assisted adaptive modulation schemes for SM, while Liu et al. [41] further developed DNN aided adaptive modulation to millimeter wave communication.To elaborate briefly on the KNN process, we define the training sets as
758 b) DNN-aided Adaptive Design: In this section, we 759 present the DNN-based adaptive modulation regime of Fig. 14. 760

778
This article has been accepted for publication in IEEE Open Journal of Vehicular Technology.This is the author's version which has not been fully edited and content may change prior to final publication.Citation information: DOI 10.1109/OJVT.2023.3328823This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

854 and 4
TSs in the VD.In this scheme, we consider the 855 following mappings: 856 a) General JMIM with K = 1, 2.

t = 2 .
667 bits/s/Hz and R k=3 t = 3.333 bits/s/Hz, respectively, while Scheme 4a) could achieve R k=2 t = 3.111 bits/s/Hz and R k=3 t = 4.667 bits/s/Hz.This article has been accepted for publication in IEEE Open Journal of Vehicular Technology.This is the author's version which has not been fully edited and content may change prior to final publication.Citation information: DOI 10.1109/OJVT.2023.3328823This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

Fig. 22 andFig. 24 :
Fig. 22 and Scheme 6 of Fig. 21, the detection performance 1042 is 1dB better for Scheme 8c) than for Scheme 6c) at the BER 1043 of 10 −5 .Furthermore, Scheme 8a) requires an SNR of 6.2 1044 dBs at BER=10 −5 , while Scheme 6a) necessitates SNR=1.6 1045 dB.Scheme 8b) has the best performance, outperforming 1046 Scheme 6b) by about 8 dB at a BER of 10 −5 .Fig.22 also 1047 shows the performance of DNN based detection for TSF CS-1048 JMIM, where Scheme 9a) and Scheme 9c) exhibit similar 1049 performance.Quantitatively, they require about 4 and 3.2 dB 1050 at a BER of 10 −5 .Scheme 9b) requires 3 dB higher SNR 1051 than the conventional SD detector, but it is still about 6 1052 dB better than Scheme 7b).The proposed learning method 1053 has a complexity order of O[O(n i n l ) + O(n 2 l ) + O(n l n o )] 1054 compared to O[2 cg (T v N t N vt (QX ) K ] for the conventional 1055 scheme, where c g denotes the RSC-coded number of bits in a 1056 transmitted symbol.1057 Finally, we present the performance of Scheme 10 in 1058

Fig. 23 .
Fig. 23.For the sake of a fair comparison, we use the 1059

TABLE I :
Contrasting our contributions to the literature

TABLE III :
Configuration of the modes presented in Fig.13

TABLE IV :
CS-MIM system simulation parameters.
This article has been accepted for publication in IEEE Open Journal of Vehicular Technology.This is the author's version which has not been fully edited and content may change prior to final publication.Citation information: DOI 10.1109/OJVT.2023.3328823 6) Scheme 6: applies maximum likelihood hard decisionThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

TABLE V :
Simulation results and complexity analysis of each Scheme.

TABLE VII
This article has been accepted for publication in IEEE Open Journal of Vehicular Technology.This is the author's version which has not been fully edited and content may change prior to final publication.Citation information: DOI 10.1109/OJVT.2023.3328823Thiswork is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ Scheme 9: applies DNN-based SD detection for the CS- 884 9)

TABLE VIII :
Configuration of mode used in conventional adaption with TSF domain CS-JMIM Table VIII presents the