A Novel User Grouping in Phase Rotation based Downlink NOMA

Non-orthogonal multiple access (NOMA) is a promising candidate beyond 5G and 6G to improve spectral efficiency by sharing the same resource block with other users. However, as the number of users increases in a single NOMA cluster, not only the spectral efficiency but also the interference increases. Therefore, to ensure a trade-off between capacity and error performance in NOMA systems, this study introduces a novel downlink NOMA scheme using the phase rotation. In the system, four users can create a single NOMA cluster and are grouped into two subgroups by exploiting conventional NOMA. Within each subgroup, the symbols of two users, near user and far user, are rotated and multiplexed by the power domain. Subsequently, the final transmitted signal is obtained by multiplexing the in-phase and quadrature components of the first and second subgroups, respectively. By superposing signals from two independent subgroups, the number of successive interference cancellation (SIC) operations are reduced compared to the existing NOMA. Moreover, the analytical bit error rate (BER) for a four-user scenario is derived, and the optimal rotation angle and power allocation are proposed to minimize the BER performance. Numerical results demonstrate the improved performance of the proposed NOMA scheme over several existing schemes, such as conventional NOMA and orthogonal multiple access (OMA).


I. INTRODUCTION
F OLLOWING the trend of a new generation of mobile communication emerging every 10 years, the rollout of 6G will begin in the 2030s. Three genetic services of 5G, namely enhanced mobile broadband (eMBB), massive machine-type communications (mMTC), and ultrareliable and low-latency communications (URLLC), have been widely adopted and optimized as a further requirement for 6G [1]. In particular, mMTC supports a large number of low-power and low-complexity devices while attaining high spectral efficiency. The mMTC refers to a typical Internet of Things (IoT) scenario, in which a large number of sensors are deployed and report sporadically to an application server in the cloud [2]. Ericsson predicts more than 3.5 million cellular IoT devices by 2023, which will be widely used in industries and societies, such as utilities, smart cities, smart buildings, transport and logistics, agriculture, and environments [3].
Over the past few decades, radio access technologies for cellular communications have relied mostly on multiple ac-cess schemes. Multiple access techniques can broadly be categorized into two different approaches: orthogonal multiple access (OMA) and non-orthogonal multiple access (NOMA). In OMA, each user can exploit orthogonal communication resources in the time, frequency, or code domain. In particular, fourth generation (4G) communications utilize the multiple access technique, orthogonal frequency division multiplexing (OFDM), which is not sufficient to support the massive connectivity with diverse quality of service requirements for 5G communication. In other words, the OMA schemes cause a bottleneck when mMTC devices access the network simultaneously [4].
In contrast to conventional OMA techniques, NOMA can support a higher number of users through non-orthogonal resource allocation [5]- [7]. Generally, NOMA can be categorized into two types: power-domain NOMA (PD-NOMA) and code-domain NOMA (CD-NOMA). This paper focuses on PD-NOMA (hereinafter referred to as NOMA). At the transmitter in the NOMA network, the transmitted signals are multiplexed at the same time/frequency resource with different power levels using superposed coding (SC). The power allocated to users depends on the channel gain or distance. A strong user, that is, a user with a good channel, is near the base station (BS), whereas a weak user, that is, a user with a poor channel, is far from the BS. Thus, higher and lower powers are allocated to the weak and strong users, respectively. The distinct power allocation for users enables the receiver to decode its own signal. At the receiver, successive interference cancellation (SIC) is performed to subtract the stronger signal first from the superposed signal and then decode the user's own data. However, as the number of users increases in a single cluster, the power difference among users becomes too small by sharing the given total transmission power. In other words, NOMA users with similar power allocations suffer from degradation of the error probability owing to inter-user interference (IUI) [8]. Hence, to subtract many users, the receiver can result in greater complexity.

A. RELATED WORKS
To overcome the high complexity and low error probability in NOMA systems, phase rotation (PR) based NOMA has been presented in many recent works [9]- [13]. Previous studies have considered the concept of signal space diversity (SSD) [14]. The key point of SSD is to apply a certain constellation rotation so that no overlapping coordinates exist among all symbols of the superposed signal. The NOMA system employing SSD rotates the symbol of either one user [9], [10] or all users [11]- [13].
In [9], the authors proposed a two-user downlink NOMA with PR employing joint multiple user detection and SIC, where only the symbol of the near user is rotated. The optimal angle was obtained to maximize the minimum distance between the points of the superposed constellation. This approach was extended to both uplink and downlink phaserotated NOMA schemes for two users in [10]. The first user used the original 4-quadrature amplitude modulation (QAM), while the second user had the rotated 4-QAM. The closedform expression was derived for the optimal angle of rotation, in which the highest error rate among the two users was to be minimized. However, in these studies, in which only a single user was rotated, the optimal angle was determined depending on the transmitted power levels.
In the case of rotating all users' symbols, [11] presented a two-user downlink NOMA network in which both near and far user symbols are rotated, and then coordinate interleaving (CI) was exploited in the superposed signal. Thereafter, an upper bound symbol error rate (SER) for both users was derived for rotation angle optimization. The optimization of the rotation angle was performed for either the near user or the far user. However, the two-user scenario still has limitations for the extension to multiple user pairings. Recently, a quadrature NOMA scheme was introduced in [12], which utilized two quadrature carriers (cosine and sine). In this scheme, all users are rotated using π/4 rotated M-ary amplitude shift keying by applying the CI. The authors demonstrated an improved the bit error rate (BER) performance and reduced the number of SIC operations.
By contrast, the previous phase-rotated NOMA focused on improving the error probability performance rather than the data rate. In [13], the author proposed a constellation domain for two-user downlink NOMA to improve both the performance of the error rate and the data rate. The rotated quadrature phase shift keying (QPSK) modulation is assigned for all users, and then the in-phase component of the first user's signal and the quadrature component of the second user's signal are superposed in the constellation domain instead of the power domain. This may lead to the removal of SIC operations in the NOMA receiver. The analytical expression for the symbol error rate was derived, and the optimum angle was obtained to minimize the SER.

B. MOTIVATION AND CONTRIBUTIONS
To the best of our knowledge, more than three users in a single cluster have not been well studied for NOMA with the phase-rotation-based system. Moreover, no exact BER has been evaluated for phase-rotation-based NOMA. Therefore, in this study, we extend [13] to the four-users case to guarantee the trade-off between the sum rate and error performance. To utilize the concept of SSD and CI, the final transmitted signal is obtained by multiplexing the real component of the first subgroup and the imaginary component of the second subgroup. Therefore, the signal for each subgroup can be independent.
The main contributions in this paper can be summarized as follows: • A novel phase-rotation-based downlink NOMA scheme is proposed for four users which form a single cluster to achieve a lower BER and higher sum rate. • The optimal rotation angle is investigated to maximize the minimum distance between symbols. Moreover, the optimal power allocation is obtained by minimizing the inter-cluster and intra-cluster distances in the superposed signal. • No SIC is required between subgroup multiplexing, which may lead to a reduction in the total number of SIC operations. • The exact BER and union bound BER are derived for all users employing M-ary PSK/QAM with Rayleigh fading channel. Simulation results show that the proposed system provides a good trade-off between the achievable sum rate (ASR) and BER compared with those of the existing NOMA clustering.

C. PAPER ORGANIZATION
The remainder of this paper is organized as follows. First, the problem formulation and solution are described in terms of the user grouping, and the system and channel models are described in Section II. In Section III, the optimal rotation angle and power allocation are described based on the distance between the symbols in the constellation. The exact and union bound of the error probability expressions for each user, and the achievable sum rates are derived for different schemes in Section IV. The simulation results are presented in Section V. Finally, Section VI concludes the paper.

A. PROBLEM FORMULATION: USER GROUPING
This paper considers the downlink NOMA system, which consists of one BS and four users. The channel gain is defined as which is related to the distance of the user from a BS. Based on the user's distance from the BS, UE 4 and UE 3 are considered as the far user (FU) and the cell edge user (CEU), whereas UE 1 and UE 2 as the near user (NU) and the cell center user (CCU) in NOMA cellular scenario. Fig. 1 illustrates the difference among conventional NOMA schemes and the proposed system for four users. In the standard NOMA, all user's signals will be multiplexed as a single NOMA pair over the same transmission bandwidth B using different power levels as shown in Fig. 1.(a) [5]. In general, as the number of users in a single cluster increases in the given total power, insufficient different power levels are allocated to distinguish the desired signal among the users during the decoding process. Therefore, the four-users pairing can suffer from high IUI rather than two-users pairing. In addition, as the SIC process increases, the computational complexity increases [15].
To easily allocate distinguishable power levels, a multiple cluster system for four users is represented in Fig. 1.(b) [16]. Each user pair is allocated half of the transmission bandwidth as B/2, whereas only two users in a subgroup

Capacity
Medium Low High IUI High Low Medium need to distribute power levels from the total transmitted power. Referring to [17] for the optimal four-user pairing in terms of data rate, the optimal pairing ensures that UE 1 and UE 4 are paired for one subchannel, while UE 2 and UE 3 are paired for the other subchannel. In conclusion, these pairing methods can not allow all signals to share the same timefrequency resources, but they can perform better in terms of error probability than the original NOMA.
In this paper, a novel user-clustering method using subgroups is proposed using the advantages of the two methods presented above. In other words, the proposed system can build four users into a single cluster composed of two subgroups in the entire transmission bandwidth, as shown in Fig. 1.(c). Similar to the multiple clusters scheme, each subgroup consists of two users by applying optimal user pairing using [17]. The two subgroups are considered independent using the real and imaginary values of each subgroup signal. Therefore, half of the total transmitter power is allocated to each subgroup, which leads to a lower error probability compared to the conventional NOMA with a single cluster.
In summary, Table 1 compares the conventional NOMA and proposed NOMA systems in terms of the capacity and IUI. The proposed NOMA system can achieve better the trade-off between the capacity and error performance compared to conventional NOMA cluster schemes. Fig. 2 is illustrated as the proposed transceiver design, including one BS and four receiver devices. As shown in Fig. 2, at the transmitter side, four-user signals are divided two subgroups and pass through rotation function "ROT." To differentiate the subgroup G 1 and the subgroup G 2 , the PR is utilized for all the NUs, that is, UE 1 and UE 2 . In particular, the FU signal data in subgroup 2 is rotated as π/2 to be used for the imaginary value of the superposed signal. After rotation, each subgroup's NU and FU signals are multiplexed using the different power levels by using power allocation "PA" and superposed coding "SC" The superposed signal for subgroup G 1 , G 2 is defined as

B. SYSTEM AND CHANNEL MODELS
where x i and p i denote the transmitted signal and power allocation coefficients of UE i , i ∈ {1, 2, · · · , N }, respectively. The total transmission power is assumed to be N i=1 p i = 1. θ i is the angle of the constellation rotation for the i th user. As mentioned above, θ 3 is a fixed value of 90 • . In Section III, the optimal angle for NUs is discussed in detail.  Finally, the real and imaginary components from the superposed signal of the subgroup are extracted and multiplexed as the transmitted signal using superposition coding. The transmitted signal can be written as A simple example of the proposed system in a constellation is shown in Fig. 3. For this example, we consider the modulation schemes for FU and NU as BPSK-QPSK. There are three SC operations in this scenario. As general NOMA, the first and second SC operations were performed to make each subgroup. However, similar to [13], the last SC process is operated to combine real values of subgroup 1 and imaginary values of subgroup 2 for the final superposed signals. In Fig. 3, the constellation for subgroup 1's superposed signal consists of UE 1 and UE 4 data. The superposed signal of G 1 is combined with a rotated QPSK and an original BPSK. The real parts of the 8 constellation points in G 1 are used for the in-phase value of the final received signal. Similarly, the constellation for subgroup 2's superposed signal consists of UE 2 and UE 3 data. The superposed signal of G 2 is a combination of a rotated QPSK and rotated BPSK. To distinguish subgroups 1 and 2, the UE 3 should be rotated by 90 • for quadrature components. The imaginary parts of the 8 constellation points in G 2 are used for the quadrature value of the final received signal. It is noted that in-phase and quadrature components of signal X contain the information of subgroup 1 and subgroup 2, respectively. In other words, the final superposed signal has 8 different amplitudes states for subgroup 1 and 8 different phase states for subgroup 2. Therefore, even though the final signal has 8 constellation points, as in the number of constellation points of each subgroup, a total of 6 bits can be transmitted at the same time.
The received signal of i th user is given by where h i and n i are the channel coefficient and additive white Gaussian noise (AWGN) for the i-th user, respectively. h i denotes the Rayleigh fading channel , where d is the distance between the BS and user, and v represents the path loss exponent. n i has a mean of zero and a variance of N 0 .
On the receiver side, the strong user, that is, UE 1 and UE 2 , can have the SIC process subtract the far user's signal from the superposed transmitted signal. However, the weak user, that is, UE 3 and UE 4 , can decode its own signal without the SIC process by treating NU's signals as interference.
For subgroup 1, the real value of the superposed signal can be considered for the detection process. Thus, the inphase component of the received signal of UE 1 and UE 4 are expressed as Each receiver can decode its own signal using maximum likelihood detection (MLD). As mentioned above, the strong user, UE 1 , needs an additional process, that is, SIC, to extract its own signal before the MLD. However, the weak user, UE 4 , can decode its own signal without SIC by treating the UE 1 signal as interference. The decoded signal can be written asx x 1 = argmin where y I 1 is the estimated received signal of UE 1 after subtracting the signal of UE 4 .
In the case of subgroup 2, the quadrature component of subgroup 2's superposed signal G 2 can be considered for the detection process. Therefore, the quadrature components of the received signal of UE 2 and UE 3 are expressed as Similar to subgroup 1, UE 2 first subtracts the UE 3 signal by exploiting SIC and then decodes the desired signal from the superposed transmitted signal G 2 using the MLD. On the other hand, UE 3 decodes the desired signal using the MLD because the UE 2 signal is treated as interference. Thus, the decoded signal of x 2 and x 3 can be written aŝ where y Q 2 is the estimated received signal of UE 2 after subtracting the signal of UE 3 .

III. OPTIMAL ANGLE AND POWER ALLOCATION A. OPTIMAL ANGLE
The optimal angle for the NU can be derived using the Euclidean distance between the constellation points. For example, in the case of NU with QPSK/QAM, the rotated constellation can be represented as shown in Fig. 4. Following the SSD principle, the equal intra-cluster distance among constellation symbols leads to a low error probability. In other words, each symbol has the same error probability as the optimal rotated angle. In Fig. 4.(a), the x-coordinates of the constellation points of UE 1 are given by: When the distance between neighbor symbols is equal, the error probability distribution of each symbol is uniform, i.e., By substituting the values of (15), the event A ′ C ′ = B ′ C ′ can be written as where the values of h I and p 1 can be eliminated. When n I /h I is negligible, the optimal angle can be simplified as By using (16) and (17), the optimal angle is independent of the power allocation coefficient and channel gain. The same approach should be repeated for UE 2 . In Fig. 4.(b), the ycoordinates of the constellation points of UE 2 are given by: When E ′ F ′ = F ′ H ′ or F ′ H ′ = H ′ G ′ , the optimal angle is obtained. Similar to UE 1 , the optimal angle of UE 2 is exactly the same as that in (17).

B. OPTIMAL POWER ALLOCATION
The final superposed signal is as shown in Fig. 5. For the optimal angle for NU, the intra-cluster distance is considered. In contrast, the final superposed signal needs to consider both the intra-cluster and inter-cluster distance. Thus, the optimal power allocation can be expressed as where d intra and d inter denote the intra-cluster and intercluster distances, respectively, among the superposed signal X. The distance of subgroup 1 is given by By (19), the optimal power allocation is derived by equating the intra-cluster distance to the inter-cluster distance. Thus, the optimal power allocation ratio can be expressed as For subgroup 2, the optimal allocation ratio is equal to (22) because the same total power is allocated for each subgroup as 0.5. Fig. 6 depicts the received signal of subgroup 1. In the constellation, asterisks (*) mark the rotated symbols of the superposed signal, and black circles mark the non-rotated symbols. We assume that all x 1 and x 4 symbols have equal probability, so the probability of the superposed signal's symbols are equal to 1/8. The error occurs when the in-phase component of AWGN n I is higher than the symbols component in BPSK constellation [18]. For example, when the  Fig. 6, the error will occur when the in-phase value of subgroup 1's signal is G

IV. PERFORMANCE ANALYSIS A. EXACT BER ANALYSIS
By utilizing the MLD, the error probability of UE 4 (far user) in subgroup 1 can be expressed as To simplify (23) in terms of the Gaussian Q function Q(.) where Q(x) = 1 π π 2 0 exp − x 2 2 sin 2 θ dθ [19, Eq.(9)] Then, the error probability of UE 4 can be represented as where γ j is the signal-to-noise ratio (SNR) for the different signal constellation points in Fig. 6 as Then, the average BER at UE 4 is obtained as follows: where f γj (γ j ) is the probability density function in the Rayleigh fading channel, which is denoted by f γj (γ) = where In contrast, the error probability of UE 1 (NU) in subgroup 1 is considered to be the error of the SIC process. Thus, the total error of UE 1 is calculated as the sum of two cases, that is, when UE 4 (FU) is detected correctly or erroneously during the SIC process. The probability of error of the UE 1 symbols can be written as P 1 (e) = P 1 (e|correct UE4 ) + P 1 (e|error UE4 ) .
where P 1 (e|correct UE4 ) is the error probability of UE 1 when UE 4 can be decoded correctly, and P 1 (e|error UE4 ) is the error probability of UE 1 when UE 4 can be decoded incorrectly.
In other words, the correct decoding is performed in UE 4 if the in-phase value of subgroup 1's signal is G To rearrange the equation for n I or n Q , the modified I/Q value for the subgroup signal and the NU signal is defined as Using Q function, the (34) can be expressed as where For the second case of the error probability of UE 1 , both UE 1 and UE 4 decode erroneously. If the decoding error occurs in UE 4 , the QPSK boundary of UE 1 can be changed owing to the SIC process. In particular, owing to the BPSK, the error of UE 4 only affects the in-phase part of the error probability of √ p 4 − √ p 1 cos π 4 + θ h 1 and n Q ≤ − √ p 1 sin π 4 + θ h 1 . Thus, the error probability of UE 1 when UE 4 can be erroneously decoded, as in (32). Similar to (34), (32) can be simplified as To rearrange the equation for n I , s can be defined as s where Finally, the probability of error of UE 1 symbols can be written as Similarly, the error probability of subgroup 2 can be derived by considering the quadrature component of the superposed signal. Fig. 7 depicts the received signal of subgroup 2. Unlike subgroup 1, subgroup 2 is heavily influenced by quadrature values owing to both rotated BPSK-QPSK combinations. Similar to (23), the error probability of UE 3 (far user) in subgroup 2 can be expressed as Owing to the error of the SIC process, the probability of error of UE 2 symbols can be written as The error probability of UE 2 when UE 3 can be correctly and incorrectly decoded, as in (43) and (44).

B. UNION BOUND BER ANALYSIS
We analyze union bound for BER of users using pairwise error probability (PEP). By using [22], the conditional PEP of the user can be expressed as where ω i = |h i | and where (20)]. The interference due to the SIC can be considered for NUs, whereas the signal of NUs treats as noise for FUs. By utilizing the probability density function (PDF) of Rayleigh fading channel f ω (ω i ) = 2ωi in (45), where erfc(x) is the complementary error function, the average PEP can be derived as The BER union bound can be expressed as (48) where q (x l →x l ) is the number of bit error when the x l is the transmitted andx l is detected symbols.

C. ACHIEVABLE SUM RATE
In this section, we compare the achievable sum rate (ASR) for the four-user scenarios with different schemes, that is, OMA, NOMA with a single pair, NOMA with multiple pairs, and the proposed system.
For the OMA, the data rate of i-th user can be given as where γ = P σ 2 denotes the transmit SNR. B is the transmission bandwidth and N is the number of users. For OMA, equal bandwidth and equal power are allocated for all users as B/N and P/N , respectively.
According to the number of clusters, NOMA can be classified as a single cluster or multiple clusters. First, the conventional NOMA can make all users for a single pair using SIC (see Fig. 1.(a)). In other words, all users are multiplexed with different power levels at the same frequency. The data rate of each user for NOMA with a single cluster can be given as where R NOMA1,4 denotes the data rate of UE 4 in NOMA with a single cluster. According to the channel gain, UE 4 can detect its own signal without interference from other users. However, the m(= K − 1)-th user needs to consider other users as interference. The power allocation is given by In the case of NOMA with multiple clusters, both frequency and power allocation can be regarded for user pairing. In this paper, four users are separated into two groups where the number of NOMA clusters is N c = 2 (see Fig. 1.(b)). The data rate of each user can written as where the transmission bandwidth B/N c is assigned to each subgroup. The power allocation is assigned as p 1 + p 4 = 1 and p 2 + p 3 = 1. However, the proposed system can be a hybrid scheme using two conventional NOMA schemes. First, two users are grouped into subgroups, and the real and imaginary values of each subgroup's superposed signal are multiplexed by VOLUME 4, 2016 utilizing the SSD principle (see Fig. 1.(b)). Thus, the data rate of the proposed system can be expressed as: where all signals are assigned the full bandwidth and the power levels are allocated as p 1 +p 4 = 0.5 and p 2 +p 3 = 0.5 as half of the total power allocation, respectively. For all schemes, the ASR is calculated as the sum of the data rates of each user.

D. COMPLEXITY
We compared the computational complexity of the receiver depending on the number of SIC operations. We assumed N users in the cellular network. The conventional NOMA with a single cluster requires N (N − 1)/2 sequential SIC operations. However, the proposed NOMA requires N/2 sequential SIC operations because only the NUs in each subgroup consist of SIC, as shown in Fig. 2 and UE 4 are assumed to be CEUs. Thus, the transmitted symbols for CCUs and CEUs are modulated using binarycoded BPSK and QPSK constellations. The total transmit power is assumed for all cases as P = 1. Fig. 8 shows the BER of the four-user scenario for the proposed system versus the transmit SNR with perfect SIC. The numerical result illustrates the exact and union bound BER by using the above derived equation and verified by Monte Carlo simulation. The optimal power allocation is chosen by using (22). The analytical BER match with simulation results. As expected, the union bound BER tends to be upper bound analysis in the simulation and the exact BER. Fig. 9 compares the average BER (ABER) for all users to the proposed system and two conventional NOMA with a different number of clusters. In the proposed NOMA and the first conventional NOMA, all users consist of a single cluster as N c = 1, while the other conventional NOMA divided four users into two subchannels as N c = 2, where N c denotes the number of clusters (see Fig. 1). In conventional NOMA with a single cluster, the optimal power allocation has been investigated for four-users NOMA in a few recent works [12], [23], [24]. Previous studies considered the power allocation This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.  for QPSK/QAM modulation for all users. However, we consider two BPSK-QPSK combinations for this paper. Thus, by modifying [24, Eq.(1)], the optimal power allocation for PD-NOMA is determined as p 1 + 4p 2 + 8p 3 + 32p 4 = 1. However, for the conventional NOMA with multiple clusters, two users form a single NOMA cluster. Hence, following many previous works, the fixed power allocations assign (p 1 , p 4 ) and (p 2 , p 3 ) to (0.2, 0.8), respectively. As can be noted from the figure, the ABER performance of the proposed system is worse than two clusters scenario but better than a single cluster scenario of the conventional NOMA. Fig. 10 shows the impact of power allocation variation on the BER in SNR = 20dB. The power allocation of NUs p 1 and p 2 varies from 0.01 to 0.25; correspondingly, p 4 and p 3 decrease from 0.49 to 0.25. When p 1 and p 2 increase, the symbol of the superposed signal gets closer between the symbols of binary 0 and binary 1 for FU, that is, G of FUs increases owing to the increased IUI. However, as p 1 and p 2 increase, the BER of the NUs decreases until a partial point, which indicates the lowest BER. Here, the particular point means that the intra-and inter-cluster distances among superposed symbols are equal to d inter = d intra . As p 1 and p 2 increase over a particular point, the symbol distance changes to d inter < d intra , which increases the BER of the NUs. Fig. 11 compares the achievable sum rate to proposed NOMA, standard NOMA with single cluster and multiple clusters, and OMA. The ASR is the sum of data rate in the different schemes by using (49), (50), (51), and (52). In case of single cluster, the proposed scheme and conventional NOMA can use full transmission bandwidth B and share the transmission power P for all users. However, in case of multiple clusters, conventional NOMA can be utilized two sub channels as half of the bandwidth B/2 and the transmission power P is allocated to each sub channels. In OMA, all users are allocated divided bandwidth and power allocation coefficient per the total number of user to B/4 and K/4. Thus, the proposed system can outperform the other schemes.

VI. CONCLUSION
In this paper, a novel multi-user downlink NOMA scheme is introduced to improve the error probability and the data rate performance simultaneously. By exploiting SSD and CI, the in-phase and quadrature components of each subgroup's signal are multiplexed for the final transmitted signal because the two independent subgroups form a single cluster. Therefore, the SIC function is not required when two subgroups are superimposed. Therefore, the complexity of the proposed system in terms of required SIC operations can be less than conventional NOMA networks with the same number of users in a NOMA cluster. Moreover, the optimal rotated angle and power allocation are obtained while minimizing the BER by maximizing the Euclidean distance between symbols, that is, the intra-and inter-cluster distances in the received signal. Furthermore, the analytical BER was derived and matched with the Monte Carlo simulation. Even though the proposed NOMA is not the best BER performance, it can achieve much higher ASU compared to conventional NOMA systems. In conclusion, the proposed NOMA scheme guarantees a good trade-off between BER and ASR.
Our future research will focus on evaluating performance with imperfect SIC and imperfect channel knowledge for a more realistic system. Moreover, the analysis will be extended to the case in which there are different power allocations for each subgroup.