Joint Symbol Level Precoding and Combining for MIMO-OFDM Transceiver Architectures Based on One-Bit DACs and ADCs

Herein, a precoding scheme is developed for orthogonal frequency division multiplexing (OFDM) transmission in multiple-input multiple-output (MIMO) systems that use one-bit digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) at the transmitter and receiver, respectively, as a means to reduce the power consumption. Two different one-bit architectures are presented. In the first, a single user MIMO system is considered where the DACs and ADCs of the transmitter and the receiver are assumed to be one-bit and in the second, a network of analog phase shifters is added at the receiver as an additional analog-only processing step with the view to mitigate some of the effects of coarse quantization. The precoding design problem is formulated and then split into two NP-hard optimization problems, which are solved by an algorithmic solution based on the Cyclic Coordinate Descent (CCD) framework. The design of the analog post-coding matrix for the second architecture is decoupled from the precoding design and is solved by an algorithm based on the alternating direction method of multipliers (ADMM). Numerical results show that the proposed precoding scheme successfully mitigates the effects of coarse quantization and the proposed systems achieve a performance close to that of systems equipped with full resolution DACs/ADCs.


I. INTRODUCTION
L ARGE-SCALE and massive multiple input-multiple output (MIMO) systems significantly improve the spectral efficiency and reliability when compared to systems equipped with a small number of antenna elements and are a key component for meeting the ever growing demand for mobile services. Additionally, large-scale MIMO systems enable communications in mmWave frequencies [1]- [3] as the large number of elements enables very narrow beams and thus high antenna gains. This is crucial as it can effectively mitigate the severe propagation loss and rain fading [3] that occurs to signals in the mmWave band.
The large number of antennas at the transmitter as well as the receiver leads to an increase of the degrees of freedom for the systems and a precoding can be applied to exploit them and improve the system performance. In the case of multi-user or multi-stream communications over a single time and frequency resource, precoding becomes necessary in order to suppress the interference between the streams and achieve reliable communications. Precoding techniques use the channel state information (CSI) and/or the information symbols to create the transmit signal in a way that satisfies the design criteria. In the literature there is a large number of precoding techniques that can be separated into two large categories based on the information that they use. On the one hand, in block level precoding (BLP) the precoder is designed using only the knowledge of the CSI and therefore its update is dependent on the channel coherence time. The transmitted symbols are produced by linearly applying the precoder on the information symbols and for that reason is commonly known as linear precoding. Linear precoding is a well researched area with multiple related works in the literature [4]- [8]. On the other hand, in symbol level precoding (SLP), techniques use both the CSI as well as the information symbols to produce the precoded symbols. SLP has certain advantages as destructive multi-user interference can be turned into constructive [9]- [13] or to enhance physical layer security [14], [15]. Additionally, SLP precoding schemes have been used to mitigate the effects of using components with low power consumption such as phase-shifters, non-linear power amplifiers (PAs) or low-resolution digitalto-analog converters (DACs). This is of particular importance in systems with a large number of antenna elements that require dedicated power hungry components for each antenna. In [12], [16]- [18] precoding with constant envelope signals was proposed for transmitters with power efficient, non-linear PAs for transmission, in [19]- [21] constant envelope precoding solutions were proposed for transmission over frequency selective channels, while in [22]- [30] precoding schemes for the downlink of multi-user MIMO systems comprised of a multiple antenna base station with low-resolution DACs and users with full resolution ADCs were proposed. Furthermore, in [31] and [32] constant envelope precoding techniques were This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ proposed for systems with low-resolution DACs and one-bit DACs, respectively. In [33] closed form expressions for the SQINR and SER of linear precoders were derived for MIMO systems with one-bit DACs. Additionally, in [34]- [37] linear and non-linear precoding schemes were proposed for MIMO systems with low-resolution DACs for orthogonal frequency division multiplexing (OFDM) transmission over frequency selective channels.
In other relevant literature the efforts focused on reducing the power consumption of the multi-antenna receiver by employing low-resolution analog-to-digital converters (ADCs). There is a special interest in one-bit ADCs as they are the least power consuming component with the ability to convert the received analog signals into digital [38]. Additionally, since one-bit ADCs should only distinguish the sign of the signal they eliminate the need for complicated automatic gain control and can further reduce the complexity of the receiver's analog front end. In [39] the expression for the mutual information of MIMO channels with one-bit ADCs was derived and was found that the penalty of employing one-bit quantization was approximately 2dB at low SNR. Several contributions studied the uplink where multiple single antenna users employing high resolution DACs transmitted to a base station with a large number of receive antennas and one-bit ADCs. Research showed that in the case of frequency-flat channels, linear detectors combined with simple linear precoding can achieve high sum rates [40]- [45] when the number of antennas is large enough. Furthermore, the authors in [46]- [49] arrived to similar results for the case of transmission over a frequency-selective channel. However, these works did not consider the utilisation of one-bit DACs at the transmitters and the negative impacts that this would have to the system performance.
In this article we study the case of a point-to-point MIMO system which employs OFDM transmission to mitigate the effects of the channel's frequency selectivity. The considered system, is equipped with both one-bit DACs at the transmitter and one-bit ADCs at the receiver. This is a departure from the systems that have been studied so far in the literature, where one-bit DACs were considered in systems with full resolution ADCs or one-bit ADCs were considered for systems with full resolution DACs. We design a novel precoder that takes into account the coarse quantization both at the transmitter as well as the receiver. This kind of transceiver architecture has multiple advantages from a power consumption perspective. It achieves significant power savings which can be attributed to utilizing one-bit DACs and ADCs as well as power efficient non-linear PAs. In general, OFDM systems suffer from high peak-to-average ratio (PAPR) and require highly linear PAs which increase significantly the cost and power consumption of the transmitter [50], [51]. However, in this system with one-bit DACs, the analog signal has a constant envelope across time and antennas which makes it immune to non-linear amplitude and phase distortion and therefore, power efficient non-linear PAs can be used instead.
Analytically, the contributions of the present work are the following.
• Two novel MIMO architectures for OFDM transmission with one-bit DACs and one-bit ADCs. The first proposed system is fully digital while the second is equipped with a network of analog phase shifters at the receiver. • The study of the power consumption of such one-bit DAC/ADC transceiver architectures with the inclusion of the non-linear PA model. • An SLP scheme for OFDM transmission in a large MIMO system with one-bits DACs and ADCs. The novel precoding design is formulated as a constrained least-squares problem and is then split into two similar mixed-discrete least-squares problems that are NP hard. Then for the solution of the latter problems, an efficient algorithm that is based on Cyclic Coordinate Descent (CCD) is proposed. • A novel design for the analog post-coding matrix which is applied to the received signal by the network of analog phase shifters. The problem of the analog post-coder is decoupled from the design of the precoding design as its update is only dependent on the CSI and is formulated as a norm maximization problem with a unit-modulus constraint. A novel algorithmic solution is presented by applying the Alternating Direction Method of Multipliers (ADMM). • Numerical simulations show that the performance of the proposed schemes overcomes the shortcomings of coarse quantization both at the transmitter and at the receiver. Furthermore the proposed solutions achieve performance close to precoding schemes that based on full resolution DACs and ADCs, and even outperform them when the non-linearities of the PAs are taken into account. The article is organized as follows. In Section II, we describe the proposed architectures for a MIMO-OFDM system based on one-bit DACs and ADCs. Furthermore, in this section the SLP design problems for the described architectures are formulated as well as the design problem of the analog post-coding matrix. In Section III, we present the solution to the SLP problems as well as the post-coder. In Section IV, we derive the power consumption of the two architectures that were previously described. Simulation results are presented in Section V, followed by the conclusions in Section VI.

A. Transceiver Architecture Based on One-Bit DACs/ADCs
A MIMO communication system, as shown in Figure 1, is considered, where the transmitter is equipped with M antennas and the receiver is equipped with K antennas, so that K < M. There are two one-bit DACs per transmit antenna, one for the in-phase and one for the quadrature component, which convert the digital output of the precoder into an analog signal. In a similar manner, the received signal at each antenna of the receiver is quantized by two one-bit ADCs and the digital outputs of the 2K ADCs are then combined to provide an estimate of the information symbols that have been sent.
It is assumed that the channel experiences frequency selective fading due to multipath propagation. Consequently, the MIMO time domain channel with L resolvable taps can be modelled as, (1) where h T (m, k, l) is the l-th channel tap, between the m-th transmit and k-th receive antenna. The system employs OFDM transmission [52], a well-known scheme which is used to mitigate the adverse effects of the channel's frequency selectivity. In OFDM the total available channel bandwidth, BW , is divided into N SC orthogonal subcarriers, with spacing Δf = BW /N SC between them, which are modulated independently either with a conventional modulation or in this case with precoded symbols. This results in N SC narrowband channels, with bandwidth smaller than the coherence bandwidth of the channel and therefore, the fading experienced by each one can be considered flat.
The use of one-bit DACs at the transmitter means that the input vectors x T have entries that lie in the complex set X defined as, where j is the imaginary unit. By denoting x T (m, n) ∈ X the signal transmitted by the m-th antenna at the n-th time index and assuming perfect synchronization, the baseband received time domain signal, in the sample domain, can be expressed as, where k = 1, 2, . . . , K, n = 1, 2, . . . , N SC and z T (k, n) the sample of Additive White Gaussian Noise (AWGN). During the transmission of an OFDM symbol the channel remains constant and perfectly known at the transmitter. The viability of the proposed architectures is heavily dependent on the existence of an effective CSI estimation solution. Thus, it is instructive to point out that, recent advancements in the signal processing community have shown that it is possible to estimate the channel almost perfectly even in the case of one-bit ADCs [53]. At the same time another promising solution for channel estimation in systems with one-bit ADCs is presented in [54] where a small number of high resolution ADCs is connected via a switch to the receive antennas during the estimation phase thus making the channel estimation problem more tractable and the perfect CSI assumption more plausible. It should be also mentioned that a Cyclic Prefix (CP) of length ν is prepended at the transmitter. This means that last ν samples of x T (n) are added at the beginning of the symbol. The CP serves as a guard band that protects the received symbol from delayed copies of the previous one and also enables the modelling of the linear convolution of the signal with the channel as a circular convolution instead. At the receiver, y T is quantized by one-bit ADCs and the output of each ADC is given by, where f q {w} = sgn{Re{w}} + jsgn{Im{w}} with sgn{·} denoting the sign function. Consequently, the outputs of the ADCs lie in X . After quantization, the first ν samples, which correspond to the CP and are corrupted by inter-symbol interference, are discarded and the FFT is computed. The output of the FFT in matrix form is given by, whereỹ F ,z F are the KN SC × 1 vectors that are produced by concatenating the frequency domain received signals and noise samples respectively across all the subcarriers and receive antennas. Additionally,W NSC ,K andW H NSC,M are reshaped DFT matrices that are used to perform the DFT and IDFT respectively and we define them as where W NSC is the classic DFT N SC × N SC matrix, ⊗ is the Kronecker product and I N is the N × N identity matrix. Using (6) the frequency domain input vector is given by, Moreover,H F is the KN SC × M N SC block matrix where H F n is the K × M frequency response of the n-th subchannel. Furthermore, the receiver applies a combiner and usesŝ to take a decision on the symbols that have been sent. The use of the post-coding matrix is critical in this system as it allows to combine the outputs of the 2K one-bit ADCs into R data streams, with R < K, of higher order modulations. The transmitter employs an SLP scheme that designs the transmit vector x T (n) ∈ X M in a way that minimizes the Euclidean distance between the RN SC × 1 vectors of information symbols, which must be conveyed to the receiver, andŝ. This objective can be expressed as, In the above approximate expression the effect of the AWGN has been ignored in order to simplify the objective, an assumption which is valid as the SNR increases.
The precoder design can be expressed as an optimization problem, if we add to the objective function F the constraint imposed on the input vector by the one-bit DACs, x T ∈ X MNSC ×1 and constrain the scalar β, which is a scaling of the original symbol vectors, to lie in R + as, In this formulation the problem is intractable and for this reason we propose the following splitting into two separate optimization problems. First, the vector of the ADCs' outputs,r T ∈ X KNSC ×1 is designed in a way that minimizes the distance between it and the vector of information symbols. This can be formulated as an optimization problem as, In (P 2 ) matrix G is also unknown. However, a joint optimization is difficult and impractical and this is why we follow a decoupled approach for the design of the combiner matrix. Moreover, this is supported by the different rate that x T and G need to be updated. The precoding vectorx T needs to updated on a symbol rate while G only when the CSI changes. In Sec III an approach based on SVD is proposed for the design of G. Because G is used for the design of the transmit signal at the transmitter and it is also applied to combine the outputs of the ADCs at the receiver it is assumed that is known at both sides. Additionally, since G is designed by using the SVD of the channel and no other information, it can be assumed that it is known both at the receiver and at the transmitter side, because both have CSI knowledge.
It should be noted that while the value of β 1 is derived by solving (P 2 ) at the transmitter, it is applied at the receiver side to scale the signal appropriately. However, there is no need to transmit the value of β 1 , which would introduce significant communication overhead as it is updated on a symbol rate, but rather it can be blindly estimated at the receiver using the equationβ where Q is the employed symbol constellation and M Q the modulation order. Once (P 2 ) is solved, the vector of the desirable ADC outputsr T is used to formulate a second optimization problem, with the objective to select the input vectorx T ∈ X MNSC that minimizes the distance in the frequency domain between the noiseless received signalH FWNSC,MxT and the vector of optimal ADC outputsr F =W NSC,KrT .
( It should be highlighted here that the introduction of W NSC,M in the problem (P 3 ) means that the IFFT is integrated in the design of the transmit signal and there is no need for a separate IFFT computation block at the transmitter as in conventional OFDM systems. One can observe that the optimization problems (P 2 ) and (P 3 ) are very similar and an algorithmic solution that is developed for one can be easily applied to the other. The problems are NP-hard and one solution could be an exhaustive search over all the possible vectorsx T ∈ X MNSC . The complexity of this solution increases exponentially with the number of antennas and subcarriers and therefore, the complexity would be enormous even for a system with few antennas and a short OFDM block. Instead, in section III we propose an efficient solution for both problems based on a Cyclic Coordinate Descent (CCD) framework [55].

B. Transceiver Architecture Based on One-Bit DACs/ADCs and a Network of Analog Phase Shifters
In the previously discussed system, the large number of antennas mitigates the effects of coarse quantization both at the transmitter and at the receiver. While generally the large number of antennas at the receiver side in the case of one-bit receive quantization is very beneficial, here the simultaneous use of one-bit DACs at the transmitter and ADCs at the receiver creates a problem. As we increase the number of receive antennas, the transmitter faces a difficulty in sending to each receive antenna and ADC the appropriate signal because it is also restricted by one-bit DACs. This inherent disadvantage means that an increase in the number of receive antennas K will increase the value of function (10) as it will decrease the number of available degrees of freedom from the side of the transmitter. This will deteriorate the system performance rather than improve it as it would happen in a classical MIMO system. This motivates us to research an alternative power efficient architecture where the increase of receive antennas will lead to an improved SNR but without negatively affecting available degrees of freedom of the system.
To this end, we propose the system architecture shown in Figure 2. This is a similar system architecture to the one described above but a network of analog phase shifters has been added before the one-bit ADCs. Furthermore the number of ADCs is now reduced to 2N s from 2K and this means that the network of KN s phase shifters maps the received signal of the K antennas to the 2N s ADCs.
The network of phase shifters can be mathematically modeled as a N s × K matrix Q with unit-modulus entries, |Q ij | = 1, ∀i, j that is applied on the received signal in the RF domain and therefore, the output of the ADCs is now given byỹ The precoder design is split into two problems that must be solved successively as before. The first problem is identical to (P 2 ) with the only difference being the dimension ofr T changing to N s N SC ×1 from KN SC ×1. The second problem is altered by the addition ofQ = I NSC ⊗Q at the cost function and becomes Finally, while addressing (P 4 ) matrix Q was not considered an optimization variable. This is because the joint problem is difficult to address and is also impractical, as Q is updated when the CSI changes rather than on a symbol rate. Thus, we opt to decouple the problem of designing Q from the precoding design. The purpose of introducing Q is to increase the SNR at the receiver, without increasing significantly the hardware complexity or power consumption. This objective can be achieved if Q is designed to maximize the Frobenius norm of the productQH F . Additionally, it is crucial for the system's performance to have available as many degrees of freedom as possible and this means that we want to design Q to also maximize the rank of the productQH F . These two objectives can be achieved at the same time by maximizing the nuclear norm of the product, ||QH F || * . This is because the nuclear norm is related with the rank and the Frobenius norm of a matrix via the following inequality, The expression above is derived by applying the Cauchy-Schwarz inequality to the nuclear norm and by using the definitions of the nuclear norm, σ 2 i , with σ i denoting the i-th singular value of the matrix. As a result of the above, the design of Q can be expressed as an optimization problem, It should be noted that Q is dependent only on the channel and therefore, is updated only when the channel matrix changes. Furthermore, Q is required both at the transmitter for the precoding design as well as at the receiver for combining appropriately the received signals. By assuming that the CSI is known both at the transmitter and the receiver it is easy to compute Q at both sides. While the objective function of problem (P 5 ) is convex the unit-modulus constraint for the entries of the matrix is not and therefore it is NP hard.
In section III the problem is reformulated and and a new algorithm based on the ADMM framework is proposed for its solution.

A. Precoding Solution for System Based on One-Bit DACs/ADCs
In this section the solution to the precoding problems with one-bit DACs and ADCs, as they were formulated in Section II, is presented. By observing problems (P 2 ) as well as (P 3 ) − (P 4 ), that correspond to the two proposed one-bit DACs/ADCs architectures, it is noted that they are very similar and the same algorithmic solution can be applied to all of them. The solution that is derived here for these problems is based on Cyclic Coordinate Descent (CCD) framework [55]. In addition to deriving the optimal precoding vector, the algorithmic solution for designing the analog post-coder matrix Q by applying the Alternating Direction Method of Multipliers (ADMM) [56] is also provided in this section. The algorithmic framework of CCD has been used to solve efficiently a similar precoding problem in a MIMO system with one-bit DACs and multiple single antenna users with full-resolution ADCs, in [37].
In the case where there is no analog processing of the signal at the receiver, which corresponds to the system shown in Fig. 1, the design of the precoded symbols include two steps. At the first step the informations symbols drawn from a constellation are given as an input to (P 2 ) which is solved and at the second step the output is fed as an input to (P 3 ). The solution of (P 3 ) is the vector of the precoded OFDM symbol in the time domain and is transmitted by the M antennas of the transmitter. As it was mentioned, both problems are solved using the CCD method, which enables the minimization of a multivariate cost function by iterating through the different coordinate directions and minimizing the latter over one direction at a time. Therefore, by applying the CCD method to (P 2 ) the k-th component ofr (i+1) T (k) at the i + 1-th iteration of the algorithm is given bỹ (k) is updated accordingly. To update the value of the variable β 1 , the following optimization problem is solved The termination criteria in (18) are not met do 4: for 1 ≤ l ≤ length(r T ) do 5: and yields the closed form solution where the operator Re{·} denotes the real part of a complex part.
The full description of the iterative solution can be seen in Algorithm 1. It is worth noting that the variable t is introduced to perform efficiently the update of one coordinate at a time of CCD. The algorithm is terminated when the following criteria are met The same procedure is followed to derive the solution for problems (P 3 ) and (P 4 ) after performing the appropriate replacements in equations (14)- (17). The input vectors is replaced byr F , β 1 is replaced by β,r T is replaced byx T and finally A is replaced by A 1 =H FWNSC,M for (P 3 ) and A 2 =QH FWNSC,M for (P 4 ), respectively.
So far there has not been a discussion regarding the derivation of the digital post-coding matrix G. For the first proposed architecture, G is formed by placing in its columns the first RN SC left-singular vectors ofH F which are derived by computing its singular value decomposition (SVD). For the second architecture, where the receiver also performs analog processing with its network of phase shifters, we use again the first RN SC left-singular vectors but this time of the matrixQH F .

B. Precoding Solution for System Based on One-Bit DACs/ADCs and a Network of Analog Phase Shifters
It is now time to present the solution for the phase shifting matrix Q that is applied at the receiver in the second proposed system architecture. It is worth noting that Q depends only on the channel and not on the symbols and therefore there is no need to calculate it on a symbol rate, but only when the channel changes. The problem (P 5 ) must be first reformulated appropriately in order to be solved using ADMM [56]. To do this we need to utilize an alternative definition of the nuclear norm [57] as well as the indicator function of the set U which is defined as where U is the set of N s ×K matrices of unit-modulus entries. The problem can now be cast in a separable form as

Its augmented Lagrangian function is given by
where Λ ∈ C K×Ns is a matrix of Lagrange Multipliers and α is a scalar penalty parameter.  10 (21) is minimized alternatingly with respect to the matrices D, F and Q, followed by a steepest ascent step for the update of variable Λ where n is the iteration index. Sub-problem (P 6a ) admits a closed form solution, [57], where U i and V i are the unitary matrices that are composed by the left-singular and right-singular vectors of H T F i F T , respectively. The following optimization problem (P 6b ) also admits a closed form solution to which we arrive by equating the gradient of (21) with zero and is given by Finally, the sub-problem P 6c can be written as The latter problem is essentially the projection of F − Λ/α on the set U and its solution is given by, where Π U denotes the projection function onto the set U which is defined for an arbitrary matrix M as The algorithm, as was described above, for deriving the phase shifting matrix Q can be seen in Algorithm 2. The algorithm is terminated when the following criteria are met (26) Finally, the section will end with the discussion regarding the computational complexity of the proposed algorithms. First, Algorithm 1 has a complexity per iteration of O(K 2 N 2 SC ) for solving problem (P 2 ) and O(M 2 N 2 SC ) when solving (P 3 ) and (P 4 ). This leads to a significant reduction in complexity when compared with the computational complexity that an exhaustive search would require for the same problems and would be O(KN SC 2 KNSC ) for (P 2 ) and O(M N SC 2 MNSC ) for (P 3 ) and (P 4 ), respectively. Furthermore, the complexity per iteration of Algorithm 2 for computing the phase shifting matrix Q is dominated by the SVD operation and is O(M N s N sc ).

IV. POWER CONSUMPTION MODEL
Since the motivation for researching the proposed systems is the increase of the power efficiency, it is essential to provide a model for the power consumption of such systems. Generally the power consumption of a communication system is given by the addition of the power of the transmitted signal and the static power, consumed by the components of the transceiver. Based on the appropriate modelling and approximations, [58], the power consumption of the transmitter can be shown to be given by where P P A , P DAC (B, F s ), P RF and P LO denote the power consumption of the Power Amplifiers (PAs), DACs, RF components (e.g filers, mixers) and Local Oscillator, receptively.
The power efficiency η of the employed PAs contributes significantly to the overall power consumption of the system and in this work, we assume that the transmitter is equipped with the widely used class B amplifiers, whose power consumption is given by, The power efficiency of class B amplifiers according to [59] is given by, where g(·) denotes the AM-AM conversion and A o,max is the maximum amplitude of the output signal given by A o,max = υA max where υ is the gain of the amplifier and A max is the input reference amplitude. It is assumed that the system uses TWT amplifiers whose AM-AM conversion is given by [59] and its AM-PM conversion by [59] Φ(A) = π 12 where A is the envelope of the input signal given by A = |x T |.
We will now provide the power consumption model for a B-bit DAC which following the analysis in [58] is given by (32) where V dd denotes the power supply voltage, I 0 denotes the value of the current source which corresponds to the least significant bit, C p denotes the parasitic capacitance of the switches that select one of the possible states of the DAC and α is a factor which is used to models some second order effects. Additionally, the sampling frequency is given by f s = 2(2f b + f cor ), where f b denotes the employed bandwidth and f cor the corner frequency of the 1/f noise [60]. The power consumption model for the multi-antenna receiver can be derived in a similar way. Using the results of [58], the consumed power of the K antenna receiver in Fig. 1 is approximated as, while the the consumed power of the receiver in Fig. 2 with the network of phase shifters is approximated as, where P LN A , P ps denotes the Low Noise Amplifiers' (LNAs) and phase shifters' power consumption, respectively and P ADC (B , F s ) is the power consumption of a B -bit ADC with f s sampling frequency, given by [58], where L min is the minimum channel length for the employed Complementary Metal Oxide Semiconductor (CMOS) technology. The total power consumption of the system is simple the addition of the power consumed by the transmitter and the receiver and is given by, In this section, the performance of the proposed solutions is evaluated through extensive numerical simulations. Additionally, the monotonic convergence of the proposed algorithms is confirmed through the numerical results. Furthermore, the performance of the proposed systems is compared to the one of systems using SVD precoding, both with and without one-bit quantization, which is known to achieve MIMO channel capacity [7].
For the numerical results a system with M = 50 transmit antennas, a channel with L = 8 resolvable taps and a cyclic prefix with a length of CP = 12 symbols were assumed. The SNR is defined as the average transmit power over the noise variance, E{||x T || 2 2 }/σ z = P/σ z , where P is the total transmit power and σ z the noise variance. The equality above holds becausex T ∈ X MNSC ×1 . Finally, the values for the termination criteria were chosen to be 1 = 2 = 3 = 4 = 10 −15 .
Deriving theoretical results for the convergence of the proposed algorithmic solutions is a very challenging task due to the non-convexity of the addressed problems. Thus, any such study could require an independent research work and thus, it is beyond the scopes of the present article. Though, it is possible to evaluate the convergence of the algorithms via numerical simulations.
First, in Fig. 3, it is shown how the cost functions of the optimization problems (P 2 ), (P 3 ) and (P 4 ) are reduced at each iteration. It should be noted that in this figure, an iteration is considered the update of each component ofr T andx T . It is observed that in all cases there is monotonic convergence to the minima, which can be intuitively explained by how CCD works. CCD minimizes the cost function over one coordinate at a time and therefore, this guarantees that at each coordinate update the cost function will have less or equal value than before. Additionally, it is observed that while problems (P 2 )− (P 4 ) are solved by the same algorithm it takes significant more iterations to converge for the problems (P 3 ) and (P 4 ). This is because of the different size of the optimization variablesr T andx T the first being a KN SC × 1 vector while the second  a M N SC × 1 one. It should also be mentioned that problems (P 3 ) and (P 4 ) show the exact same convergence behavior as they are identical, since Q in (P 4 ) is considered constant when solving the problem.
Next, in Fig. 4, we can see how the value of the cost function of the maximization problem (P 5 ) increases with the number of iterations. Again, it is observed that the convergence is monotonic. For ADMM, convergence results have been derived for convex problems that involve two blocks of variables [61]. However, here the problem that is tackled is strongly non-convex with three blocks of variables and convergence results is also a very difficult task that it is not possible to be addressed in the context of the present article.
In Fig. 5, the BER performance of the proposed precoding scheme with one-bit DACs and ADCs is examined for systems with different number of receive antennas. In a MIMO system with high resolution ADCs it would be expected that the increase of receive antennas would increase the received SNR and therefore the system performance. However, as one can observe in this figure this is not always the case with onebit ADCs. For a small number of receive antennas, K = 6, an error floor appears to the BER curve. This is because there are not enough degrees of freedom to achieve a minimum for P 2 that can guarantee error free communication at high SNRs. This means that the number of one-bit ADCs is not enough to reduce the cost function of (P 2 ) sufficiently. As the number of receive antennas increases to K = 12, the error floor goes away and the BER performance improves significantly because there are more ADCs available that can reconstruct the induced information symbols without significant error. However, when the number of receive antennas increases again to K = 24 the BER performance deteriorates. While a high number of receive antennas provides degrees of Fig. 6. Impact of the number of receive antennas to the performance of a system equipped with a phase shifting network at the receiver employing 16-QAM OFDM with M = 50 antennas, Ns = 12 ADCs, R = 2 data streams and N SC = 32 subcarriers. Fig. 7. Impact of the number of OFDM subcarriers to the performance of two systems employing 16-QAM and M = 50 antennas and R = 2 data streams. The system that is equipped with the network of phase shifters employs K = 50 antennas and Ns = 12 ADCs while the other one employs K = 12 antennas. freedom to problem P 2 , it takes them away from P 3 where the transmitter which is equipped with M = 50 antennas and one-bit DACs struggles to find a good minimum for P 3 and this leads to an increased transmit power and to the observed performance degradation. Finally, Fig. 5 includes results that assume perfect knowledge of β, rather than estimating it blindly as in the rest of this section. It becomes obvious by observing the curves that the blind estimation of β does not negatively affect the system performance as the results are identical for the two scenarios. This is because the size of the block and the number of the data streams are sufficient for a good blind estimation of β.
The results in Fig. 5 underline the need for the second proposed system architecture that is shown in Fig. 2 where a phase shifting network maps the signal of the K receive antennas to N s one-bit ADCs. In Fig. 6 the impact of the different number of receive antennas is shown for a system which employs N s = 12 one-bit ADCs and a network of KN s phase shifters. It is observed that as the number of receive antennas almost doubles from 12 to 25 and from 25 to 50 we gain 1 dB and 2 dB in performance respectively. Therefore, we can overcome one of the disadvantages of one-bit ADCs by adding phase shifters at the receiver and using the proposed analog post-coder Q. It is also worth mentioning that the power consumption of phase shifters is significantly smaller than that of the ADCs.
Next, in Fig. 7, the effect of the different number of sub-carriers is evaluated for the two proposed system architectures. The employed modulation is 16-QAM and three different scenarios are simulated for N SC = 16, 32, 64 subcarriers. It is observed that for both systems, the performance is almost identical for the different number of sub-carriers. In general real, world systems use OFDM blocks with large number of sub-carriers and therefore, the proposed schemes seem an appropriate solution as the system performance is preserved.
In the following numerical simulation in Fig. 8 the BER performance of the proposed schemes is compared to that of SVD precoding [7] when 4-QAM and 16-QAM modulation is employed. Additionally, the BER performance of a system which employs one-bit quantized SVD precoding is evaluated where the transmitted and received signals are coarsely quantized by one-bit DACs and ADCs respectively. First, when 4-QAM is employed the gap between SVD precoding that uses full resolution DACs and ADCs and the two proposed one-bit SLP schemes is about 10 dB while when 16-QAM is employed the respective gap reduces to about 3 dB. This shows that the proposed schemes are more appropriate for higher order modulations and in both cases the power consumption savings of one-bit DACs and ADCs over the full resolution DACs and ADCs is so significant that more than makes up for this performance gap. Furthermore, we observe that the performance gap between the two proposed one-bit SLP schemes remains constant at 2 dB for different modulation orders. Finally, the BER curves of quantized SVD show an error floor even when 4-QAM is employed. This highlights the need for appropriate precoding schemes as the ones proposed here in large MIMO systems that use one-bit DACs and ADCs.
In Fig. 9 we present a comparison of the proposed schemes with the BER of an SLP scheme that was designed in [37] for MIMO systems with one-bit DACs and full resolution ADCs. It is observed that when the signal of the scheme [37] is not quantized by one-bit ADCs and 4−QAM signaling is used, it slightly outperforms the proposed SLP schemes for one-bit DACs and ADCs. However, we observe that when one-bit quantization is applied at the receiver then the proposed techniques significantly outperform the competing SLP technique which is slowly driven to an error floor because of the high quantization error. This validates the need for a scheme like the one proposed in this work, where the problem of one-bit DACs and ADCs is jointly considered.
In Fig. 10, the effects of a PA, that is modelled according to (30)-(31), on the BER performance are evaluated. The gain of the amplifier is set to υ = 4 and the input reference amplitude to A max = 1. The solid lines are used to plot the BER curves of systems that use the non-linear PAs while the dashed lines correspond to systems that do not take into account the effects of the PAs and have been added for Fig. 9. Comparison of the BER performance of the proposed SLP schemes with the SLP precoding in [37] for an OFDM system with M = 50, N SC = 32 sub-carriers and R = 2 data streams. The systems without phase shifters have K = 12 receive antennas and ADCs while the systems equipped with phase shifters have K = 50 receive antennas and Ns = 12 ADCs.  reference. By inspection, it is observed that the introduction of the PAs leads to a constant 2 dB performance degradation of the two one-bit SLP proposed schemes. The BER performance of SVD precoding takes a big hit when the effects of non-linear PAs are considered. This is because the amplitude of the transmitted signal is not constant as in the case of one-bit SLP and this leads to significant amplitude and phase distortion. On the other hand, the advantage of the one-bit SLP schemes is that the amplitude of the transmitted symbols is constant across the time and across the antennas which leads to a uniform distortion of amplitude and phase.
This can be better observed in Fig. 11 where the effect of non-linear amplification is shown for one-bit SLP and SVD precoding and then how this affects the received constellation. In scatter-plots (a) and (c) we can see with black the input signals to the PA and with blue the output signals from the PA, while in scatter-plots (b) and (d) the black points correspond to the original constellation points and the blue points are the received symbols for SLP based on one-bit DACs/ADCs and SVD precoding, respectively. The proposed SLP scheme produces signals with constant amplitude across all time instances and therefore the non-linear amplification inserts constant amplitude and phase distortion. On the other hand, the SVD based system produces transmit signals with varying amplitude and the non-linear amplification inserts varying phase and amplitude distortions dependent on the amplitude of the input signal to the PA. This explains why the received symbols of the SVD based system are significantly more scattered than the symbols of the proposed one-bit SLP scheme.
In the previous figures we showed the BER performance of the proposed schemes which was a significant improvement over the BER of other quantized precoders. However, these figures do not provide a clear relationship between BER performance and power consumption. To this end we will introduce the metric of energy efficiency as defined in [62], where P e kn is the bit error probability per UT and per subcarrier, b is the number of bits per constellation symbol and P denotes the power that is consumed by the transceiver and is given by (36). In Fig. 12 it is observed that the proposed SLP scheme without a network of phase shifters at the receiver provides the best energy efficiency at the high SNR region while in the low SNR region the quantized SVD precoder has a slight edge. On the other hand, the proposed SLP scheme with the phase shifters yields only average energy efficiency results that are below the competing schemes because it has 50 receive antennas instead of 12 and a large network of phase shifters that consume power. However, this scheme is still very useful in scenarios were we want to improve performance without increasing the transmit power, as the BER performance gains come from the analog combining of the phase shifters. Finally, another interesting observation is that the SLP for one-bit DACs proposed in [37] can surpass the energy efficiency of the quantized SVD at high SNRs but not the energy efficiency of the proposed SLP for one-bit DACs and ADCs.
In order to show the complexity of the different methods we present a table with the runtime of the different precoding schemes. This, in combination with the BER performance figures presented in the article can illustrate the trade off between complexity and performance. From the Table I it can be seen that the SLP schemes have similar runtimes, but the  SVD which is a block level precoder is significant faster as expected. Therefore, it is clear that the BER performance gains of the proposed schemes come at the cost of computational complexity. It is also worth mentioning that the runtime is reduced when compared to the SLP technique for one-bit DACs and this shows that we have optimized significantly our algorithmic solution. Finally, the numerical results are concluded with Figures 13 and 14 where the total power consumption of the two proposed system architectures is compared with a MIMO system based on 14-bit DACs and ADCs. The power consumption of these systems is calculated by plugging in the parameters of table II into the equations (27)- (36). First, in Fig. 13 the power consumption of the systems in Figures 1 and 2 is plotted for a fixed number of transmit antennas in order to observe the effect of the different number of receive antennas to the power consumption. The system with the network of phase shifters consumes about 1 dBW more than the system without it but as the number of receive antennas increases for both systems the gaps becomes even smaller. This is because the number of ADCs remains constant, N s = 12, for the system with the phase shifters. On the other hand the same MIMO system based on 14-bit DACs and ADCs has an increased power consumption of more than 9 dBW when compared to the two proposed systems based on one-bit DACs and ADCs. In Fig. 14 the number of receive antennas remains fixed to K = 12 and the power consumption is plotted for a different number of transmit antennas. Again we see that the gap between the two proposed systems is less than 0.5 dBW and is due to the increased power consumption of the network of phase shifters. On the other hand the gap between the propose systems and the conventional system based on 14-bit DACs and ADCs is about 1 dBW for 1 transmit antenna but widens up to 12 dBW when the number of transmit antennas increases to 100. These results show that the proposed system architectures are very suitable for large-scale MIMO  systems as they manage to reduce significantly the power consumption.

VI. CONCLUSION
In this article, two MIMO-OFDM systems with one-bit DACs and ADCs were presented and an appropriate SLP solution was proposed. The precoding design was formulated as a non-linear least squares problem and it order to be tackled efficiently it was split into two NP hard mixed discrete continuous constrained problems. The problems were solved efficiently with an iterative algorithm which is based on the CCD framework. Additionally, the second proposed system architecture required the design of the analog post-coding matrix which was formulated as a nuclear norm maximization problem with a unit-modulus constraint and a an efficient ADMM-based solution was proposed. Furthermore, an approximate power consumption model was derived for the two proposed transceivers. The numerical results showed the necessity of appropriate precoding schemes for one-bit DACs and ADCs transceivers, as the optimal SVD-based precoding for full resolution DACs and ADCs could not guarantee error free communication when quantized to the desired one-bit precision. In addition, the results showed that the proposed one-bit SLP schemes were significantly less affected by the non-linearities of the PAs as the transmitted signals have constant envelope across the time and the transmit antennas. Thus, the proposed approaches for large MIMO-OFDM systems look very appealing as they lead to significant reduction of power consumption without loosing too much in performance when compared to conventional systems.