Photonics-Assisted Millimeter-Wave Communication System Based on Low-Bit Gaussian Mixture Model Adaptive Vector Quantization

The quantization technique with a low-bit resolution can significantly reduce the cost and power consumption of analog-to-digital converter (ADC). It will play an important role in energy conservation and cost reduction for the incoming B5G millimeter-wave (MMW) communication systems. In this paper, we propose and demonstrated experimentally a low-bit Gaussian mixture model (GMM) based non-uniform adaptive vector quantization (AVQ) scheme for the low-cost intensity modulated envelope detection photonics-assisted 28 GHz MMW communication system for the first time. The principles of GMM-based one-dimensional adaptive scalar quantization (ASQ) and multi-dimensional AVQ are first introduced, and then are used to realize the low-bit non-uniform adaptive quantization for reducing the ADC bit resolution of MMW receiver. Furthermore, the performance of traditional uniform quantization, the present K-means and proposed GMM-based non-uniform ASQ/AVQ schemes are evaluated and compared in detail. Utilizing the proposed GMM-based AVQ scheme, the ADC quantization resolution in our MMW receiver can be reduced from 5 bits of the traditional uniform quantization to as low as 2 bits, without noticeable performance penalty. Moreover, as compared with the K-means-based quantization scheme, the MMW receiver enabled by GMM-based ASQ/AVQ scheme can save about half of the quantization time under similar performance. This is mainly because the clustering based on probability converges faster than the Euclidean distance, which significantly reduces the number of iterations required. Therefore, the GMM-based AVQ scheme is a promising solution to realize high performance ADCs with low-bit resolution for future MMW-enabled optical fiber wireless access networks.


I. INTRODUCTION
W ITH the explosive growth of a variety of emerging Internet services, such as virtual reality, digital twin, holographic projection and metaverse, the throughput and bandwidth of wireless communication networks increase dramatically [1]. The photonics-assisted millimeter-wave (MMW) communication system, which supports the large-capacity communication and wide-area coverage of MMW signal by means of radioover-fiber architecture, has become the key technology in the upcoming beyond 5G and 6G era [2], [3]. However, the everincreasing of throughput and bandwidth puts forward higher requirements for overall cost and power consumption of this MMW system, which mainly depend on the modulation and detection methods as well as signal quantization techniques, etc. In the photonics-assisted MMW communication system, simple intensity modulation with the high-order modulation format such as four-level pulse amplitude modulation (PAM4) and low-cost envelope detection based on the Schottky barrier diode are effective strategies to reduce its cost and power consumption [4]. Accordingly, the quantization technique has become one hotspot and the next breakthrough to realize energy conservation and cost reduction. In particular, under the scenario of massive machine type communication in future B5G MMW systems, the cost and power consumption of high-bit resolution analog-to-digital converters (ADCs) at the MMW receiver end for high-speed signal quantization is a vital challenge [5]. As is known, the cost and power consumption of ADC mainly depends on the quantization bit resolution when at the same sampling rate. Therefore, how to realize a low-bit quantization without inducing the obvious performance penalty is the key to solving the above problem.
Uniform quantization is the simplest quantization method, and it usually requires a high bit resolution to ensure adequate This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/  [6], [7] and peak-to-average power ratio (PAPR) mitigation [8] can be used to reduce the need for quantization bits. However, both of the two approaches have a limited reduction in quantization bit resolution due to the presence of extra noise induced by the clipping or PAPR reduction. On the other hand, non-uniform quantization is another effective method to reduce the required number of quantization bits. To date, some nonuniform quantization schemes with low-bit resolution have been presented, achieving the improved quantization performance to some extent [9], [10], [11], [12], [13], [14], [15]. To be specific, these non-uniform quantization schemes mainly include the A-law quantization [9], [10], Lloyd quantization [11], non-parametric histogram estimation (NPHE) [12], Gaussianlike estimation (GLE) [13], [14] and other novel quantization schemes [15]. Nevertheless, these above-mentioned quantization schemes either have the fixed quantization levels which cannot be adjusted adaptively according to the actual distribution of input signal, or need to extra estimation the probability density function (PDF) of the signal, which increases the complexity of the quantization operation. Fortunately, the novel low-bit adaptive quantization schemes without PDF estimation have also proposed and demonstrated [16], [17], [18], [19], [20], [21], [22], [23], [24]. Among them, the K-means-based non-uniform quantization is a commendable scheme and has obtained extensive attention in recent years. Although it has been proved the K-means quantization method can provide excellent quantization performance [21], the hard decision based on Euclidean distance clustering demands a large number of iterations which would bring high complexity [16].
In this paper, for the first time, we present and experimentally verify an adaptive vector quantization (AVQ) scheme based on the Gaussian mixture model (GMM) clustering algorithm for the photonics-assisted MMW communication system. The performance of different quantization schemes, including traditional uniform quantization, the present K-means and proposed GMM-based one-dimensional adaptive scalar quantization (ASQ) and two-dimensional AVQ, are evaluated in detail. The detailed comparison between this work and previous reports is summarized in Table I. We list the key advantages of the proposed photonics-assisted 28 GHz MMW system based on the GMM AVQ scheme in the following. It mainly includes three aspects: (1) supports vector quantization and the quantization levels can be adjusted adaptively according to the actual distribution of MMW signal without PDF estimation.
(2) The quantization bit resolution as low as 2 bits can be realized in our proposed photonics-assisted MMW communication system enabled by the GMM-based AVQ scheme, which can significantly reduce the cost and power consumption of the ADCs in MMW receiver end, as compared with traditional uniform quantization scheme. (3) The proposed GMMbased ASQ and AVQ schemes can achieve similar performance as that of the K-means approach, whereas the running time can be reduced by about half under the same convergence threshold.
This paper is organized in four Sections. Section II introduces the principles of the K-mean and GMM-based AVQ scheme, respectively. Experimental verification and result discussions are shown in Section III. In the last Section, the main conclusions are given eventually.

A. Principle of Vector Quantization
Suppose the target dataset contains a total of N 0 scalar samples to be quantified. Generally speaking, the scalar samples are quantified on the one-dimensional scale. However, in order to reduce the quantization distortion and improve the quantization performance, the vector quantization is proposed and performed on the multi-dimensional scale. At this case, the above N 0 scalar samples should be regrouped into N vector samples firstly. Here, we use d to present the dimension of vector quantization. For the ASQ scheme, the scalar dataset is original {x 1 , x 2 , …, x N }, where N = N 0 . Instead, for the AVQ scheme with the same numbers of total samples, the vector dataset is given as {x 1 , x 2 ,. . . , x N }, where N = N 0 /d and the symbol A stands for rounding up A. The constructed i-th vector sample can be expressed as Suppose the total number of quantization levels is M, the {c 1 , c 2 , …, c M } represents the codebook of vector quantization, and c j is the j-th vector quantization level (i.e., codeword) in the codebook. Each vector sample x i and codeword c j are all composed of d-dimensional scalars. The process of vector quantization is to match each vector sample in the dataset {x 1 , The vector quantization is applicable to both real-valued and complex-valued signals. In theory, the original scalar samples can be acquired just from the real/imaginary part of the signal, or from a joint combination of real and imaginary parts of the complex-valued signal. However, in practice, the former is the preferred solution. Because the scalars in each vector sample x i have the correlation due to symbol up-sampling, while the latter case does not. As is known, the samples up-sampled from the same symbol are numerically close to each other. Leveraging on this, the obtained vector samples which reconstructed from the above-mentioned scalar samples are more centralized along positive proportion distribution (e.g., y = x for two-dimensional vector quantization) [21]. Benefitting from this correlation, multi-dimensional vector quantization can further reduce the quantization distortion, as compared with traditional uniform quantization scheme.
When talking about the cost and power consumption of the quantizer, it is undoubtedly related to the parameter of quantization bits per sample (QBs/Sa). Notably, M quantization levels correspond to log 2 (M) quantization bits for both ASQ and AVQ schemes. For the one-dimensional ASQ scheme, the QBs/Sa corresponds to log 2 (M). Nevertheless, since each vector x i contains d scalar samples for the d-dimensional AVQ scheme, thus the QBs/Sa is log 2 (M)/d. On the other hand, in order to achieve adaptive quantization, the quantization levels in the codebook should be adjusted adaptively according to the different distribution of input signal. In the following, the AVQ based on K-means and GMM clustering algorithms are presented to update the codebook adaptively.

B. Principle of K-means-Based AVQ Scheme
Based on the idea of hard decision, the K-means clustering algorithm adopts Euclidean distance to divide the dataset with N objects into M clusters. Firstly, the initial centroids {c 1 , c 2 , …, c M } of the M clusters can be initialized randomly. Secondly, according to the nearest neighbor principle, all samples of the input dataset {x 1 , x 2 , …, x N } are divided into corresponding centroid of the M clusters in turn. Thirdly, update the centroid for each cluster by taking the average operation according to The quantization distortion is assessed through the sum of square differences between each sample and its quantized codeword, which can be expressed as where P represents the number of iteration. A convergence threshold can be preset. The iteration operation is stopped until the ratio of the difference between the two adjacent quantization distortions, i.e., (D P+1 -D P ) / D P , is less than the convergence threshold. To a certain extent, multiple iterations can avoid the problem of K-means clustering convergence to a local optimal solution. Utilizing the above clustering algorithm, the codebook {c 1 , c 2 , …, c M } of K-means non-uniform quantization can be updated adaptively according to the distribution of input signal, without calculating its PDF. The concept diagrams of traditional uniform quantization, and K-means-based one-dimensional (d = 1) ASQ as well as two-dimensional (d = 2) AVQ schemes are shown in Fig. 1. Different from the traditional uniform quantization, the quantization levels of K-means-based ASQ are unevenly distributed on the one-dimensional scale, which implies more quantization levels can be allocated for the samples in dense area, contributing to a lower quantization distortion. For the K-means-based AVQ scheme, each two-dimensional vector x i falls into the corresponding Voronoi cell formed by Voronoi boundaries, and will be quantized as the corresponding centroid of c j .

C. Principle of GMM-Based AVQ Scheme
Different from K-means clustering algorithm, which takes Euclidean distance as the decision condition for clustering and each sample only belongs to one cluster, the GMM divides the target samples into multiple different sub-Gaussian distribution models and gives the corresponding probability that each sample is assigned to M clusters. As a result, any sample belongs to multiple clusters at the same time, and finally the clustering decision is performed according to the maximum probability. The overall PDF of the GMM clustering established by the weighted sum of M Gaussian densities can be expressed as where ω j is the mixture weight of the j-th component, and φ(x | μ 2 j , σ 2 j ) is the Gaussian densities of j-th component, which meets the Gaussian distribution N(μ j , σ 2 j ) with the mean μ j and variance σ 2 j . The mean and variance describe the distribution location and amplitude of the samples, respectively. The sum of all weights is equal to 1, i.e., M j=1 ω j = 1. It should be noted that the overall PDF estimation of GMM is given here just to help understand the principle of GMM, and it is not required in the actual GMM quantization process. The hidden parameters of the GMM are collectively represented by the notation of λ j = {ω j , μ j , σ 2 j }, j = 1, 2, …, M. These parameters are preset with a randomly initial values, and then can be optimized by the expectation maximization (EM) algorithm [25], which contains two steps: step E and step M.
Firstly, we initialize the parameters of λ j with ω j = 1/M, σ 2 j = 0, and μ j with the random quantization levels. Secondly, using step E calculates the posterior probability γ ij that the i-th sample x i comes from the j-th sub-Gaussian model component, which can be expressed as Thirdly, using step M updates the parameter set of λ j (j = 1, 2, …, M) by Next, repeat the step E and M process until the λ j parameter change of two adjacent times reached the convergence threshold. Finally, the quantization process is performed according to the probability φ(x|μ, σ 2 ), and each vector sample x i can be matched to the corresponding mean μ of the sub-Gaussian model with the maximum probability. In theory, the GMM can fit any type of signal distribution without limiting the number of sub-Gaussian component. Fig. 2 shows the contour diagram of PDF for two-dimensional GMM-based AVQ, which enabled by four sub-Gaussian models. The target samples are generated through four kinds of Gaussian distribution with different colors. After GMM clustering, it can be seen that these samples can be divided into four sub-Gaussian models, and the contours of PDF for each sub-Gaussian distribution are in very high agreement with the distribution of the target samples, which means a good quantification effect can be achieved.

A. Experimental Setup
The experimental setup for 28 GHz photonics-assisted MMW communication system based on the low-bit non-uniform GMM AVQ scheme, is shown in Fig. 3. The intensity modulation with PAM4 signal and low-cost envelope detection are used to simplify the system structure. At the transmitting end, two external cavity lasers (ECLs) with the wavelengths of 1550.172 nm and 1550.396 nm are first coupled by an optical coupler (OC), and then are sent to an integrated Mach-Zehnder modulator (MZM) with built-in driving amplifier for electro-optic modulation. The PAM4 signal with the baud rate of 3 GBd is produced in the transmitting DSP and generated by an arbitrary waveform generator (AWG) with the sampling rate of 92 GSa/s. At the transmitting DSP, as shown in Fig. 3(a), the PAM4 symbols mapped from the pseudo-random binary sequence (PRBS) are  shaped by a root-raised-cosine (RRC) filter with a roll-off factor of 0.1 after 8 times up-sampling. This up-sampling operation guarantees the correlation between adjacent PAM4 samples, and thus can contribute to the performance improvement for multi-dimensional vector quantization. The modulated optical spectrum after MZM is shown in Fig. 3(b). The frequency interval between the two optical wavelengths is 28 GHz. Considering the photonics-assisted MMW indoor and outdoor continuous coverage scenario, a 5 km standard single-mode fiber (SSMF) is sufficient to provide the signal access from outside to indoor [26]. After fiber transmission, a variable optical attenuator (VOA) is applied to adjust the received optical power (ROP) for sensitivity measurement. At the optical wireless conversion side, whose photo is shown in Fig. 3(c), a single-ended PD with 3 dB bandwidth of 40 GHz is used to down-convert the optical signal into 28 GHz MMW through optical heterodyne beating. Afterwards, the generated 28 GHz MMW signal is launched into free space after amplified by one low-noise amplifier (LNA) and one power amplifier (PA) with the total gain of about 43 dB. A pair of horn antenna (HA) working in Ka-band (26.5∼40 GHz) are employed for 1.6-m wireless transmission. At the wireless receiver end, the received 28 GHz MMW signal is reconverted to baseband via an envelope detector (ED) with the 3 dB bandwidth of over 500 MHz. Finally, the obtained baseband signal is amplified and then captured by a 64 GSa/s real-time digital storage oscilloscope (DSO) for further offline signal processing. At the receiving DSP, as shown in Fig. 3(d), the above signal is first quantized by a digital emulated ADC. For comparison, different quantization schemes including traditional uniform quantization, the present K-means and proposed GMM-based ASQ and AVQ are evaluated. Subsequently, the resampling, synchronization, matched filtering, downsampling, least mean square (LMS) equalization, PAM4 demapping and bit error rate (BER) calculation are performed in turn in the receiving DSP.

B. Experimental Results
We first investigate and compare the performance of the adaptive GMM quantization with other quantization schemes at different QBs/Sa condition under a fixed ROP. The results are shown in Fig. 4(a). For the traditional uniform quantization scheme, the BER drops sharply with the increase of QBs/Sa, especially with an insufficient QBs/Sa. To be specific, the BER performance can be improved by more than an order of magnitude when the QBs/Sa is increased from 2 bits to 3 bits. Moreover, limited by the rigid quantization levels, the uniform quantization requires at least 5 bits to ensure a good and stable BER performance. On the contrary, benefitted from the flexible and adaptive quantization codebook, both the K-means and GMM-based ASQ/AVQ schemes can significantly reduce the requirement for QBs/Sa. For instance, the one-dimensional ASQ schemes can reduce the QBs/Sa requirement from 5 bits to nearly 3∼4 bits, whereas this value can be further reduced to 2 bits after utilizing the two-dimensional AVQ scheme. It can be found that the proposed GMM scheme can achieve the performance similar to K-means scheme, regardless of whether one-dimensional ASQ or two-dimensional AVQ technique is used.
In order to further investigate the quantitative performance of different quantization schemes, Fig. 4(b) shows the quantization noise (QN) improvement versus QBs/Sa curves for the K-means and GMM-based ASQ/AVQ schemes, as compared with the uniform quantization scheme. The QN improvement here is calculated according to the quantization variance, which represents the level of quantization distortion. That is, ΔQN = 10 × log 10 (Var 0 ) -10 × log 10 (Varx), where Var 0 and Varx are the quantization variances of the uniform quantization scheme and non-uniform quantization scheme (including ASQ-K-means, ASQ-GMM, AVQ-K-means and AVQ-GMM), respectively. It can be seen from Fig. 4(b) that compared with uniform quantization, the QN improvement of ASQ and AVQ increases with the decrease of QBs/Sa. This is mainly because the smaller the QBs/Sa, the larger the difference in quantization variance between rigid uniform quantization and ASQ/AVQ non-uniform quantization schemes. For one-dimensional ASQ scheme, the QN improvement is between 2.5 dB and 4 dB. Nevertheless, after adopting the two-dimensional AVQ scheme, the improvement can be increased to more than 10 dB. This implies better quantization performance can be achieved. Fig. 5 shows the time-domain waveform segments and recovered PAM4 level diagrams after quantized processing by different schemes under a fixed QBs/Sa. In the time domain waveform segment, the processed waveform (blue curve) overlaps with the original waveform (red curve), thus the distortion level of the quantized sample can be observed intuitively. For a large QBs/Sa with 5 bits, the quantization distortion is insignificant even with the uniform quantization scheme. Meanwhile, there is also no significant difference in the level diagrams of five different quantization schemes. However, after reducing the QBs/Sa to 2 bits, the quantization distortions reflected by the waveform diagram for two-dimensional AVQ, one-dimensional ASQ and uniform quantization schemes increase successively. In particular, the performance of the uniform quantization at this case degrades notably, which can be observed from its cluttered level diagram. Instead, for the two-dimensional K-means and GMM-based AVQ schemes, although there is a slightly observable rise in quantization distortion, the influence on the level distribution is basically negligible, as compared with the case of 5 bits QBs/Sa.
We further evaluate the BER versus ROP curves under the case of QBs/Sa = 3 bits and QBs/Sa = 2 bits, respectively. The results are shown in Fig. 6. In order to clearly reveal the impact of quantization operation on the BER performance, the result without quantization is also given here as a reference. As shown in Fig. 6(a) with the QBs/Sa of 3 bits, it can be seen that the BER performance of all schemes can meet the threshold of 7% overhead hard-decision forward error correction (HD-FEC) (3.8E-3). At this 7% HD-FEC BER threshold, the ASQ and AVQ schemes can improve the receiver sensitivity by about 0.3 dB and 0.4 dB, respectively, as compared with the performance of nonadaptive uniform quantization scheme. On the other hand, the BER performance of the proposed GMM-based ASQ and AVQ are basically similar to the K-means-based schemes. Meanwhile, compared with the reference result without quantization, the GMM and K-means-based AVQ have almost no performance penalty. When the QBs/Sa is reduced to 2 bits, it can be found from Fig. 6(b) that, the BER of non-adaptive uniform quantization scheme cannot meet the BER threshold of 20% overhead soft-decision forward error correction (SD-FEC) (2E-2). However, after utilizing the adaptive GMM and K-means-based ASQ schemes, the corresponding BERs can just reach the 7% HD-FEC threshold at the ROP of -7 dBm. Furthermore, the adaptive two-dimensional GMM and K-means-based AVQ schemes can further reduce the BER by more than an order of magnitude (below 2E-4) under the same ROP condition. At this case, the two AVQ schemes have also no performance penalty compared with the reference result, even if the QBs/Sa is as low as 2 bits. This means that the application of the two-dimensional AVQ scheme can significantly reduce the cost and power consumption of receivers widely deployed in optical wireless access networks.
Next, we evaluate the influence of different signal transmission baud rates on receiving performance for different quantization schemes. The results are shown in Fig. 7. The black curve is the reference result without quantization operation. We can observe the following three phenomena from Fig. 7. Firstly, under different baud rates from 1 GBd to 5 GBd, the proposed GMM quantization scheme can achieve comparable performance with the K-means scheme regardless of using one-dimensional ASQ or two-dimensional AVQ technique. Secondly, as the baud rate increases, the SNR performance of the two-dimensional AVQ scheme gradually and slightly deviates from the reference curve, especially when the baud rate exceeds 4 GBd. This is because the larger the signal baud rate, the higher the requirement for QBs/Sa. As a result, in the case of high baud rate, it is necessary to increase the QBs/Sa value so as to ensure the same performance. Finally, with the increase of signal baud rate, the performance of the one-dimensional ASQ and two-dimensional AVQ are more obvious than that of the non-uniform quantization. In other words, the adaptive non-uniform quantization, especially multi-dimensional AVQ scheme, is more suitable for high-speed MMW communication scenarios.
Finally, we study the running time for K-means and GMMbased adaptive quantization schemes. In theory, since the GMM clustering algorithm utilizes the step E and M to estimate the posterior probability and sub-Gaussian model parameters, respectively, its complexity for a single iteration is higher than that of the K-means clustering algorithm. However, considering the case of a same convergence threshold, the elapsed time for the two kinds of clustering algorithms to reach the threshold is a more critical issue of our concern. As such, we test the running times of K-means and GMM-based one-or two-dimensional adaptive quantization under different QBs/Sa conditions. The results are shown in Fig. 8(a). It can be found that the quantization times of one-dimensional ASQ and two-dimensional AVQ schemes both rise with the increase of QBs/Sa. Meanwhile, the bigger the QBs/Sa, the larger the time difference between two-dimensional AVQ and one-dimensional ASQ schemes. This main reason is that although two-dimensional quantization achieves the vector compression of scalar samples, it has more quantization levels than one-dimensional quantization in the same QBs/Sa case, thus the quantization time also increases significantly. On the other hand, the proposed GMM scheme has the shorter running time (nearly half) than K-means scheme regardless of using one-dimensional ASQ or two-dimensional AVQ technique. This might be mainly attributed to faster convergence of GMM clustering algorithm than that of K-means, which can be confirmed by Fig. 8(b). With the same convergence threshold of 1e-3, ASQ-K-means scheme requires 15 iterations to reach convergence, while only 6 iterations are enough for the ASQ-GMM scheme instead. For two-dimensional AVQ schemes, the iteration number of K-means and GMM-based schemes are 9 and 38, respectively, which also exists a multifold difference. The K-means adopts the hard decision method of Euclidean distance for clustering, while the distribution probability based soft decision is selected by the GMM approach. Relatively speaking, the decision for some samples in the edge of clusters using GMM algorithm is no longer rigid, but more flexible. This may promote faster convergence, especially for the signals that meet Gaussian distribution. In addition, it should be emphasized that, although there are differences in performance during the iteration process, the eventual SNR of the two schemes is basically similar after convergence.

IV. CONCLUSION
In conclusion, a photonics-assisted 28 GHz MMW communication system enabled by the GMM adaptive non-uniform quantization technique is proposed and verified experimentally. The GMM-based two-dimensional AVQ scheme is presented to realize the low-bit adaptive quantization of the ADC in the MMW receiver. We compare the performance of the proposed GMM quantization scheme with other quantization schemes (including the traditional non-adaptive uniform quantization, the present K-means-based adaptive non-uniform quantization, etc.) in the low-cost intensity modulated envelope detection 28 GHz MMW communication system. The results show that, the quantization bit resolution of the ADC in our MMW receiver can be reduced as low as 2 bits by the GMM-based AVQ scheme, without noticeable performance penalty. In addition, the proposed GMM scheme can achieve a similar quantization performance as the K-means scheme. However, under the same convergence threshold, the quantization time can be reduced by about half. It should be emphasized that this proposed GMMbased multi-dimensional AVQ scheme is applicable to both pure electronics-enabled and photonics-assisted MMW communication systems. Nevertheless, relatively speaking, the effect of GMM AVQ scheme is more obvious in the latter case, which is characterized by high-speed and large-capacity scenario. Given the above, we believe it can provide an efficient and economic technical solution to reduce the cost and power consumption of ADCs in the upcoming B5G and 6G MMW-enabled optical fiber access networks.