Joint CD, DGD, and PDL Estimation Enabled by FrFT Based Time-Frequency Reconstruction

A high-precision optical performance monitoring (OPM) scheme is proposed to achieve chromatic dispersion estimation (CDE), differential group delay (DGD) estimation, and polarization-dependent loss (PDL) estimation simultaneously for the long haul coherent optical fiber communication system. A time-frequency analysis method based on the fractional Fourier transformation (FrFT) is applied to the optical signal under different channel impairments to reconstruct the two-dimensional distribution images. Multi-task convolutional neural network (MT-CNN) is then adopted to extract the corresponding features of impairments and establish a solid relationship between the images and impairment quantities, thus can jointly estimate CD, DGD, and PDL. We validate in simulation the proposed estimation using 50 GBaud PDM-16 QAM, the mean absolute error (MAE) for CD, DGD, and PDL estimation is 114 ps/nm, 0.26 ps, and 0.072 dB, and the monitoring window ranges from 1600∼48000 ps/nm, 4∼100 ps, and 0∼2 dB, respectively. Simulation verification indicates that the proposed estimation method achieves high precision and robustness to ASE noise and fiber nonlinearity of self-phase modulation (SPM).


I. INTRODUCTION
W ITH the rapid development of artificial intelligence, big data, the internet of things, and cloud computing, network traffic has shown an explosive growth trend and will keep a 25% compound annual growth rate (CAGR) in the future five years [1]. To satisfy the increasing demand for data traffic, the optical fiber communication system is developing towards ultra-large capacity while providing flexible and dynamic functionalities, hence optical signals may accumulate various fiberbased impairments during transmission. Therefore, reliable and precise identification of various impairments is essential for ensuring high quality of service (QoS) [2]. Chromatic dispersion (CD) is one of the most important parameters to be monitored during the optical performance monitoring (OPM) implementation. Since the future flexible optical networks will become more elastic and flexible, the accumulated CD of optical signals may change dynamically. Therefore, CD estimation (CDE) is the prerequisite for effective dispersion compensation, and has to be conducted at the very beginning of the digital signal processing (DSP) unit at the receiver. However, CDE is subject to large DGD and tends to fail when the DGD exceeds 60 ps, thus it is essential to eliminate the influence of large random DGD on the CDE algorithm. As an instantaneous value of polarization mode dispersion (PMD), DGD obeys Maxwellian probability distribution and may become very large at the moment, leading to outage probabilities of transmission and increasing the complexity of adaptive equalizers (AEQs) [3]. Hence, DGD is also a key parameter to be monitored. Under the circumstances, PDL is one of the significant measures, because many optical in-line components accumulate nonnegligible PDL, which can produce more complex system-related effects than originally assumed [4]. Therefore, it is necessary to monitor CD, DGD, and PDL for the coherent optical transmission system.
Several efforts for joint monitoring CD, DGD and PDL by using finite-impulse-response (FIR) filter have been proposed over the past years [5], [6]. In [5], the proposed DGD estimation method requires some adjustment of matrix elements and sinusoidal curve fitting. Ref. [6] needs to estimate PMD after equalization, which may lead to poor result of estimation due to inadequate numbers of taps. Nevertheless, an OSNR monitor is needed prior to the PDL estimation thus adds additional complexity. In order to achieve high estimation accuracy and effectively resist other impairments interference (fiber nonlinearity and ASE noise, etc.), it is essential to introduce machine-learning into the estimation method.
In traditional methods, signals are processed in time-domain or frequency-domain separately. These methods inevitably ignore the joint distribution characteristics of the time-frequency domain, thus losing a lot of effective information. In order to observe and process signals more comprehensively, we adopt the method of time-frequency signal processing based on FrFT. Since FrFT can be regarded as the decomposition based on timevarying chirped signals, we can treat the linear and nonlinear impairments experienced by signals in the fiber link as the chirped response in the time-frequency plane, so as to flexibly analyze and extract the linear impairment characteristics in the timefrequency domain and achieve accurate impairments estimation. To accurately estimate impairments of long-haul optical transmission systems, we have conducted a series of investigations on the fractional Fourier transformation (FrFT) based DSP algorithms. A blind CDE method based on the rotation property of FrFT was demonstrated in [7], which performs high precision and low complexity. In [8], we further developed the CDE algorithm, and used the BP neural network to improve CDE accuracy under the condition of random/large DGD. Nevertheless, DGD itself is also a key impairment identity since it may cause unacceptable system outages and degrade system performance. Therefore, we developed a joint CD and DGD estimation algorithm using FrFT based time-frequency distribution reconstruction image and multi-task deep neural network (MT-DNN) [9]. However, MT-DNN is unable to effectively extract the two-dimensional information of the time-frequency distribution reconstructed image, and the robustness against other impairments has not been verified.
In this paper, we perfected the past work into a more sophisticated OPM scheme for the PDM coherent optical communication system by developing a joint CD, DGD and PDL estimation algorithm using FrFT based time-frequency distribution reconstructed images and multi-task convolutional neural networks (MT-CNN). Our method can realize multi-tasks estimation within a single-stage algorithm thus improve the efficiency and accuracy. In our method, the received signal is first transformed into time-frequency distribution images by FrFT, these images contain various impairments characteristics. Then, we use the MT-CNN to train the time-frequency distribution reconstruction images and extract the corresponding features of impairments, thus jointly estimating CD, DGD, and PDL. By introducing x and y polarization into the third dimension of the image channel, the proposed method can accurately estimate polarization-related impairment. To verify the validity of the proposed technique, numerical simulations are performed for 50 GBaud PDM-16 QAM optical signals in the following ranges: 1600∼48000 ps/nm for CD, 4∼100 ps for DGD, and 0∼2 dB for PDL. The mean absolute error (MAE) for CD, DGD and PDL is 109 ps/nm, 0.26 ps, and 0.081 dB, respectively. Our algorithm maintains high-precision performance even under the condition when impairments are large and affect each other. Moreover, we verify the robustness of the algorithm against ASE noise and fiber nonlinearity. In the presence of fiber nonlinearity and ASE noise, the simulation results demonstrate that the MAE of CD, DGD, and PDL is 114 ps/nm, 0.26 ps, and 0.072 dB.
The paper is organized as follows: in Section II, we demonstrate the principle of the time-frequency distribution reconstruction method and the multi-task convolutional neural network. In Section III, the simulations are conducted to show the performances of this method and the comparison between other NN models. The concatenated birefringent fiber model is also adopted to validate the feasibility of our method in real applications. Finally, in Section IV, we give the conclusions.

A. FrFT Training Sequences
The FrFT is a generalization form of the traditional Fourier transformation (FT). The main difference is that it represents the signals on an orthonormal basis formed by chirps [10]. The FrFT of a signal f (t) with a rotation angle α,denoted as F α (u), is defined as where the transform kernel K α (t, u) of the FrFT is given by Since FrFT has the rotation property [11], it can be used to scan the time-frequency distribution in the angular direction. Thus, it is usually used to analyze the nonstationary signal.
A direct current (DC) signal has a steady amplitude and phase along with time. After a fixed order FrFT, the DC signal becomes where B α is a constant related to α. This kind of signal is similar to a linear frequency modulation (LFM) signal with a quadratic phase, which can be analyzed in a specific fractional domain instead of in either time domain or frequency domain. In the following, we denote the LFM signal derived from DC signal as FrFT training sequences (TSs).

B. Time-Frequency Distribution Reconstruction Based on FrFT
The FrFT TSs can be reconstructed in the time-frequency plane to better represent different optical channel impairments the signal experiences. In [12], an FrFT based time-frequency distribution reconstruction using optical approaches was introduced to achieve the optical phase retrieval. As we have introduced, FrFT has the rotation property, which can be used to scan the time-frequency distribution in the angular direction. The detailed process is that a set of received FrFT TS with different orders of FrFT constitutes the time-frequency distribution images in the polar coordinate. By using the inverse Radon transformation, these images are then transformed from the polar coordinate to the rectangular coordinate for subsequent analysis.
The FrFT TSs with a length of 100 are utilized to establish the time-frequency distribution images of x and y polarization states, respectively. FrFT is performed on the received FrFT TSs in the order range of −1∼1 with a scanning step of 0.05. The simulation results with different impairments are shown in Fig. 1. Obviously, the characteristics reflected in time-frequency distribution images are sensitive to optical impairments.
According to the above figures, we can observe that CD shows regular patterns on time-frequency distribution images. With the increase of DGD, time-frequency distribution images present a complicated structure and change randomly. PDL results in a corresponding change in the intensity between the timefrequency images of the two polarization states. Taking ASE noise and fiber nonlinearity into account, the time-frequency distribution images will become more randomized and diffuse, thus the existence of fiber nonlinearity and ASE has an impact on the joint estimation of CD, DGD and PDL.
In summary, the time-frequency reconstruction based on FrFT provides a unique and vivid way to analyze the optical signal under different channel impairments. With a detailed timefrequency distribution, impairment information of the signal can be obtained to give an overall perspective for the signal processing.

C. Multi-Task Convolutional Neural Network
CNN is a powerful neural network inspired by the natural visual perception mechanism of living creatures, which uses the convolution operation to deal with high-dimensional inputs [13]. CNN provides superior performance in image processing and classification tasks, and builds solid relations between input images and output labels by automatic extraction of image features. Multi-task learning (MTL) is a kind of deep learning model, which can conduct multiple tasks together by sharing representations between related tasks [14]. Therefore, we use MT-CNN to train the time-frequency distribution diagrams to extract the corresponding features of impairments, thus jointly estimating CD, DGD, and PDL.
The detailed parameters of our designed MT-CNN architecture for CD, DGD, and PDL estimation are illustrated in Fig. 2. In our scheme, we design an eleven-layer CNN composed of batchnormalization layer (BN), convolution layers, pooling layers, and dense layers. The input is the time-frequency reconstructed image with a size of 100 × 100 × 2, and its two channels respectively represent the time-frequency information of the x and y polarization state. The input would first go through the batch normalization layer to prevent overfitting and speed up convergence. Then we employ convolutional layers with random kernels to perform the feature extraction and feature mapping on the time-frequency reconstruction images. Following the conventional layers, the feature images would go through max pooling layers to downsample the input while preserving features and reducing computational complexity. After feature extraction, the compressed feature images are then inserted into a flattening layer and transformed into one-dimensional vectors for subsequent processing. Finally, the feature images are processed by two fully-connected layers with three regression tasks, CDE, DGD estimation, and PDL estimation, respectively.
Being a regression task, the output of the impairments estimation is continuous value rather than discrete vector, therefore the corresponding loss function is the mean absolute error (MAE): Consequently, the final loss function, which is used to evaluate the convergence of MT-CNN during the training process, can be written as where the parameter λ is used to adjust the importance of each task in the MT-CNN model to improve the overall performance. The activation functions of all layers adopt rectified linear units (ReLU) except for the output layer, which uses linear instead. Adaptive gradient momentum (Adam) optimization algorithm is adopted to train the model and minimize the output errors [15]. The feature extraction layers consisting of convolution and max pooling layers can automatically extract the most significant and distinctive image features for data regression, which ensures a more accurate regression of the impairments.

A. Simulation Setup
To investigate the feasibility of the proposed algorithm, we conducted a simulation system based on MATLAB, Keras library and VPI Transmission Maker 9.1. The numerical simulation model of the long-distance polarization-divisionmultiplexed (PDM) coherent optical communication system is shown in Fig. 3, and the estimation range of key parameters is summarized in Table I. In the transmitter, two independent random sequences generated in Matlab are used to modulate two IQ Modulators at 100 GSam/s to form a 50 Gbaud PDM-16 QAM signal superimposed with 0.1 order TS. The modulated optical signal is amplified by an erbium-doped fiber amplifier (EDFA) and launched into a fiber link, each span composes of a 100 km long single-mode fiber (SMF) and an EDFA. The CD parameter of SSMF is 16 ps/ (nm × km) and one loop accumulates 1600 ps/nm CD. Different CD values are generated by setting different loop numbers. The noise figure of the EDFA in loop is set to 0 dB and the OSNR is controlled by adding ASE noise at the end of the fiber link. After exiting the loop, the PMD-Emulator is used to simulate the situation of random DGD in the range of 4-100 ps which obeys Maxwellian probability distribution. PDL is then co-simulated by Matlab and VPI in the range of 0-2 dB. The Hermitian matrix is used to characterize the loss differences of orthogonal polarization states, and the PDL model can be expressed as (6), (7). To simulate the fiber nonlinearity, the self-phase modulation (SPM) effect of the fiber is taken into consideration, the nonlinear coefficient of the fiber is set to 2.6 W −1 · km −1 . At the receiver, the optical signals are detected by a coherent receiver and saved offline to be imported into Matlab for DSP. After the signal's resampling, the received TSs are transformed into time-frequency distribution images, then imported into MT-CNN to conduct CD, DGD, and PDL estimation.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
R psp = cos α −e −jγ sin α e jγ sin α cos α (7) Based on the above system, 60000 sets of data are collected and divided into three parts: 80% for the training set (48000 samples), 10% for the validation set (6000 samples), and 10% for the test set (6000 samples), respectively. Our MT-CNN is based on the Keras, a python deep learning library, with Tensorflow as backend. The training and testing processes are performed on a personal computer with a graphics processing unit (GPU) accelerator.

B. Performance of the Estimation Algorithm
During the training process, we conducted the hyperparameter optimization to obtain the optimal performance of the model. Therefore, the results shown in this paper are based on the best combination of hyperparameters.
Simulation results for CD, DGD, and PDL estimation are shown in Fig. 4(a)-(c). The blue dot represents the estimation results. The black line denotes the fitted curve, and the red line corresponds to the segmentation MAE corresponding to different interval values. More specifically, the simulation result is in the case that OSNR is 15 dB while fiber power is set to 0 dBm. The verification of the algorithm's resistance to ASE noise and fiber nonlinearity will be demonstrated in detail in the next section.
The specific numerical performance value is shown in Table I. The MAE for CD, DGD, and PDL is 109 ps/nm, 0.26 ps and 0.081 dB, respectively. The MAE of CD is below the estimation tolerance of 350 ps/nm, the residual CD can be effectively compensated by 2 × 2 CMA equalizer with a tap number of 11 [16].
According to the above figure, it can be proved that the estimation method we proposed is feasible and accurate enough.

C. Robustness to ASE Noise and Fiber Nonlinearity
In the previous introduction of time-frequency distribution images, the existence of fiber nonlinearity and ASE has an impact on the joint estimation of CD, DGD, and PDL, which leads to an increase in the degree of distortion and diffusion of the image. Therefore, effective resistance to ASE noise and fiber nonlinearity is significant for the estimation algorithm.
To evaluate the method's tolerance of ASE noise, we verified the proposed method's performance under different OSNR conditions, the result is shown in Fig. 5. and the specific numerical performance value is summarized in Table II. The abscissa represents different impairment interval values, and the ordinate represents the segmentation MAE of corresponding impairment interval values. The simulation results confirm that our method can provide reliable and accurate estimations with OSNR from 10 to 25 dB.
Moreover, we also investigated the fiber nonlinearity's impact on the proposed method. Considering that the nonlinear coefficient of the fiber is generally a fixed value, while the transmission distance will affect CD and fiber nonlinearity at the same time, we scanned the fiber power from 0 to 6 dBm to increase the fiber   Fig. 6 and Table III. It is intuitively clear that the estimation result remains accurate in the presence of fiber nonlinearity.
Finally, for the purpose of verifying the robustness of our method against various combinations of ASE noise and fiber nonlinearity, we scanned OSNR from 10 to 25 dB and the  TABLE III  ROBUSTNESS TO FIBER NONLINEARITY   TABLE IV  ROBUSTNESS TO ASE NOISE AND FIBER NONLINEARITY input signal power from 0 to 6 dBm simultaneously. The specific numerical performance value is summarized in Table IV. Therefore, we confirm the robustness to ASE noise and fiber nonlinearity of our method.

D. Performance Comparison With Different Models
In order to further verify the superiority of MT-CNN, we make a comparison with other NNs based on the same dataset. The comparison between MT-CNN and MT-DNN is elaborated here.
The structure of MT-DNN is illustrated in Fig. 7. The three-dimensional time-frequency distribution images would be   stretched into a one-dimensional vector. Then the input would go through the batch normalization (BN) layer to speed up convergence and be processed successively by four dense layers to extract high-order features of the time-frequency image. Finally, three dense layers with the a dimension of 1 are parallel added to predict the impairments.
After training and testing the DNN, we summarize its performance in Table V and Fig. 8, which obviously shows that CNN outperforms DNN significantly, demonstrating the excellent performance of CNN on the time-frequency distribution images. Meanwhile, we calculate the network computation parameters  The superiority of CNN in the time-frequency distribution images is interpretable. DNN only processes one-dimensional information, consequently ignoring two-dimensional features of time-frequency distribution images, which results in its low accuracy in estimating polarization impairments. Benefitting from the advantages of local perception, weight sharing and subsampling, CNN can extract abstractive information from high-dimensional input and achieve the optimal accuracy at a moderate computation cost. By introducing x and y polarization into the third dimension of the image channel, the ability of CNN to recognize polarization impairments can be significantly enhanced.
We can conclude from the analysis above that CNN has the advantages of high accuracy and low complexity in processing time-frequency distribution images.

E. Performance Under Cascaded Birefringent Fiber Model
In the previous work, PMD Emulator was adopted to simulate DGD, and the feasibility of the proposed method was verified. In real systems, the whole fiber can be regarded as the concatenations of segments of birefringent fiber, and each small section of fiber can be regarded as a phase retarder. Therefore, we simulated PMD in a cascading way to verify the effectiveness of the method in the real scenario, while other simulation parts remained unchanged. The PMD in the range of 0∼100 ps is generated by 20 DGD segments with randomly rotating Jones matrices. The Jones transmission matrix of the i section of the fiber is: where R i represent two orthogonal elliptical polarization states, while Λ i (ω) means the phase retardation of ωΔτ i in frequency domain between two orthogonal PSPs. The final Jones transmission matrix of the entire optical fiber is: The group delay of a small segment of fiber Δτ i (l i ) can be calculated from PMD parameter D P MD : The specific numerical performance is summarized in Table VII, simulation results for CD, DGD, and PDL estimation are shown in Fig. 9(a)-(c). The MAE for CD, DGD, and PDL is 185 ps/nm, 0.57 ps, and 0.18 dB, respectively. Therefore, it can be concluded that our method can perform well in the case of cascading PMD and is applicable to the actual situation.

IV. CONCLUSION
A joint CD, DGD and PDL estimation method based on the FrFT and MT-CNN is proposed and evaluated numerically. We propose a time-frequency distribution reconstruction method based on FrFT to visualize the transmission evolution pattern of optical communication signals in the form of two-dimensional images, in an attempt to analyze the transmission performance of optical communication more comprehensively. Considering that different impairments will cause corresponding image features in the time-frequency distribution images, we adopt MT-CNN to train the time-frequency distribution images and extract the corresponding features of impairments to jointly estimate CD, DGD, and PDL. Through simulations, the proposed estimation method is proved robust against ASE noise and fiber nonlinearity. Reliable CD, DGD, and PDL estimation is demonstrated for 50 GBaud PDM-16 QAM signals with MAE of 114 ps/nm, 0.26 ps, and 0.072 dB, respectively. Therefore, with its reliable estimation results and strong robustness, this estimation scheme is promising for future flexible optical networks.