Enhancing the Credibility of the Optical Performance Monitor With Adversarial Training

The existing optical performance monitoring (OPM) scheme based on deep neural network has no selection capability of the input data. They always accept and process all, which may result in serious monitoring errors and reduce the credibility of the monitoring system. Because the transmitted data in the future heterogeneous fiber-optic networks are diverse, and it’s likely to exceed the scope of the monitoring system. We propose an unsupervised generative adversarial network (GAN) as the judgement module in the new OPM framework to select the legal data within the scope of the monitoring system. The generator consists of encoder-decoder-encoder (EDE) sub-network, jointly learns the image and latent feature distribution of the legal data. And the training data for the network in the new added judgement module is the same as the OPM analyzer network’s, therefore, no extra data are collected, which is low-cost. In the simulation, four modulation formats under two bit-rates are taken into account to verify the model performance in the judgement module. When 60 Gbps 64QAM signal is selected as illegal data, the max value of the area under the curve (AUC) is 0.942. The judgement time for single image is about 12 ms. Moreover, the influence of the task weights and the latent feature shape on the judgement performance are investigated. The new added judgement module largely increases the credibility and safety of the existing OPM scheme.


I. INTRODUCTION
With the high-speed development of various cutting-edge services, such as artificial intelligence (AI), fifth-generation (5G) and cloud computing, the data transmitted in optical fiber network is also increasing explosively. Moreover, in order to improve the quality-of-service (QoS) and meet the real-time needs of end-users, the optical network becomes more heterogeneous, dynamical and expecting a unified control and management of resources (e.g. bit-rate, modulation format., etc.) [1]. The elastic optical networks (EONs) together with software defined network (SDN) controllers can meet these demands. To ensure the reasonable control and management, it is crucial to provide correct and accurate monitoring parameters (e.g. modulation format, optical signal-to-noise ratio (OSNR), bit-rate, etc.) for the SDN con-The associate editor coordinating the review of this manuscript and approving it for publication was Rentao Gu . troller using the technologies of OPM as well as bit-rate and modulation format identification (BR-MFI) [2]. The optical performance monitors deployed with the OPM and BR-MFI technologies are equipped on the various intermediate node of the optical network.
Recently, AI has attracted the attention of researchers, among which the deep learning (DL) technology has become a research hotspot in various areas such as natural language processing (NLP), computer vision (CV), automatic speech recognition (ASR), [3]- [5] etc. Compared with the traditional machine learning (ML) methods, DL has the significant advantages of self-learning and automatic feature extraction [6]. Naturally, with the purpose of improving the monitoring accuracy, more and more DL technologies are used in OPM [7] as well as BR-MFI [8], [9]. Moreover, some work even realize the BR-MFI and OPM simultaneously. In [10], [11], the convolutional neural network (ConvNet) was proposed for the BR-MFI and OPM by using the data of the eye-diagram and constellation-diagram. In our previous works [12], the multi-task learning (MTL) based ConvNet was proposed for the OPM and BR-MFI by using the phase portrait images. Similarly, by using the asynchronous amplitude histogram (AAH), the MTL deep neural network (DNN) was proposed for the OPM and BR-MFI [13], [14]. In general, with the help of the advanced DL technologies, the result of the monitoring tasks (OPM and BR-MFI) are becoming more and more accurate.
However, there is a serious vulnerability in the existing OPM schemes when the optical performance monitor is deployed in the real monitoring scenario. Specifically, the analysis module of the existing OPM schemes directly use the supervised learning method to train the DL model as the data analyzer. A particular dataset is collected as the monitoring scope, then, based on this dataset, the DL model is trained to have an accurate monitoring result. In order for the trained DL model to work properly, an important premise is that the input data cannot exceed the monitoring scope, otherwise, the DL model will give a totally wrong result. Because the trained DL model can only give correct results within the monitoring scope. For example, if a QAM type signal is input into the analysis module which is only trained to identify the on-off keying (OOK) type signal, the analysis module would mistake the QAM signal for the OOK signal. For the optical performance monitor, the input data within the monitoring scope is defined as legal data, or else as illegal data. Unfortunately, the existing OPM schemes have no selection of the input data, which means that they accept and process all. Moreover, it is very easy for the optical performance monitor deployed in the heterogeneous optical network to receive the data exceeding the monitoring scope. Since the monitoring results are important for the SDN controller to manage and control the whole optical network, it is necessary for the optical performance monitor to have the ability of input data selection. The selection between the legal and illegal data can be solved as a supervised learning problem in theory, for example, we can put the illegal data into the training dataset, and train the DL model to recognize them. But there are endless illegal data types in the real monitoring scenario, which means that the DL model cannot filter the unknown illegal data while the training dataset is becoming bigger and bigger. In order to eliminate the vulnerability and improve the credibility of the optical performance monitor, more advanced technology and OPM framework are needed.
In this paper, we design a new OPM framework to improve the credibility in the practical monitoring scenario. Different from the old OPM framework which directly accepts and processes all the input data, a judgement module is added into the new OPM framework to filter the illegal data which exceeds the monitoring scope. The core of the judgement module is an unsupervised GAN which generator consists of EDE sub-network. The GAN model minimizes the distance between the images and latent features of the legal data during training. The large distance metric form the trained GAN model indicates illegal data. The asynchronous single channel sampling (ASCS) method is used to acquire the phase portrait images as the input data. Four common signals, 60/100 Gbps quadrature phase-shift keying (QPSK), 60/100 Gbps 4 quadrature amplitude modulation (QAM), 60/100 Gbps 16QAM, 60/100 Gbps 64QAM in the scenario of various impairments such as OSNR, chromatic dispersion (CD), and differential group delay (DGD) are comprehensively investigated to verify the performance of the judgement module. The good performance shows the effectiveness of the proposed OPM scheme.

A. MORE CREDIBLE OPM FRAMEWORK
Firstly, we propose the new OPM framework based on the real monitoring scenario in the optical network, as shown in Fig. 1. Future heterogeneous optical network is designed to support various services (e.g. service a, b and c) with different parameters (e.g. OSNR, CD, DGD, modulation format, bitrate, etc). For the better utilization of the resources in physical layer, it is necessary to use the optical performance monitor in the intermediate nodes to provide the monitoring information for the SDN controller. Based on the provided monitoring information, the SDN controller can formulate strategies to better control and manage resources. Thus, the optical performance monitors are required to provide as correct information as possible.
The old OPM framework simply consists of two modules: data generation and data analysis modules. The data generation module is used to continuously transform the network transmission signal into the data format (e.g. AAH, asynchronous delay-tap sampling (ADTS) images) suitable for the processing of the analysis module. Here, the phase portrait image is generated by the ASCS. The analysis module based on neural network will analyze the input data and then report the results. The neural network in the analysis module is pre-trained, which means that the monitoring scope is determined. Once the data which exceeds the monitoring scope is input into the analysis module, the totally wrong monitoring results are attained. However, in the development of the heterogeneous optical network, there will be more and more new services which can easily exceed the monitoring scope of the existing optical performance monitor. Without the ability to filter the illegal data, the monitoring information provided by optical performance monitor will lead to network chaos.
To solve this problem, we design a more credible OPM framework by adding a new judgement module on the old OPM framework. The new added judgement module located between the data generation module and the analysis module is used to filter the illegal data. Specifically, if the judgement module recognizes that the data generated by the data generation module is illegal, it will send out a warning and denial of service. Otherwise, the legal data will be sent to the analysis module to produce monitoring information. By adding the judgement module in the new OPM framework, the optical performance monitor becomes more credible, since it has the ability to filter illegal data so that the totally wrong monitoring information can be avoided. Moreover, since the judgement module and the analysis module are decoupled, the various monitoring algorithms studied by the predecessors can be applied without any modification.

B. ASYNCHRONOUS SINGLE CHANNEL SAMPLING
In the data generation module, we use the ASCS method to generate phase portrait images as the object of subsequent processing. The ASCS is a simple and low-cost method, since only the single-tap sampling without clock information is required [15], [16]. The principle of using the ASCS method to generate phase portraits is presented in Fig. 2. The optical signal transmitted in the network will be converted into electrical signal after being directly detected by the photodetector (PD). Then, the single-tap sampling with low rate 1/T sampling is used to attain the original sample sequence marked as q 1 , q 2 , · · · q N . The shifted (shifted by k samples) version of the original sequence is attained to produce the sample pairs (q i , q i+k ) together with the original sequence. The collected sample pairs are displayed as the phase portraits. Moreover, the different signals' phase portraits under diverse impairments are shown in Fig. 3. Obviously, the phase portraits can directly show the influence of various monitoring parameters, which are suitable for processing by the judgement and analysis modules.

C. ADVERSARIAL EDE CONVNET FOR DATA JUDGEMENT
After the data generation module, the phase portraits will be sent to the judgement module which is the focus in this paper. In the judgement module, the adversarial EDE ConvNet is proposed to filter the illegal data. The whole neural network model is designed on the framework of GAN invented by Goodfellow et al. [17]. As an unsupervised algorithm, GAN have been applied to various applications [18]- [23] because of its strong ability of learning data distribution. The basic idea of GAN is that the generator network G and the discriminator network D compete against each other in the training phase. Specifically, the generator network tries to learn the input data distribution and produce an image, then the discriminator network judges the authenticity (real or fake) of the generated image.
The overview of the adversarial EDE ConvNet is illustrated in Fig. 4. The generator is formed by an encoder-decoderencoder sub-network. The generator learns the image and latent feature distribution by reconstructing the input image and extracted latent feature, respectively. Taking a 32×32×3 color image I as the input of the generator, the first encoder sub-network G E1 downscales the input image to a feature Z of shape 1 × 1 × 100. The feature Z which contains the most comprehensive information of the input image with the least size can be regarded as the input image's latent feature. Then, the decoder sub-network G D reconstructs the  input image I asÎ by upscaling the feature Z . The G E1 consists of 4 layers, and each layer consists of convolutional operation, batch-norm and leaky ReLu() activation. Similarly, the G D uses the convolutional transpose operation, ReLU() activation, batch-norm and the tanh() activation. Moreover, the second encoder sub-network G E2 , which has the same structure as G E1 but different weight parameters, is used to extract the latent featureẐ from the reconstructed imageÎ . The featureẐ has the same shape as feature Z . During training phase, the input image I and the reconstructed imageÎ are identified by the discriminator network D as real and fake, respectively. With the help of the GAN framework, the EDE network can better learn the representation of the legal data. The details of the configuration (the filter size, stride, padding and number of channels) in the basic blocks are displayed in Table 1. To avoid repetition, the configuration of other blocks are omitted. Because other blocks are the reuse of the basic blocks, which means that they have the same structure and configuration.
In order to train the proposed model, a big training dataset denoted as where M is the number of the phase portraits and I i ∈ R 32×32×3 . Note that the training dataset only contains the legal data since the phase portraits are collected from the monitoring scope. Besides, a testing dataset of N phase portraits collected from both the inside and outside of the monitoring scope can be denoted as , where the image label y i ∈ {0, 1} (0: illegal data, 1: legal data) and I i ∈ R 32×32×3 . Based on the above two datasets, our model first learns the legal data distribution on the training dataset, then the trained model identifies whether the data in the testing dataset is legal or illegal. In the testing phase, a score S I i indicating the probability of the input testing image being illegal will be calculated based on the L 2 distance of the latent features Z andẐ . The S I i can be expressed as The scores of the whole testing dataset are normalized to [0, 1]. The testing image I i is regarded as illegal when its score S I i exceeds a certain threshold. During the training phase, the model is trained by the combined three loss functions. Each loss function is used to optimize the different part of the model. The first loss function is the adversarial loss. The most common way to train GAN is to update the generator G based on the output of the discriminator D, but this way is not stable. In order to alleviate the training instability, we use the feature matching [24] method to update G based on the D s internal feature. Given an input image I i from the distribution of training dataset, the feature matching method calculates the L 2 distance between the D's internal feature of the original image I i and the reconstructed imageÎ i , respectively. The VOLUME 8, 2020 adversarial loss can be expressed as where f (·) is the output of the D's third layer. The second loss function is the reconstruction loss. It is used to optimize the G E1 and G D by learning the content information about the input data. Since the L 1 distance produces less blurry image that the L 2 distance [19], we use the L 1 distance to define the reconstruction loss as The third loss function is the latent feature loss. The above two loss functions can enforce the G to learn the legal data distribution in image space, moreover, we add the latent feature loss to learn the distribution in latent feature space. The latent feature loss can be defined as Based on the three loss functions, the G learns the distribution of legal data both in image and feature space. When an illegal data which has different distribution with the legal data is inputted to the trained model, the distance between the latent features z and z' will increase beyond the threshold, since the model is trained only on legal data. Finally, the overall loss function can be expressed as loss overall = loss rec + λ 1 loss lat + λ 2 loss adv (5) where λ 1 and λ 2 are the task weights to balance the influence of the latent feature loss and the adversarial loss, respectively. Two models with different important factors are trained for the comparison of performance. One model named as ''Model 1'' is trained when λ 1 = 15 and λ 2 = 0, which means that the ''Model 1'' is trained without the framework of GAN since λ 2 = 0. The other model named as ''Model 2'' is trained when λ 1 = 15 and λ 2 = 5. The specific information about the selection of the important factors is discussed in section B part III.

III. SYSTEM SETUP AND RESULTS
In order to collect data and build the neural network model, the simulation system is established on VPItransmission-Maker and Tensorflow library as shown in Fig. 5. Firstly, eight signals are generated in the transmitter by two bitrates (60/100 Gbps) and four common modulation formats (4QAM, 16QAM, QPSK and 64QAM). To simulate the impairments in single-mode fiber (SMF) transmission,    Fig. 7, in which the images with red border are the illegal data (60 Gbps 64QAM). The Fig. 7(a) shows the input images. The Fig. 7(b) and Fig. 7(c) shows the corresponding reconstructed images of the Fig. 7(a) by the ''Model 1'' and ''Model 2'', respectively. The correlation between the reconstructed and the input images are displayed at the top of each reconstructed image in Fig. 7(b) and 7(c). Since the reconstructed legal images have the better image content and the bigger correlation value than the reconstructed illegal images, we can conclude that both the two models can effectively reconstruct the legal images, but fail to reconstruct the illegal images. It is because that the trained model have learned the legal data distribution, so it is easy to reconstruct legal image rather than the illegal image. The difference of the reconstruction performance between the legal and illegal images is an intuitive reflection of the distribution difference between the legal and illegal data. For the legal images, the reconstruction performance of the ''Model 2'' is better than the reconstruction performance of the ''Model 1'', while, for the illegal images, the reconstruction performance of the ''Model 2'' is worse than the reconstruction performance of the ''Model 1''. This means that the ''Model 2'' which trained on the GAN framework is more powerful in identifying the illegal data.

B. LATENT VECTOR LENGTH AND TASK WEIGHT
The latent features Z andẐ are used to represent the data distribution, the shape of the latent feature would directly affect the representation ability of the data distribution, then affect the model performance of identifying illegal data. Besides, the task weights also affects the model performance. Therefore, it is necessary to study how these hyper-parameters affect the model performance. Here, we take the ''Model 2'' as the research object and change the shape of its latent feature. The AUC values for each latent feature shape under different monitoring scope are illustrated in Fig. 8. It is clear that when the shape of the latent feature is 1 × 1 × 100, the model achieves best AUC for almost all monitoring scope. To be more concrete, when the length of the third dimension is less than 100, the larger the third dimension is, the better the model performance is. Nevertheless, once the length of the third dimension exceeds 100, the model performance begins to decline. It is because that small shape cannot contain all the useful features, while large shape contains too much redundant features.
Next, the influence of the task weights on the model performance is studied when the shape of the latent feature is fixed at 1 × 1 × 100 and 60 Gbps 64QAM is selected as the illegal data, as shown in Fig. 9. The task weights are adjusted in the range of [0, 30] with the step of 5. Obviously, when the task weight λ 2 is at the range of [0, 15] and the task weight λ 1 is at the range of [0, 10], the model performance is poor (AUC is about less than 0.5084). When the λ 1 is at the range of [10,20] and the λ 2 is at the range of [0, 10], the model achieves good performance. The optimal model performance (the AUC value is 0.9420) is achieved when the λ 1 equals 15 and the λ 2 equals 5, which is exactly the task weights configuration of the ''Model 2''.  Generally, in order to make the model have good performance, it is very important to select the appropriate shape of the latent features and the task weights. In the case of this paper, it is suitable to set the latent feature shape, the λ 1 and the λ 2 to 1 × 1 × 100, 15 and 5, respectively.

C. DISTRIBUTION OF THE JUDGEMENT SCORES AND FEATURES
The ''Model 2'' trained when the 60 Gbps 64QAM signal is selected as the illegal data is used to evaluate the corresponding testing dataset. The histogram of the judgement score S I i during the test phase is illustrated in Fig. 10. It is clear that a separation score around 0.4 can effectively separate the testing data into the legal data and illegal data. Although the legal data and the illegal data have the overlapping parts, the proportion of the overlapping parts is very small, which has a limited impact on the overall performance. Moreover, the t-SNE [26] visualization of the extracted features from the third layer (f (·)) of the discriminator network D is illustrated in Fig. 11. The shape of the feature produced from the third layer of the D is 4 × 4 × 256. The t-SNE is a non-liner dimensionality reduction algorithm, which is common for visualization. It is obvious that the legal data and illegal data can be roughly separated into two parts, which means that the discriminator network D has the ability to identify whether the data is legal or not. The above results directly prove the validity of the proposed model. Based on the Intel Core i7 CPU, the average time for the model to process each image in the testing dataset is around 12 ms, which can be shorter by using the Graphics Processing Unit (GPU) devices. Compared with the old OPM framework, although the new added judgement module increases the processing time of the data (within an acceptable range), it greatly enhances the credibility of the optical performance monitor, which is of great significance to the development of the optical network.

IV. CONCLUSION
In conclusion, an adversarial EDE network as the new added judgement module in the new OPM framework is proposed. The new OPM framework as well as the adversarial EDE network can filter the data which exceed the monitoring scope of the optical performance monitor, so as to avoid the totally wrong monitoring results. By the comparison of the EDE network (without GAN framework), the proposed adversarial EDE network achieves better performance. When 60 Gbps 64QAM signal is selected as illegal data, the max value of the AUC is 0.942. A short time around 12 ms is taken for our model to process a single input image, which is very efficient. The judgement module and the analysis module are trained on the identical training data, therefore, no extra data are needed for the new added judgement module, which is convenient and low-cost. Moreover, the effects of the latent feature shape and the task weights on the model performance were studied in detail. The proposed method is of great significance to enhance the credibility of the optical performance monitor and assure the efficient operation of the optical network.