Multi-Eigenvalue Demodulation Using Complex Moment-Based Eigensolver and Neural Network

Optical eigenvalues originating in optical solitons are the potential for becoming information carriers not affected by chromatic dispersion and nonlinear effects in optical fibers. They are obtained by attributing the associated eigenvalue equations deduced by solving the nonlinear Schrödinger equation with inverse scattering transform (IST) to the matrix eigenvalue problem, and maintaining constant values regardless of the transmission distance. The eigenvalue communication systems require to solve the eigenvalues in soliton-by-soliton. While effective eigenvalue solution methods have not been studied well in telecommunication systems, one of the most well-known eigenvalue solution methods is the QZ decomposition-based method. However, the QZ algorithm requires a large complexity. To reduce the complexity, a method to demodulate the optical eigenvalues using a complex-moment eigenvalue solver (CME) was investigated. CME is a parallelizable eigenvalue solver that can extract any eigenvalue. This paper proposes a novel optical eigenvalue demodulation method that combines CME and an artificial neural network (ANN) based on employing an on-off encoded discrete eigenvalue modulation scheme. The ANN is sensitive to the input order of the units; therefore, the eigenvalues must be sorted. A lightweight sorting algorithm is hence required. In addition to the proposed scheme, this study proposes partial sorting using CME and ANN. Here, 2000 km fiber-transmission experiments for an on-off-encoded four-discrete-eigenvalue were conducted. The experimental results indicated that the proposed demodulation method obtained bit error rate (BER) characteristics comparable to conventional methods by devising an extraction range of eigenvalues in CME.

In the digital domain, the optical eigenvalues are obtained by attributing the associated eigenvalue equations to a matrix eigenvalue problem and then solving the eigenvalue problem. A stable algorithm solves the eigenvalue problem based on the QZ decomposition technique; hereinafter this is called "QZ." After conducting QZ, the optical eigenvalues are sorted. The sorted eigenvalues are input into the artificial neural network (ANN) for bit decision. A method has been proposed and its feasibility has been evaluated [14], [15]. However, the previous method had the following issues. r QZ requires a large time complexity, O(n 3 ), where the matrix size is n × n.
r The basic ANN requires sorting the eigenvalues before inputting them into ANN units because it is sensitive to the input order. To reduce the computational complexity of the eigenvalues solution, a parallelizable eigenvalue solver based on CME methods [20], [21], which is characterized by its large scale parallel operation, hereinafter called "CME" was employed. CME methods draw multiple closed curves on a complex plane and extract the eigenvalues included in the curves. Furthermore, this paper proposes an optical eigenvalue demodulation method based on the combined use of CME and ANN. The contributions of this paper are summarized below.
r An error-free operation is achieved using CME and ANN for four eigenvalues extracted from the waveform with 2000 km fiber propagation.
r It is implied that the penalty for the optical signal and noise power ratio (OSNR) depends on the extraction range of This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the continuous eigenvalues. 1 Since the modulated eigenvalues are only "discrete eigenvalues," there seems to be no need to extract "continuous eigenvalues." However, this paper shows that continuous eigenvalues also affect the BER characteristics. Therefore, we modify the range of eigenvalues extracted by CME (termed extraction range) to find the range of continuous eigenvalues that affect the BER characteristics.
r CME parameters are investigated to satisfy the accuracy required in optical eigenvalue communications.
r To reduce computational complexity, in addition to the proposed method, a partial sort method, that divides eigenvalues into small groups and employs sorting for each small group was proposed. In the following sections, Section II introduces related works on optical eigenvalue communications. Section III describes the proposed method based on combined CME and ANN. The feasibility of employing the CME method for single-eigenvalue communication is simulated as a preliminary study in the case of on-off encoding in Section IV. The optimal parameters of the CME method are also evaluated for eigenvalue communication. In Section V, 2000 km fiber-transmission experiments are conducted to evaluate the proposed scheme in the case of an on-off encoded four-eigenvalue communication. Section VI is the conclusion.

A. Eigenvalue Communication
Optical eigenvalues derived by IST 2 have been widely studied as ideal modulation carriers that are unaffected by chromatic dispersion and nonlinear optical effects during optical fiber propagation. In 1993, a modulation technique using eigenvalues was proposed by Hasegawa and Nyu [23]. The details of the eigenvalue modulation scheme with the IST are presented in Appendix A.
With the advent of digital coherent technologies, many studies using optical eigenvalues have been proposed to increase the transmission capacity in recent years, such as the on-off encoding scheme [1], [2], [9], [10], [11], [12], [13], [14], [15], where multiple eigenvalues are associated on a complex plane, or the modulated spectral amplitude of multisoliton pulses [16], [17], [18], [19]. In the former scheme, the bit sequence corresponds to the on-off state of the eigenvalues, as shown in Fig. 1. When multiple eigenvalues are used, multiple bits can be transmitted by using a single pulse. The details of the on-off encoded multiple-eigenvalue modulation scheme and its demodulation method are presented in Appendices B and C, respectively.
However, the variance of the eigenvalues caused by bandwidth limitation and amplified spontaneous emission (ASE) noise is 1 The continuous eigenvalues are defined as a part of the continuous spectrum. In the IST theory [22], they are also continuous real-valued eigenvalues. So, hereafter, we call them are also eigenvalues. Since the discrete Fourier transform (DFT) is used with a finite time window when the eigenvalues are calculated, the continuous spectrum was discretized on the real axis. 2 IST is also well known as nonlinear Fourier transformation (NFT), e.g. see [7]. not followed in a circularly symmetric complex Gaussian distribution [24]. In addition, the variance depends on the eigenvalue position and the combination of eigenvalues. Although eigenvalues were used in the same position, the variance is changed by eigenvalues with other positions. For example, the use of either eigenvalue set ∈ {i, −1 + i} or {i, 1 + i} is assumed for on-off encoded eigenvalue modulation. Eigenvalue {i} was assigned to all eigenvalue sets; however, the amount of variance of eigenvalue {i} was different between {i, −1 + i} and {i, 1 + i}. This situation makes it extremely difficult to optimize the boundary of the symbol decision using linear thresholds on the complex plane. A decision process with the Euclidean minimum distance was demonstrated to set an appropriate threshold [17]. Machine learning-based demodulation methods (in particular, ANN) [14], [15], [19] and equalization [18], [25] have also been proposed.
The ANN cases include two types of typical demodulation methods: demodulation in the time domain (TD) [12], [14], [19] and in the eigenvalue domain (ED) [14], [15]. Our previous works [12], [14] using TD demodulation indicated a drastic power margin of 11 dB compared with the IST and hard decision demodulation methods. Meanwhile, previous TD works required the preparation of a training model of ANNs for each transmission distance. It is important to demodulate in the ED to maximize the inherent advantages of eigenvalue demodulation. This is because demodulation in the eigenvalue plane enables a distance-independent ANN demodulation. Our group has demonstrated an ED demodulation method that exhibits high generalization performance over propagation distances up to 3000 km for signals with on-off encoded eigenvalues [14] using four eigenvalues as shown in Fig. 1. To improve transmission capacity, a multilabel learning method that can demodulate on-off encoded twelve-optical-eigenvalue signals [13] over propagation distances of up to 200 km is also demonstrated [15].
However, to demodulate the signal in the eigenvalue plane, the receiver must perform eigenvalue computation and sort symbol-by-symbol (soliton pulse by soliton pulse). This requires enormous computational complexity; therefore, employing the CME [20], [21] to reduce the computational complexity required to extract the eigenvalues is considered.

B. Eigensolver
Eigenvalues are generally obtained using QZ, which is a stable eigenvalue solving algorithm. QZ extends the QR algorithm for the standard eigenvalue problem (Az = ζz) to the generalized eigenvalue problem (Az = ζBz) [26]. When using unitary matrices Q and Z, square matrices A and B are transformed into upper triangular matrices QAZ and QBZ. However, QZ requires large computational complexity O(n 3 ) where the matrix size is n × n. In the eigenvalue communication, as discussed in the next section, n is the two-fold number of samples per soliton pulse. When solving for eigenvalues with QZ, eigenvalues are obtained from the diagonal elements of the upper triangular matrix as {ζ 1 , ζ 2 , . . . , ζ i , . . . , ζ n } = diag(QAZ) where i is the index of an element of the i-th row and column. The output is usually ordered by the magnitude of the absolute value |ζ 1 | ≥ |ζ 2 | ≥ · · · ≥ |ζ n |. This output order may cause bit errors when demodulating by combining it with an ANN. Details are provided in Section III.
The CME is considered to reduce the computational complexity when solving for the eigenvalues because QZ requires a large complexity. Fig. 2 shows an overview of the CME method [20], [21]. The CME sets arbitrary circular regions in a complex plane and solves the eigenvalues inside these circular regions. CME allows parallel computation for each circular region. In Fig. 2, three circles are drawn to extract five eigenvalues. To simulate the circle region in the computer, the circle is discretized and several quadrature points are prepared. Each quadrature point (eight points in Fig. 2) used in the numerical integration approximation in each circular region is parallelized, and each quadrature point is calculated in parallel. Thus, three parallel operations are performed. The mathematical details of CME are presented in Appendix D.
As a similar technique, a technique using contour integrals has been proposed [27]. This work can efficiently find discrete eigenvalues. The stable algorithm locating zero proposed by L. M. Delves and J. N. Lyness [28] is used in [27]. The related work [27] takes advantage that the poles of function a(ζ) described in NFT are discrete eigenvalues. CME also partially uses this algorithm [28]. As an advantage, CME can obtain the arbitrary eigenvalues in closed curves since it is a generic eigenvalue-seeking method. That is, CME can obtain continuous eigenvalues in addition to discrete ones. Our used ANN needs not only discrete eigenvalues but also continuous ones. Another vital advantage is the ability to perform parallelization massively.
Although the overall computational complexity of the CME is larger than that of the QZ, the computational complexity is O(Nn 3 /N p ) because of hierarchical parallelism. Largescale parallelization significantly reduces computing complexity compared to QZ O(n 3 ). Here, N denotes the number of quadrature points prepared for the numerical integration approximation on each closed circle, and N p is the number of parallelizations. The larger the value of N , the higher the computational performance and the smaller the eigenvalue error. However, because the size of N is directly related to the receiver scale, it is preferable to keep the size as small as possible when using this method in optical fiber communication systems. The next section describes the proposed method by combining CME with ANN, including the sorting algorithm, and comparing it with the QZ and ANN case. Fig. 3 shows an overview of the eigenvalue demodulation scheme. An analog optical signal is separated into IQ components by a coherent receiver and converted into a digital signal by an analog-to-digital converter (ADC). The obtained digital pulse train is divided into pulses, and the optical eigenvalues are extracted by IST. The pulse division must continue to be obtained with the same interval. That it, the periodic refresh operation is needed under the timing jitter. Note that this paper assumed the ideal condition for the jitter. Let the pulse division method under the jitter condition be a further issue. Optical eigenvalues are extracted using QZ. Let the sampling rate be S a samples/pulse and the number of extracted eigenvalues be 2S a . The typical number of S a is 64 or 128 [12], [13], [14], [15]. The number of samples is determined by the spectrum bandwidth. The modulated discrete eigenvalues mapped on large value in Fig. 4. Block diagram of eigenvalue demodulation scheme using CME method. Fig. 3.

Algorithm 1: Sort in
concatenate vectors the direction of real axis have the larger value of S a . That is, an increase in the number of modulated discrete eigenvalues brings that in the matrix size. Algorithm 1 shows the sorting algorithm for eigenvalues. The eigenvalue set ζ in ∈ C 2S a extracted using QZ is prepared. In general, the eigenvalue set ζ in comprises complex eigenvalues and their conjugates. That is, half of the eigenvalues in the set do not contribute to bit information. Thus, the conjugate elements are removed using Algorithm 1. 3 First, the algorithm sorts the eigenvalues in the set in the real part direction (Line 1). The complex conjugate eigenvalues are removed from the sorted optical eigenvalues (Line 2-5). Subsequently, the real and imaginary parts of the eigenvalues ζ tmp ∈ C S a are decomposed. The decomposed eigenvalues are connected to form a new eigenvalue set, ζ out ∈ R 2S a (Line 6). The eigenvalue set ζ out is used as the input to the ANN. In addition, when extracting optical eigenvalues, the QZ method requires a large computation time O(n 3 ), where the matrix size to solve for eigenvalues is n. Therefore it is necessary to reduce the computational complexity. An algorithm is proposed in [14]. The proposed algorithm is described in the following subsection. This paper compared the proposed method with this previous work as a baseline to determine its feasibility. Fig. 4 shows a block diagram of demodulation using a CME combined with an ANN. The process before inputting the signal 3 Note that the maximum and minimum elements of the real part of the eigenvalues in the set are not pair of eigenvalue and its conjugate. In other words, the number of elements of the pair is 2S a − 2. Algorithm 1 also removes either maximum or minimum elements of real part of eigenvalue in the set. This fact does not affect the performance of the algorithm. Fig. 4.

Algorithm 2: Sort in
concatenate vectors into the IST is the same as in Fig. 3. The parallelized CME extracts eigenvalues in each circle from the matrix generated by the IST. Because the CME prepares multiple overlapped circles for eigenvalue extraction, duplicated eigenvalues can be extracted multiple times. The algorithm calculates the Euclidean distance between all eigenvalues in the overlapped region to remove the duplicated eigenvalues. When the Euclidean distance is less than the circle's radius, these eigenvalues are regarded as duplicated and removed. Thus, the duplicated eigenvalues in the CME algorithm are removed by referring to the overlap of the circles, and the unduplicated eigenvalues are input into the sorting algorithm. Algorithm 2 shows the sorting algorithm for the eigenvalues extracted by CME. Eigenvalue set ζ in ∈ C N in where N in is the number of extracted eigenvalues from the CME, is sorted in the real direction (Line 1). Eigenvalues with negative imaginary parts are removed from the sorted eigenvalue set (Lines 2-6). The complex eigenvalues output by the IST appear as pairs of their complex conjugate elements. The conjugate eigenvalues have the same information as the original eigenvalues; therefore conjugate eigenvalues are deleted. CME extracts only the eigenvalues inside the drawn circles, which makes it difficult to extract the eigenvalues if they are outside the circles. Because the number of input units of the ANN is fixed, it is necessary to match the number of eigenvalues to this number of units. Therefore, if the CME results in a shortage of extracted eigenvalues in this algorithm, zero-padding is employed (Line 7). In addition, the real and imaginary parts of the eigenvalue set ζ tmp are decomposed. The decomposed eigenvalues are concatenated to construct the eigenvalue set ζ out (Line 7). The eigenvalue set ζ out is used as the input to the ANN.

C. CME With Partial Sort
As shown in Figs. 3 and 4, the eigenvalue set is sorted in the real part direction, and the eigenvalues are then divided into real and imaginary parts and input into the ANN. This sorting process requires O(n 2 ) of computational complexity using a simple algorithm. Even with a quicksort, which is well known as a fast sorting algorithm, the average time is O(n log n) and the worst-case time is O(n 2 ). Thus, the complexity of the sorting process must be reduced.
concatenate vectors This paper proposes a partial sort method that groups several circles of CME and employs the sort algorithm for each group. Fig. 5 shows a block diagram of the eigenvalue demodulation method using CME and partial sorting. The process before inputting the received signal into the IST is the same as in Fig. 3. Algorithm 3 presents an algorithm for partial sorting.
Eigenvalue set input into the algorithm denotes ζ in ∈ C c×r×N max where c, r ∈ Z denote the identifiers of each circle of the CME in the vertical and horizontal directions, respectively. A circle at the origin of the complex eigenvalue plane is set as (c, r) = (0, 0). The third dimension, ζ in represents the number of eigenvalues in each region. N max denotes the maximum number of eigenvalues. Nulls are stored when the number of eigenvalues in a circle is less than N max .
The eigenvalue set ζ in extracted by the CME is grouped and sorted in the direction of the real part for circles with the same real part identifier r (Lines 1-2). Next, the 3-dimensional eigenvalue set is reshaped to a 1-dimensional eigenvalue set (Line 3). Line 4 eliminates the null elements. The following operations are the same as those in Algorithm 3 (Lines 6-11). Depending on the number of circles, partial sorting reduces the complexity compared to sorting the entire eigenvalue set.

IV. INVESTIGATION OF CME AVAILABILITY
We firstly employed CME on eigenvalue communication with a simulation in the case of a single eigenvalue withon-off encoding ζ ∈ {0.5i}. The simulation investigated the required number of quadrature points N of the CME parameter in eigenvalue communication. In CME, while parallel operations are required for as many quadrature points as possible, N should be as small as possible to reduce the complexity; there is a trade-off in which a small N results in a large computational error. Therefore, the minimum N parameter was determined.
In the simulation setup, the bit sequence generated by the transmitter was encoded into a single eigenvalue sequence. The IST transforms the eigenvalue sequence into a pulse train. The pulse train was input into the additive white Gaussian noise (AWGN) channel. We added the AWGN using a preset signal-to-noise power ratio (SNR). For the definition of SNR, the soliton pulse is discretized 32 samples per pulse. The simulation combined the multiple pulses and generated the pulse train. The signal power is calculated from the pulse train. Signal power divided by SNR determines the noise power.
At the receiver side, eigenvalues were extracted from the divided pulses using the IST. Fig. 6(a) shows the constellation of back-to-back eigenvalues and designated regions for the CME. Fig. 6(b) shows the variance of the eigenvalues for the change in SNR. The variance was calculated between the ideal and received eigenvalue positions. The quadrature points were set as N = 2, 4, 8, and 16. The effect of changing N on variance is small. Fig. 6(c) shows the variance with N in the back-to-back simulation without noise. Compared to Fig. 6(b), the variance caused by the CME calculation error was negligible. addFor example, when N = 2, the eigenvalue has the about 10 −5 variance. This value is less than the variance caused by the noise. Althogh general CME applications focus on extracting exact eigenvalues, optical eigenvalue communication can use CME with a minimal number of N because the noise in the channel is dominant.
To investigate the effect of the number of N on the demodulation performance, the results of the bit error rate (BER) measurements are shown in Fig. 6(d). The simulation set the threshold for the hard decision to 0.25i. Note that this simulation did not employ ANN. Even when N was varied from 2 to 16, the BER results were the same as in the QZ case. In this simulation, the use of a single eigenvalue was assumed. When multiple eigenvalues trsare used, several eigenvalues may exist in a circle. In this case, the CME requires higher number of N . Therefore, depending on the number of eigenvalues used, the receiver must determine the number of N .

A. Setup
2000 km transmission experiments were conducted. Fig. 7 shows the experimental transmission system. The transmitter generates a random bit sequence and encodes every four bits into an eigenvalue pattern, as shown in Fig. 1. In total, 62250 pulses were generated. The pulse sequences were input into an arbitrary waveform generator (AWG) at 10 GSa/s (Sa/s means samples per second) with 32 samples/pulse and 5.3-GHz bandwidth limitation. The bit rate was 1.25 Gbps (= 0.315 Gpulse/s × 4 bit/pulse). The RF signal was modulated into an optical signal by an IQ modulator, which has 20-GHz bandwidth limit, and input into a 2000 km transmission system.
For the transmission length, the dispersion length L D (nonlinear length L NL ) should be sufficient longer than the distance of the amplifier spacing to suppress the fiber loss effect based on the guiding-center soliton system [29]. In the experiment, the dispersion length L D could be lengthen to 446 km by using a NZ-DSF with a dispersion parameter D of 4.4 ps/nm/km, which was sufficient longer than the distance of the amplifier spacing of 50 km. In addition, an inter-symbol interference is one of main impairments to limit the transmission distance in the eigenvalue transmission system [30]. This is the reason why many studies on experimental demonstrations of eigenvalue transmission and nonlinear frequency division multiplexed (NFDM) system employed a NZ-DSF. Based on these backgrounds, the transmission distance was set at 2000 km.
The initial transmission power is set by balancing dispersion length (L D = t 2 0 /|β 2 |) with nonlinear length (L NL = 1/γP 0 ) where t 0 is an input pulse width of fundamental soliton at ζ = 0.5i, β 2 is dispersion of the group velocity, γ is nonlinear coefficient, and P 0 is the peak power of fundamental soliton. That is, the peak power is expressed as, Considering the guiding-center theorem [29], then the input power P in is given by, where Γ = αt 2 0 /|β 2 |, α is the fiber loss, and Z a is the amplifier spacing. We multiplied P in by the normalized power |q| 2 described in Appendix A. Consequently, as a practical value, the input average power was set to −3.0 dBm taken into account the pattern sequence, the loop loss, connection loss, and amplified spontaneous emission (ASE) noise from EDFA. t 0 was set to 50 ps.
The optical signal was received by a coherent receiver of 20-GHz bandwidth limit, and the stored pulse sequences were resampled to 20 GSa/s using a 40-GSa/s and 50-GHz digital sampling oscilloscope (DSO), and then eigenvalue demodulation was performed.
1) QZ + Sort + ANN: The ANN used has a three-layer structure with one hidden layer consisting of 64 units for the input layer, 256 units for the hidden layer, and 16 (= 2 4 ) units for the output layer. The input unit layer has a 2-fold sampling rate to decompose the eigenvalues into real and imaginary components. For the activation function, a rectified linear unit (ReLU) function was applied to the hidden layer, and a softmax function was used for the output layer. The optimization method for learning the ANN was Adam [31], with an initial step size of α = 0.001 and exponential decay rates of β 1 = 0.9 and β 2 = 0.999, and a coefficient of = 10 −8 . MATLAB Deep Learning Toolbox released by Mathworks Inc. was used as the ANN framework. The BER measurement was performed assuming 10000 of the 62250 pulses were training data and the remaining 52250 pulses were test data.
2) CME + Sort + ANN: CME was implemented using z-Pares [32]. z-Pares is a package for solving parallel eigenvalues using the Sakurai-Sugiura method [20], [21]. Fig. 8 shows the designated region for eigenvalue solving using the CME method. There are two types of eigenvalues; discrete eigenvalues and  continuous eigenvalues. The example of the discrete eigenvalues appear in Fig. 1. Continuous eigenvalues have continuous spectrum [22]; however, when extracted from the sampled waveforms, they become discrete values and appear along the real axis. Fig. 9 shows the mapping of all 32 eigenvalues of 4 discrete and 28 continuous eigenvalues.
For ANN parameters, the input layer of ANN was 32 units, the hidden layer was 128 units, and the other parameters were the same as those in (1) QZ + Sort + ANN. Fig. 10 shows the constellation of eigenvalues using (a) QZ and (b) CME. Although QZ in Fig. 10(a) extracts all eigenvalues existing in the complex plane, Fig. 10(b) obtains eigenvalues based on the region, as shown in Fig. 8. In the CME case, the four on-off encoded discrete eigenvalues were perfectly extracted. Fig. 11 shows the BER results for back-to-back and 2000 km transmissions. At 7% overhead, hard decision forward error correction (HD-FEC) limit (= 3.8 × 10 −3 ) [33], the OSNR penalty between QZ and CME cases was approximately 1.6 dB for back-to-back and 2 dB after 2000 km transmission. The proposed scheme using CME and ANN was effective when these penalties were acceptable. Hypothetically, the OSNR penalty appeared because continuous eigenvalues other than the four discrete eigenvalues likely affected the demodulation performance. The following section investigats the effect of the continuous eigenvalues, for which only a part is obtained.

D. Impact of Continuous Eigenvalues on BER Characteristics
The previous section discussed the reason of OSNR penalty between employing QZ and CME methods. A demodulation scheme using continuous eigenvalues has been reported [34]. To our best knowledge, there is no report that continuous eigenvalues are essential in the case of the on-off encoding scheme. In this section, we varied the number of circles in the CME to expand the extraction region of the eigenvalues. In particular, to extract the continuous eigenvalues, we added the circles on the real part axis. Fig. 12 shows the expanded circular region of the CME. Fig. 13 shows the constellation of eigenvalues extracted from the updated circular region. The additional region as shown  in Fig. 12 provides a wide range of continuous eigenvalues along the real axis. Fig. 14 shows the BER results for back-to-back and 2000 km transmission. By extracting continuous eigenvalues along the real axis, the OSNR penalty in Fig. 11 disappeares, and the demodulation performance is comparable to that of QZ. Therefore, ANN is likely to use the variation of discrete and continuous eigenvalues for class classification. Depending on the positions of the discrete eigenvalues, the variance of the continuous eigenvalues caused by noise is different; therefore, continuous eigenvalues are also likely to become features for bit decisions in the ANN.
The above analysis extracted all 32 eigenvalues using multiple circles of the CME. In the CME case, we can select any number of circles we use. In addition, a smaller number of circles reduces the receiver's complexity. The next analysis investigated the minimum number of circles of the CME to minimize the OSNR penalty. For the circlar positions of the CME, it was assumed that the circles were positioned with imaginary axis symmetry, as in Fig. 12. Here, the center coordinates of the maximum and minimum circles along the real axis were defined as (c, r) = (0, ±r M ). For example, in Fig. 12, r M was set to ±2.0. If the circles are removed at r M = ±2.0; then, r M is reset to ±1.75. This procedure was used to adjust the number of circles. Fig. 15 shows the OSNR that satisfies the 7% HD-FEC limit by changing the r M . The wider the eigenvalue extraction range, the higher the noise tolerance. By setting the eigenvalue extraction range to |r M | < 1.75, the OSNR tolerance is close to that of the QZ case. In other words, when each discrete eigenvalue turned on or off according to the bit pattern, continuous eigenvalues in |r M | < 1.75 affected the bit decision characteristics in the ANN. Meanwhile, increasing the number of circles to extract a wide range of continuous eigenvalues increases receiver complexity.  Therefore, it is necessary to determine the extraction region from the perspective of the required OSNR and available complexity.

E. Impact of N on the Accuracy
The smaller number of quadrature points N in the CME is a critical factor that causes worse accuracy than that of QZ. In Section IV, better results were obtained with N = 2 in the case of a single eigenvalue communication; however, it is expected that the same results are not always obtained in the multiple eigenvalue cases. This subsection analyses the case with four eigenvalues. Fig. 16 shows the constellation maps on the complex eigenvalue plane with varying N : (a) N = 4, (b) N = 6, (c) N = 8, and (d) N = 16. In (a) N = 4, many outliers appear compared to (c) N = 8. When a CME is required to obtain several eigenvalues, accuracy depends on N . That is, the error in the numerical integration with the quadrature points corresponds to the accuracy of the eigenvalue extraction. When a larger error occurs, the eigenvalues move to random positions, Fig. 17. Impact of the number of quadrature points N on BER characteristics. and thus outliers appear. In (b) N = 6, whereas there is a slight random outliers, it was more suppressed than in (a) N = 4. Note that with an increase in noise or the number of eigenvalues in a circle, the outliers are likely to increase, even though N = 6. Fig. 17(a) shows the back-to-back BER measurement results for different values of N . For N = 4, an OSNR penalty of approximately 1.5 dB is observed at the 7% HD-FEC limit. The reason for the penalty is the outliers on the constellation map. The results for N = 6 were almost the same as those for N = 8. Outliers of the eigenvalues were observed for N = 6 as shown in Fig. 16(a). Meanwhile, there was a small number of pulses where outliers eigenvalue, that did not affect the BER performance. Fig. 17(b) shows the BER measurement results after a 2000 km transmission with varying N . When the number of quadrature points is N = 4, the error floor appears at approximately the 7% HD-FEC limit. The error floor is expected to be caused by the random variance of the eigenvalues, as in the back-to-back case. The OSNR penalty was approximately 0.2 dB, even at N = 6. This is because random variance occurred, albeit only slightly. The higher the value of N , the higher the accuracy of the eigenvalue decision. From these results, we expect that N needs to change depending on the number of eigenvalues within the closed curve. The fewer the eigenvalues in the closed curve, the smaller N can be taken. In other words, the larger the noise is, the more the eigenvalues fluctuate, so N must be larger. For four eigenvalues, it was possible to achieve the better BER results with N = 6 and 8. Although empirical, these results were a small value compared to the original CME application. Meanwhile, the larger the value of N , the larger the complexity. Therefore, it is necessary to set parameters in terms of accuracy for the required OSNR and complexity.

F. Employing Partial Sort
To further reduce computational complexity, a proof-ofprinciple experiment was conducted for partial sorting. Using the  measurement data obtained from Fig. 7, we use the algorithm for eigenvalue demodulation, as shown in Fig. 5 and Algorithm 3. The parameters were the same as those described in Section V-D. Fig. 18 shows the BER results when using the partial sort compared to the BER, as shown in Fig. 14. The BER characteristics were almost comparable for all measured points.
We confirmed the complexity reduction by employing the partial sort compared with the normal quick sort applied to all eigenvalues. Fig. 19 shows a comparison of sorting complexity. We varied the r M from ±0.75 to ±3. 25. The vertical axis indicates the estimated computational complexity by comparing only the sort process. Another complexity is the same compared between CME with and without the partial sort.
Let the number of eigenvalues denote N ζ , we estimated the average complexity as O(N ζ log 2 (N ζ )), and the worst complexity was estimated as O(N 2 ζ ). When using the normal quick sort without the partial sort, we referred the above estimation on the all extracted eigenvalues. That is, when N ζ = 32, the average and worst complexities were 32 × log 2 (32) = 160 and 32 2 = 1024. Here note that, when the complexity is O(f (n)), we refer to only the order f (n). In partial sort, we estimated the complexity for each eigenvalue group indicated by Algorithm 3, and their complexities were summarized. We determined the merged complexities as the total complexity and plotted on Fig. 19.
The computational complexity of using the partial sort is smaller when the specified region by CME is set to |r M | 1.75. For the average complexity at |r M | = 1.75, both sorting methods are almost same values. At |r M | = 3.25, the partial sort obtained 32.9% complexity reduction compared with normal quick sort. For the worst case at |r M | = 1.75 and |r M | = 3.25, the partial sort obtained 55.8% and 79.0% complexity reduction, respectively. In the worst complexity, the partial sort divided the sorting into smaller parts, so the complexity was suppressed even evaluated at O(N 2 ζ ). Therefore, as a result, we use multiple eigenvalues and increase the sampling rate, then the partial sort is in significant effect.

VI. CONCLUSION
This paper proposed joint demodulation of the CME and ANN for optical eigenvalue communication. In the discrete and continuous eigenvalues obtained by solving the IST, the CME extracted only the discrete and nearby continuous eigenvalues. After extracting the eigenvalues, they were input into the ANN for bit decision. As a result, a 2 dB OSNR penalty was obtained compared with the QZ case. This result indicates that the proposed method is practical for systems that can accept this penalty.
Furthermore, to investigate the reason for the 2 dB OSNR penalty, the CME expanded the circlar regions on the complex eigenvalue constellation map and increased the number of extracted continuous eigenvalues. The BER characteristics were comparable to those of QZ when the CME circle, which is the maximum distance between the center coordinate and the original point, was positioned at |r M | < 1.75. In other words, We revealed that the variation of the discrete and continuous eigenvalues in | [ζ]| < 1.875 affects the bit decision in the ANN.
We investigated the number of quadrature points N , an internal parameter of the CME algorithm, that is of concern as another factor affecting BER characteristics. When using a single eigenvalue, even the minimum number of quadrature points of two did not affect the BER characteristics; however approximately six quadrature points were required when using multiple eigenvalues.
A partial sorting algorithm was proposed and validated with experiments. The BER characteristics have no OSNR penalty compared with using the normal quick sort applied to all eigenvalues while reducing complexity.
In a further study, FPGA implementation of the proposed method and evaluation of the throughput and latency of largescale parallel operations are required. As a limitation of this study, the theoretical reason why continuous eigenvalues affect BER characteristics has not been revealed. Future work requires investigating the effect of continuous eigenvalues in the case of multiple discrete eigenvalue transmissions.

APPENDIX A ASSOCIATED EIGENVALUE EQUATION
The behavior of the complex envelope amplitude of a light wave propagating in an optical fiber is described by [35] i ∂E ∂z − β 2 2 , and α [1/m] are the propagation distance, time measured in a coordinate system moving at the group velocity, and the complex envelope amplitude of the electric field, dispersion of the group velocity, nonlinear coefficient, and fiber loss, respectively. (3) is conducted the following normalization, where T is a time normalized to the input pulse width t 0 .
Considering the normalization and guiding center theorem [29], the NLSE is given by, The associated eigenvalue equations used to solve the NLSE analytically with the IST are expressed as follows: where ζ and ψ l (Z, T ) (l = 1, 2) are complex eigenvalues and eigen functions, respectively. If q(Z, T ) satisfies (5) then, eigenvalues ζ of (6) are invariant with the distance Z. By mapping a bit sequence to the on-off states of the complex eigenvalues, multiple eigenvalues modulations are realized.

APPENDIX B MULTIPLE EIGENVALUE MODULATION
In contrast to general multi-level modulation schemes such as quadrature amplitude modulation (QAM), multiple eigenvalues are mapped to a single symbol (one pulse) in multiple eigenvalue modulation schemes. Fig. 1 represents the optical eigenvalues in the complex plane and is an example of modulation that maps a bit sequence to the on-off states of the four eigenvalues. Turning multiple eigenvalues on and off achieved a 16 (= 2 4 ) pattern of eigenvalue position [10]. This is called on-off encoding, showing that the number of bits directly corresponds to the number of multiple eigenvalues. Here, this pattern of eigenvalue placement is called the eigenvalue pattern.
After mapping the eigenvalues, they were converted into pulses using the IST described in APPENDIX A. The converted pulse is sent to a digital-to-analog conveter (DAC), and an IQ optical modulator generates an optical pulse.

APPENDIX C EIGENVALUE DEMODULATION
On the receiver side, the IQ component of each optical pulse is received by a coherent receiver. The IQ components were converted into digital signals using an analog-to-digital converter (ADC). This digital signal was divided into pulse waveforms using a window function, and the eigenvalues were extracted. To obtain the eigenvalues ζ of (6), (6) is Fourier-transformed into the integral equation shown below: whereq(Z, Ω) andψ l (Z, Ω) are the Fourier transforms of q(Z, T ) and ψ l (Z, T ), respectively. By discretizingq(Z, Ω) and ψ l (Z, Ω) in the frequency domain and converting convolution integral of the second term on the left-hand side of (7) to the numerical integration, (6) can be converted to a matrix eigenvalue problem, as in the following equation [6]: whereψ l is a column vector whose elements areψ l (Ω n )(n = 1, 2, . . . , N) and Ω n is the discretized frequency. B * is the Hermitian transpose matrix, B. A and B are square matrices of N × N , and their elements are as follows: where n jk = N/2 + j − k + 1 and ΔΩ(= Ω n+1 − Ω n ) is the frequency interval of the discretization.

APPENDIX D CME METHOD
To reduce the computational complexity of solving the eigenvalue problem, the complex moment-based eigenvalue solving method is applied to optical eigenvalue communication. The Sakurai-Sugiura method (SSM), one of the complex momentbased eigenvalue solvers, was proposed by Sakurai and Sugiura in 2003, characterized by high parallel processing performance [20].
Here, the detailed algorithm of SSM is introduced. SSM is a parallel computation algorithm that solves only the eigenvalues within a specified region. First, a closed curve was drawn on the complex plane. The center of the closed curve is γ, the radius is ρ, and the quadrature point z j = ρξ j + γ(j = 1, 2, . . ., N) is on the closed curve where ξ j = cos θ j + i sin θ j and θ j = (2π/N s )(j − 1/2). QZ extracts all eigenvalues in the complex plane, whereas SSM only finds the eigenvalues in the specified region. In addition, each specified region can be calculated in parallel, which is the first parallelization. The following equation is then computed.
The eigenvalues within each region were obtained as poles by the spectral decomposition of the matrix bundle in the above (11). (11) can be computed in parallel at each integration point: and is the second parallelization step. The simultaneous calculation of (11) can also be performed in parallel. This is the third parallelization mrthod. Where u and v are non-zero random matrices n × L, where I is the identity matrix. The A is the matrix of computed eigenvalues. L is a parameter determined from the number of eigenvalues in the region. The number of eigenvalues in the domain is calculated as follows: w j tr (z j I − A) −1 (12) where w j is the weight at each quadrature point in each region, and w j = (cos θ j + i sin θ j )/N s . To extract the poles, (11) is approximated by trapezoidal numerical integration of the contour integral along a closed curve.
where M is a parameter for M < N s and must satisfy the condition n < ML. n denotes the number of eigenvalues within a certain region. Quadrature points were used to approximate the numerical integration when performing contour integration along a closed curve. Finally, the eigenvalues can be computed from the Hankel matrix that integrates the computations at each integration point, and is represented by the following equation: where σ 1 , σ 2 , . . . , σ M are the diagonal elements of Σ, such that where σ m ≥ δ, σ m +1 < δ, and δ is an arbitrarily small value, and in this paper δ = 10 −12 . Also, from m , U 1 ∈ C M ×m , Σ 1 ∈ C m ×m , W 1 ∈ C M ×m . From this, the eigenvaluesζ i (i = 1, 2, . . . , m ) can be determined as the eigenvalues ofÃ = Σ −1 1 U T 1H (−1) M W 1 . In this case, eigenvalue ζ i of the matrix A are obtained as ζ i = ρζ i + γ.