SECTION I

The first generation of commercial coherent optical systems began at line rates of 40 Gb/s and 100 Gb/s [1], [2]. The formats of choice were polarization-multiplexed (PolMux) binary phase-shift keying (BPSK) and quaternary phase shift keying (QPSK), allowing sufficient reach and fitting into the 50-GHz grid of typical systems. The signal was modulated in the time domain and transmitted using single-carrier (SC) or dual-carrier approaches [2]. Besides SC-based systems, research and development are focusing on orthogonal frequency-division multiplexing (OFDM) that is a contender for the second generation of coherent systems due to the stated higher spectral efficiency [3], [4], [5]. For second-generation systems, 400 Gb/s and 1 Tb/s are discussed as probable line rates. The new generation will likely require major changes, such as low-noise amplifiers, low-loss fibers, higher bandwidth requirements, or optical-electrical regeneration. First concepts for 1 Tb/s lean toward wavelength-division multiplexing (WDM) solutions, using either OFDM [6] or SC channels as subcarriers [2].

Typical coherent optic receivers have been either using blind adaptation, relying on the signal statistics only [7], or training sequences, as discussed in [1] and [8]. Training sequences inherently increase the signal bandwidth, putting a slight advantage on blind receivers. A 112-Gb/s PolMux QPSK with a symbol rate of 28 GBd and a sampling rate of 56 Gsamples/s for twofold oversampling put a stringent requirement on the design of analog-to-digital converters (ADCs). While faster ADCs will become available in the future, increasing the room for signal overhead, future coherent links are likely to be based on several wavelengths, leading to a modular design of the receiver with multiple lower rate subchannels. Therefore, overhead requirements are unlikely to be a major limiting factor. Future coherent systems will also require a high degree of scalability, in order to be able to increase the underlying modulation format without leading to a major redesign of the digital signal processing (DSP) algorithms. Data-aided receivers allow for a scalable and dynamic design of the receiver, while the performance of blind algorithms might suffer for higher order formats. Thus, there is a need for a reevaluation of blind and data-aided coherent receivers for future fiber-optic communication systems.

A general block diagram for the signal processing in SC coherent receivers is shown in Fig. 1. It is comprised of dispersion compensation using frequency-domain equalization (FDE) [9], timing recovery, multiple-input–multiple-output (MIMO) equalization, and carrier recovery, which can employ feedback and feedforward algorithms.

In this contribution, blind and data-aided receiver concepts will be discussed and compared for SC coherent receivers, focusing on MIMO equalization. Synchronization will be assumed ideal if not mentioned otherwise, and the interested reader is referred to [10], [11], [12] for general algorithms and to [7], [9], and [13] for their application in optical receivers.

The paper is structured as follows. Section 2 introduces the linear fiber optic channel model. The basics of channel estimation are explained in Section 3, followed by a discussion of time-domain equalization (TDE) and FDE algorithms in Sections 4 and 5, respectively. Data-aided and blind receivers are compared in Section 6 in terms of convergence speed, tracking performance, and receiver complexity, arriving at a recommendation for next-generation SC coherent optical receivers.

SECTION II

Coherent optic systems that employ polarization multiplexing to send information can be viewed as a special case of MIMO systems. In wireless communications multiple antennas are used at the transmitter and receiver side using spatial multiplexity. In optics, two orthogonal polarizations are transmitted, resulting in a 2 × 2 MIMO system. Fig. 2 shows a typical equivalent baseband model for the linear fiber channel, where nonlinearities are neglected.

In the source, bits are mapped to an amplitude and phase of the transmitted signal vector ${\bf s}[k]$, followed by the pulse shaping filter $g_{s}(t)$, the channel comprising of $L$ spans each with an impulse response ${\bf C}_{i}(t)$ and an additive white Gaussian noise (AWGN) source ${\bf n}_{i}(t)$. At the receiver, the signal passes a receiver filter $g_{r}(t)$ and is sampled with an oversampling rate of typically $1 \leq m \leq 2$. Finally, the signal is equalized in the time-discrete domain in the filter ${\bf W}[k]$. If not otherwise noted, synchronization is assumed ideal. The transmitted signal vector is defined as TeX Source $${\bf s}^{T}[k] = \left[s^{T} [k], s^{T} [k - 1],\ldots, s^{T} [k - L_{h} - L_{w} + 2]\right]\eqno{\hbox{(1)}}$$ with TeX Source $${s}^{T} [k] = \left[s_{1} [k], s_{2} [k],\ldots, s_{N_{t}} [k]\right]\eqno{\hbox{(2)}}$$ where $N_{t} = 2$ is the number of transmit channels for PolMux systems. Here, $L_{h}$ is the maximum length of the channel impulse response resulting from the transmitter and receiver filters $g_{s}(t)$, $g_{r}(t)$, as well as the total channel ${\bf C}(t)$. $L_{w}$ is the length of the equalizer and is chosen as $L_{w} \geq L_{h}$.

The fiber channel is modeled by a concatenation of linear filters, consisting of chromatic dispersion (CD), polarization-mode dispersion (PMD), polarization-dependent loss (PDL), and bandpass filters, as well as noise sources [14]. If nonlinearities in the fiber channel can be neglected, a lumped-noise model with a single noise source at the receiver can often be used. The receive filter $g_{r}(t)$ in optical systems consists of the superposition of optical and electrical filters. Since $g_{r}(t)$ is not a matched filter, it will be added to the channel for a total impulse response of ${\bf H}(t) = g_{r}(t) \ast {\bf C}(t) \ast g_{s}(t)$.

For convenience, the complete channel is described by a discrete model. The transmission over the frequency-selective link is then given by TeX Source $${\bf r}[k] = {\bf Hs}[k] + {\bf n}[k]\eqno{\hbox{(3)}}$$ with the received vector and the noise vector defined as TeX Source $$\eqalignno{{\bf r}^{T}[k] =&\, \left[{\mmb r}^{T}[k], {\mmb r}^{T}[k - 1], \ldots, {\mmb r}^{T} [k - L_{w} + 1]\right]&\hbox{(4)}\cr {\bf n}^{T}[k] =&\, \left[{\mmb n}^{T}[k], {\mmb n}^{T}[k - 1], \ldots, {\mmb n}^{T}[k - L_{w} + 1]\right]&\hbox{(5)}}$$ with TeX Source $$\eqalignno{{\mmb r}^{T} [k] =&\, \left[r_{1} [k], r_{2}[k], \ldots, r_{N_{r}}[k]\right]&\hbox{(6)}\cr {\mmb n}^{T} [k] =&\, \left[n_{1} [k], n_{2}[k],\ldots, n_{N_{r}}[k]\right].&\hbox{(7)}}$$ Here, $N_{r} = 2$ is the number of receive channels in the polarization diversity receiver. Note that ${\bf n}[k]$ is in general colored due to preceding filtering. For baud-spaced sampling, the channel matrix is written in Toeplitz matrix form as TeX Source $${\bf H} =\left[\matrix{{\bf H}_{0} & \cdots & {\bf H}_{L_{h} - 1} & {\bf 0} & \cdots & {\bf 0}\cr {\bf 0} & {\bf H}_{0} & \cdots & {\bf H}_{L_{h} - 1} & \cdots & {\bf 0} \cr \vdots & & \ddots & \cdots & \ddots\cr {\bf 0} & \cdots & {\bf 0} & {\bf H}_{0} & \cdots & {\bf H}_{L_{h} - 1}}\right] \in {\BBC}^{L_{w} N_{r} \times (L_{w} + L_{h} - 1) N_{t}}.\eqno{\hbox{(8)}}$$ A MIMO channel submatrix is defined as TeX Source $${\bf H}_{i} = \left[\matrix{h_{1, 1} & \cdots & h_{1, N_{t}}\cr \vdots & \ddots & \vdots\cr h_{N_{r}, 1} & \cdots & h_{N_{r}, N_{t}}}\right].\eqno{\hbox{(9)}}$$

The case of twofold oversampling with sampling diversity does not differ in principle from polarization diversity that is generally assumed in coherent optical systems. The modification of the channel matrix ${\bf H}$ in (8) is similar to an extension from a single input to a multiple input channel. It is then rewritten as [15] TeX Source $${\bf H} = \left[\matrix{{\bf H}_{0} & {\bf H}_{2} & \cdots & \cdots & {\bf H}_{L_{h} - 1} & {\bf 0} & \cdots & {\bf 0}\cr {\bf H}_{1} & {\bf H}_{3} & \cdots & {\bf H}_{L_{h} - 2} & {\bf 0} & & \cdots & {\bf 0}\cr {\bf 0} & {\bf H}_{0} & {\bf H}_{2} & \cdots & \cdots & {\bf H}_{L_{h} - 1} & & \vdots\cr & {\bf H}_{1} & {\bf H}_{3} & \cdots & {\bf H}_{L_{h} - 2} & & \cr \vdots & & \ddots & \ddots & \ddots & \ddots & & {\bf 0}\cr {\bf 0} & \cdots & {\bf 0} & {\bf H}_{0} & {\bf H}_{2} & \cdots & \cdots & {\bf H}_{L_{h} - 1}}\right] {\bf H} \in {\BBC}^{L_{w} N_{r}\times (L_{h} + L_{w}) N_{t}/2}.\eqno{\hbox{(10)}}$$

Here, the channel and filter lengths $L_{h}$ and $L_{w}$ describe the length of the fractionally spaced filters and are assumed odd in this formulation. At the receiver, the signal is processed in a linear equalizer, resulting in the output signal TeX Source $${\bf z}[k] = {\bf WHs}[k] + {\bf Wn}[k] = \sum_{n = 0}^{L_{w} - 1} {\bf W}_{n} \sum_{c = 0}^{L_{h} - 1} {\bf H}_{c} {\mmb s}[k - c - n] + \sum_{n = 0}^{L_{w} - 1} {\bf W}_{n} {\mmb n}[k - n]\eqno{\hbox{(11)}}$$ with the equalizer matrix given by TeX Source $${\bf W} = \left[{\bf W}_{0}, \ldots, {\bf W}_{L_{w} - 1} \right], \in {\BBC}^{N_{t} \times L_{w} N_{r}}\eqno{\hbox{(12)}}$$ and the MIMO equalizer submatrix is defined according to (9) as TeX Source $${\bf W}_{i} = \left[\matrix{w_{1, 1} & \cdots & w_{1, N_{r}}\cr \vdots & \ddots & \vdots\cr w_{N_{t},1} & \cdots & w_{N_{t}, N_{r}}}\right].\eqno{\hbox{(13)}}$$

SECTION III

Equalization schemes often require the knowledge of the channel impulse response. Therefore, a training sequence with length $L_{c}$ is transmitted e.g., in the preamble of a data block. Using the transmitted and received sequences, the channel can be estimated. For this purpose, the transmission equation in (3) is modified to the matrix notation TeX Source $${\bf R}[k] = \bar{\bf H}{\bf S}^{T}[k] + {\bf N} [k]\eqno{\hbox{(14)}}$$ where the matrices are defined as follows: TeX Source $$\eqalignno{{\bf R}[k] =&\, \left[{\mmb r}[k], \ldots, {\mmb r}[k + N_{t} L_{c} - 1]\right] \in {\BBC}^{N_{r}\times N_{t} L_{c}}&\hbox{(15)}\cr \bar{\bf H} =&\, [{\bf H}_{0}, \ldots, {\bf H}_{L_{c} - 1}] \in {\BBC}^{N_{r}\times N_{t} L_{c}}&\hbox{(16)}\cr {\bf S}^{T} [k] =& \, \left[\matrix{{\mmb s}[k] & \cdots & {\mmb s}[k + N_{t} L_{c} - 1]\cr \vdots & & \vdots\cr {\mmb s}[k - L_{c} + 1] & \cdots & {\mmb s} [k + (N_{t} - 1) L_{c}]}\right] \in {\BBC}^{N_{t} L_{c}\times N_{t} L_{c}}&\hbox{(17)}\cr{\bf N}[k] =&\, [{\mmb n}[k], \ldots, {\mmb n} [k + N_{t} L_{c} - 1] \in {\BBC}^{N_{r} \times N_{t} L_{c}}. &\hbox{(18)}}$$ The least squares (LS) solution for the channel estimate is then given by [15], [16] TeX Source $$\mathhat{\bar{\bf H}} = {1 \over N_{t} L_{c}} {\bf RS}^{-1} = {1 \over N_{t} L_{c}} {\bf R} ({\bf S}^{H} {\bf S})^{-1} {\bf S}^{H}.\eqno{\hbox{(19)}}$$In the case of AWGN, the LS channel estimate is identical to the maximum-likelihood (ML) estimation [15].

Best estimation performance is achieved using noise-like sequences with good autocorrelation and cross-correlation properties. There are several types of sequences matching these objectives like, e.g., binary pseudorandom sequences [17], that offer ideal autocorrelation properties for infinite length. Another class of complex training sequences, called Constant-Amplitude Zero-Autocorrelation (CAZAC) sequences, was proposed in [18] among other publications and is also referred to as Frank–Zadoff or Zadoff–Chu sequences. They are defined as TeX Source $$s[k] = \cases{e^{jK_{c}\pi k^{2}/L_{c}}, & $L_{c}$\ even\cr e^{jK_{c} \pi (k - 1)^{2}/L_{c}}, & $L_{c}$\ odd} \quad k = 0, 1, \ldots, L_{c} - 1.\eqno{\hbox{(20)}}$$ CAZAC sequences offer ideal cyclic autocorrelation properties, regardless of the length $L_{c}$, simplifying (19) to $\mathhat{\bar{\bf H}} = (1/N_{t}L_{c}){\bf RS}^{H}$.

Typical synchronization algorithms require a periodicity in the training sequence [12]. Therefore, a repetition of the training sequence will be assumed when treating channel estimation. One method to estimate the channel in a MIMO system is to transmit the same sequence one at a time from each antenna [19], as shown in Fig. 3. The length of an individual training sequence should be $L_{c}\ > \ L_{h}$. Prefixes can be inserted at the beginning and the end of each training sequence in order to improve the synchronization performance. Their length $L_{g}$ is half the length of the channel impulse response. This method is possibly less bandwidth efficient with respect to the required prefix length and does not use the array gain of spatial multiplexing. Moreover, the signal amplitude becomes zero in each channel, which is suboptimal in terms of the gain control at the transmitter and receiver.

Another training sequence design uses the principle of shift-orthogonal sequences [20]. For a MIMO system with $N_{t}$ transmission channels and a minimum training sequence length of $L_{h}$, a new training sequence with length $L_{c} = N_{t}L_{h}$ is created by cyclic shifting. Given a first sequence ${\bf s}_{1} = (s_{1}[0], \ldots, s_{1}[L_{c} - 1])$ in a 2 × 2 MIMO channel, the second sequence can be written as ${\bf s}_{2} = [s_{1}[L_{c}/2], \ldots, s_{1}[L_{c} - 1], s_{1}[0], \ldots, s_{1}[L_{c}/2 - 1]]$, where $L_{c}$ is assumed even. In general, the shift should be greater or equal to the channel impulse response length. The according illustration of the training is given in Fig. 4.

The minimum length of the training sequence is then TeX Source $$L_{c, tot} = 2 L_{h} \cdot N_{t} + 2L_{g} = L_{h} \cdot (2 N_{t} + 1).\eqno{\hbox{(21)}}$$ For PolMux systems, the minimum total length is given by $L_{c, tot} = 4L_{h}$, if the prefixes are omitted.

SECTION IV

For limited channel distortion, equalization is typically performed using TDE. Linear filters are sufficient to fully compensate for the combined effects of CD and PMD. Nonlinear equalization methods, like decision-feedback equalization (DFE) or ML equalization, can only bring a slight benefit if PDL is dominant in the system and will not be covered in the following.

The task of the filter adaptation function is to compute the tap values of ${\bf W}$ to yield the output signal ${\bf z}[k]$ that is as close as possible to ${\mmb s}[k]$. Here, it is assumed that the filter input and the desired response are single realizations of jointly wide-sense stationary stochastic processes with zero mean [21]. The output error of the filter is given by TeX Source $${\bf e}[k] = {\mmb s}[k] - {\bf z}[k] = {\mmb s}[k] - {\bf Wr}[k].\eqno{\hbox{(22)}}$$ The objective for the filter design is the minimization of the mean square error value of ${\bf e}[k]$, reducing the combined influence of signal distortion and noise. The resulting cost function is definedas TeX Source $$J({\bf W}) = E\left\{{\bf e}^{H} [k] {\bf e}[k]\right\} = \sum_{i = 1}^{N_{t}}E\left\{\left\vert e_{i}[k]\right\vert^{2}\right\} = {\rm tr} \left(E\left\{{\bf e}[k]{\bf e}^{H}[k] \right\}\right)\eqno{\hbox{(23)}}$$ where $E\{.\}$ represents the expectation value, and ${\rm tr}\{.\}$ is the trace of the matrix. The cost function can then be rewritten as TeX Source $$J({\bf W}) \!=\! E\left\{\left\Vert{\mmb s}[k] \!-\! {\bf Wr}[k]\right\Vert^{2}\right\} \!=\! E\left\{\left\Vert({\bf I}_{\delta + 1} \!-\! {\bf W}{\bf H}){\bf s}[k]\right\Vert^{2}\right\}\eqno{\hbox{(24)}}$$ with the selection matrix given by TeX Source $${\bf I}_{\delta + 1} = \left[{\bf 0}_{N_{t} \times N_{t}\delta}, {\bf I}_{N_{t} \times N_{t}}, {\bf 0}_{N_{t} \times (L_{h} + L_{w} - \delta - 2)N_{t}}\right].\eqno{\hbox{(25)}}$$ Reformulating the equation leads to the familiar expression TeX Source $$J({\bf W}) = E\left\{{\rm tr}\left(({\bf I}_{\delta + 1} - {\bf WH}) {\bf R}_{ss} ({\bf I}_{\delta + 1} - {\bf W} {\bf H})^{H} \right) + {\rm tr} ({\bf W}{\bf R}_{nn}{\bf W}^{H}) \right\}.\eqno{\hbox{(26)}}$$ Here, the autocorrelation matrix of the signal and the noise are written as ${\bf R}_{ss} = E\{{\bf s}[k]{\bf s}[k]^{H}\}$ and ${\bf R}_{nn} = E({\bf n}[k]{\bf n}[k]^{H})$, respectively. Setting the derivative of the cost function to zero, the global minimum for the cost function can be obtained, resulting in the minimum mean square error (MMSE) solution for the equalizer given by TeX Source $${\bf W} = {\bf I}_{\delta + 1} \underbrace{\left({\bf R}_{ss}^{-1} + {\bf H}^{H} {\bf R}_{nn}^{-1} {\bf H}\right)^{-1}}_{\rm Equalizer} \underbrace{{\bf H}^{H} {\bf R}_{nn}^{-1}}_{\rm Whitening\ MF}.\eqno{\hbox{(27)}}$$ If the input signal and the noise are uncorrelated, with ${\bf R}_{ss} = \sigma_{s}^{2}{\bf I}$ and ${\bf R}_{nn} = \sigma_{n}^{2}{\bf I}$, (27) can be formulated in more familiar form as TeX Source $${\bf W} = {\bf I}_{\delta + 1} \left({\bf I} {\sigma_{n}^{2}\over \sigma_{s}^{2}} + {\bf H}^{H} {\bf H} \right)^{-1} {\bf H}^{H}.\eqno{\hbox{(28)}}$$ For best equalizer performance at the shortest possible filter length, the discrete delay $\delta$ in the selection matrix ${\bf I}_{\delta + 1}$ has to be optimized.

The MMSE equalizer solution can also be computed using the gradient least-mean-square (LMS) algorithm with low complexity of the filter update [22]. Here, no knowledge of the channel or the autocorrelation matrices from (27) is necessary, as only the instantaneous values are used in the stochastic approach. The LMS algorithm can be derived as TeX Source $${\bf W} [k + 1] = {\bf W}[k] - \mu \widehat{\nabla_{{\bf W}^{\ast}} J({\bf W})} = {\bf W}[k]+ \mu\underbrace{\left({\mmb s}[k] - {\bf W}[k]{\bf r}[k]\right)}_{{\bf e}[k]}{\bf r}[k]^{H} = {\bf W}[k] + \mu {\bf e}[k]{\bf r}[k]^{H}\eqno{\hbox{(29)}}$$ where $\nabla_{{\bf W}^{\ast}}$ is the vector differential operator of the cost function with respect to the conjugate of ${\bf W}$. The complexity of the LMS filter update can be reduced at the expense of the convergence and tracking speed using signum-update algorithms. They are given by [17] TeX Source $${\bf W}[k + 1] = {\bf W}[k] + \mu\cases{{\rm sgn}\left({\bf e}[k] \right){\bf r}[k]^{H}\cr {\bf e}[k] {\rm sgn}\left({\bf r}[k]^{H}\right)\cr{\rm sgn}\left({\bf e}[k]\right){\rm sgn} \left({\bf r}[k]^{H}\right).}\eqno{\hbox{(30)}}$$

Stringent requirements on the system bandwidth efficiency can lead to an omission of training sequences and a blind or non data-aided adaptation of the receiver parameters. In this context, it is useful to distinguish between channel equalization or deconvolution and source separation of several independent transmitters. In blind equalization (BE) and blind source separation (BSS), only the received signal is known with unknown input and channel. Blind equalizer adaptation in general assumes independent and identically distributed (i.i.d.) symbols.

The most popular blind deconvolution algorithm is the constant-modulus algorithm (CMA), first introduced in [23]. The motive behind the algorithm was to find a criterion that is independent of the carrier phase and that can also acquire the channel, even if the signal is heavily distorted. The general cost function is given by TeX Source $$J({\bf W}) = {1 \over 2p}E\left\{\sum_{i = 1}^{M_{t}} \left(\left\vert z_{i}[k]\right\vert^{p} - R_{p}\right)^{2}\right\}\eqno{\hbox{(31)}}$$ with TeX Source $$R_{p} = {E\left\{\left\vert z_{i}[k]\right\vert^{2p}\right\}\over E\left\{\left\vert z_{i}[k]\right\vert^{p}\right\}}.\eqno{\hbox{(32)}}$$ For $p = 2$, the algorithm reduces to the well-known CMA. Deriving the cost function with respect to the equalizer taps leads to the error TeX Source $${\bf e}[k] = {\bf z}[k] \circ \left(R_{2} - \left\vert{\bf z}[k] \right\vert^{2}\right)\eqno{\hbox{(33)}}$$ where ${\bf a} \circ {\bf b}$ is the element by element multiplication of two vectors ${\bf a}$, ${\bf b}$. The CMA has been extensively used for PSK constellations in fiber optics but can also be applied to quadrature amplitude modulation (QAM). An extension to a multimodulus algorithm (MMA), where the constant $R_{2}$ is adaptively chosen based on the power of the equalized signal, leads to an improved tracking performance due to the lower steady-state error.

Although the CMA was extended to a multichannel cost function in (31), a mere deconvolution is not sufficient for the separation of multiple channels. The cost function permits local minima, where any input channel can appear at more than one equalizer output. In the fiber-optic channel, the separation of polarizations becomes especially problematic if PDL is present.

The task of BSS spans many application fields such as acoustic processing, images, biometrics or financial data analysis. In the literature, the terms BSS and independent component analysis (ICA) are often used as synonyms. BSS in fields other than communications often has different application scenarios, focusing on analog signals without digital modulation, having nonstationary statistics, and is not necessarily real-time compliant. In fiber optics communications, the equalizer performance is desired not to be compromised by BSS with identical steady-state and tracking performance.

In 1995, Bell and Sejnowski derived a stochastic gradient algorithm for BSS based on the infomax principle [24], which was proven to be identical to ML estimation [25]. The update algorithm is given by TeX Source $${\bf W}[k + 1] = {\bf W}[k] + \mu\left(\left({\bf W}[k]^{H} \right)^{-1} - \varphi\left({\bf z}[k]\right){\bf r}[k]^{H}\right).\eqno{\hbox{(34)}}$$Here, $\varphi({\bf z}[k])$ is derived as [24] TeX Source $$\varphi\left({\bf z}[k]\right) = {{\partial g\left({\bf z}[k]\right)\over \partial {\bf z}[k]}\over g\left({\bf z}[k]\right)}\eqno{\hbox{(35)}}$$ from the sigmoid function $g(.)$ that is defined for $g({\bf z}[k])$ to resemble the pdf $p({\bf s}[k])$ and assumed to be invertible. The choice of the sigmoid function $g(.)$ determines the robustness of the algorithm, although a certain flexibility is permitted.

In [26], Amari presented an improved version of the Bell and Sejnowski infomax algorithm, introducing the natural gradient algorithm. Here, the derivative of the cost function was normalized by ${\bf W}^{H}{\bf W}$, leading to a simpler and faster update rule that omits the matrix inversion and is given by TeX Source $${\bf W}[k + 1] = {\bf W}[k] + \mu\left({\bf I}_{N_{t} \times N_{t}} - \varphi\left({\bf z}[k]\right){\bf z}[k]^{H}\right){\bf W}[k].\eqno{\hbox{(36)}}$$The natural gradient was applied in fiber optics to a flat-fading or instantaneous mixtures channel in [27]. The sigmoid function $\varphi(z[k])$ must be chosen for complex signals with a sub-Gaussian distribution and was derived in [28]. The natural gradient algorithm then results in [29] TeX Source $${\bf W}[k + 1] = {\bf W}[k] + \mu\left({\bf I}_{N_{t} \times N_{t}} + \left(\tanh \left({\Fraktur{Re}} \left({\bf z}[k] \right) \right) + j\tanh\left({\Fraktur{Im}} \left({\bf z}[k] \right) \right)\right){\bf z}^{H}[k] - {\bf z}[k]{\bf z}[k]^{H}\right){\bf W}[k].\eqno{\hbox{(37)}}$$

BSS and blind deconvolution are closely related. Whereas BSS ensures statistical independence of several sources, deconvolution eliminates the statistical dependence within a single channel between i.i.d. input symbols. Thus, the BSS problem can be extended from a flat-fading channel to a frequency-selective MIMO channel without changing the basic formulation. It is, however, beneficiary to separate the steps of deconvolution and source separation in order to reduce the number of independent variables to be estimated. This is either done using a prewhitening step before the actual BSS [30] or by directly combining SISO deconvolution algorithms with BSS as shown in the next paragraph.

In [31] and [32], a BSS algorithm for convolutive mixtures was presented that works as an extension of existing deconvolution algorithms and will be used in the following in the fiber optic receiver design. Extending the CMA cost function by a cross-correlation term leads to TeX Source $$J({\bf W}) = \underbrace{{1 \over 4}E \left\{\sum_{i = 1}^{M_{t}}\left(\left\vert z_{i}[k] \right\vert^{2} - R_{2}\right)^{2} \right\}}_{J_{CMA}({\bf W})} + \underbrace{\alpha \sum_{l, m = 1, l \neq m}^{2}\sum_{\xi = \xi_{1}}^{\xi_{2}} \left \vert\rho_{lm}[\xi]\right\vert^{2}}_{J_{BSS}({\bf W})}\eqno{\hbox{(38)}}$$ with TeX Source $$\rho_{lm}[\xi] = E\left\{z_{l}[k]z_{m}^{\ast}[k - \xi]\right\} \eqno{\hbox{(39)}}$$ where $\rho_{lm}(\xi)$ the cross-correlation function between polarization $l$ and $m$, and $\xi_{1}$, $\xi_{2}$ are integers that depend on the channel delay spread. The derivative of the BSS cost function with respect to the filter taps yields TeX Source $$\nabla_{{\bf w}_{lm}^{\ast}} J_{BSS}({\bf w}_{lm}) = \alpha \sum_{\xi = \xi_{1}}^{\xi_{2}} \rho_{lm}[\xi] z_{m}[k - \xi] {\bf r}_{m}^{\ast}.\eqno{\hbox{(40)}}$$ The instantaneous expectation value of the cross-correlation coefficient $\rho$ can, e.g., be computed by TeX Source $$\rho_{lm}^{(k)}[\xi] = (1 - \epsilon) \cdot \rho^{(k - 1)}_{lm}[\xi] + \epsilon \cdot z_{l}[k] \cdot z_{m}^{\ast}[k - \xi]\eqno{\hbox{(41)}}$$ where $\epsilon$ is a forgetting factor, resulting in the error signal TeX Source $$\eta_{l}^{(k)} = -\sum_{\xi = \xi_{1}}^{\xi_{2}} \rho^{(k)}_{lm}[\xi] \cdot z_{m}[k - \xi].\eqno{\hbox{(42)}}$$ The tap updates are given by TeX Source $$w_{lm}^{(k)}[n] = w_{lm}^{(k - 1)}[n] +\mu \cdot \eta_{l}^{(k)} \cdot r_{m}^{\ast}[k - n].\eqno{\hbox{(43)}}$$ Although the considered MIMO system is dubbed blind, eventually side-information is required for proper equalization. Not only do the signal statistics of the sources have to be known, the inevitable permutation of the output channels requires the use of higher layer framing bits for proper identification.

SECTION V

TDE requires a high computational effort due to the convolution operation. The equalization complexity of the standard convolution scales with $O(L_{w}^{2})$. If the operation is performed in the frequency domain, the complexity scales with $O(L_{w}\ \log\ L_{w})$ and can thus be significantly reduced [17]. FDE is one of the main advantages of OFDM [33]. Here, the equalization of frequency-selective channels simplifies to the equalization of several tightly spaced flat-fading channels. However, FDE is not limited to OFDM and was also successfully introduced to SC communication systems [34], [35]. Fig. 5 shows a comparison of the transmitter and receiver for OFDM and SC-FDE.

The main difference of equalization lies in the placement of the FFT/IFFT operators. The cyclic prefix can be omitted in SC-FDE, using overlap-add or overlap-save techniques [17].

The basics of data-aided equalization of MIMO systems in frequency domain are similar to TDE. MMSE detection and the LMS algorithm can be reformulated and employed in the frequency domain.

The derivation of the MMSE equalizer solution for the FDE is similar to the time domain and is e.g., covered in [36]. The equalizer solution for baud spaced sampling can be written separately for each frequency tone, leading to TeX Source $$\mathtilde{\hbox{W}}_{MMSE}[n] = \left({\bf I}_{N_{t}\times N_{t}} {\sigma_{n}^{2} \over \sigma_{s}^{2}} + \mathtilde{\hbox{H}}[n]^{H}\mathtilde{\hbox{H}}[n]\right)^{-1}\mathtilde{\hbox{H}}[n]^{H}\eqno{\hbox{(44)}}$$ where $\mathtilde{\hbox{H}}[n]$ is a matrix with dimensions $\BBC^{N_{r} \times N_{t}}$ at frequency tone $n$ and $\mathtilde{\hbox{W}}_{MMSE}[n]$ having $\BBC^{N_{t} \times N_{r}}$. The computation of the MMSE solution in the frequency domain thus requires only the inversion of several small matrices simplifying to $\BBC^{2 \times 2}$ for PolMux systems.

For systems with oversampling, the equalizer can be derived similarly, taking into account that several tones in the frequency spectrum bear the identical information which is differently attenuated by the channel and the receiver filter. The T/2-spaced equalizer with the FFT size $N$ can be given as [36] TeX Source $$\mathtilde{\hbox{W}}_{MMSE}[n] = \left({\bf I}_{N_{t}\times N_{t}} {\sigma_{n}^{2} \over \sigma_{s}^{2}} + \mathtilde{\hbox{H}}[n]^{H}\mathtilde{\hbox{H}} [n] + \left(\mathtilde{\hbox{H}}[n]^{H}\mathtilde{\hbox{H}}[n] \right)_{(n + N)_{\bmod N}}\right)^{-1}\mathtilde{\hbox{H}}[n]^{H}.\eqno{\hbox{(45)}}$$

The LMS algorithm can as well be formulated in the frequency domain with a block update of the equalizer transfer function. The derivation is described in [17], where the algorithm is referred to as fast LMS (FLMS). The block diagram for the adaptive equalizer is shown in Fig. 6.

The blind deconvolution and source separation algorithms of Section 4.2 can in principle also be applied to the frequency domain. Here, the problem of convolutive mixtures changes to a source separation problem of several instantaneous mixtures. In a PolMux 2 × 2 MIMO system, this means that the problem becomes similar to the flat-fading time-domain case. However, the convergence becomes more problematic, as the matrices at different frequency tones usually are arbitrarily permuted and scaled [37]. The solution of this problem adds to the complexity of the frequency-domain BSS, which will not be analyzed in the following.

However, BE in the frequency domain is a vital part of coherent receivers. In optically uncompensated links, equalization complexity can be reduced by precompensating for the scalar CD effect before the unmixing of the two polarizations. Thus, the training sequences and the MIMO equalizer length can be kept short. A blind estimation algorithm for dispersion was presented by the authors in [9] and will not be covered in this paper.

SECTION VI

The processing speed of integrated circuits is much lower than the typical sampling rates in optical high-speed receivers. Therefore, the signal processing has to be parallelized to a high degree. A block diagram of a TDE with three taps and an exemplary parallelization degree of $p = 4$ is shown in Fig. 7. Typically, a much higher parallelization degree is required.

As a consequence, the feedback filter update at processing instant $k = T$ cannot be performed at the subsequent instant $k = T + 1$ but the earliest at $k = T + p$. In addition, most operations cannot be processed within a single clock cycle $D = 1$ but require more delay with $D\ > \ 1$. The feedback information from $k = T$ is thus only available at processing instant $k = T + pD$. This puts strict constraints on the implementation of feedback in optical receivers. The consequence for any feedback implementation is a required reduction of the loop gain in order to achieve stability, despite the high delay. For the LMS and the CMA, this leads to a lower update factor $\mu$, as introduced in (29). Moreover, the performance of the LMS can suffer due to the required parallelized feedback carrier recovery.

The covered adaptive filter algorithms have different acquisition behavior that is affected by the channel and the receiver implementation. An overview is given in Fig. 8. The MMSE equalizer solution is the fastest computation of the optimum Wiener solution. Here, a training sequence consisting of two CAZAC sequences was assumed for channel acquisition. Although it is possible to use very short training sequences and average the channel estimate over several subsequent headers, this constrains the channel tracking ability of the receiver. Therefore, it can be preferable to acquire the channel from each header independently. This makes it possible to track fast channel gradients [38], albeit at the cost of higher overhead.

The LMS algorithm has a slower convergence speed due to the stochastic gradient, although that is not quite as apparent in a serial implementation, where the assumption is that the processing speed of the receiver is higher than the sampling rate of the signal. If the processing is parallelized, a smaller update factor $\mu$ is required, leading to a consequently higher number of symbols for adaptation. Realistic implementations of training headers may also be limited in length; therefore, the LMS has to acquire the channel over the sequence of several headers, which further decreases the convergence speed, since the data between the headers cannot be used at first in decision-directed mode. In general, using the header alone for the filter update with LMS impedes the tracking ability of the receiver. The implementation therefore requires blind tracking in decision-directed mode once the equalizer taps have converged. Thus, for TDE with LMS, training is only relevant for channel acquisition.

The CMA naturally features the slowest convergence speed of all algorithms, which is even more impaired in parallelized receivers. Increasing the distortion in the channel leads to a disproportionate increase in the average acquisition duration of the equalizer. As a consequence, the CMA should primarily be used for limited distortion only. The update factor $\mu$ for the LMS and CMA was optimized to result in the fastest convergence. The final performance after convergence for all algorithms differs depending on $\mu$ but can be made equal if the update factor is chosen small enough.

Once the channel has been acquired, an optimum tradeoff has to be found between steady-state and tracking performance. For data-aided receivers, it will be assumed that the channel is estimated from a single header only using the MMSE equalizer solution and not averaging over subsequent headers. Thus, the channel can arbitrarily change from header to header. The header overhead depends on two factors. First, the minimum training sequence length that is determined by channel noise and the channel memory length. Second, the repetition rate of the training sequence that depends on the maximum tolerable channel gradient. Fig. 9 analyzes the resulting overhead for MMSE equalizer adaptation for a maximum polarization rotation speed of 10 kHz and 50 kHz on the Poincaré sphere.

It is distinguished between the case where the CD is estimated by the training sequence and the receiver configuration with blind dispersion compensation up front. Here, a total of 50 000 ps/nm of CD was assumed. It is evident that a blind CD compensation is required in order to achieve a reasonable overhead, thus leading to a probable coexistence of blind and data-aided algorithms in future receivers.

If the LMS is used in TDE data-aided systems, tracking should be done in decision-directed mode. TDE with LMS thus simplifies to a blind receiver in tracking mode with arbitrary initial adaptation that can be based on data-aided LMS or blind CMA. The tracking performance of the LMS is compared to the CMA in Fig. 10. In this evaluation, a 112-Gb/s PolMux-16QAM with a parallelization degree of $p = 64$ and a transmitter and receiver laser bandwidth of 100 kHz were assumed. It is evident that the CMA slightly degrades in tracking mode and is outperformed by the LMS. However, the performance of the LMS can be reached, if the CMA is switched to the MMA after initial convergence. For higher laser phase noise, the LMS will deteriorate due to the additional required carrier feedback recovery that degrades the performance in the parallelized implementation. In this case, the feedback carrier recovery was performed using a second-order phase-locked loop (PLL). Due to the high degree of parallelization, the phase estimate of the PLL was only approximate, but sufficient for the decision-directed LMS. A precise carrier phase recovery was performed after the equalizer. Furthermore, it is evident that the LMS and MMA are more susceptible to noise in the low-SNR region.

A frequency implementation of the LMS can in principle reduce the overhead of data-aided systems with a MMSE equalizer. After transmitting a short sequence, initial equalizer taps can be computed using MMSE, while the final convergence and tracking are performed using the FLMS. Although the FLMS is sometimes proposed in the fiber-optic literature, the implementation of the algorithm practically fails due to the required parallelization in the high-speed optical receiver. As shown in Fig. 6, the feedback loop of the FLMS includes two FFTs in addition to the required carrier phase recovery. An FFT introduces a high processing delay, which effectively prohibits a fast equalizer tracking. Fig. 11 analyzes the tracking performance of the FLMS for an FFT size $N = 256$ for several degrees of parallelization.

The laser bandwidth is set to 100 kHz for the transmitter and receiver laser. It is apparent that the FLMS is not suited for fiber optic receivers as its performance degrades with the parallelization degree. MMSE is therefore the best solution for data-aided FDE.

Besides the speed of the adaptive algorithm, its complexity is the most important criterion in complex high-speed receivers. In the following, it will be distinguished between the update algorithm complexity, as well as the complexity of the equalization process itself. Fig. 12 compares the complexity of the filter update for all presented update algorithms. Where applicable, a parallelization degree of $p = 64$ is assumed. It is evident that most of the complexity in blind receivers comes from the BSS algorithm where a simplified version of the Paulraj and Papadias algorithm, optimized for the 2 × 2 MIMO channel, was assumed. The update complexity of the CMA and LMS can be minimized using the signum-update algorithm. If the LMS and the CMA updates are used without the signum update, the complexity of the update is identical to the TDE filtering complexity.

Data-aided TDE using signum-LMS requires only minimal update complexity compared to blind TDE. The MMSE equalizer solution in combination with TDE as shown in (27) is the most complex algorithm due to the required matrix inversion and not a viable solution for high-speed optical receivers. Using the MMSE can be advantageous if FDE is employed. Here, the required matrix inversion has low complexity as written in (45). Thus, data-aided TDE-LMS and FDE-MMSE are among the least complex solutions in terms of their filter update algorithms.

Filtering itself can be performed either in time domain or frequency domain. Here, the optimum solution depends on the filter length. For short filters TDE has a lower complexity than FDE, while on the other hand, FDE clearly outperforms TDE for long filters [17], [39]. Fig. 13 shows a comparison of TDE and FDE for parallelized receivers. The minimum size of the FFT depends on the channel length, parallelization degree, and overlap in the processing. Therefore, the variable complexity of the TDE is compared with FDE with static FFT sizes. The superiority of FDE is apparent for any combination of parameters.

SECTION VII

The paper covers the most important data-aided equalization and BE algorithms for SC coherent optical systems. In general, blind algorithms exhibit slower adaptation, worse performance for low SNR and more instability than data-aided methods. Blind receivers based on CMA usually have higher complexity due to the required separation of the two polarizations, which is especially challenging in presence of high PDL. TDE with data-aided adaptation can be realized with low complexity based on the LMS algorithm. In terms of tracking, blind receivers perform approximately identically if the LMS or the MMA are used. However, the tracking performance will suffer once the modulation format is further increased.

FDE in combination with LMS can be ruled out for implementation due to the high delays in the feedback path and the consequently worst tracking performance. On the other hand, FDE with MMSE equalization presents the fastest and one of the least complex solutions. The implementation of SC-FDE with MMSE does not require feedback loops like the TDE with LMS and is therefore more suitable for a feedforward implementation in high-speed optical receivers. Data-aided receivers with MMSE using TDE or FDE can be realized with overheads in the range of 3–5%, assuming prior CD compensation.

The authors would like to thank D. van den Borne, S. Jansen, G. Grosso, T. Wuth, O. Adamczyk, and S. Spälter for the valuable discussions and their tireless support.

No Data Available

No Data Available

None

No Data Available

- This paper appears in:
- No Data Available
- Issue Date:
- No Data Available
- On page(s):
- No Data Available
- ISSN:
- None
- INSPEC Accession Number:
- None
- Digital Object Identifier:
- None
- Date of Current Version:
- No Data Available
- Date of Original Publication:
- No Data Available

Normal | Large

- Bookmark This Article
- Email to a Colleague
- Share
- Download Citation
- Download References
- Rights and Permissions