• Abstract

# Reduction of Loop Delay for Digital Symbol Timing Recovery Systems Using Asynchronous Equalization

Timing recovery loops with low loop delay are desirable. This paper presents receiver architectures with asynchronous equalization to reduce the loop delay. We propose a new asynchronous delayed least-mean-square (AD-LMS) adaptation algorithm together with an interaction-free loop to eliminate interaction between timing and equalization loops. In addition, a timing recovery scheme to reduce the timing jitter is developed. The proposed architecture can apply to 10GBASE-T systems. Simulation results show that the conventional approach suffers from the loop-interaction and the proposed method can eliminate this issue. Moreover, our approach has high phase margin, low gain peaking, and low jitter properties.

SECTION I

## Introduction

ALL-DIGITAL approaches have drawn immense attentions to a receiver design for high-speed communication systems [1]. In general, high-level modulation schemes are preferred for highspeed data transmission. Therefore, it needs to reduce the loop delay in the timing loop to meet the requirement of low-jitter in a symbol timing recovery system.

Fig. 1 shows a simplified baseband block diagram for conventional receivers. A received signal r(t) is applied to an analog low-pass filter (LPF) to suppress out-of-band noise and limit the bandwidth of interested signals. Then, the filtered signals are sampled by an analog-to-digital converter (ADC) which operates at a free-running frequency 1/Ti. In general, the sampling clock 1/Ti differs from the baud rate 1/T. Therefore, a sampling-rate converter (SRC) is used to convert the asynchronous samples to the synchronous ones. It can be done by an interpolator together with a symbol timing recovery (STR) loop [2]. The STR includes a timing error detector (TED), a loop filter (LF), and a number-controlled oscillator based (NCO-based) controller. According to the averaged timing error information provided by the TED, this controller determines when to interpolate an output sample that is baud-spaced such that back-end digital signal processing (DSP) blocks can properly work. The DSP block, which includes a feed-forward equalizer (FFE), a feedback equalizer (FBE), and a slicer, is used to equalize the channel and make decisions. In general, the FFE and FBE are adaptive filters so that they can adapt to channel variation. One of the prevalent adaptation algorithms is the least-mean-square (LMS) algorithm [3]. However, this conventional receiver architecture has two design issues. First issue is loop delay. Since the FFE is included in the timing loop, the loop delay of the timing loop is increased and thereby induces larger phase noise [4], [5]. Second issue is interaction. There are two coupled loops in the system model: timing loop and equalization loop. The adaptation of FFE will change its phase response and therefore change the equivalent sampling phase. This adaptation tries to not only equalize the channel but also compensate the sampling phase error such that the signal-to-noise ratio (SNR) at the slicer input is maximized. At the same time, the NCO-based controller varies its control signals in the timing loop to eliminate the inter-symbol-interference (ISI). Such interaction may cause an unpredictable result.

Fig. 1. A simplified baseband block diagram for digital receivers.

Bergmans et al. proposed a simple solution to reduce the loop delay by which the FFE was moved out of the STR loop and was asynchronously adapted [6], [7]. However, they assumed that the SRC could perfectly obtain the timing error information and did not take the interaction issue into consideration. Daecke and Schenk proposed using the equalizer coefficients to estimate the phase offset and then feeding to the timing loop to cancel its effects [8]. However, the accuracy of this equalization-based timing error estimation method depends on the adaptation algorithm; and the optimal equalizer coefficients are assumed to be known in advance otherwise the timing error estimation would be biased. Staszewski et al. proposed a constrained LMS algorithm such that the stability of the STR loop was improved [9]. However, this constraint compromises the convergence of the LMS algorithm. Gysel and Gilg proposed a timing recovery scheme in which the STR and equalization loops were independently converged first then fixed the FFE coefficients so that there is no interaction between these two loops [10].

In this paper, we propose a new STR architecture by which the loop delay of the timing loop is reduced and the FFE coefficients are asynchronously updated. We also apply this architecture on 10GBASE-T systems to demonstrate the feasibility.

SECTION II

## Proposed Reduced Loop Delay Approach

As shown in Fig. 2, we propose a new architecture with reduced loop delay. It should notice that the FFE is moved out from the timing loop and we adopt a modified delayed LMS [11] together with an inverse SRC (ISRC) to continuously update the coefficients of the FFE; moreover, we propose using an extra interaction-free path together with a timing recovery scheme to avoid the interaction between timing and equalization loops. The details are described as follows.

Fig. 2. The proposed architecture with reduced loop delay.

### A. The Design of SRC and ISRC

Fig. 3 shows the block diagram of the SRC and ISRC. For SRC, the sample selector is controlled by the NCO-based controller, i.e., mn Ti with mn = ⌊ nT/Ti ⌋, where ⌊·⌋ denotes the floor function; the coefficients of the sinc-interpolator are determined by the NCO-based controller, i.e., φn Ti with φn = nT/Timn, as well, and can be expressed as TeX Source $$c_k^{\phi_n}= \left.{\sin(\pi t/T_i)\over \pi t/T_i}\right\vert_{t=\phi_nT_i}\eqno{\hbox{(1)}}$$For ISRC, the sample selector is control by mk T with mk = ⌊ kTi/T⌋; the coefficients of the sinc-interpolator are determined by φk T with φk = kTi/Tmk and can be expressed as TeX Source $$c_n^{\phi_k}= \left.{\sin(\pi t/T)\over \pi t/T}\right\vert_{t=\phi_k T}\eqno{\hbox{(2)}}$$Fortunately, since the structure of SRC and ISRC are “symmetric”, we can simply set φn Ti = mod {−φk T, 1}, where mod{·, 1} denotes the modulo-1 operation, and set the action of the sample selector of the ISRC just opposite that of the SRC. In other words, if the SRC determines to interpolate one more sample then the ISRC will determine stop interpolating a sample at that time. As shown in Fig. 4, we assume that Ti/T = 2/3. The interpolator will interpolate one more sample and nothing when φ = 0 and φ = 1, respectively; otherwise, it produces one sample at a time. Such implementation approach can reduce the hardware cost. It should be noticed that we use time index n and k to represent the signals that are sampled at an equivalent rate of 1/T and 1/Ti, respectively, in this paper.

Fig. 3. The block diagram of the SRC and ISRC.
Fig. 4. Timing relation between the SRC and the ISRC.

### B. Design of Interaction-Free Loops

The key to solve the interaction issue is to decouple the timing loop and equalization loop. The simplest way to do this is using the signal before the FFE to estimate the timing error. However, the TED needs synchronous data to estimate the timing error. Therefore, extra delay-matching and SRC elements are required. We delay the input of SRC by DF symbols (Ti-spaced), which are determined by the delay of the FFE. The delayed input then apply to the extra SRC such that its outputs xT[n] are equivalent to the T-spaced symbols.

After decoupling the timing loop from the equalization loop, we can correctly train the coefficients of the FFE. It should be noticed that the tap inputs and error signals used in an adaptation algorithm need to be aligned in sampling rate, otherwise this algorithm will fail. We adopt the ISRC to translate the T-spaced error signals eT[n] to the Ti-spaced version eTi[k]. In addition, we propose using an asynchronous delayed LMS (AD-LMS) algorithm such that the delay that is induced by the delay of FFE DF, of SRC DS, and of ISRC DI are taken into account. In general, the delay of SRC and ISRC are not an integer. Because both SRC and ISRC are controlled by the same parameter φ with opposite sign, the delay of SRC and ISRC can be expressed as TeX Source \eqalignno{D_S &= D_{S_I} + D_{S_F}\cr D_I &= D_{S_I} + D_{S_F}&\hbox{(3)}}where DSI and DSF denotes the integral and fractional part of the delay of the SRC, respectively. The AD-LMS algorithm can be expressed as TeX Source \eqalignno{{\bf w}[k+1] &= {\bf w}[k] + \mu_1e_{T_i}[k-D]{\bf x}[k-D]\cr{\rm with}\ e_{T_i}[k] &= ISRC \{\hat{a}[n]-\tilde{a}[n]\}\cr\tilde{a}[n] &=SRC\{{\bf x}^T[k]{\bf w}[k]\} - b[n-D_F-D_{S_I}],&\hbox{(4)}}where w is an NF × 1 coefficient vector of the FFE, x is an NF × 1 tap-input vector of the FFE, eTi is an asynchronous error signal, and μ1 is a step-size parameter; SRC { · } and ISRC{ · } denote the SRC and ISRC operation, respectively; D = DF + DS + DI = DF + 2DSI denotes the delay parameter; b[n] is the output of the FBE.

The FBE coefficients are adaptively updated according to a conventional LMS algorithm. It can be expressed as TeX Source \eqalignno{{\bf m}[n+1] &={\bf m}[n]+\mu_2e_T[n]{\bf y}[n]\cr{\rm with}\ e_T[n] &=\hat{a}[n] - \tilde{a}[n],&\hbox{(5)}}where m is an NB × 1 coefficient vector of the FBE, y is an NB × 1 tap-input vector of the FBE, eT is a synchronous error signal, and μ2 is a step-size parameter;

### D. Timing Recovery Scheme

We extend the timing recovery scheme proposed in [10] by using two types of TED: mMM-TED [12], [13] and B-TED [14] in different timing phases.

In timing phase I, we exploit the extra interaction-free path so that the mMM-TED could estimate the timing error information χ by the following equation: TeX Source \eqalignno{\chi[n] &=\tilde{a}[n]x_T[n-1] - \tilde{a}[n-1]x_T[n]\cr&+{1\over 2}(\tilde{a}[n]x_T[n-2]-\tilde{a}[n-2]x_T[n]),&\hbox{(6)}}At the end of this phase, the FFE will be well-trained.

In timing phase II, the B-TED exploits the detection output and postcursor ISI to estimate the timing error information χ by the following equation: TeX Source \eqalignno{\chi [n] &=-e_T[n-1](d[n]-d[n-2])\cr{\rm with}\ d[n] &= b[n]+\hat{a}[n],&\hbox{(7)}}In this phase, the input signal of TED is taken after the FFE to obtain a cleaner signal and hence the timing jitter is reduced. It should notice that the coefficients of FFE are fixed to avoid the interaction at this phase.

SECTION III

## Simulation Results With 10GBASE-T Application

We apply the proposed architecture with reduced loop delay to 10GBASE-T systems [15] which support a mode of non-loop-timed STR. The channel model of the insertion loss (IL) is obtained in [16]. The baud rate is 800 MHz. We assume that the sampling frequency offset (SFO) is varied from 50 to 2000 parts per million (ppm) and both the SRC and ISRC are sinc-interpolator with tap length of 30; the received SNR is 33 dB; NF = 75 and NB = 16.

According to the Bode stability criteria ([17], chap. 4), a stable control loop requires a positive phase margin. Moreover, a low phase margin will cause peaking in the closed-loop gain near the unity-gain frequency. This peaking increases the noise in that frequency range even more, thus increasing the total output noise. The phase margin and gain peaking comparisons for the conventional and the proposed approaches are shown in Fig. 5. Our approach has higher phase margin and lower gain peaking than the conventional approach when SFO is high in particular.

Fig. 5. The comparisons of (a) phase margin and (b) gain peaking.

The comparisons of the loop behavior during the timing phase I are shown Fig. 6. It is observed that the interaction-free path successfully decouples the timing and equalization loops. In contrast, for the conventional approach, these two loops interfere with each other and finally both loops cannot converge.

Fig. 6. LF output comparisons during timing phase I. The timing loop starts at the beginning and equalization loop starts at time = 0.4 × 105 (SFO = 500 ppm).

The LF output comparison for the proposed architecture during timing phase I and timing phase II is shown in Fig. 7. As we expected, the variation of LF output during timing phase II is roughly 13.73 dB less than that of LF output during timing phase I. Therefore, the timing jitter is reduced. The LF output is about at −5 × 10−4, i.e., 500 ppm, at the end of timing phase II so that the SFO is correctly compensated.

Fig. 7. LF output comparisons for the proposed architecture during different timing phases (SFO = 500 ppm).
SECTION IV

## Conclusions and Future Works

In this paper, we have presented an all-digital receiver architecture such that the timing loop delay is reduced and proposed an corresponding timing recovery scheme such that the timing jitter is reduced. A more complicated AD-LMS has been proposed to pay for the reduced loop delay. The novelty of the proposed architecture is to add the extra interaction-free path so that the timing loop and equalization loop do not interfere with each other. In addition, our timing recovery scheme can further reduce the variation of the LF output by roughly 13.73 dB which is paid by fixing the coefficients of FFE. The future works will focus on the coordination between timing and equalization loops so that the all-digital receiver is able to combat channel variation by adaptively updating the coefficients of the FFE.

### Acknowledgment

The authors would like to thank the financial support provided by National Science Council (NSC), R.O.C., under Grant NSC96-2220-E-002-008.

## Footnotes

Ying-Ren Chien, Chu-Yun Lin, and Hen-Wai Tsao are with the Integrated System Lab, Graduate Institute of Communication Engineering, National Taiwan University Email: (curtis.chien@gmail.com, nmmm233@gmail.com), tsaohw@cc.ee.ntu.edu.tw.

## References

1. Efficient implementation of polynomial interpolation filters or full digital receivers

J. T. Kim

IEEE Trans. Consum. Electron., vol. 51, p. 175–178, 2005-02

2. Interpolation in digital modems—Part I: Fundamentals

F. M. Gardner

IEEE Trans. Commun., vol. 41, p. 501–507, 1993-03

S. Haykin

Adaptive Filter Theory, 4th, Prentice-Hall, 2001

4. Wiener's analysis of the discrete-time phase-locked loop with loop delay

A. Spalvieri, M. Magarini

IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, p. 596–600, 2008-06

5. Effect of loop delay on phase margin of first-order and second-order control loops

J. W. M. Bergmans

IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 52, issue (10), p. 621–625, 2005-10

J. Bergmans, H. Pozidis, M. Lin

Euro. Trans. Telecomms, vol. 16, issue (6), p. 545–556, 2005

J. Bergmans, M. Y. Lin, D. Modrie, R. Otte

Signal Processing, vol. 85, issue (7), p. 1301–1313, 2005

8. Solving the interaction problem of timing synchronization and equalization

D. Daecke, H. Schenk

Int. Zurich Seminar on Communications (IZS), 2008-03, vol. 1, 52–55

9. A constrained asymmetry lms algorithm for prml disk drive read channels

R. B. Staszewski, K. Muhammad, P. T. Balsara

IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process, vol. 48, issue (8), p. 793–798, 2001-08

10. Timing recovery in high bit-rate transmission systems over copper pairs

P. Gysel, D. Gilg

IEEE Trans. Commun., vol. 46, issue (12), p. 1583–1586, 1998-12

11. The LMS algorithm with delayed coefficient adaptation

G. Long, F. Ling, J. G. Proakis

IEEE Trans. Acoust., Speech, Signal Process., vol. 37, issue (9), p. 1397–1405, 1989-09

12. Timing recovery in digital synchronous data receivers

K. H. Mueller, M. Müller

IEEE Trans. Commun., vol. 24, issue (5), p. 516–531, 1976-05

13. A DSP based 10BaseT/100BaseTX ethernet transceiver in a l.8 v, 0.18 um CMOS technology

R. Huss, M. Mullen, C. T. Gray, R. Smith, M. Summers, J. Shafer, P. Heron, T. Sawinska, J. Medero

International Conference on Custom Integrated Circuits, 2001, 135–138

14. A class of data-aided timing-recovery schemes

J. W. M. Bergmans, H. Wong-Lam

IEEE Trans. Commun., vol. 43, issue (2/3/4), p. 1819–1827, 1995-02/03/04

15. IEEE Std 802.3an, Clause 55, Annex 55A and Annex 55B.—Physical Layer and Management Parameters for 10 Gb/s Operation, Type 10GBASE-T

2006, Std.

16. IEEE 802.3an 10GBASE-T Study Group

http://www.ieee802.org/3/an/public/material/index.html [Online]. Available:

17. Phaselock Techniques

F. M. Gardner

Phaselock Techniques, 3rd, Wiley-Interscience, 2005

## Cited By

No Citations Available

## Keywords

### INSPEC: Controlled Indexing

timing jitter, delays, digital communication, equalisers, least mean squares methods, synchronisation

### INSPEC: Non-Controlled Indexing

No Keywords Available

### Authors Keywords

No Keywords Available

### More Keywords

No Keywords Available

No Corrections

## Media

No Content Available
This paper appears in:
International Symposium on Circuits and Systems
Issue Date:
2009
On page(s):
193 - 196
ISBN:
N/A
Print ISBN:
978-1-4244-3827-3
INSPEC Accession Number:
10760419
Digital Object Identifier:
10.1109/ISCAS.2009.5117718
Date of Current Version:
26 Jun, 2009