Quadratic Auto-step Least Mean Square Equalization for High-data-rate IR-UWB Wireless Communication Systems

High-data-rate impulse radio ultra-wideband (IR-UWB) wireless communication system suffers from serious intersymbol interference (ISI) issues in an indoor multipath environment. This paper proposes an auto-step least mean square time-domain equalization algorithm based on quadratic function (QA-LMS), which outplays the overall convergence speed, steady-state error, signal-to-noise ratio (SNR) threshold and robustness compared to the state of the art relevant to the adaptive step-size algorithms. The proposed algorithm does not need to preset the parameters according to channel conditions. The algorithm converges absolutely fast in the mean square error (MSE) learning curve. To meet the bit error rate (BER) of forward error correction (FEC) code, the algorithm improves the SNR threshold of traditional auto-step LMS (A-LMS) by 4.4 dB in CM1 and 3.6 dB in CM3 of 802.15.3a channel model respectively. The proposed algorithm is stable and robust in nonstationary environment with the advantage of vectorial step-size that based on the gradient of estimated error and high step detection.


I. INTRODUCTION
I MPULSE radio ultra-wideband (IR-UWB) systems request much lower power consumption compared with carrier-based communication systems, which is of great advantage for low-power sensors on body area network (BAN) in medical information and communication technology (ICT) [1]. However, in high-data-rate communication systems, the serious intersymbol interference (ISI) issue is of great concern, which needs to be mitigated by equalizers. The single-carrier frequency-domain equalization (SC-FDE) based on minimum mean square error (MMSE) is common [2]. However, time-domain equalization (TDE) were much more efficient in hardware complexity for short-range IR-UWB systems, where the channel delay was within tens of symbol intervals [3]. In time-domain, an equalizer based on the training sequence can be classified into Least Mean Square (LMS) family and Recursive Least Squares (RLS) family. Normalized LMS (NLMS) algorithm was proposed to solve the sensitivity of LMS to the input by normalizing to the power of the input signals [6]. LMS family is more robust and less complex at the cost of slower convergence [5].
Papers [11]- [13], [23], [24] proposed the set-membership adaptive filter (SMF), using an empirically defined or timevarying error bound to avoid useless updates. The SM-NLMS algorithm greatly improved the convergence speed with empirically defined error bound both in the training stage and the decision directed stage. The modified SM-NLMS algorithm (marked as SM-NLMS2) used a time-varying error bound based on the parameter-dependent bound (PIB), which is at the cost of estimating the noise power [24]. However, at low signal-to-noise ratio (SNR), the gradient of the cumulative error changed exponentially, which led to deterioration of the convergence speed for both two SM-NLSM algorithm.
Papers [25], [26] proposed the reduced-rank adaptive filter. The decreasing rank matrix in [25] was constructed by using the key features of the input data, such as the eigenvectors and eigenvalues of the input autocorrelation matrix. In [26], the SAABF-LMS and SAABF-RLS algorithm proposed a structure of the multibranch projection vector to obtain the basis function. This method could reduce the number of adaptive parameters and improve the convergence speed. However, the reduced-rank filter is hard to perform with a fast convergence speed and low steady-state error at low SNR.
Papers [7]- [14] proposed variable step-size time-domain algorithms to solve the low convergence speed of LMS algorithm. The modified VSS-LMS and VSS-NLMS algorithms reduced the steady-state error of VSS-LMS by modifying the linear relation of step-size and estimated error [7], [14], and the algorithm based on hyperbolic tangent function (HTLMS) made full use of forward prediction and backward detection error to update the step-size [8]. However, their convergence speed was too slow for high-data-rate applications (500Mbps). The algorithm for underwater acoustic channels (UWAVLMS) improved the convergence speed by proposing a new nonlinear relation of step-size and estimated error [9], [10]. However, its steady-state errors were not low enough in this application. All of them used a scalar step-size and had to manually specify the parameter value in advance according to different channel conditions. The traditional auto-step LMS algorithm (A-LMS) solved the issue with normalized factors [15], [16]. As an incremental learning algorithm, A-LMS converged fast by vectorial stepsize and got excellent robustness by high step detection [16]. However, due to the gentle slope of the exponential function and the large step-size after training, A-LMS was still sensitive to errors in the decision directed stage, which would cause higher BER [15].
In this paper, a quadratic auto-step LMS equalization algorithm is proposed, which outplays in convergence speed, steady-state error, and robustness, respectively. The algorithm uses the quadratic function to update the step-size along the gradient direction of estimated error. The algorithm modifies the normalized factor to prevent the increment of the step-size from oscillating seriously. This paper is organized as follows. A review about the theory of traditional A-LMS algorithm is firstly presented in Section II. Section III elaborates the principle and derivation process of QA-LMS. The implementation architecture of QA-LMS is presented in Section V. Section VI compares the convergence speed, steady-state error, and BER performance of various algorithms. Finally, a summary is presented in Section VII.

II. AUTO-STEP LMS ALGORITHM
FIGURE 1: The system block diagram of simulation The simulation system of the equalizer is shown in Fig.1, in which second-order Gaussian pulses and on-off keying (OOK) modulation are adopted. The simulation system uses an 802.15.3a channel model to emulate the indoor multipath environment of IR-UWB applications [20]. A-LMS uses the normalized factor to avoid manually specifying the parameter in advance. A-LMS algorithm converges fast by vectorial step-size based on the gradient of estimated error and achieves better robustness by high step detection [16]. u( ) andˆ( ) are the input vector and the estimated output for the nth iteration. ( ) is the expected response and ( ) is the estimated error calculated by Where ranges from 0 to − 1. is the number of taps.
( ) is the estimated tap coefficient calculated bŷ Where µ( ) is the vectorial step-size and ( ) is the ith step-size as The increment of uses the gradient of estimated error to update the step-size along the gradient direction as is the meta-step-size [15]. Using the chain rule of calculus, the partial derivative process of (4) can be expressed as Takingˆ( ) as the center and giving a small increment change [15], it can be expressed aŝ And when i equals to j, it is easy to derive [15]: Memory parameter ℎ records the recent changes of [ 15]: To reduce the sensitivity to meta-step-size and avoid the occurrence of ( ) ( )ℎ ( ) spurt, is divided by the normalized factor [15]: Where is the forgetting factor. Finally, whenever an overshoot occurs in the sample cost function, the step-size is reduced to make it no longer overshoot. In high step detection ( + 1) is divided by [15]: The time-varying Wiener solution w 0 with the minimum estimated error 0 is at the lowest position on the spherical Mean Square Error (MSE) bowl as shown in Fig.2a. This MSE function is calculated from 3000 randomly generated inputs and outputs. Error function reduces fastest at one time step along the gradient direction [18]. A-LMS algorithm uses vectorial step-size to update the tap coefficients in each dimension of the gradient direction, which is a relatively superior way when the contour line of MSE surface is an ellipse [15]. Especially when the number of taps is large, the convergence speed is much faster surely, which is proved in Section 5. The increment of based on the gradient of estimated error (see formula (4)) greatly helps converge along the gradient direction instead of the spiral path shown in Fig.2a [15]. High step detection further detects the stepsize and scales the step-size to a smaller value so that the updated weight vector ( ) will not overshoot the lowest position [16].

III. QUADRATIC AUTO-STEP LMS ALGORITHM
Due to the quite gentle slope of an exponential function, the step-size updates slowly by feedback estimated errors, so that the channel abrupt change caused errors is not processed in time. In addition, the step-size of A-LMS will not approach to zero due to the exponential function ( = ) since is not approaching to −∞ (referring to the curve of ( ) in Fig.2b. The large step-size of an exponential function after training leads to more steady-state errors [15].
To solve the issue of A-LMS, this paper proposes an autostep algorithm based on quadratic function (QA-LMS). In Fig.2b, the larger curvature of quadratic function ensures a much faster update of step-size in training stage. And the step-size after training is much closer to zero, which ensures QA-LMS more insensitive to channel abrupt change caused errors and the estimation of taps much more stable in steady state. An even-order power function ensures that the step-size is always positive.
However, there is no need to use power functions with higher order than 2, such as a biquadratic function. After initializing the step-size to the upper limit of 0.1 [16], the range of for quadratic, exponential and biquadratic functions is [0.   The slope of quadratic function is larger at initialization stage and smaller at end of training stage according to the 1st derivatives 1 , 0 , and 2 in formula (11). In addition, the slope of the square root function according to 3 in formula (11) goes to infinity while goes to zero, which will cause the oscillation of step-size.
From the analysis above, an quadratic function has advantages with the consideration of performance and implementation complexity. Then QA-LMS algorithm replaces ( ) 2 / with ( ) 2 / for better control of : In this method, the high-slope range may cause oscillations in the beginning of the training stage where the estimated error fluctuates wildly. QA-LMS algorithm modifies the normalizing factor by taking the product of and the first-VOLUME 4, 2016 order power of the input vector (Input vector is used here for the dimensional relationship), so that the increment of the step-size after normalization does not oscillate seriously: Further more, it can be modified by adding a bias to the feedback error like the bias compensation of estimated error in [17], through which the feedback effect of the estimated error is enhanced (replace e(n) in formula (5) and (9) with ( ) in formula (14).
For OOK modulation, , 0.5, often achieves better performance. 0.5 equals to the decision threshold to enhance the sensitivity to the error on the premise that the output signal is below the decision threshold. The process of QA-LMS is shown in Algorithm 1.

Algorithm 1 Quadratic Auto-step LMS
Initialize: The following part will prove that the gradient of estimated error is optimal for the estimation of the Wiener solution in steady state, while SM-NLMS is not so. The expectation of the error gradient vector equals to zero in steady state as Filter robustness in steady state supposes that the sequence of MSEs { [ 2 ]} ∈ is convergent [21], it can be derived according to formula (12): It is necessary and sufficient for the gradient vector to be zero at steady state: Formula (17) indicates that the principle of orthogonality is a necessary and sufficient condition for that the expectation of the error gradient is zero in steady state. In filter theory, the principle of orthogonality exists as a condition for the optimization of the estimated output.
However, the error gradient vector of SM-NLMS algorithm is calculated by The step-size is updated when the estimated error exceeds the default threshold . is a constant next to zero. Here the fraction function (18) cannot converge to zero. In addition, the update strategy based on threshold results in exponential changes of the incremental gradient of the estimated error after a certain period of time without update, which will enlarge the BER of output signals at the moment.
From the properties of a quadratic curve, QA-LMS algorithm achieves fast convergence speed and excellent robustness. QA-LMS algorithm achieves better tracking property for Wiener solution in steady state than SM-NLMS. In addition, quadratic function is simpler for hardware implementation.

IV. THE ANALYSIS OF STABILITY
Firstly, the error vector of the estimated weight is defined as Where w is the Wiener solution. Then Note that { * } is the multiplication of the vector's corresponding element, which still yields a vector. The relationship between the estimated output error and the weight error is The mean square deviation of the weight coefficients (MSD) is defined as Then On the premise that the step-size has upper and lower bounds, the relation is defined as Where is a preset parameter according to simulation results. Since the step size value is always positive, then And the square error is always positive as On the left side of formula (26), the sum of the terms multiplied by the positive step-size is still greater than 0. Then each term is multiplied by the lower limit of the positive step-size to get the relationship as Thus, the mean square deviation is obtained as (28) If the right side of formula (28) is less than zero, it is considered that under the step size boundary condition, the mean square deviation is monotonically decreasing. Therefore, the algorithm is stable in the sense of mean square error, that is, the convergence process is monotone [4]. Further, step size boundaries can be derived as

V. IMPLEMENTATION ARCHITECTURE AND COMPLEXITY COMPARISON
The implementation architecture of decision feedback equalizer (DFE) consists of a feed-forward filter (FFF) and a feedback filter (FBF) as shown in Fig.3a. The inputs of FFF are noisy signals, while the inputs of FBF are judged signals after the noise reduction. The main control mechanism is a tap gain algorithm. The secondary control mechanism is a step-size update algorithm. FBF only works for the ISI of post-cursor part. The coverage range of FBF's taps should be 3-5 times Root Mean For comparison of the various algorithms' complexity, the detailed implementation cost of them is presented in table 1.
is the number of taps and is the IFFT/FFT points. The statistical result of SC-FDE algorithm based on MMSE is obtained after analyzing the algorithm and its implementation architecture [22]. equals to 128 or 256 while the effective data rate equals to 200 Mbps or 400 Mbps.   [2], which is much more complex than QA-LMS. We set the parameters of SAABF-LMS algorithm with the better result of convergence speed and steady-state error, that is, M=30, D=4, q=4, and C=9.3. Then SAABF-LMS algorithm consumes 455.6 multiplications in average, while QA-LMS only consumes 330 multiplications. However, the complexity of QA-LMS is still higher compared with other variable step-size LMS algorithms, such as HTLMS and UWAVLMS. Vectorial step-size factor of QA-LMS, which causes much more complexity, is necessary in non-stationary environment to achieve faster convergence speed and track performance as shown in following section.

VI. SIMULATION RESULTS AND DISCUSSION
The effective data rate of the system model in Fig.1 is 500 Mbps with the pulse repetition period of 2ns. Forward Error Correction code (FEC) requires BER lower than 10 −3 . is the energy per bit in joules (J) and is the power spectral density of noise in watts per hertz (W/Hz), respectively.
/ 0 is a measure of the SNR. It is worth mentioning that the length of training sequence should be as short as possible to reduce the frame overhead.

A. CONVERGENCE SPEED, STEADY-STATE ERROR AND BER
On the premise of adjusting the parameters of each algorithm to keep an uniform MSE in steady state, this simulation initializes the length of training sequence to 2000 bits to compare the convergence speed. HTMLS's parameters are that (0) = 0.4, = 0.4, = 10 = 0.016 [8]. Moreover, UWAVLMS's parameter, , is 0.98 [9]. The timingvarying error threshold of SM-NLMS is 0.18 in training stage and 0.40 in decision directed stage, which performs     The comparison of MSE learning curve between various algorithms in the condition of / 0 = 30 and 16 is shown in Fig.4a. The curves are drawn by averaging the square error over 100 separate Monte Carlo experiments. The method of comparing the convergence speed is recording the iteration times, K, where [ 2 − 2 −1 ]/ 2 −1 < 0.055, which ensures the BER in decision directed stage lower than 10 −3 .
From Table 2, QA-LMS converges fastest among the various algorithms above. Vectorial step-size of A-LMS and QA-LMS helps converge, especially in CM3 with more taps to cover the long RMS delay. Vectorial step-size based on the gradient of estimated error helps converge faster along the gradient direction instead of the spiral path in training stage when the contour line of MSE surface is an ellipse, which is analyzed in Section II. When in CM1 of 802.15.3a channel, the reduced-rank algorithm was not chosen for comparison because it is not suitable with a very small number of taps according to the analysis of complexity. In CM3, the convergence speed of SM-NLMS2 is faster while that of SAABF-LMS algorithm is slower compared with QA-LMS algorithm at high SNR. With the decrease of SNR, the convergence speed and BER performance of SM-NLMS2 and SAABF-LMS deteriorates very sharply. As shown in table 3, the SNR threshold of SM-NLMS2 and SAABF-LMS is worse than that of QA-LMS both in CM1 and CM3. In terms of complexity, when the number of weight coefficients is less than 30, the advantage of reduced-rank filters is not particularly obvious. It is suitable for the scenarios in the suppression of Narrow Band Interference (NBI) and Multiuser Interference (MUI), where the reduced-rank filter could greatly reduce the complexity and improve the convergence speed with a large number of weight coefficients. Fig.6a and Fig.6b show the BER performance with long (500) and short (100) training sequences. SC-FDE's curve with the same effective data rate is from [2]. Minimum / 0 threshold for the limit of FEC is shown in Table  3. Compared with A-LMS, / 0 threshold of QA-LMS is improved by 4.4 dB in CM1 and 3.6 dB in CM3 with the length of training sequence 500. It is worth mentioning that HTLMS has obvious BER advantage when training length is long enough (>500). The step-size of HTLMS and QA-LMS is extremely small after training so that both two algorithms are not sensitive to channel abrupt change caused errors as shown in Fig.5a. However, HTLMS converges very slowly as a result of the previous analysis. In addition, when the training length is shorter than 100, its BER performance will deteriorate drastically while QA-LMS still achieves 2.2 dB improvement compared to A-LMS.

B. VECTORIAL STEP-SIZE FACTOR
After switching channels at 500th iteration (CM1 → CM2 & CM1 → CM3), the square error of HTLMS, A-LMS and QA-LMS are shown respectively in Fig.(7a,7b). A-LMS and QA-LMS based on vectorial step-size can track the change of channel in time. The tracking performance is an important requirement for IR-UWB wireless communication systems.

C. META-STEP-SIZE PARAMETER INSENSITIVITY
The experimental results in Fig.7c show that the value of meta-step-size, 0.01, achieves the best performance in each / 0 cases (from 16 to 20 dB). To facilitate hardware implementation, QA-LMS recommends a 6-bit shift register to replace the multiplier unit as = 0.015625 (= 2 −6 ). In addition, as demonstrated in [15], QA-LMS's performance is not strongly dependent on as well. QA-LMS recommends = 5 × 10 −3 according to simulation tests.

VII. CONCLUSION
This paper proposes a QA-LMS adaptive equalization algorithm in time domain, which outplays in all the mentioned aspects of A-LMS, including convergence speed, steady-state error, complexity, robustness, robustness and BER performance. QA-LMS converges fast so that the training length can be no more than 100. To meet the limit BER of FEC, QA-LMS reduces the / 0 threshold of A-LMS by 4.4 dB in CM1 and 3.6 dB in CM3 of 802.15.3a channel model respectively. QA-LMS replaces the exponential functions in A-LMS with multiplications, which greatly reduces the complexity. Experimental result shows that QA-LMS is in- QIANYUN LIU received the B.S. degree in communication and information engineering from Shanghai University, Shanghai, China, in 2020. She is currently pursuing the M.S. degree with the Shanghai University, Shanghai, China. Her recent research interests include ultra-wide band positioning algorithm in indoor environment. VOLUME 4, 2016