Design and Implementation of a Digital Front-End With Digital Compensation for Low-Complexity 4G Radio Transceivers

A digital front-end with digital compensation is designed and implemented for low-complexity 4G radio transceivers targeted for wearable devices such as smart watches. The proposed digital front-end in the radio receiver consists of an anti-drooping filter, a decimation chain, a DC offset cancellation circuit, and an in-phase and quadrature estimation and compensation circuit whereas the digital front-end in the radio transmitter includes an anti-drooping filter, a root raised cosine filter, and an interpolation chain. The proposed DC offset cancellation circuit is based on both infinite-duration impulse response filter and moving average. The proposed in-phase and quadrature estimation and compensation circuit attains lower complexity with negligible performance loss, compared with an existing circuit. A systematic top-down strategy is taken to design and implement the proposed digital front-end from the algorithm level to the application-specific integrated circuit or ASIC hardware level. The inter-symbol interference in the transmitter and the receiver is analyzed and the unwanted emission in the transmitter is simulated as well. For all the seven bandwidths or modes in 3G and 4G, the digital front end receiver ASIC satisfies all the interference requirements, namely, in-band blocker, narrowband blocker, and adjacent channel selectivity requirements whereas the digital front end transmitter ASIC meets all the unwanted emission requirements, namely, spectrum emission mask, spurious emission, and adjacent channel leakage ratio requirements. The proposed multimode 4G digital front end receiver and transmitter ASICs exhibit a >40dB mean signal-to-noise ratio for all the seven modes and are implemented in a 180nm CMOS process technology.


I. INTRODUCTION
Modern radio transceiver chips tailored to various wireless standards such as WLAN and LTE often include digital circuits immediately after the analog-to-digital converter (ADC) in the receiver and immediately before the digital-to-analog converter (DAC) in the transmitter. A digital front-end receiver (DFE Rx) lies right after the ADC in order to filter The associate editor coordinating the review of this manuscript and approving it for publication was Prakasam Periasamy . unwanted blockers, select the desired signal channel, and lower the sample rate. A digital front-end transmitter (DFE Tx), which is a counterpart to the DFE Rx, lies right before the DAC in order to raise the sample rate and shape the signal pulse. The sample rate is converted in the DFE Rx and the DFE Tx in order to ease the processing in the digital baseband or modem chip. Especially, the DFE Rx needs to combat both noise and interference and it can include some digital compensation circuits to cancel out or correct radio frequency (RF) impairments such as DC offset and in-phase and quadrature (IQ) imbalance or mismatch. Placing the digital compensation circuits inside the radio transceiver chip is helpful in readily fixing RF analog impairments (such as DC offset and IQ imbalance) which are inherent in the same radio chip.
The means to cope with RF and analog circuit impairments by digital parts can be subdivided into digital calibration and digital compensation. Digital calibration is a means based on feedback (and hence a closed-loop technique), where the digital part measures or senses the impairment under consideration and accordingly controls or tunes the RF and analog circuitry to reduce or mitigate that impairment. Digital compensation, on the other hand, is a feedforward means (and hence an open-loop scheme), where typically the digital part blindly (i.e., without measurement) cancels or minimizes the impairment of interest of its own accord (i.e., without controlling the RF and analog circuitry). In this work, RF and analog impairments including IQ imbalance, DC offset, and drooping are canceled by digital parts on their own without controlling the RF and analog circuitry and thus feedforward digital compensation techniques are employed to handle the impairments.
Digital parts coping with various impairments can be implemented in the DFE (which is often integrated into the radio transceiver) or in the modem (i.e., the digital baseband processor). For example, the carrier frequency offset and the sample clock offset are typically compensated for in the baseband modem (especially the modem for orthogonal frequency division multiplexing or OFDM) and the IQ imbalance should be compensated for before channel estimation in the modem. In the radio transceiver, time constants of baseband filters and ADC loop filters, output frequencies and/or amplitudes of voltage-controlled oscillators (VCOs), temperature drift in the VCO, the VCO gain, local oscillator (or LO) leakage, and the DC offset are all examples that can be digitally calibrated. Since the sampling rate is higher in the DFE than in the modem, more accurate and faster compensation is possible in the DFE. Especially, the DFE, if integrated into the same chip as the RF and analog circuitry, imposes little limitation on the No. of input and output pins, enabling more efficient and affordable digital compensation. On the other hand, compensation that utilizes the results from synchronization and channel estimation will be possible if compensation in the modem is conducted albeit at a lower sampling rate than in the DFE. In this work, DC offset and IQ imbalance compensation is realized in the DFE, which does not exclude the compensation in the modem. Joint compensation in both the DFE and the modem might be more effective.
Without digital compensation or calibration, the RF and analog circuitry will be more costly and more power hungry to achieve the same required performance as the one with digital compensation or calibration. For instance, to minimize IQ imbalance without digital compensation will lead to a much larger footprint of RF and analog circuitry which consumes a much larger current.
Multimode multistage decimation and interpolation chains are included in the DFE Rx and the DFE Tx, respectively. Multimodes include 7 channel BWs in 3G and 4G standards and the multistage approach is taken to save the area by reducing the total No. of filter taps. To reduce computation and storage, it is more efficient to adjust the sample rate in a series of decimation or interpolation filter stages than in a single-stage filter with a huge No. of filter taps [1], [2]. Formerly, we designed and implemented a DFE Rx with a cascaded integrator comb filter and a fractional sample rate converter [3] whereas in this work the fractional sample rate converter [2], [4], [5] is avoided to lower the complexity of the overall DFE.
A host of literatures address the DFE Rx, the digital compensation circuitry, and the DFE Tx. An analog-digital baseband signal chain is proposed in [6] as a software-defined radio receiver to realize flexible multimode operations. Direct-conversion or zero-intermediate-frequency receivers suffer from self-mixing of the local oscillator, giving rise to the DC offset, as well as IQ imbalance arising from the practical inaccuracy of I and Q paths. The carrier frequency offset and phase noise compensation as well as symbol timing synchronization are typically done at the digital baseband in the inner receiver of the modem. In this work, RF imperfections including DC offset and IQ imbalance as well as RF drooping are removed or compensated for in the DFE Rx which lies both before the modem and inside the radio receiver. In [4], a decimation chain and digital filters are designed for software radio receivers by using fractional sample rate conversion. In [7], sample rate conversion filters are designed for multi-standard software radios based on fractional sample rate conversion. In [8], nonlinearity impairment is digitally compensated for in software radios. Analog and mixed-signal radio receivers with little RF impairments will lead to high implementation costs and therefore compensation in the digital domain is much more desirable for lowcost receivers. Digital compensation attains lower cost, lower power consumption, and higher Silicon fabrication yield with superior flexibility. A DFE Rx with polynomial interpolation filters [9]- [11] for fractional decimation is designed for flexible radios in [12].
Complete software implementation of the DFE Rx for IEEE 802.11ac receiver is realized by adopting parallel processing of multiple bands in [13]. A representative example of the DFE Rx for multimode cellular transceivers is presented in [14] with simulation results. Droop compensation is addressed in [4], [15]. DC offset cancellation circuits in the analog domain and in the digital domain are presented in [16] and [17], respectively.
In particular, a multitude of literatures deal with the inevitable IQ estimation and compensation since the IQ imbalance physical impairment exists in every practical analog processing radio receiver [18], [19]. Direct-conversion radio receivers and OFDM systems are very sensitive to IQ imbalance. A blind IQ imbalance parameter estimation is performed and simulation results are given in [20], which VOLUME 9, 2021 is independent of pilots or preambles. IQ imbalance gives rise to insufficient image rejection [21] in the radio receiver and the resulting image appears as interference. IQ imbalance compensation is simulated in terms of interference cancellation by using least mean square and recursive least squares algorithms in [22]- [24]. A blind IQ parameter estimation and compensation algorithm with feedforward paths to guarantee stability is presented and simulated in [25]. An adaptive equalization algorithm [26] is presented and simulated to compensate for IQ imbalance under carrier frequency offset in an orthogonal frequency division multiplexing system in [27]. A pilot-based carrier frequency offset and IQ imbalance compensation algorithm is presented and simulated in [28], where a finite-duration impulse response (FIR) filter and a phase compensator are used for IQ imbalance compensation. A training-symbol-based IQ imbalance compensation algorithm in conjunction with phase noise compensation is presented and simulated in [29]. A joint channel and data symbol estimation algorithm is presented and simulated to compensate for IQ imbalance and phase noise in [30]. IQ imbalance is extensively analyzed and a hardware-efficient delay-based IQ compensation scheme is presented and designed in [31], [32].
A systematic top-down strategy is taken to design and implement the proposed DFE from the algorithm level to the ASIC hardware level, which is described in great detail in [3] that we formerly authored. The design methodology and principle used to implement the proposed DFE in this work is also described in [3]. We formerly modeled, designed, and implemented a DFE Rx with fractional sample rate conversion in an ASIC [3]. However, to reduce complexity, fractional sample rate conversion is avoided in this work. All the digital compensation circuits are newly designed in this work and included in the proposed DFE Rx. Especially, the DFE Rx with digital compensation includes an anti-drooping filter, a DC offset cancellation (DCOC) circuit which features the adoption of both the infinite-duration impulse response (IIR) filter and the moving average (MA), and finally a new compact IQ estimation and compensation circuit which uses a coordinate rotation digital computer (CORDIC). A DFE Tx with an interpolation chain is also newly designed in this work. The inter-symbol interference (ISI) in the transmitter and the receiver is newly analyzed and the unwanted emission in the transmitter is newly simulated and measured as well. The proposed DFE Rx and Tx are targeted for lowcomplexity 3G and 4G radio transceivers in wearable devices such as smart watches.
The paper is organized as follows. Section II describes the algorithm design of the DFE Rx made up of an antidrooping filter, a decimation chain, a DCOC circuit, and an IQ estimation and compensation circuit. Section III depicts the algorithm design of the DFE Tx including an anti-drooping filter, a root raised cosine (RRC) filter, and an interpolation chain. Section IV details the hardware design of the DFE Rx and the DFE Tx. The former part of Section V shows simulation results of the DFE Rx and the DFE Tx. The latter part of Section V exhibits ASIC implementation results of the DFE Rx and the DFE Tx, followed by conclusion.

II. ALGORITHM DESIGN OF THE DFE RX
In this work, a DFE Rx and a DFE Tx are designed at the algorithm level and subsequently at the hardware level. In this section, the DFE Rx design at the algorithm level is explained. In section III, the DFE Tx design at the algorithm level is delineated. The overall DFE Rx architecture for direct-conversion radio receivers is shown in Fig. 1, which consists of DCOC blocks for DC estimation from a pair of IQ ADC output signals, an anti-drooping filter to compensate for the attenuation in the passband, a decimation chain to downsample the sampling frequency, and IQ estimation and compensation blocks to deal with IQ imbalance. The anti-drooping filter plays an important part in the 3.84MHz bandwidth (BW) 3G with no equalizer. The decimation chain also serves the purpose of removing the interference which enters the DFE Rx together with the input signal. Each of the building blocks of the DFE Rx can be enabled or disabled. Either the estimated DC value or an external value can be used by the DFE Rx and similar for the estimated IQ value. Seven BWs are supported: 20MHz, 15MHz, 10MHz, 5MHz, 3MHz, and 1.4MHz for 4G and 3.84MHz for 3G. Word lengths or bit widths for the DFE Rx input and output are 14 bits and the bit width for the estimated IQ value is 27 bits.   2 shows the timing on DC and IQ estimation and compensation. First, DC estimation runs on the basis of both the IIR filter and the moving average. Up to 10ms, the estimation goes on based on the IIR filter while the moving average continuously accumulates input values. At time point of 10ms, the moving average produces the output based on the accumulated values and the DC estimator changes the estimated value from the IIR filter output to the moving average output. Up to 10ms, DC compensation runs on the basis of the IIR filter while after 10ms, it runs by using the moving average output. At time point of 10ms when the DC offset lessens, IQ estimation gets started. IQ estimation takes 1ms for BW = 20MHz and hence its output is produced at time point of 11ms. Subsequently, IQ estimation for BW = 10MHz, 5MHz, 3MHz, 1.4MHz, and 3.84MHz produces outputs 2x, 3x, 7x, 15x, and 2x, ending at time point of 26ms in the worst case  (BW = 1.4MHz). At the instant the first IQ estimation output is produced (at time point of 11ms), IQ compensation sets in.
As stated above, the IIR-filter-based DC estimation interval is set to 10ms according to the given RF transceiver and modem design. If the interval is >10ms, a packet may drop in the early stage since the receive performance for the first few subframes is substantially poor. On the other hand, if the interval is set smaller than 10ms in order to enhance the receive performance of the first few subframes, the area and the power dissipation of the moving-average-based DCOC circuit become too large. Moreover, from the standpoint of the LTE system, whose radio frame has a 10ms duration, the synchronization channel (i.e., the primary synchronization signal and the secondary synchronization signal), which is receivable by all the terminals, can be employed to cancel the DC offset. Additionally, it can be safely assumed that once the carrier is set, the Tx drift is typically not too large and the residual Tx drift is as large as it can be resolved by the carrier frequency offset estimation and compensation circuit in the modem.

A. ANTI-DROOPING FILTER
Prior to the decimation filter, an anti-drooping filter is placed to compensate for the attenuation that manifests itself in the passband owing to decimation and drooping. For, say, BW = 3MHz in 4G, the passband frequency is 1.35MHz (single sideband and <0.5 * 3MHz) and the design requirement specification (DRS) for the attenuation that should be compensated for by the anti-drooping filter is 1.99dB. Fig. 3 displays how an anti-drooping filter is designed. The blue line shows the anti-drooping filter response for the drooping in the passband. However, the No. of filter coefficients needed to realize the response is too large. Therefore, in practice the X-shaped characteristic points, which are placed in the beginning, the VOLUME 9, 2021  end, and the two drastically varying points of the blue line, are used to design the anti-drooping filter. The red line results from the design of the anti-drooping filter, which is similar to the ideal blue line in the passband.
The decimation chain output before and after the compensation by the anti-drooping filter is shown in Fig. 4 where the dotted red line marks the passband. As shown, the droop (or attenuation) in the passband is compensated for by 1.99dB which is the requirement for the 3MHz BW case whose passband frequency is 1.35MHz. The No. of filter taps is 41 for BW = 3MHz. The No. of taps in an ordinary lowpass filter (LPF) is affected by the transition width and the ripple. In analogy, the No. of taps in the anti-drooping filter is affected by the passband width (or passband frequency) and the droop at the passband edge. The No. of taps in the anti-drooping filter is expressed as the droop at the passband edge (dB) divided by the passband width and multiplied by the sampling frequency. Thus more filter taps are needed as the passband width is smaller and the droop at the passband edge is larger. For different channel BWs and corresponding passband frequencies, different attenuation values are required. All the channel BW cases are designed and meet the specification.
In 4G, the passband flatness is partly secured by the equalizer in the modem while in 3G, the passband attenuation impacts the ISI, degrading the performance and thus the anti-drooping filter is employed to reduce the ISI as well as to retain the passband flatness. The basic model for analyzing the ISI is drawn in Fig. 5, which is in turn converted to an equivalent model that combines the downsampler by 2 and the downsampler by 4 into a downsampler by 8 located at the modem Rx front-end. Thus LPF2 and the RX RRC filter will be seen as upsampled by 2. I k is the information signal. In this equivalent model, the drooping, the anti-drooping, LPF1, LPF2, and RX RRC filters are seen as one RRC filter which finally forms a raised cosine (RC) filter in conjunction with TX RRC. The RC filter coefficient can be represented as in Fig. 6, by which I k is pulse shaped. For an ideal RC filter, the ISI will be all zero except for the gain value but for a practical filter, the ISI will exist as in Fig. 7 where the red curves represent the ISI that will be sampled as I k passes through the RC filter, yielding a large signal-to-noise ratio (SNR) degradation. The minimum (mean) SNR of the DFE without anti-drooping is 10.91dB (24.75dB) in theory and 11.06dB (21.31dB) by simulation whereas the minimum (mean) SNR with anti-drooping is 30.13dB (48.5dB) in theory and 30.19dB (44.80dB) by simulation. In short, with anti-drooping, the SNR rises by about 20dB.

B. DECIMATION CHAIN
The downsampling ratios that the DFE Rx should meet for channel BWs of 3.84MHz, 1.4MHz, 3MHz, 5MHz, 10MHz, 15MHz, and 20MHz are 2, 16, 8, 4, 4, 2, and 2, respectively, from which the DFE output frequencies, 15.36MHz, 1.92MHz, 3.84MHz, 7.68MHz, 15.36MHz, 30.72MHz, and 30.72MHz are obtained, respectively. The decimation chain includes an LPF, a downsampler by 2, a channel selection ifilter (CSF), and an RRC filter. The LPF followed by the downsampler removes the interference overlapping the signal. Fig. 8 shows the input (left) and the output (right) of the LPF followed by the downsampler. Blue lines represent the signal and red lines the adjacent channel selectivity (ACS) interference. Before passing the LPF and downsampler, the interference is larger than the signal by ∼20dB. After passing the LPF and downsampler, the signal has a width twice as large as the signal before passing while the interference overlapping the signal gets attenuated. The remaining un-attenuated interference adjacent to the signal will be removed by the CSF at the last stage of the decimation chain, getting much lower than the signal.
The lowpass CSF exhibits the largest size among the filters in the decimation chain. Fig. 9 shows the input (left) and the output (right) of the CSF. The signal is denoted in blue  . CSF input (left) and output (right). VOLUME 9, 2021 while the interference in red. After passing the LPS and downsampler, the signal is still lower than the adjacent interference by about 20dB and the CSF gets rid of this remaining interference, as shown on the right of Fig. 9. Namely, while the LPF for downsampling reduces the interference only in the location to which the signal is downsampled, the CSF is to reduce the interference in all locations and hence more taps are needed.
The RRC filter with a chip rate of 3.84MHz for 3G is illustrated in Fig. 10, whose roll-off factor is 0.22, which decides the filter shape. In 3G, the ISI as well as the interference should be removed since the ISI also affects the performance. This is done by the RRC filter in the DFE Rx, which is combined with the DFE Tx RRC filter, to construct the RC filter. In the ideal RC filter, no ISI will exist since all the sample points but for the gain point will be zero. However, the ideal RC filter will have infinite No. of taps, barring the circuit implementation. Therefore, only a finite No. of taps is used for implementation, as shown in Fig. 10 where the red portion is cut off and the black portion is only used.

C. DC OFFSET CANCELLATION (DCOC)
The DCOC block diagram is shown in Fig. 11, which estimates the DC value from the DFE Rx input and removes the DC value. Otherwise, the DC value will affect the data samples around the DC, degrading the minimum SNR obtainable in the modem. In this work, both the IIR filter and the moving average are designed. The former operates from start up to 10ms for DC estimation (est.) and in the meantime, the latter internally accumulates input values. After the time point of 10ms, the moving average part takes over the DC estimation from the IIR filter. The IIR filter is designed by using Eq. (1), where α is 0.9999829. To estimate the DC in the signal, the passband is very narrow and hence the No. of taps will be considerable if the filter is implemented as an FIR filter. Thus, the IIR filter is selected for implementation. The corresponding filter structure is shown in Fig. 12. The coef1(=0.0000171) and coef1(=0.9999829) values correspond to (1-a) in the numerator of (1) and a in the denominator of (1).    13 shows the frequency response of the IIR filter, which is 0dB at the leftmost and the rightmost points that correspond to the DC and is -60 to -100dB in other regions. As the result of the DC estimation based on the IIR filter, the estimated output approaches the DC value in 10ms and the minimum SNR observed in the modem is about 20dB.
The incoming signal to the DFE Rx will swing around 0 if there is no DC offset but will swing around the DC offset if the offset exists. Thus the center of the swing is to be found by using lots of samples and this center is regarded as the DC offset. This takes much time since lots of samples are to be employed, during which the IIR filter operates first. Based on the 10ms duration, it is verified from simulation that the moving average outperforms the IIR filter. Therefore, while the IIR filter runs for the first 10ms to output every sample value, the moving average internally accumulates input values. At time point of 10ms, the moving average starts to produce estimated DC values based on the accumulated values. Fig. 14 shows the block diagram of the moving average whose input signals are accumulated for 10ms by means of the adder and the flip-flop (FF). After this, the signal passes through the multiplexer and is multiplied by the value, div, which is the inverse of the No. of samples accumulated for 10ms. In this way, the average signal or the estimated DC value is obtained.  Fig. 15 shows the IQ imbalance model to design the IQ estimation and compensation circuit. The input y(t) is assumed as Eq. (2), where s(t) is the information signal. A I and A Q is the gain imbalance, ω c is the angular velocity, and φ I and φ Q is the phase imbalance. I imb [n] at the ADC output can be expressed as Eq. (3). In a similar manner, Q imb [n] is obtained as in Eq. (4). These two equations are expressed in matrix notations in Eq. (5). By left-multiplying the inverse of the 2 × 2 matrix on both sides, IQ imbalance is removed. This inverse of the 2 × 2 matrix is estimated by the IQ estimation circuit.

D. IQ ESTIMATION AND COMPENSATION
The IQ estimation in [31], [32] obtained the compensated IQ from Eq. (6) where the outcome is calculated from the result of the IQ estimation. sgn(x) is the signum function defined as the output = 1 if x > 0 and output = -1 if x < 0. By substituting Eq. (5) into Eq. (6), the relationship between the IQ compensated signal and the information signal is obtained. After IQ estimation and compensation, it is found that the IQ compensated signal is basically equal to the information signal multiplied by a complex constant. (If there is gain imbalance or phase imbalance, then the IQ compensated signal will not be equal to the information signal multiplied by a complex constant.) Thus the difference between the compensated signal and the information signal can be corrected by means of the pilot signal and subsequently the information signal is recovered in the modem equalizer.
By comparison, the proposed IQ estimation is implemented with much lower complexity than the existing IQ estimation [31], [32]. The IQ compensated signal is obtained from Eq. (7). Similar to the existing IQ estimation, by substituting Eq. (5) into Eq. (7), the relationship between the IQ compensated signal and the information signal is obtained. In the proposed IQ estimation as well, the IQ compensated signal is equal to the information signal multiplied by a complex constant if A I = A Q (no gain imbalance) and φ I = φ Q (no phase imbalance). The complex constant scales the information signal by its gain or magnitude and rotates the information signal by its phase to yield the compensated signal. Since the compensated signal is different from the information signal by a constant scale factor and a constant phase, the information signal can be recovered in the modem by means of the pilot signal.
Fig. 16 compares the hardware complexity of the proposed IQ estimation with that of the existing IQ estimation.  In the IQ estimation circuit, the No. of multipliers and square root circuits is reduced and in the IQ compensation circuit, the No. of multipliers is reduced as well. The area of the IQ estimation and compensation circuit is shrunk by 55% while the theoretical performance is guaranteed. Fig. 17 shows the minimum SNR (left) and the mean SNR (right) as a function of the No. of symbols processed in the digital compensation circuitry when the channel BW is 5MHz. IQ old denotes the existing IQ estimation and IQ new denotes the proposed IQ estimation while MA denotes the moving average. In case of the min SNR, both the existing method and the proposed method satisfy >20dB with practically identical performances. In case of the mean SNR, the proposed method exhibits 1dB less SNR but still meets the >40dB requirement with much smaller hardware area.
IQ new+ denotes the proposed IQ estimation with applying different IQ estimation times for different channel BWs. The IQ estimation needs to process ∼30000 samples to produce a sufficient performance. When the channel BW is 20MHz, the sampling frequency is 30.72MHz and hence the 1ms interval will include 30720 samples. Thus ∼1ms is needed to conduct IQ estimation for BW = 20MHz. When the channel BW is 5MHz, the sampling frequency is 7.68MHz and hence the 1ms interval will include only 7680 samples. Thus the IQ estimation circuit needs 4 updated intervals to produce a sufficient performance. In case of the mean SNR, the performance gets better as the estimation is updated.
As is shown in Fig. 17, both the minimum SNR and the mean SNR are improved by >3dB relative to the old or existing IQ estimation. Also, without any DCOC and IQ estimation, the minimum SNR and the mean SNR would be -25dB and around 40dB, respectively, and hence a large enhancement can be achieved by digital compensation.

III. ALGORITHM DESIGN OF THE DFE TX
The DFE Tx structure is shown in Fig. 18. The red path is used for 4G while the green path is used for 3G. In 4G, the modem output signal is first passed on to the anti-drooping filter which is located early in the chain instead of late in the chain for the sake of ease in design. The sampling frequency grows high (122.88MHz for 1.4MHz, 3MHz, and 5MHz BWs and 245.76MHz for 10MHz, 15MHz, and 20MHz BWs) at the output of the interpolation chain. Subsequently, the signal passes through the interpolation chain which adjusts the  sampling frequency and rejects images coming from upsampling. In 3G (BW = 3.84MHz), the upsampler by 4 (L = 4) and the RRC filter for pulse shaping are added in the signal path. The input and output signals of the DFE Tx are both 14 bits wide. The RRC filter is already explained in Section II and hence only the anti-drooping filter and the interpolation chain will be explained below.

A. ANTI-DROOPING FILTER
Drooping occurs at the front stage for the DFE Rx but occurs at the rear stage for the DFE Tx. To ease the design, the antidrooping filter is placed early in the chain, where the sampling frequency is lower, rather than late in the chain, where the sampling frequency is higher. The design method is identical to that of the anti-drooping filter in the DFE Tx but more characteristic points (denoted as X in Fig. 19 for a 15MHz BW) are adopted for a more accurate design. In Fig. 19,   FIGURE 19. Design of the anti-drooping filter.
the red curve represents the response of the designed antidrooping filter. Fig. 20 shows the result before anti-drooping (left) and after anti-drooping (right) for a 15MHz BW. If anti-drooping is not applied, the passband exhibits an attenuation of 3.13dB whereas if drooping is compensated for, the passband attenuation reduces to 0.17dB, meeting the <0.5dB passband flatness requirement. In this way, the filter is designed to satisfy the requirements of all the 7 BWs. The theoretical No. of filter taps was computed, as was done with the DFE Rx anti-drooping filter in Subsection A of Section II, to be 16 taps but a performance drop was shown in the actual measurement owing to some ripple. Thus the No. of taps for the 3.84MHz BW, the most critical BW to consider in the anti-drooping filter design, is set to 21 taps to meet the mean SNR of >40dB. This No. of taps satisfies the requirements of all the other BWs.
The ISI analysis for the DFE Tx model is similar to that for the DFE Rx model. In DFE Tx, the building blocks are arranged in this order: the upsampler by 4, the antidrooping filter, the Tx RRC filter, the upsampler by 2, LPF1, the upsampler by 2, LPF2, the upsampler by 2, and LPF3, followed by drooping at RF. This makes it difficult to analyze the ISI since many blocks have different sampling frequencies. Therefore, an equivalent model is adopted, where the four upsamplers in the original DFE Tx is integrated into one upsampler by 32 placed at the first stage. Now the anti-drooping filter and the Tx RRC filter is at eight times the sampling frequency, LPF1 at four times the sampling frequency, and LPF2 at twice the sampling frequency, as shown in Fig. 21 where I k is the information signal and both the Rx RRC filter and the downsampler by 32 represent a simple DFE Rx model. The building blocks between the upsampler by 32 and the downsampler by 32 make up the RC filter. The ISI model in Fig. 21 is used to analyze the ISI and obtain the corresponding SNR. In the absence of anti-drooping, the minimum (mean) SNR from VOLUME 9, 2021  simulation is 13.53dB (20.07dB) and the minimum (mean) SNR in theory is 13.29dB (21.15dB). In the presence of anti-drooping, the minimum (mean) SNR from simulation is 32.40dB (45.31dB) and the minimum (mean) SNR in theory is 32.39dB (46.24dB). Thus by means of the anti-drooping filter, the minimum SNR and the mean SNR rise by ∼20dB and 25dB, respectively, exhibiting that the anti-drooping filter is important in the DFE Tx as well as in the DFE Rx. should meet the unwanted emission standard as well as the sampling frequency adjustments. The unwanted emission standard is categorized into the spectrum emission mask (SEM), the spurious emission (SE), and the adjacent channel leakage ratio (ACLR). The SEM is the power spectral density (PSD) immediately outside of the desired signal band, the SE is the PSD farther outside of the desired signal band, and the ACLR is the difference of the adjacent channel power relative to the desired channel power. 4G SEM and SE requirements in case of the 15MHz BW, for example, are illustrated in Fig. 22. Two measurement BWs are specified, 30kHz and 1MHz, for the PSD measurement. The SEM for 3G is somewhat different from that of 4G but the SE is identical. From the standard specification, the 3G SEM and SE requirements in case of the 3.84MHz BW, for example,  are drawn in Fig. 23. The ACLR requirements in case of the 15MHz signal BW, for example, are as follows: The E-UTRA channel with a 13.5MHz measurement BW at an offset of 15MHz from the center of the desired channel should be >36dB lower than the desired signal power, where E-UTRA is short for the evolved universal mobile telecommunications system terrestrial radio access. Also the two UTRA channels with 3.84MHz BWs at offsets of 10MHz and 15MHz should be >39dB and >42dB lower, respectively, than the desired signal power.
The interpolation chain raises the sampling frequency and rejects images arising from upsampling. For the 15MHz BW mode, for instance, the input sampling frequency is 30.72MHz and the output sampling frequency is 245.76MHz after upsampling by 8. In this case, upsampling by 8 all at once is not desirable but in multiple stages, e.g., upsampling by 2 three times in a row is better since the total No. of filter taps can be reduced. Namely, upsampling by 8 is implemented with three pairs in cascade where a pair comprises an upsampler by 2 and an LPF. The No. of filter taps can be obtained theoretically from the Bellanger's formula [3], [33] on the basis of the filter specification, shown in Eq. (8) where δ plin and δ slin are the passband attenuation and the stopband attenuation, respectively, in linear scale and f = (f s -f p )/f samp is the transition BW normalized by the sampling frequency of the designed filter. (Here, f s is the stopband frequency, f p the passband frequency, and f samp the sampling frequency.) Assuming the passband and stopband attenuations are -60dB (10 −3 ) and -80dB (10 −4 ), respectively, and f samp = 245.76MHz, f s = 232.2MHz, and f p = 7.5MHz, then the upsampler by 8 followed by one LPF will have N = 42, namely 42 taps. By comparison, if an upsamplerby-2-and-LPF1 pair (with f samp = 61.44MHz, f s = 23.22MHz, and f p = 7.5MHz) followed by an upsamplerby-2-and-LPF2 pair (with f samp = 122.88MHz, f s = 53.94MHz, and f p = 7.5MHz) yet followed by an upsampler-by-2-and-LPF3 pair (with f samp = 245.76MHz, f s = 115.38MHz, and f p = 7.5MHz) is implemented instead of the upsampler by 8 and one LPF, then N of each pair will be 15, 10, and 9, respectively. Thus the total No. of taps will be 34, smaller than 42. Therefore, the multistage implementation is better than the direct form implementation. Multistage upsampling in powers of 2 is known to lead to fewer taps.   Fig. 25 illustrates how the images are rejected as the signal passes through each stage of the chain in Fig. 24. After each upsampling, the LPF gets rid of the image and otherwise the interpolation chain is unable to meet the unwanted emission specification. Fig. 26 shows the unwanted emission when the images are not removed (left) and when the images are rejected (right). If the images are not rejected, the SEM and SE standards are not satisfied and also the ACLR standard is not satisfied. Otherwise, all the standards can be met. Two measurement BWs are used, 30kHz and 1MHz.

IV. HARDWARE DESIGN OF THE DFE RX AND THE DFE TX
After the algorithm design of the DFE Rx and Tx, the hardware design of the DFE Rx and Tx follows.

A. DFE RX
The DFE Rx consists of the anti-drooping filter, the decimation chain, the DCOC circuit, and the IQ estimation and compensation circuits. The anti-drooping filter hardware is a 51-tap transposed FIR filter which by symmetry has 26 coefficients, as shown in Fig. 27. Filter coefficient values were obtained on the basis of the Parks-McClellan optimal FIR filter algorithm [34]. The filter is required to run at <16ns clock cycle. The red path in Fig. 27 is the critical path consisting of one multiplier and one adder, which has a delay of about 13.72ns in a 180nm CMOS ASIC. The sampling frequency at the DFE Rx input is 61.44MHz for channel BWs of 20MHz, 15MHz, and 10MHz while it is 30.72MHz for channel BWs of 5MHz, 3MHz, 1.4MHz, and 3.84MHz. The design principle, strategy, and procedure of the decimation chain and digital filters designed in this work are extensively explained in [3] authored by us. The No. of bits in the integer part and the fractional part is determined methodically, which is also explained in [3].
The LPF, the CSF, and the RRC filter in the decimation chain all take on the transposed symmetric FIR filter type.   Among them, the CSF hardware in the decimation chain is shown in Fig. 28. A programmable FIR filter is used to have different coefficients for different BWs. If the control input, hz, is 1, the No. of coefficients is even and if hz is 0, the No. of coefficients is odd. The total No. of taps is 89 and by utilizing symmetry, only 45 taps are implemented.   The No. of filter taps for each of the anti-drooping (anti-D) filter, the LPFs (FIR1 -FIR4) with downsamplers, and the CSF is listed in Fig. 29. The CSF in 3G (BW = 3.84MHz) is actually the RRC filter. The bypass in Fig. 29 means all the coefficients are unity. The maximum No. of taps is chosen for each filter to implement the hardware structure: 51 taps for the anti-D filter, 7 taps for FIR1, 8 taps for FIR2, 10 taps for FIR3, 63 taps for FIR4, and 89 taps for the CSF. Fig. 30 shows the simplified hardware structure of the antidrooping filter + the decimation chain, with the No. of taps displayed for each filter. The downward arrow followed by 2 denotes downsampling by 2 preceded by lowpass filtering. FIR1 runs at the front-most of the decimation chain, which needs to run at a clock cycle of <16ns and is implemented in 6000 gate equivalent (GE) (with reference to a two-input NAND gate) while meeting all the requirements over 7 channel BWs. FIR2 also runs at the <16ns clock cycle and is implemented in 10000 GE, while meeting all the requirements of 7 modes. FIR3 can operate also at the <16ns clock period and is implemented in 12000 GE while FIR4 has the largest No. of filter coefficients among the LPFs and is implemented in 60000 GE. The CSF has the largest area among the filters and operates at different speeds than the LPF and the anti-drooping filter. The CSF has only to run at a <32ns clock cycle and exhibits 80000 GE.
The hardware structure of the DCOC circuit is shown in Fig. 31. Both the IIR filter and the moving average are implemented with the 10ms counter and the multiplexer which chooses one of the two estimated DC values according to the counter value. As the result of logic synthesis, the critical path is found as the path where the IIR filter coefficient is multiplied and then added, denoted in red in Fig. 31. The required delay in the DCOC circuit is 16ns and the delay of the synthesized circuit is smaller than 16ns for all the 7 channel BWs in which the circuit is to run at 61.44MHz or 30.72MHz.
The hardware structure of the existing IQ estimation circuit is shown in Fig. 32. Dividers and square root (SQRT) circuits implemented in CORDICs together with multipliers and shifting circuits are included in the structure. With a clock period of 100ns and no compile optimization, the critical path of the synthesized logic is from the shifting block to CORDIC SQRT, whose delay is 22.09ns. The four outputs from top to bottom correspond to the (1, 2) element (the 1st row and the 2nd column element), the (2, 2) element, the (1, 1) element, and the (2, 1) element in the 2 × 2 matrix in Eq. (6). The red shaded blocks are removed in the proposed IQ estimation circuit. Fig. 33 shows the hardware structure of the proposed IQ estimation circuit. The newly added squaring circuit is highlighted in blue. This circuit has a word length of (16,25) whereas the squaring circuit in Fig. 32 has a word length of (2, 25), where (a, b) is defined as a = No. of integer bits and b = No. of fractional bits. The critical path in the proposed circuit is from the squaring circuit to CORDIC SQRT, whose delay is 29.99ns, larger than the delay in the existing circuit. However, the circuit meets the sampling frequency   requirement (33ns clock cycle) and hence does not incur any problem. The two outputs (upper and lower) correspond to the (1, 1) element and the (2, 1) element in the 2 × 2 matrix in Eq. (7), which are fed to the IQ compensation block. The IQ estimation circuit operates 10ms after DC offset cancellation. For the first 1ms after the IQ estimation  sets in, only the accumulator will run, and subsequently all the building blocks will run. Accumulation times will be different for different channel BWs since the sample rates differ. In the CORDIC divider, the denominator should be larger than the numerator and thus the left shift by, say, 4 operation is used to increase the denominator and then the divider output is left shifted by 4 to return to the original scale. The CORDIC SQRT should have an input range from 0.5 to 2 and thus the shifting amounts change as the position where the most significant bit (MSB) 1 appears changes. In the CORDIC SQRT, if the input is right shifted by, say, 4, then the output is left shifted by 2 to return to the original scale.
The IQ estimation and compensation circuit should run while meeting the specification of <32.55ns clock period (or >30.72MHz clock rate) over all the 7 channel BWs or modes. The circuit runs successfully from the clock period of 18ns to 10ns while consuming the area of 20000 to 24000 GE for IQ estimation and 7500 to 9500 GE for IQ compensation.

B. DFE TX
The DFE Tx mainly consists of the anti-drooping filter and the interpolation chain. The anti-drooping filter is a 21-tap transposed FIR filter and by symmetry has 11 coefficients, as shown in Fig. 34. The filter operates in front of the interpolation chain at a clock cycle of 30ns. The red path shows the critical path when the circuit is synthesized in a 180nm CMOS ASIC. The critical path has a multiplier and an adder. The required operating frequencies determined by the filter input frequencies at BWs of 20MHz, 15MHz, 10MHz, 5MHz, 3MHz, 1.4MHz, and 3.84MHz are 30.72MHz, 30.72MHz, 15.36MHz, 7.68MHz, 3.84Mz, 1.92MHz, and 15.36MHz, respectively. Over the range of the critical path from 12ns down to 4ns as the required operating speed is increased, the circuit area varies little. However, as the required speed is further increased, the area rapidly increases since the anti-drooping filter with programmable coefficients necessitates considerable change to meet the required speed. Fig. 35 shows the No. of filter taps in the anti-drooping filter (anti-D), the Tx RRC filter, and the interpolation chain for 7 BWs. The interpolation chain can have up to 6 FIR LPFs (FIR1 to FIR6). The maximum numbers of filter taps in anti-D, RRC, FIR1, FIR2, FIR3, FIR4, FIR5, and FIR6 are 21, 81, 36, 11, 7, 5, 4, and 4, respectively. Each filter hardware is implemented according to the maximum No. of taps and the resulting hardware complexity of the DFE Tx in terms of the No. of filter taps is shown in Fig. 36. The upward arrow followed by 2 denotes lowpass filtering preceded by upsampling by 2. The RRC filter occupies almost half of the overall hardware complexity.
The FIR filters in the interpolation chain has transposed forms, among which the FIR1 LPF that has the maximum No. of filter taps is shown in Fig. 37. The FIR1 filter is designed on the basis of coefficient symmetry. Among the 7 BWs, the 3MHz BW has the maximum No. of taps, 36, and hence FIR1 has 18 coefficients by symmetry. Each FIR filter in the interpolation chain has a register for upsampling. The input register, in_tmp, has a control input, valid, such that if valid = 1, the input is passed on to the output of in_tmp whereas if valid = 0, the output is 0 so that the register serves as an upsampler. A 2-to-1 multiplexer is used to select either an even No. of coefficients or an odd No. of coefficients. The red path represents the critical path when the circuit is synthesized in a 180nm CMOS ASIC. In the critical path lie a 19 bit multiplier and a 20 bit adder between two flipflops. The critical path delay is 13.54ns, smaller than the required period, 16.27ns (61.44MHz when BWs are 20MHz and 15MHz). Programmable coefficients are used and the area changes little until the required clock cycle is reduced to 4ns but the area rapidly increases as the required clock period drops below 4ns.
The RRC filter and the other filters in the interpolation chain are synthesized and implemented in a similar fashion. The RRC filter is bypassed in 4G and is required to run at 15.36MHz (a clock cycle of about 60ns) in 3G. The RRC filter uses fixed coefficients and hence constant multipliers are employed. As a result, the area increases in proportion to the required speed. The FIR2 filter should operate at 122.88MHz (∼8ns) in case of the 20MHz and 15MHz BWs. FIR3 should run at 245.76MHz (∼4ns) for the 20MHz and 15MHz BWs. For FIR3, identical coefficients are used for BWs of 20MHz, 10MHz, and 5MHz and hence constant multipliers can be employed. FIR4 is bypassed when BWs are 20MHz and 15MHz and is required to operate at 245.76MHz in case the BW is 10MHz. The No. of taps in FIR4 is smaller than that in FIR1 or FIR2. FIR5 is used only when the BWs are 3MHz and 1.4MHz and is required to run at 122.88MHz for the 3MHz BW. FIR5 is bypassed when the BWs are 20MHz, 15MHz, 10MHz, 5MHz, and 3.84MHz and can be designed with constant multipliers. FIR6 is used only for the case when the BW is 1.4MHz and is bypassed for all the other BWs. FIR6 uses constant multipliers and is required to run at 122.88MHz. All the filters in the DFE Tx meet all of the requirements, as is the case with the DFE Rx filters.

V. SIMULATION & ASIC IMPLEMENTATION RESULTS OF THE OVERALL DFE RX AND DFE TX
In this section, the modem SNR performance is simulated for both the overall DFE Rx and the overall DFE Tx and the unwanted emission is simulated for the DFE Tx. Also the area and the speed are provided for the overall DFE Rx and the overall DFE Tx ASIC hardware and the unwanted emission is provided for the DFE Tx fixed-point ASIC hardware. Fig. 38 shows the simulation model and environment for the DFE Rx, where the DCOC circuit, the anti-drooping filter, the decimation chain, and the IQ estimation (est.) and compensation (comp.) circuit are all included in the DFE Rx. The BS modem is the base station modem and the UE modem the user equipment modem. RF impairments such as IQ imbalance, DC offset, and drooping are also modeled and the interference or blockers such as adjacent channel selectivity (ACS), inband blocker (IBB), and narrowband blocker (NBB) are taken into account. Fig. 39 shows the SNR at the modem output, which is obtained from the model in Fig. 38. The black line represents the simulated SNR when there is no interference, the solid lines denote the mean SNR and the dotted lines denote the minimum SNR. All the channel BWs and all the interference scenarios satisfy both the 40dB mean SNR requirement and the 20dB minimum SNR requirement of 3G and 4G.  To obtain a receive performance at a certain level (e.g., packet error rate <10%) for each modulation scheme, the required SNR should be satisfied at the output of the 1-tap equalizer in the modem (not at the output of the DFE Rx). In this work, the modem SNR is measured under various interference conditions in Fig. 39. The required SNR for each modulation is modem implementation-dependent but generally >3dB, >11dB, and >18dB for QPSK, 16QAM, and 64QAM, respectively. In this work, we designed the DFE Rx and Tx such that the mean SNR and the minimum SNR for the user BW are >40dB and >20dB, respectively, in order to support modulation schemes up to 64QAM with a given modem algorithm. Fig. 40 shows the simulation model and environment to verify the DFE Tx performance, where the anti-drooping filter, the RRC filter, and the interpolation chain made up of many LPFs are all included. The margin of the unwanted emission is measured after the UE modem signal passes through the DFE Tx and experiences RF drooping and the overall SNR is measured after the signal also passes through the BS modem. The red path is for 4G while the green path is for 3G. Fig. 41 shows the unwanted emission in case when the 4G BW is 15MHz. The anti-drooping filter together with only 3 LPFs in the interpolation chain is activated while the upsampler by 4 and the Tx RRC filter are dispensed with in this mode in order to realize the 15MHz 4G mode. The margin for both SEM and SE is >6dB, meeting the standard. Fig. 42 shows the unwanted emission for the 3G BW of 3.84MHz, which also satisfies the >6dB margin required in the standard regarding SEM and SE. In this mode, all the blocks are turned on except the last three LPFs in the interpolation chain. Fig. 43 shows the unwanted emission margin for each BW, which is measured according to the method used to obtain the results in Fig. 41 and Fig. 42. The SEM, SE, and ACLR requirements (the >6dB margin spec) of all the seven BWs or modes are satisfied be the designed DFE Tx. The SEM with the 1MHz measurement BW increases rapidly as the signal BW drops from 5MHz to 1.4MHz since the SEM PSD drops more rapidly as the signal BW drops in this region. Fig. 44 shows the modem SNR for each of the BWs, where the mean SNR and the minimum SNR specs are set identical to those of the DFE Rx. For 4G BWs, the blue line denotes the case when the anti-drooping filter is used while the red line denotes the case when the anti-drooping filter is not employed. The solid line is for the mean SNR and the dotted line for the minimum SNR. For the 3G BW, The blue O and the red O denote the mean SNR cases when the anti-drooping   filter is present and absent, respectively. The blue X and the red X denote the minimum SNR cases when the antidrooping filter is present and absent, respectively. The 4G modes always meet the spec whether or not the drooping is compensated for and also the SNR values are similar whether or not the anti-drooping filter is used. However, the 3G  mode cannot satisfy both the mean SNR and the minimum SNR specs in the absence of anti-drooping to compensate for the drooping. The DFE Tx should meet the error vector magnitude (EVM) of -40dB, corresponding to the mean SNR of 40dB. Fig. 45 shows the SNR variation when the MATLAB model is converted to hardware in case of the 3MHz channel BW. The quantization error gives rise to some SNR degradation in case of the hardware implementation, compared with the MATLAB model. Specifically, the degradation of the mean SNR (left half of Fig. 45) is about 2dB at worst and the degradation of the minimum SNR (right half of Fig. 45) is about 0.3dB at worst. For all the channel BWs and interference scenarios, the implemented ASIC hardware meets the 40dB mean SNR requirement and the 20dB minimum SNR requirement.

B. ASIC IMPLEMENTATION RESULTS OF THE OVERALL DFE RX AND DFE TX
Since each of the building block in the DFE Rx operates at its own speed, logic synthesis is conducted individually with each building block. The building block having the largest area is the CSF owing to the large No. of taps and heavy use of the multipliers and adders. The CSF uses different coefficients for different channel BWs and hence has little room to potentially reduce the area. Across all the scenarios, the operating speed of each block has a >2ns margin, as listed in Fig. 46. The area is denoted in the gate equivalent of two-input NAND gates. The slack is the difference between the spec delay and the actual delay. The area breakdown of the DFE Rx is drawn in Fig. 47. The CSF, the FIR4 filter, and the anti-drooping filter consume a large area. The circuit is implemented in a 180nm CMOS ASIC technology. For the given 0.18µm CMOS process technology, the two-input NAND gate has a 9.979µm2 area.
After the gate-level synthesis with a 180nm CMOS process technology, the leakage power is negligible and the overall dynamic power dissipation is as follows for each block. For the DFE Rx, the DCOC circuit consumes 4.32mW, the antidrooping filter 12.9mW, FIR1 1.97mW, FIR2 3.6mW, FIR3   6.3mW, FIR4 22.6mW, the CSF 42.6mW, the IQ estimation circuit 132.9mW, the IQ compensation circuit 9.8mW, and thus a total of 237mW is drained. For the DFE Tx, the anti-drooping filter consumes 12.9mW, the RRC filter 39.2mW, FIR1 11.3mW, FIR2 3.7mW, FIR3 2.5mW, FIR4 1.92mW, FIR5 1.54mW, FIR6 1.54mW, and thus a total of 75mW is drained. Fig. 48 compares the floating point simulation result and the fixed point hardware result of the unwanted emission (4G, 15MHz BW). The two results are similar with the SEM (30kHz and 1MHz measurement BWs) margin while the SE margin in hardware degrades by ∼10dB owing to the quantization effect (14 bit output) but still meets the spec. Fig. 49 compares the floating point result and the fixed point result of the unwanted emission margin for each BW. The unwanted emission margin in case of the fixed point hardware is lower or worse than that in case of the floating point simulation by ∼20dB for SE but still way above the standard (the >6dB margin spec). Fig. 50 compares the SNR from the floating point simulation and the SNR from the fixed point hardware for each BW. The solid blue line and the solid red line denote the mean SNR values from the floating point simulation and the fixed point hardware, respectively. The dotted blue line and the dotted red line denote the minimum SNR from the floating point simulation and the fixed point hardware, respectively. Owing to the quantization error in hardware, the mean SNR is degraded by 10dB to 22dB but still meets the 40dB requirement and also the minimum SNR is lowered by 20dB for the 1.4MHz BW but still meets the 20dB requirement. The SNR in 3G is affected little by the quantization noise since it is heavily affected by the ISI. Fig. 51 shows the area and the delay along with the spec delay and the slack (margin) of each block in the DFE Tx. Since each block operates at a different speed, the blocks are synthesized separately. The most area consuming block is the RRC filter owing to the large No. of taps with many  multipliers and adders. FIR3 and FIR4 have small timing margins or slacks, which may be relaxed more if cascaded integrator comb filters are employed. Fig. 52 is the pie chart for the area breakdown of the DFE Tx. The anti-drooping filter, the RRC filter, and the FIR1 LPF combinedly add up to ∼70% of the overall DFE Tx area. The logic is implemented in a 180nm CMOS process technology.
Purely digital circuits in CMOS process technologies can be easily compared with one another since the dependence on technology is clear. Specifically, according to the feature length of the CMOS very-large-scale integrated (VLSI) system, gate count or area, speed, and power consumption can be normalized. Namely, speed is inversely proportional to the feature length, area is proportional to the square of  the feature length, and power consumption is proportional to the feature length. Since performance, power, and area of the digital CMOS VLSI system scale very smoothly with the feature length, a digital circuit in a 180nm technology can be easily translated to the same digital circuit in a technology with another feature length for performance, power, and area prediction.

VI. CONCLUSION
A digital front-end receiver, comprising a DC offset cancellation circuit, an anti-drooping filter, a decimation chain, and an in-phase and quadrature estimation and compensation circuit, is modeled, designed, and implemented from algorithm to hardware, for the purpose of converting the sample rate and digitally compensating for RF impairments. Also a digital front-end counterpart in the transmitter, comprising an upsampler, an anti-drooping filter, a root raised cosine filter, and an interpolation chain, is modeled, designed, and implemented to raise the sample rate and reject the unwanted emission. The inter-symbol interference is modeled and analyzed in the transceiver to meet the SNR requirements and all the seven BWs or modes in 3G and 4G are supported with low complexity for wearable devices. The proposed ASICs are implemented in a 180nm CMOS process technology.