A Novel Parallel Digitizer With a Pulseless Mixing-Filtering-Processing Architecture and Its Implementation in a SiGe HBT Technology at 40GS/s

Mixing-Filtering-Processing (MFP) digitizers are a class of high-speed digitizers, employing mixers, filters and data converters to obtain high sampling frequencies and large bandwidths. We propose a variant of the Asynchronous Time Interleaving (ATI) architecture which employs synchronous frequencies for the mixers and ADCs, allowing simplified correction of aliasing mismatches via linear filters whose coefficients can be estimated via single-tone tests using well-known linear estimation algorithms. The architecture uses rectangular waves with 50% duty cycle to simplify the hardware implementation and maximize the signal-to-noise ratio of the front-end, thus obtaining a very simple structure requiring few high-frequency analog blocks to implement very fast digitizers: clock dividers, mixers, I/O buffers, and lowpass filters are all that is required to perform MFP digitization, besides the back-end ADCs and the (linear) signal processing for aliasing removal. The proposed architecture is also cascadable and allows the design of multi-channel (4, 8 or more channels) hierarchical MFP digitizers: using the same chip, multiple front-ends can be cascaded to obtain more channels with narrower bandwidth, which can finally be digitized by slower ADCs. The front-end of the two-channel digitizer has been designed in the STMicroelectronics SiGe BiCMOS55 technology, measured, and calibrated. Results prove that aliasing-correction filters can be synthetized and that overall accuracy, after the removal of aliasing terms, is limited by noise and distortions to about 5 equivalent bits from 0 to 20GHz, experimentally validating the calibration technique for mismatch errors.

of GHz, and high sampling speed, to comply with the Nyquist sampling condition.
Given the stringent bandwidth and sampling frequency requirements, some form of time-interleaving is used. Conventional time-interleaving (TI) architectures [3], [17], [18] use two or more analog-to-digital converters (ADCs) in parallel, driven by delayed clocks at lower rates, to reconstruct the input signal at a higher overall sampling frequency. However, each ADC operates on the full input signal, so that the ADCs must have an input bandwidth as large as the required system's bandwidth. The input bandwidth requirement of the ADC can be relaxed by using an input wideband track & hold [13].
All interleaved architectures, including DBI, FI and ATI, are affected by aliasing caused by channel mismatches [13], [22], [23], [24], which need to be corrected in digital post-processing to maximize the system's SNDR. Conventional time-interleaving can remove aliasing via linear filtering, for instance via FIR filters, so that the parameter estimation and real-time correction problems are greatly simplified [17], [18]. The estimation of the filter coefficients can be approached via linear estimation techniques such as L1, L2, or L∞ norms, which require solving linear systems and/or linear programming problems [29]. This is in general not true for all types of MFP digitizers, which may require solving nonlinear nonconvex [20] optimization problems which may greatly complicate the estimation and correction of aliasing errors, or use backpropagation to account for nonlinearities in the parameter space due to cascaded blocks [28], or iterative Gauss-Siedel approximations [26].
An MFP digitizer that allows simple signal reconstruction and calibration of mismatch errors, exploiting the synchronicity of the clock used in the mixers and the ADCs, has been proposed by some of the Authors in [23]. It is a variant of the ATI architecture that uses a single clock tree to drive the mixers and the ADCs, which operate at the same frequency and with a constant relative delay of half sampling period, as in the conventional TI architectures. In the absence of mismatches, no signal processing is required to reconstruct the output, other than upsampling and interleaving, whereas in the real case, where mismatches between channels are present, a FIR filter can be used to reduce aliasing via linear signal processing, as in conventional TI digitizers. The filter coefficients can be determined using low-cost and numerically stable convex linear optimization techniques, which are guaranteed to find the global optimum and use algorithms of polynomial complexity [29].
In this paper we present and validate an implementation of this synchronous MFP digitizer architecture, including its calibration procedure. In this proposed implementation, a clock divider is used to create a rectangular clock wave with 50% duty cycle, to be used in the mixers instead of the pulse train typically used in the ATI approaches. This simplifies the architecture and has advantages in terms of signal-to-noise ratio (SNR), at the cost of a slight gain loss in the second half of the Nyquist band of the digitizer, which can however be easily compensated in the digital domain with conventional FIR equalization. The resulting hardware structure is very simple, since a clock divider with limiters, two mixers, two lowpass filters and four output buffers are enough to compose the front-end, and no pulse generators are necessary to create something akin to a ''Dirac'' train pulse.
The use of a synchronous architecture also allows a modular design, because several stages can be cascaded to obtain multi-layered hierarchical [23] MFP architectures with 4, 8 or potentially more channels, since the input clock is frequency-divided by two and used to drive the mixers, and is then buffered out of the chip to drive other MFP front-ends (which will further divide the clocks by two) or the final ADCs. Hence, the front-end chip is cascadable and allows creating a 2 L -channel MFP digitizer with an output bandwidth of f S /2 L+1 , where f S is the input clock frequency. This allows a single chip to be used with different ADCs -operating at f S /2, f S /4, . . . f S /2 L sampling frequency, depending on the number of output channelsand of course on the sampling frequency of the available ADCs. Modularity and cascadability are possible because the input signal is split into two output signals, and also the input clock is split into two output clocks: hence, the MFP front-end chip can drive either two other MFP chips operating at half frequency, or the final ADCs. At each layer, the number of channels doubles, and each channel has half the bandwidth and can thus be digitized at half the sampling rate.
To validate the architecture, we have designed a 40GS/s MFP digitizer front-end with an input bandwidth of 20GHz and two outputs with 10GHz bandwidth. The chip, implemented in the STMicroelectronics BiCMOS55 process using HBT devices [30], can be used in a hierarchical architecture [23] to further reduce the output bandwidth, obtaining for instance four 5GHz output signals, which can be digitized via commercial off-the-shelf (COTS) components [31], [32]. The front-end comprises a clock divider [33] and two mixers, and only the lowpass filters [34] and the ADCs need to be added at the outputs to complete the entire digitizer (which also requires real-time digital signal processing for aliasing correction and eventually equalization [17], [18], [22], [23], [24]).
The chip has been measured and the acquired data have been processed to assess noise, distortions, and aliasing before and after calibration. The calibration procedure proposed in [24] has been experimentally validated and allows improving the SNDR by up to 15dB at low frequencies.
The ensuing digitizer is dominated by noise and distortions, whereas aliasing becomes negligible after calibration, which is performed using FIR filters [35] estimated from the acquired data. Section II describes the proposed MFP architecture, highlighting the properties and advantages of the pulseless MFP architecture and the use of a single clock tree. Section III describes the aliasing removal and correction filter synthesis process. Section IV describes the chip and board design. Section V reports the results of chip measurements and subsequent signal processing, with and without calibration for aliasing removal. Section VI concludes. Figure 1 shows the architecture of a synchronous two-channel MFP digitizer. The proposed MFP architecture is an evolution of the ATI architecture [13], but it is simplified in terms of analog blocks and digital signal processing, and is cascadable to obtain multi-channel hierarchical MFPs [23] using the same hardware (except for the LPF cut-off frequency, which must be halved at each layer). The proposed architecture uses the same clock for the mixers and ADCs, so that no digital modulators are required for calibration, and fewer spurs are present at the output: aliasing removal only requires (cyclo-stationary) linear FIR filters to equalize the channels, and simple linear estimation techniques with single-tone sinusoidal input waveforms can be used for calibration [24].

II. THE PROPOSED MFP ARCHITECTURE
The RF front-end is simple: one clock divider, two mixers, and two lowpass filters are all that is required to turn a 20GHz signal into two 10GHz signals, to be digitized by 20GS/s ADCs. Furthermore, the outputs of the lowpass filters and of the clock divider can be used to drive two identical MFP frontends, obtaining four 5GHz signals, requiring 10GS/s ADCs. This process can be repeated multiple times, doubling the number of outputs and halving the required ADC sampling frequency and input bandwidth at each iteration [23]: the so called ''hierarchical'' architecture would be composed of identical front-ends, except that the required lowpass filter cutoff frequency is halved at each iteration (10, 5, 2.5GHz. . . ).
The input clock at 40GS/s is divided into two output clocks at 20GS/s, which directly feed the mixers and ADCs, or the subsequent front-end stage. The mixers operate with a 50%-duty cycle square wave, and the mixers and ADCs have the same clock frequency and relative delay difference. Driving the mixers with 50% duty-cycle waves, it is possible to maximize signal power (as explained in Section II-A) with respect to noise, at the cost of a gain loss at high frequency which can be easily equalized using the same linear filters employed in aliasing removal (as explained in Section II-B).
The input signal (with 20GHz bandwidth) is fed to two on/off mixers, operating in counterphase at 20GS/s. Because the mixers do not fulfil the Nyquist condition, aliasing occurs and the 0-10GHz input bandwidth is superposed to the 10-20GHz input bandwidth [22].
The mixer is not a conventional mixer, because its output is either equal to the input (except eventually for a gain factor) or zero, depending on the clock phase. Hence, it should be modeled as a multiplier of the input signal with a square wave that alternates between 0 and 1.
The lowpass filter (with cut-off frequency of 10GHz) removes the frequency content beyond 10GHz: the remaining two 0-10GHz signals at the filter outputs contain both the 0-10GHz and the 10-20GHz input signal bands, superposed. Because the mixers operate in phase opposition, the superposition of the two half bands occurs with different coefficients [22] and can thus be disentangled in the digital domain after the acquisition of the filter outputs.
After the lowpass filters, two ADCs operating at 20GS/s in phase opposition digitize the signals. Hence, the ADCs operate in their first Nyquist band, as their input bandwidth is 0 to 10GHz.
The digital signal processing (DSP) required for calibration only requires upsampling by 2 and cyclo-stationary FIR filtering, as in conventional TI-ADCs, instead of digital lowpass filters, upsamplers, and mixers as in [14].
The proposed architecture is analyzed in Section II-A, where the implications of using a square wave clock are considered. The calibration of aliasing artifacts due to mismatches is reported in Section II-B. A comparison with the complexity of similar interleaved ADCs is reported in Section II-C.

A. THE PULSELESS ARCHITECTURE WITH SYNCHRONOUS FREQUENCIES
The mixer in the MFP architecture in Figure 1 operates as a track and hold whose output is either the input signal (when the clock is high) or zero (when it is low). Hence, it is equivalent to multiplying the input signal by a square wave, which we call p (t). Because it will be implemented as a (modified) Gilbert cell driven in saturation to maximize the gain, the mixer is either on or off. We assume that the duty cycle of the square wave is D ∈ [0, 1], shown in Figure 2 for the two cases of D = 0.1, 0.5. In the ideal MFP theory, the pulse train is composed of Dirac pulses. However, Dirac pulses do not exist, so that the square wave will alternate between 0 and 1, and its area will depend on D.
MFP theory shows that the Fourier series of the pulse shape influences the digitizer's frequency response, causing a gain variation in the first and second halves of the Nyquist band.
The first two terms of the Fourier series of a square wave of duty cycle D and period T are: Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. Hence, the gain in the first half of the Nyquist band will be proportional to D, and in the second half to sin (Dπ) π. If the duty cycle falls to zero, the first two Fourier terms will be identical, so that the gain of the MFP digitizer will be mostly flat. If the duty cycle is 50% (D = 0.5), however, the first term will be 1/2, and the second term 1/π.
The interesting point is that for D → 0 the gain will be flat but very low, because the pulse train will have very little energy and the output of the mixer will be 0 most of the time. In this case, the noise performance of the system will be inadequate. However, if D = 0.5, the mixer output power will be maximized, thus improving noise performance, but there will be a gain loss of 2/π between the first and second halves of the Nyquist band. Because of this, the architecture cannot provide flat gain, but this gain loss can be compensated with digital equalization.
Generating a 50% duty-cycle square wave is easy, because only a clock divider by two is required, hence the circuitry is significantly simplified: a single clock divider will drive two mixers in saturation, and eventually two lowpass filters can be added at the output of the mixers.
The architecture only employs a single clock tree, starting with a 40GHz clock and producing two output clocks at half the input frequency. In a 2-channel architecture, the clocks will drive two 20GS/s ADCs: this architecture is thus synchronous in frequency. However, the relative delay between the mixer clocks and the ADC clocks will impact the frequency response of the system, because the lowpass filters with frequency responses L (f ) will have an equivalent frequency response L ′ (f ) = L (f ) e −j2π τ . Because of how the MFP system operates, the system's frequency response will depend on L ′ (f ) and its aliased counterpart, L ′ f S 2 − f . The end result is not a mere delay in the frequency response, because of the aliased delay component, whose phase is no longer linear in frequency. The system's frequency response can be significantly affected by a delay, because the aliased delay can cause destructive interference and create a zero in the frequency response around the center of the Nyquist band. It is thus of the essence to take care of this delay by properly sizing the transmission lines in the actual system. Of course, short delays will have a negligible impact, but long cables can be an issue.
The architecture is cascadable and thus modular: a single front-end with 40GHz input clock can split a 20GHz signal into two 10GHz signals, and two additional front-ends will operate with the 20GHz output clocks and the two 10GHz output signals of the first stage, to produce four 5GHz signals and four 10GHz clocks.
These clocks can then be used to create a 4-channel hierarchical MFP, if terminated by four 5GHz lowpass filters and four 10GS/s ADCs, or the hierarchical architecture can be extended to an eight-channel architecture with 2.5GHz signals and requiring 5GS/s ADCs.

B. ALIASING REMOVAL VIA DIGITAL CALIBRATION
In the following, we use the term ''calibration'' for aliasing removal, and ''equalization'' for gain and phase error correction. Both operations are performed by linear FIR filtering in the digital domain. The theory of the two-channel MFP digitizer [22], [23], [24] reveals that the output, after the ADCs, can be reconstructed with linear signal processing, i.e., FIR filters, estimated by convex optimization methods which are numerically efficient and stable [29]. In the absence of mismatches, no aliasing occurs, and the input signal can be reconstructed by simple interleaving of the ADC outputs (after upsampling by 2 to 40GS/s, which requires no signal processing hardware). If mismatches are present, as is always the case in actual systems, a linear FIR filter operating on the second channel can be used to equalize the second channel with respect to the first, thus eliminating aliasing. Figure 3 shows the DSP required by calibration and equalization; two possible implementations of the processing are considered. In Figure 3a, equalization is performed after calibration, because H 1 removes aliasing, and H eq removes the remaining linear errors. In Figure 3b, the filters H ′ 0 = H eq and H ′ 1 = H 1 H eq perform calibration and equalization at 75660 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
the same time. The two schemes are equivalent. The FIR filters, operating after upsampling by 2, can be implemented in polyphase form [35] because half the inputs are zero, thus halving the total computational cost. In this paper, we use the scheme in Figure 3a, first removing aliasing and then equalizing the system's frequency response. Because the two schemes are equivalent, once the filters are identified both can be implemented. Because the correction filters in the ideal cases (i.e., in the absence of mismatches) are just gain terms (or, equivalently, FIR filters of length 1, which is an all-pass frequency response with flat gain and zero phase), if the mismatch errors are small the filters will almost be all-pass, whereas H eq will need to equalize the system's frequency response, including the impact of the 50%-duty cycle clock discussed in the previous sub-section. The DSP block reported in Figure 3 requires the estimation of the FIR filter coefficients, which need to be estimated from the data. In this sub-section we focus on aliasing removal, so that we assume H 0 = 1 and only synthetize H 1 . We use the angular frequency ω to express the input frequencies f , with mapping ω = 2π f /f S , where f S = 40GS/s is the system sampling frequency: as the input frequency goes from 0 to 20GHz, the angular frequency goes from 0 to π.
If an input signal at frequency ω i ∈ [0, π] is fed to the MFP system, the ADC outputs will have two tones after upsampling [24], [35]. One will be at frequency ω i , the other at frequency π − ω i . If no mismatch is present, the main (ω i ) and aliasing (ω a = π − ω i ) signals are present at both ADCs with the same phase or in phase opposition, respectively. Hence, the sum of the ADC outputs will cancel the aliasing tone, and only the main tone will remain (as they sum in phase). In the presence of mismatches, however, the aliasing terms will not cancel each other after summing, and cyclo-stationary filtering will be required for aliasing removal. In this case, we filter the output of the second channel y 1 [n] with a filter of impulse response h 1 [n], before summing it with the output of the first channel, y 0 [n] (see Figure 3): The calibrated output z [n] will in general have both the main tone at frequency ω i , and the aliasing term at frequency ω a . However, there will be an ideal filter h 1 [n] which will cancel the aliasing tone, and only leave the correct input tone, eventually with a gain and phase error (as a single filter cannot perform both aliasing cancellation and equalization [22]).
If we consider only the phasors, given an input frequency ω i , the output of the j-th ADC, with j = 0, 1, will have two tones, one at frequency ω i , and one at frequency ω a . Hence, the frequency response of the correction filter only matters at these two frequencies. Calling ij0 the main output at input frequency ω i and ADC j, and ij1 the aliasing output for the same frequency and ADC, the main and aliasing phasors for the calibrated output z will be: For calibration, i.e., aliasing removal, we need the second term to be zero, hence we can compute the ideal frequency response at frequency ω a given the input at frequency ω i , as: It is not possible to estimate the ideal frequency response at ω i = π/2, because the aliasing and main tone will be superposed, and the phasors cannot be estimated. In a real experimental setting, where distortions are present, also the frequencies ω i = π/4, 3π/4 should be avoided, because the third-order distortion terms will fall at the same frequencies as the main or aliasing tones.
A FIR filter with coefficients h l , l = −L, −L + 1, . . . , 0, 1, . . . , L has 2L +1 free parameters to be estimated, and its frequency response will be: Such filter can be used to approximate the required FIR filter H 1 (ω) with a given accuracy, which will mostly depend on the filter length L and the smoothness of the required frequency response H 1 (ω). Hence, given H 1 (ω) and L, the synthetized filter will have a given error. We need to choose the filter coefficients h l which minimize some norm of the error E 1 (ω) = H fir 1 (ω) − H 1 (ω). The norms which can be used are the L1 norm |E 1 (ω)|, the L2 (Euclidean) norm |E 1 (ω)| 2 , or the L∞ norm |E 1 (ω)|. These optimization problems can be solved with convex programming algorithms [29], such as least squares or linear programming. The summations and the maximum are computed over the frequency points ω n which have been used as input frequencies in the tests. With 2L + 1 free parameters, the number of input frequency points N must be sufficiently larger than the number of parameters to allow identification.
The architecture we propose allows calibration with simple sinusoidal test signals spanning the entire Nyquist band of the digitizer. Multiple acquisitions of the outputs allow computing the optimal correction filters and synthetizing them as FIR filters using linear convex techniques of quadratic or otherwise polynomial complexity, such as least squares or linear programming [29]. Convergence is guaranteed, as a single optimum exists and all these algorithms are wellknown, numerically stable, and computationally inexpensive. Once the correction filters have been synthetized, each channel (but one, if equalization is not required) will require a FIR filter, which can be implemented in polyphase form to exploit the fact that each ADC output is upsampled to the system's sampling frequency, to save computational resources.
On the other hand, [20] uses a complex non-convex particle swarm algorithm for a FI architecture. The ATI architecture in [14] is provided with no details on calibration, but the presence of modulators in the signal processing backend makes it likely that linear algorithms cannot be applied. Furthermore, the presence of two asynchronous clock domains may cause interference spurs and create mismatch aliasing spurs at additional frequencies, and decimators and interpolators are required [22].
The DBI reported in [21] uses many relatively long (more than a hundred coefficients) filters and several FFT blocks, occupying more than 5,000 DSP units on an FPGA with 40 parallel channels (500MHz clock for 20GS/s system sampling frequency). The FI digitizers in [25], [26], and [27] employ Gauss-Siedel iterations and thus require multiple blocks with several FIR filters, with a cost of a few hundreds of multiplications per sample only for correction. Real time estimation [26], [28] appears too expensive to be realistically implemented, requiring more than a thousand products per sample [26], which imply tens of TFLOPS of computing power.
It will be shown in Section IV that our implementation requires a few tens of products per sample in real-time, which is feasible for top-end modern FPGAs. Furthermore, parameter estimation requires simple test hardware (a sinusoidal generator) and simple and robust estimation algorithms (such as least squares).

III. CHIP AND BOARD DESIGN
This Section describes the chip and board design of the 40GSps MFP system, implemented in the STMicroelectronics BiCMOS55 SiGe process [30]. The chip only employs HBT NPN devices, besides passive devices (resistors and capacitors). The board is composed of an alumina core and a Rogers board hosting the external connectors.

A. INTEGRATED CIRCUIT DESIGN
The integrated circuit includes the core components of the MFP front-end: with reference to the block scheme in Figure 4, the chip includes the frequency divider, to generate the clock signals with 50% duty cycle starting from the system clock at 40GHz, the mixers, and the output buffers for the mixer outputs and the divided clock. The 10GHz lowpass filters have been left outside the chip for more generality, but a possible implementation in the same technology has been designed and tested [34] with positive results.
The frequency divider exploits the static frequency divider (SFD) architecture and is implemented by a D-type flip-flop (DFF) in Current-Mode Logic (CML) style closed in negative feedback (D =Q).
The DFF is based on a master-slave architecture, with two cascaded D-latches driven by opposite clock signals, as shown in Figure 5. Figure 6 shows the topology of   the CML D-latch, which is easily derived from the XOR gate. The divider was designed to guarantee worst-case operation at 40GHz, and without exploiting inductive peaking to minimize area footprint. The detailed design is described in [33], where measured performance of the divider is also reported, and involves the choice both of a suitable output swing and of transistor sizes and bias currents. The layout 75662 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  of the divider was also optimized, as better discussed in [33], to maximize symmetry (so as to obtain an accurate duty cycle of 50%) and minimize the length of the interconnection lines driven by the collectors of transistors in the latches, whose parasitic capacitances affect the divider speed. The absence of peaking inductors allowed to minimize the area footprint of the latches (30 × 65 µm 2 ), that were placed side by side to minimize interconnects, as shown in Figure 7.
The divider is followed by two limiting amplifier stages, used both to convert the sinusoidal output waveform in a square wave and for clock distribution inside the chip. The limiting amplifiers are implemented as simple cascoded differential pairs. The first stage is followed by an emitter follower, used for level shifting and to better drive the following stage; a separate second stage is used for each mixer and for the clock output, to minimize the length of the interconnects towards the following blocks. The design allows achieving 6.6ps rise times at the input of the mixer block.
The mixer block, shown in Figure 8, is based on the Gilbert multiplier topology. Unlike a standard mixer where the input signal gets multiplied by a square wave alternating between 1 and −1, the mixer block required in the MFP architecture is essentially a square wave sampler that lets the input signal pass toward the output only for half period. A multiplication by a square wave alternating between 0 and 1 has therefore to be implemented, and this can be easily obtained starting from the Gilbert topology, exploiting the differential nature of the signals. With reference to Figure 8, the lower differential pair acts as a transconductor on the input signal, and a degeneration resistor is exploited to achieve a good linearity and unity gain. The upper-level transistors act as switches controlled by the differential clock signal: when the clock signal is high, the output currents of the lower transconductor reach the load resistors through current buffers (Q m3 -Q m4 and Q m6 -Q m7 ) providing the required differential output voltage, that is approximately (R L /R E ) v id , where R L = R mL1 = R mL2 and R E = R mE1 = R mE2 . During the negative half period of the clock, transistors Q m5 , Q m6 , Q m9 , Q m10 are on, and their crossed connection provides zero differential voltage with a constant output common-mode voltage, apart from the effects of mismatches. The design of the mixer has been optimized to minimize distortions, achieving -53dB HD3 for a full-scale 800mVpp differential input. Figure 9 shows the layout of the chip, that results pad limited to allow easy testing and fine tuning of the different blocks; the core area, excluding the output buffers, occupies 530 × 220µm 2 , and differential transmission lines are used to connect the pads. The overall circuit power consumption is about 640mW from a 3V power supply. Table 1 details the consumption of the different blocks; it has to be noted that a redundant and non-optimized bias network has been used to allow easy testing and tuning of the blocks.

B. BOARD DESIGN
A test board has been designed and fabricated on a 10mil lowloss Rogers 4350B substrate: a rectangular cut in the board allows hosting of the SiGe die and of an interface alumina board. In fact, the small pitch of RF pads on the die would require long bonding wires for interconnection to the Rogers board pads, so leading to performance degradation at the upper side of the operation bandwidth. An alumina interface board, that can show pad pitch comparable to the one of the die, has been designed in order to provide interconnection with shorter bonding wires. The interconnection between the alumina and the Rogers board is obtained by exploiting metal strips. Finally, in order to ensure a better grounding of the board, a 50µm thermally and electrically conductive adhesive film has been used to bond the Rogers board to a 1mm copper backplane, which provides mechanical support. The interface alumina board and the SiGe die have been glued to the copper VOLUME 11, 2023

IV. MEASURED RESULTS
This Section describes the measured results before and after signal processing for aliasing cancellation and signal reconstruction.

A. DESCRIPTION OF THE EXPERIMENTAL SETUP
The MFP system comprises several elements. The clock divider [33], mixers and input/output buffers are integrated in the chip. The lowpass filters [34] have not been integrated to allow maximum flexibility in the design of the digitizer: for instance, two 10GHz lowpass filters are required in a 40GS/s 2-channel system, but four 5GHz filters would be required in a 4-channel system at the same sampling frequency [23]. The front-end may be capable of operating up to 60GS/s [33], but the lowpass filters would then need a bandwidth of 15GHz. Hence, integrating the filters reduces the flexibility in testing the system. Figure 11 shows the photograph of the test chip, and Figure 12 shows a block scheme of the measurement setup: high-quality signal generators have been used both for the 40GHz clock (Anritsu MG 3697C) and for the input data signal (Rohde & Schwarz SMR20), and wideband baluns have been used for single-ended to differential conversion. The differential outputs of the mixers are directly connected to the inputs of a Tektronix DSA8300 digital sampling oscilloscope.
Because the measured front-end lacks the lowpass filters, ADCs, and digital signal processing subsystems, they have been simulated using Matlab, based on the measured outputs of the front-end, which have been sampled at 320GS/s by the oscilloscope (in equivalent time sampling mode).
Sinusoidal input signals at frequencies from 1 to 19GHz have been applied, and 109 different input frequencies have been acquired to allow a fine-grained coverage of the whole spectrum. In [22] simulated results on a 2-channel MFP were used to calibrate also a wideband multi-tone signal. The theory of linear mismatches makes no difference between signals, because a mismatched MFP system remains linear, though time-varying. The same holds for additive noise, but of course not to distortions. In our experimental setup we cannot provide a multi-tone signal from 0 to 19GHz, as only two sinusoidal generators (one for the clock, one for the signal) are available. Hence, we provide experimental data only with single-tone signals. However, as far as aliasing or additive noise are the main error sources, as in our case (see the experimental results below), the presence of multiple simultaneous signals, or of modulated signals, will not affect the behavior of the digitizer. The same wouldn't hold, of course, in the case of nonlinear distortions, which are however not the dominant error sources in our experimental setup.
The lowpass filters are simulated as IIR filters operating at 320GS/s, and their frequency response is similar to that of the original analog filters [34]. The ADCs operate at 20GS/s, i.e., at one sixteenth of the acquisition frequency of the oscilloscope, and hence are obtained by downsampling by 16. Because the ADCs operate in phase opposition, the two channels have a frequency offset of 8 samples (25ps, equivalent to 40GHz). Finally, the digital signal processor performs calibration and analyzes the data to estimate gain, aliasing, distortions, and noise before and after calibration.
Because linear calibration only corrects aliasing, it is expected that distortions and noise will not be affected by calibration, but aliasing will be significantly reduced. Hence, the signal-to-noise-and-distortion ratio (SNDR), initially dominated by mismatches, will gradually saturate toward the limit set by noise and distortions. Furthermore, because calibration does not perform equalization, the overall system will have a certain gain for every input frequency, which will not be affected by calibration, but will be corrected afterward by linear equalization with a FIR filter. Figure 13 shows a typical spectrum of the MFP, for an input signal of 5.8GHz. The tones at 0 and 20GHz are due to offset and offset mismatch, are of no concern for many applications, such as bandwidth monitoring, and can be both easily removed by forcing the mean value of both channels to zero. The tone at 14.2GHz is due to aliasing and dominates the spurious-free dynamic range (SFDR) of the digitizer. The aliasing-free dynamic range (AFDR) of the MFP system is only 24.4dB before calibration. The third-order harmonic distortion of the input signal falls at 17.4GHz and is −46.6dB, about 20dB more than the AFDR. The potential for accuracy improvement via linear calibration is thus significant.

B. ALIASING ANALYSIS BEFORE CALIBRATION
There are tones around the main tone which cannot be explained in terms of offsets, aliasing, or distortions. They are probably due to some feedthrough in the setup, possibly some synchronization or digital signal. Because there is nothing in the MFP operating at those frequencies, or at that frequency difference from the main tone in case they are intermodulation products, these tones are not produced by the MFP front-end.
Finally, noise can be estimated after removal of the offset, signal, aliasing, and distortion components. The estimated SNR at this frequency is 36dB, including both the MFP's and setup noise. However, the latter has been proven to be negligible: acquisitions taken after switching off the IC show about 10dB lower noise than acquisitions taken with the chip on, hence, noise produced by the MFP is only about 0.5dB lower than measured, as 90% of the noise comes from the chip. However, some additional noise can be produced by the input source.
Similar results can be observed for different input signals, with noise typically dominating distortions, and aliasing dominating both at most frequencies before calibration. Figure 14 shows the frequency response in magnitude (first panel) and phase (second panel) and the impulse response (third panel) of the correction filter. The black line is the VOLUME 11, 2023 75665 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. desired filter which would remove all aliasing from the data. The red line is the actual synthetized frequency response, with a filter of length 47 (requiring about 23 multipliers per sample, considering that half the input samples are zero owing to zero-padding). The synthesis error -the difference between the desired and synthetized frequency responses -is in the range of a few tens of dB, and a few degrees of phase. The impulse response of the filter is shown in the third panel, and is mostly a Dirac delta, with slight variations required to synthetize the desired frequency response: a Dirac delta, the ideal correction filter when no mismatches are present, would have zero phase and unity magnitude at all frequencies, and typical correction filters will just add some gain and phase distortion to correct for gain, delay and frequency response mismatches.

C. ESTIMATION AND SYNTHESIS OF THE CORRECTION FILTER
The input signal varies from 1 to 19GHz, so that it covers the entire first Nyquist band of the 40GS/s digitizer. Sinusoids are spaced by 200MHz, and care is taken to acquire the frequency f S /2−f in for each f in input tone, because both these frequencies (corresponding to the main and aliasing tone in a 2-channel time-interleaved system) are required to estimate the correction coefficients [24].
The FIR filter has been synthetized using the L2 (Euclidean) norm, hence minimizing the quadratic error between the desired and synthetized frequency responses in the band 1-19GHz where the input signals lie. A length of 47 was sufficient to achieve very good agreement between the two frequency responses, though also a shorter filter would have been sufficient, considering that after aliasing has been sufficiently reduced by linear filtering, accuracy is limited by noise and distortions. Shorter filters are tested in the following, to reduce computational complexity: they are less expensive in terms of computational resources, but have higher synthesis error, so that they are less effective in synthetizing the ideal frequency response of the aliasing-correction filter which eliminates aliasing spurs.

D. ALIASING ANALYSIS AFTER CALIBRATION
The results before (left) and after (right) calibration are shown in Figure 15 for an input frequency of 5.8GHz. It is evident that aliasing correction reduces the aliasing tone by about 24dB, so that the residual aliasing becomes negligible with respect to HD3 and the spurious tones. The AFDR passes from 24.5 to 49dB, whereas the HD3 remains at about 46.5dB, and the SFDR around 42.5dB. Hence, an improvement of 18dB in SFDR is achieved by removal of the aliasing term. The spurious tones are mostly around the main tone, and are not generated by the IC, because no component works at those frequencies. They are most likely due to interferences with the input generator, and are 45dB below the main tone, so that they don't impact the overall accuracy of the system, which is determined (after calibration) by noise.
The noise floor remains the most significant limitation of the SNDR because it dominates distortions: the SNR at this frequency is about 36dB, equivalent to an ENOB of 6. Noise, like distortions and other spurs, cannot be corrected via linear signal processing, so that aliasing removal can improve SNDR only until aliasing becomes negligible. Figure 16 shows the gain of the system (the power of the output tone) before and after calibration, without equalization. The loss of about 7dB in the setup is shown at 1GHz, but it quickly increases to 13dB of attenuation in the second half of the Nyquist band. Though a loss of about 4dB at the Nyquist frequency is expected from MFP theory [24], the additional loss of about 2dB at 19GHz, which was not present in the simulations of the chip, is probably due to board losses at higher frequencies. Figure 16 shows that gain does not change before and after calibration, as the frequency response is not equalized, and the impact of aliasing correction on gain is limited. Figure 17 shows the gain before and after equalization with a filter of 9 taps. The system's frequency response, mostly due to the 50% duty-cycle of the pulse shape, can be easily equalized with short filters. Hence, most of the signal processing cost is due to the need of correcting aliasing mismatches, which require 2-4 longer filters than equalization. Figure 18 shows the SNR of the system. The shape is very similar to the gain before equalization in Figure 16, in the first half of the Nyquist band, because the SNR is dominated by the gain loss. The value at 19GHz (35dB) is consistent with the jitter performance of the input generator as specified by the manufacturer [36], so that it sets the upper limit to SNR around the Nyquist frequency. However, jitter cannot explain the shape of the SNR curve before the Nyquist frequency (as jitter performance worsens with frequency). The minimum SNR is about 32dB, and also SNR is only slightly affected by calibration, as expected.
Distortions before and after calibration are shown in Figure 19. Of course, calibration does not influence distortions, which are always better than 40dB until 16GHz. Hence,  distortions are not the main limitation to system performance, since aliasing distortions (before calibration) and noise are larger. Figure 20 shows the SNDR of the MFP digitizer, which includes aliasing, noise and distortions. Distortions are dominated by aliasing up to about 16GHz, and then noise dominates the SNDR, limiting the MFP performance. In fact, calibration significantly improves accuracy up to this frequency, and then SNDR isn't significantly influenced by calibration because noise dominates. This is due to the fact that calibration can only correct mismatches which cause aliasing, but cannot improve nonlinear distortions and noise. At low frequencies, aliasing distortions dominate, and SNDR can be improved significantly. At higher frequencies, noise dominates, and calibration has a limited impact. After calibration, Fig. 18 and Fig. 20 are very similar, because noise dominates over aliasing and nonlinear distortions. The decrease in SNR in Fig. 18 is similar to the reduction in gain before equalization in Fig. 16, so that it is mostly due to the gain loss.   So far, calibration has been performed with a 47-tap FIR filter. Without calibration, SNDR is 24dB at the worst frequency. After calibration, it increases by 7dB, up to 31dB, at the cost of additional power consumption for the FIR filters. A 47-tap FIR filter operating at 20GS/s (thanks to the polyphase architecture) would require about 1TFlops of computing power. This computing power can be reduced using shorter filters, at the expense of lower accuracy. The trade-off between accuracy and computational complexity depends on many factors: if SNR and THD performance is good, longer filters allow reducing aliasing distortions and provide better SNDR; on the contrary, once the performance ceiling caused by SNR and THD is reached, longer filters provide no additional benefits, because aliasing is no longer the dominant limitation on SNDR.
The simplest form of correction is gain mismatch correction, which only requires a FIR filter with 1 coefficient: this, however, only yields 26.6dB of worst-case SNDR, about 2.5 bits more than performance without calibration. Hence, longer filters are required to improve performance, up to the ceiling of about 31dB at the frequency for which SNDR is lowest. Figure 21 shows the worst case (minimum from 1 to 19GHz) and mean (across all the frequency points between 1 and 19GHz) SNDR as a function of the FIR filter length. A cost of 0 is the case without calibration. A cost of 1 is the case of gain mismatch correction. Performance saturates after about 45 filter taps (about 900GFlops of real-time computing power), but good linearity can be achieved also with FIR filters of 25 taps (about 500GFlops) or less. The computational cost of calibration is given by the filter length multiplied by the system's sampling frequency, divided by 2 because half of the input samples are zero, as the ADCs operate at 20GS/s but the system operates at 40GS/s, so that half of the input samples is equal to zero after upsampling to 40GS/s. Such cost is common to all time-interleaved systems, and the required signal processing is identical to that used in such systems.

V. CONCLUSION
A modified architecture of MFP digitizer is proposed and experimentally validated. The architecture is composed of a simple wideband analog front-end comprising one clock divider, two on/off mixers, and two lowpass filters. Each front-end takes one input signal and one input clock and provides two half-bandwidth output signals and two half-frequency output clocks. The front-ends can be cascaded to obtain digitizers with 4, 8 or more channels. The back-end can be implemented via ADCs operating at low sampling frequency and with low bandwidth requirements (first Nyquist band at their sampling frequency), and DSP for aliasing removal and equalization.
The architecture can thus implement hierarchical MFP digitizers [23] and can be calibrated via simple linear convex least squares techniques using a set of single-tone test signals to estimate the coefficients of FIR filters [24]. Hence, the generation of test signals, the estimation of correction parameters, and the real-time correction of aliasing and frequency response errors are straightforward.
The proposed architecture avoids the use of analog and digital blocks such as frequency modulators to provide multiple clock frequencies in the analog or digital domains, pulse generators, and complex non-convex optimization techniques for calibration.
A 40GS/s front-end for a two-channel MFP system has been designed in a commercial SiGe BiCMOS technology to validate the architecture [22] and the calibration technique [24] proposed by some of the Authors in the past. The chip contains a clock divider with 40GHz input and 20GHz output, two mixers with 20GHz clock input and 0-20GHz input bandwidth, and input and output buffers for the signal and clock paths. The alumina substrate and Rogers board on which the chip is mounted have also been designed.
The chip has been tested using a sinusoidal signal generator from 1 to 19GHz, a sinusoidal clock generator at 40GHz, and a 4-channel oscilloscope with 320GS/s equivalent sampling frequency. The acquired data has been processed in Matlab to emulate the analog 10GHz lowpass filters of the MFP system, and the following two time-interleaved 20GS/s ADCs. Furthermore, single-tone data at multiple frequencies have been acquired and used to synthetize the correction FIR filter to minimize aliasing and maximize the SNDR of the digitizer, experimentally validating the calibration technique proposed in [24].
Results show that the digitizer has about 24dB of SNDR before calibration, due to aliasing caused by channel mismatches, and from 39 to 31dB of SNDR after calibration to reduce aliasing via linear signal processing. Distortions are better than 40dB from 0 to 16GHz. At least 5 equivalent bits of resolution are obtained from 0 to 20GHz input.
The total power consumption of the front-end is 640mW from a 3V voltage supply, including the I/O buffers for the clock and data signals. About 86mW are to be added to implement the two 6th-order lowpass filters [34]. Of the total power consumption, 105mW are due to the biasing network, which has not been optimized and has large redundancy to maximize flexibility. Furthermore, 344mW of consumption are due to the output data and clock buffers, which would not be required in a fully integrated solution, as it would include the ADCs, thus further reducing the total power consumption of the front-end.
The chip validates the idea of 2-channel MFP digitizers [22] and proves that the architecture can be used in a BiCMOS technology to achieve at least 40GS/s sampling frequency with 5 bits of resolution across the entire Nyquist band, using linear signal processing and linear optimization techniques for identification similar to those used in conventional time-interleaving ADCs [17], [18], but greatly relaxing the input bandwidth requirements of the ADCs.
The chip can also be used in a hierarchical 4-channel architecture [23] to obtain a 40GS/s MFP digitizer with four 5GHz analog outputs, which can use four 10GS/s ADCs, which are commercially available. Hence, the MFP front-end is a fundamental building block for high-speed digitizers, whose advantages are ease of design, ease of digital calibration, and scalability to even higher sampling frequencies and number of channels. One of the avenues of future research will be the implementation and validation of a 4-channel hierarchical MFP digitizer.