Assessment of the Bundle SNSPD Plus FPGA-Based TDC for High-Performance Time Measurements

Counting single photons and measuring their arrival time is of crucial importance for imaging and quantum applications that use single photons to outperform classical techniques. The investigation of the coincidence, i.e. correlation, between photons can be used to enhance the resolution of optical imaging techniques or to transmit information using quantum cryptography. Time measurements at the state-of-the-art are performed using Superconducting Nanowire Single Photon Detectors (SNSPDs), the lowest timing jitter single-photon detectors, connected to digital oscilloscopes or digitizers. This method is not well adapted to the ever-increasing and pressing requirement to perform measurements on a high number of channels at the same time. We focus the high-performance measure of the arrival time of photons and their correlation by means of SNSPDs and a 16-channel Time-to-Digital Converter fully implemented in Field Programmable Gate Array (FPGA). In this approach, the photons’ coincidence is analyzed in real-time directly in the FPGA, resulting in a Coincidence Time Resolution (CTR) of 22.8 ps r.m.s.. For the practical benefit of the scientific community, an extended and comprehensive panorama also of comparison with the actual available strategies in this field of applications is offered through a huge number of references.


I. INTRODUCTION
Photon absorption enables for very sensitive light detection by monitoring the transition of a segment of a currentbiased superconducting nanowire from superconducting to conventional resistive state [1]. Superconducting Nanowire Single Photon Detectors (SNSPDs) are devices that use this working mode and are used in a variety of quantum information applications [2], [3], quantum computing [4], and quantum optic [5], [6], [7], due to the fact that they are extremely suitable to detect solitary photons [8] with very high probability in a wide spectrum window from UV to The associate editor coordinating the review of this manuscript and approving it for publication was Remigiusz Wisniewski .
infrared [9], [10]. SNSPDs have been used to monitor the emission of single photons from a variety of light sources, including carbon nanotube dopants [11], color centers of silicon carbide [12], and semiconductor quantum dots [13]. Thanks to the absence of a PN-junction, SNSPDs have several advantages over conventional devices that are sensitive to single photons like Single-Photon Avalanche Diode (SPAD), including no afterpulsing (i.e., the triggering of a PN-junction-based detector after the main avalanche effect due to the presence of trap levels that receive extra energy during the first avalanche), extremely low timing jitter, and a low Dark Count Rate (DCR) (i.e., the detection of thermal photons due to the thermal noise in a PN-junction-based detector induced by the generation-recombination processes within the semiconductor), according to [14]. They are the best detector for applications that require great time precision and excellent weak signal detection. High-resolution light detection and ranging, photon correlations (LIDAR) [15], [16], [17], [18] oxygen singlet detection [19], optical reflectometry in the time domain for telecommunication networks [20], This type of technology was also demonstrated valid in deepspace optical communication by [21], [22]. Furthermore, determining the coincidence (i.e., correlation) between them is required in the calculation of photon arrival time at highresolution [23]. In particular, the Coincidence Time Resolution (CTR) is one of the most important metrics in this regard [24].
To assign a timestamp to the photons, we can use either a classic Voltage-Mode (VM) [25], [26] or a modern Time-Mode (TM) approach [27], [28]. A Digital Oscilloscope (DO) or a digitizer [29] is used to amplify and digitize the signals from the detectors in a VM method. This permits the timestamp to be recovered using proper Digital Signal Processing (DSP) techniques including oversampling and interpolation, as described in [30]. Additional changes to the resulting waveforms also allow for compensating of non-ideality, such as pile-up. Instead, if a TM technique is used, the signals from the detectors' output are discriminated using a Threshold Comparator (TC) or a Constant Fraction Discriminator (CFD), and then transformed directly into a timestamp using a Time-to-Digital Converter (TDC) [31], [32], [33].
As a result, TM acquisition chains use less hardware than VM acquisition chains and are therefore preferable in multichannel applications. Due to the inability to run DSP algorithms to account for non-ideality, we usually receive less time precision in TM systems than in VM systems. However, by interposing appropriate filters between the detector and the discriminator, we can adjust for these non-idealities and increase timing precision. In this regard, the growing number of channels in modern applications mandates the employment of TM approaches rather than VM approaches, which are now only utilized for preliminary testing.
To implement a TDC, we can use a static Application Specific Integrated Circuit (ASIC) or a programmable logic device, specifically a Field Programmable Gate Array (FPGA) [31]. In this regard, we must keep in mind that an ASIC TDC surpasses an FPGA solution in terms of precision and resolution at the same technological node, but it loses flexibility. In actuality, using an FPGA-based approach allows you to fine-tune the TDC's parameters (resolution, hardware occupancy, number of channels, and so on) to guarantee that the TDC is well-suited to the application. Furthermore, all of the application's processing requirements (e.g., timestamp manipulation, correlation computation, etc.) may be fulfilled directly in the FPGA employing an efficient, high-speed, and flexible parallel computing architecture [32], [33]. This is not achievable with ASIC architectures. In fact, in order to perform correct algorithms, the gathered timestamps must be moved to an appropriate processing unit, such as a Personal Computer (PC), increasing the system complexity in terms of hardware and/or throughput. In this paper, we show that timing measurements on SNSPD (the most time-resolved detector) may be done with precision referenced to single-shot operating mode less than 26 psr.m.s. up to 1 MHz rate of measure at the current state-of-theart. Furthermore, we will show how to directly measure the CTR in FPGA while also determining whether two photons detected on two different SNSPDs are in coincidence or not; with this technique, a precision of less than 33.4 psr.m.s. is guaranteed up to a 980 kHz coincidence rate, implying 7 Mcps per detector. As a result, all of the algorithms are parallelized and hosted in the FPGA, with the PC functioning only as a monitor. The organization of the paper is as follows: in Section II, we will provide the current state-of-the-art of time measurement by SNSPD. In Section III, the characteristics of the employed SNSPDs are discussed, and in Section IV, measurements using a VM technique are carried out and used as a reference. Sections V, VI, and VII are where the paper's core and novelty are found, with simulation of temporal jitter (Section VI) and real-time CTR measurement with an FPGA-based TDC (Sections V and VII). Section VIII examines the relationship between measures and possible future development.

II. STATE-OF-THE-ART
The SNSPD is a grid of superconducting nanowires kept at a low temperature, with bias currents (I BIAS ) in the µA range provided by a ''real'' current generator with impedance R BIAS . The low temperature ensures the presence of the socalled superconducting state, in which the material has an infinite conductance and there is no voltage drop across the array. When a photon strikes a nanowire, it warms the area of impact, breaking the superconducting state, according to [1]. As a result, a resistance greater than zero (R SNSPD ) emerges for a few ps, generating a pulse of a few mV . The cooling device then restores superconductivity by extracting the heat produced by the detected photon. A graphical depiction of the detecting technique is shown in Figure 1 Despite the low work temperature, some thermal photons can be caught inadvertently by the SNSPD, resulting in a DCR proportional to I BIAS of a few hundreds of cps. Instead, we refer to the likelihood of correctly detecting an optical photon as System Detection Efficiency (SDE); the SDE is also proportional to I BIAS with linear dependency in the socalled ''linear region,'' where the SDE value is between 15% and 85% [10].
An amplification stage (Figure2) is required to amplify the small generated pulse while also shaping it with an exponential waveform of hundreds of mV as amplitude (A), a decay time constant (τ ) of a few ns, and a rise time (t RISE ) as fast as possible, according to [34]. Using an appropriate instrumentation time-interval-meter (TIM), such as an DO and/or a TDC, it is possible to assign a timestamp to the detected photon, measuring the time instant when the exponential curve rises (Figure 3).   The jitter associated with the exponential curve with regard to the instant of photon detection is attributable, in first approximation, to the inherent jitter that impacts the SNSPD (σ SNSPD ) [35] and the electronic jitter introduced by the amplifier noise (σ AMPLI ) [36]. However, the investigation in [37] suggests that a third jitter contribution, the jitter of the incident photon (σ PHOTON ), be addressed.In fact, from a pure metrological standpoint, it is impossible to determine the instant of creation of the observed photon with infinite precision. When evaluating the SNSPD's characterization experiment, incident photons are sometimes generated by a LASER; thus, σ PHOTON is the jitter in comparison to the ideal trigger. As a result, we may deduce that the measured timestamp is additionally influenced by a fourth component of jitter (σ MEAS ) caused by the measurement procedure. The overall jitter (σ TOT ), according to Equation (1), is: The measurement of the temporal difference of photons in coincidence over two SNSPDs (Figure 4) is another important application of SNSPDs [37]. A source generates photons at the same moment, which are detected by two distinct SNSPDs in this situation. The spatial position where the photons were generated can be determined by measuring the difference in time of the detected photons. The precision of this measure is the so called CTR and is the standard deviation of the distribution of coincidence measurements (σ CTR ). In the same way as σ TOT , the σ CTR depends on the measurement process (σ MEAS ) and on the intrinsic and electronic jitters of the two SNSPDs, that is σ SNSPD,1 , σ AMPLI ,1 , σ SNSPD,2 , and σ AMPLI ,2 respectively. Precisely, σ PHOTONS (photons not photon as in Equation (1)) defines the possible jitter between the coincidence photons that is detected by the SNSPDs due to the nature of physical generation mechanism. This is summarized in the following equation, The symmetry between the two SNSPDs makes feasible to set σ SNSPD,1 = σ SNSPD,2 = σ SNSPD and σ AMPLI ,1 = σ AMPLI ,2 = σ AMPLI , reducing Equation (2) to

A. INTRINSIC JITTER
The I BIAS [35], [36], the cooling temperature (T K ) [38], the active area, a.k.a. geometry of the grid array, (A) [36], [39], and the wavelength (λ) of the detected photon all influence the σ SNSPD . From an intuitive standpoint, the shorter the λ or the lower the T K , the faster the state transition from superconducting to non-superconducting, lowering the σ SNSPD , i.e., σ SNSPD ∝ λ and σ SNSPD ∝ T K . Similarly, if I BIAS is increased or the A is decreased, the voltage pulse has a faster response, which reduces the inherent jitter, i.e., σ SNSPD ∝ 1/I BIAS , and σ SNSPD ∝ 1/A. Obviously, there are just a few alternatives for tuning these four factors. First and foremost, an SNSPD's job is to detect photons with a λ that cannot normally be modified and is determined by the nature of the processes being studied. Furthermore, for a certain application, a minimum A is desired, which determines the device's geometry (i.e., thickness, fillfactor, width, and length). A preset upper limit to the I BIAS is also set to keep the DCR low. Furthermore, there is a technological restriction to the minimum T K .
In practice, SNSPDs exhibit intrinsic jitter in the range of 6 ps to 20 ps as FWHM, taking these factors into account [40].

B. ELECTRONIC JITTER
As can be seen, σ AMPLI is highly dependent on the amplifier's Noise Figure (NF), Gain (G), and Band Width (BW ). In fact, σ AMPLI results in a random dispersion of the exponential's rising edge in relation to the voltage pulse in input [41]. As a result, the identical input voltage pulse creates very distinct exponential waveforms, increasing the detection event's uncertainty. To reduce this dispersion, we must ensure that the output exponential shape (t RISE ) has a steep rising edge, so σ AMPLI ∝ 1/t RISE , with the highest possible Signal-to-Noise Ration (SNR) [35], which is equivalent to minimizing NF,σ AMPLI ∝ NF, and maximizing G, σ AMPLI ∝ 1/G.
In general, we need to ensure tens of dB as G. in order to achieve the goal output amplitude of hundreds of mV .
To prevent collecting noise, we must select an amplifier with a BW that is as close as feasible to the spectrum of the input voltage pulse contained between the boundary down (F L ), imputable to the the 1/f noise corner frequency (F C ), and up frequencies, (F H ) ( Figure 5). We will get a t RISE that is inversely proportional to F H , i.e., t RISE ∼ 1/F H [42]. Using ultra-fast RF amplifiers with F H of several GHz, t RISE of hundreds of ps is ensured. The presence of the 1/f noise with F C of hundreds of kHz impose the F L in the same order of magnitude. In addition, even in the presence of pile-up, a minimum recovery time into the superconducting state is necessary for linearity in the count rate. In [34] an upper limit of tens of ns for the exponential decay time constant (τ ) is set. Given that τ is inversely proportional to F L in the first approximation, i.e. τ ∼ 1/F L . Considering this two contribution, τ has upper limit of tens of ns. Figure 6 shows a graphical equivalence between the amplifier's output exponential and its Bode diagram for completeness. Table 1 recaps all of the amplifier's parameters in relation to their restrictions.
As a result, in real-world applications, we see electronic jitter ranging from 14 to 45 ps as FWHM [37].

C. MEASUREMENT JITTER
The contribution of σ MEAS relies on the device used for timestamping, i.e. the more precise the instrument, the lower the contribution of σ TOT and σ CTR . In this instance, the rule of thumb is to use a high-precision instrument (a.k.a. σ MEAS ), which is sigma σ 2 MEAS σ 2 SNSPD + σ 2 AMPLI in this example. This restricts the precision requested to some ps in real-world applications.
Oscilloscopes and time-interval-meters are the two types of equipment we can utilize, as expected. Due to the tremendous success of digital electronics and digital signal processing, we propose to use the DO as an oscilloscope and the TDC as a time-interval-meter.

1) DIGITAL OSCILLOSCOPE
In the case of DO, the amplifier's exponential output form is sampled as a classical analog signal with a sampling frequency F S according to Shannon theorem (i.e.,F S ≥ 2 · F H ). This involves working with a sampling rate of tens of Gsps, which translates to a timestamp resolution of only hundreds of ps (LSB). In fact, the accuracy provided by a DO (σ DO ) is proportional, in the ideal case, to the LSB (i.e., σ DO = LSB/ √ 12) and in the absence of interpolation techniques LSB = 1/F S . To lower σ DO to a few ps, a digital oversampling of the collected waveform and a subsequent interpolation technique (e.g., sinc) are required.
The DO method allows you to examine the exponential structure in greater detail, but it has several limitations. To begin, at least one byte every sample is required to make the interpolation technique successful; this translates to tens of GByte per second of data that must be stored during the acquisition, which is thus limited in duration. Furthermore, the DO technology has a strict limit on the number of parallel channels that may be used, making this approach practical solely for detector characterization. In this regard, the TDC technique is required for the majority of applications.

2) TIME-TO-DIGITAL CONVERTER APPROACH
The exponential output form is directly translated into a timestamp at 32 or 64 bits with a few ps of LSB using the TDC method. In this method, no post-processing for interpolation is required, and data storage is reduced, allowing the system to run in real-time. Obviously, the shape information is completely gone.
The TDC assigns a timestamp to the input signal referring to on an internal clock (time-tagger) [43] or an external event event (start-stop), as described in [44]. The start-stop approach and are commonly used in Time Correlated Single Photon Counting (TCSPC) applications; on the other hand, the time-tagging solutions can work in continuous mode. The concept of time measurement is defined as the time elapsed between an absolute time reference, taken as ''zero'' on the time axis, and the occurrence of a certain event of interest. In this case, our measurement is a ''Timestamp''; however, we are usually more interested to carry out a time measurement as the relative ''Time Distance'' between two events, the first one being the ''START'' signal, and the second being the ''STOP'' signal. Anyhow, a ''Timestamp'' is nothing but a particular case of time distance between a chosen absolute time reference, and the event under study. On the other hand, the time distance between two events is the time difference between their timestamps, each one calculated referencing to a common absolute ''zero'' time origin. Figure 7 shows the concept just explained. The primary difference between TDCs, regardless of operating mode, is in the circuit architecture and technology employed for time-to-digital conversion. In reality, we have the option of using a mixed-signal or all-digital technique. The Time-to-Amplitude Converter (TAC) [45] is the most popular and widely used mixed-signal TDC, in which the time interval is transformed into a voltage level and monitored by an Analog-to-Digital Converter (ADC). Instead, the most widely used fully-digital TDCs are based on the Delay-Line (DL), in which the time interval is quantized using the propagation delay of the logic gate that composes the DL and specifies the LSB, regardless of the individual circuit [46].
Mixed-signal and fully-digital TDCs can both be implemented as ASIC, while fully-digital TDCs can also be implemented in FPGA devices [47].
The designer of an ASIC solution can fine-tune all of the circuit's settings with considerable freedom in order to obtain the best performance. Modern FPGA technologies now enable for equivalent performance in a considerably shorter time-to-market while also having a far reduced overhead than an ASIC approach. The extreme high flexibility of FPGA-based systems makes them increasingly more recommended at equal accuracy levels [48].

3) COMPARISON
Table (2), which refers to Paragraphs II-C1 and II-C2, concentrates on the advantages and disadvantages of DO and TDC techniques. Table (3) shows the precision and maximum number of parallel channels based on the SNSPD literature [49].

III. SNSPD OVERVIEW
We characterized a two channels detection system provided by Single Quantum B.V. [58] using a LeCroy WaveRunner HRZ640i ( [59]) as the DO (Figure8), which is made up of two The detector is a shaped NbTiN nanowire with a width of 100 nm and a fill factor of 50. (Figure 9) [62]. The nanowire structure has a diameter of 16 µm and is optically coupled to a mono-modal optical fiber with a diameter of 12 µm. (Figure 9). The superconductor is placed on top of a resonant cavity made of a 135 nm silicon-oxide layer and gold to increase the likelihood of absorbing photons at about 800 nm. Because photon down-conversion sources emit in this wavelength range, this wavelength range is highly valuable for quantum research. The detector is kept at a T K of 2.5 K using a Gifford-McMahon closed-cycle cryo-cooler. The SNSPD, with its I BIAS current generator, is at the input of the amplification stage. When photon detection occurs, the SNSPD's load impedance (i.e., R BIAS R SNSPD ∼ k ) is greater than the amplifier's input impedance (

A. SNSPD DESCRIPTION AND CHARACTERIZATION
The calibration curve, SDE vs I BIAS , is measured (using photons with λ of 750 ns) to determine the SNSPD's working point.
With a I BIAS of 17 µA, the SNSPD achieves the saturation with a SDE of 80.7%; so, a ''linear region'' operating point at I BIAS of 14 µA with a SDE of 70% is choose. This SED consider the SPSPD's deception efficiency and all the optical losses.

B. SNSPD SIMULINK SIMULATION
The SNSPD detection system is recreated in the Simulink environment using the information obtained in Paragraph III-A. To begin, the LASER is simulated using a delta-comb generator with a period T LASER = 1/F LASER of 1/76 MHz ∼ = 13.128 ns and active time (T ON ) of 6 ps. The SNSPD's detection efficiency is replicated using a random generator that VOLUME 10, 2022 generates a number n between 0 and 100 and suppresses the delta-comb if n is greater than the SDE (n > SDE). The SNSPD is modeled using a first-order system with a thermal τ of 100 ps, which reflects the time it takes to regain the superconducting state.
The S-parameter model given by Minicircuit (ZTL-100) and RF BAY Inc is used to simulate the two amplifiers (LNA-100). In this method, the LASER is represented by a digital trigger, and the SNSPD is represented by an exponential form. Figure 11 depicts the simulation model's block architecture, while Figure 12 depicts a comparison of the emulated exponential output with respect to the genuine one. . This Simulink simulation model shows photon detection emulation, two amplifiers, the TIM for timestamp measurement, and the scope for acquiring ''analog'' waveforms.

FIGURE 12.
In this comparison, the simulated exponential output is compared to the real one.

IV. SNSPD TIMESTAMPING WITH OSCILLOSCOPE
A traditional timestamping with a LeCroy WaveRunner 640i as DO operating at 40 GSs and 8-bit, as a first step, before using the fully-digital solution described in Section V is done. The SNSPD versus LASER characterization is shown in Paragraph (IV-A), and the CTR between two SNSPDs induced by the same LASER is evaluated in Paragraph V.

A. SNSPD VS LASER
The jitter between the SNSPD and the LASER (Figure 3, where TIM we used a LeCroy WaveRunner 640i scope) yields a Gaussian shape with a standard deviation of 19.84 ps r.m.s. (i.e., FWHM of 44.27 ps) that represents σ TOT stated in Equation (1). To do this, the skew between the SNSPD's analog waveform and the LASER acquired with the DO was measured, and a sinc interpolation was used to provide the maximum possible precision. After that, a statistically significant amount of measurements was used to create a histogram. In this method, we were able to replicate the measurement result reported in Table 4 and in [56].
Furthermore, as illustrated in Figure 13, the pile-up of exponential pulses has caused a baseline fluctuation. In reality, there is superposition and consequently distortion of the n − th SNSPD's exponential curve starts before the tail of the (n−1)−th curve terminates, which is around 5τ ∼ = 5·14 ns = 70 ns long. In this case, as shown in Figure 14, generating a timestamp when the SNSPD waveform crosses a specified threshold causes a walk-error. Because the SNSPD randomly triggers on the LASER pulse, the baseline fluctuation is also a random variable with a standard deviation of σ BASE . As a result, the walk-error is random and is defined by the standard deviation σ WALK , which has an impact on the slope (Sl) during the rising edge of the exponential shape of the SNSPD (i.e., Sl = A/t RISE ) and, obviously,on σ BASE ,as shown by the relationship In this sense, the oscilloscope's σ MEAS is made up of two parts: σ WALK , which is due to the baseline and is transformed to time by the discrimination algorithm, and σ DO , which is the quantization error of the DO; i.e., Inserting Equations (4) and (5) in Equation (1), we get In this method, the baseline fluctuation is suppressed using a digital baseline restoration algorithm that implements a correct discrimination to detect the timestamp compensating the σ BASE (i.e., σ WALK −→ 0 ps r.m.s.) to achieve the highest level of precision possible. Furthermore, because to a higher probability of pile-up, the size of the baseline is proportional to the count-rate on the SNSPD (R SNSPD ), as you might expect. To confirm this, we measured σ BASE and Sl as function of R SNSPD using the DO in a range of 18 to 200 kHz, observing an increase in σ BASE ; the results are shown in Table 5.   (5) is used to reproduce the measurement results exposed by Single Quantum.  Due to some non-ideality in the estimation of σ SNSPD , σ AMPLI , σ DO , and the suppression of σ WALK , Table 4

B. SNSPD VS SNSPD
We next measured the CTR between two SNSPDs triggered by the same LASER pulse using the LeCroy WaveRunner 640i as DO (Figure 4) tuning on the baseline restore method. Unfortunately, because of the random nature of SNSPDs, real-time activation is difficult. In fact, in order to consider only the SNSPD numbers 1 and 2 in coincidence with the LASER, we must obtain all timestamps and execute postprocessing elaboration as shown in Figure 15. The result of the CTR count-rate (R CTR ) in the range of 1.2 cps to 720 cps is reported in Table 6. Count-rates in the range of 6.8 kcps and 200 kcps are sought over the two SNSPDs, number 1 and 2 (R SNSPD,1 and R SNSPD,2 respectively) to guarantee this R CTR .
To compute the CTR in practice, we must apply the statistic to a large number of skews computed between two detected coincidence events over the SNSPDs and extract the standard deviation (σ ). To accomplish so, an acquisition time (T ACQ ) of a few minutes was required in order to collect at least ∼ 10 4 of coincidence events (N CRT ) (i.e., N CRT = T ACQ · R CRT ). When we consider the F S of 40 GHz and the size of each single sample (1 Byte), we can calculate that for SNSPD #1 and #2, we need to acquire a total of N SNSPD,1 = F S · T ACQ and N SNSPD,2 = F S · T ACQ samples, which corresponding to 4.68 TByte for each minute of acquisition (i.e., (N SNSPD,1 + N SNSPD,2 ) · (1 Byte) · (60 sec/minute)). As a result, the proposed approach is incompatible with real-time applications. TDC timestamping was introduced to enable real-time measurement.
To change R SNSPD,1 and R SNSPD,2 while maintaining the I BIAS of the SNSPDs constant, an appropriate optical programmable attenuator with a value of A dB is placed between the LASER and the SNSPDs, reducing the quantity of photons and, as a result, the count-rates.
We may state that, using Equations (3), (4), (5), and the same method used in Paragraph IV-A, VOLUME 10, 2022  where σ PHOTONS is the jitter between the two correlated emitted photons (a.k.a., photons in coincidence) that will be detected by the SNPSDs for the CTR estimation, as opposed to σ PHOTON , which is the jitter between the LASER and its trigger (i.e., 6 ps r.m.s.). We can examine σ PHOTOS → 0 in our experimental setup, where the correlated photons are created via a beam splitter. Equation (7) becomes, in this case.
Moreover, if the baseline fluctuation is made negligible, We would like to point out that the measured σ CTR values (

V. SNSPD TIMESTAMPING WITH TDC BUT NO BASELINE COMPENSATION
In this Section, we describe the real-time characterizations of LASER vs SNSPD and SNSPD vs SNSPD performed using a TDC introduced in [33] and applied in numerous scientific investigations [63], [64]. We used the same configurations stated in Section IV, replacing the DO with the TDC in these tests. There is no mechanism in place to filter out baseline fluctuation. Figure 16 shows the used configuration, which comprises of a multi-channel Tapped Delay-Line based TDC (TDL-TDC) that has been integrated into an FPGA device [31], [46]. The FPGA is a Xilinx 28-nm 7-Series Artix-7 200T [65], which is mounted on a Trenz Electronics [66] TE0712 Systemon-Module (SoM) and connected to a custom carrier board through a large number of connectors. The configuration allows for better software and hardware re-configurability, allowing the user to easily change the FPGA device by simply replacing the TE0712 with SoMs from the same TE07xx family. In multi-channel mode, the TDC can run up to 16 parallel channels at high performance. As shown in Figure 16, the input receives an analog signal ranging from 0 to 3.3 V, which is converted into an LVDS digital pulse by a programmable TC. Differential traces calibrated at 100 distribute signals to the connectors and to the device where the TDC is located. The power stage and communication resources are housed on the carrier board. Table 7 highlights the main characteristics of the instrument employed. In connection to the σ TDC mentioned in Sections above, we must understand that it is a combination of the two channel precision (i.e., σ CH = 12 ps r.m.s.), one for the START timestamps and one for the STOP timestamps used to define the time interval under measurement. In this sense, σ 2 TDC = σ 2 CH + σ 2 CH = 2σ 2 CH = (17 ps r.m.s.) 2 . The suggested instrument is totally based on FPGA; the TDC and processing are both located in this seance's programmable logic. User-defined real-time parallel methods can be developed, or the collected timestamps can be forwarded to a PC for post-processing manipulation, as with the DO in Section IV. In this sense, a coincidence engine for the CTR has been implemented. Furthermore, a hardware histogrammer has been developed to retrieve the statistics of the SNSPD vs LASER (σ TOT ) and SNSPD vs SNSPD (σ CTR ) directly.

B. SNSPD VS LASER
The identical measurement setup described in Paragraph IV-A and Figure 3, replacing the DO with the TDC presented in Paragraph V-A, is used. The TDC calculates the statistics, such as the histogram, in real time by measuring the time interval between LASER and SNSPD. In this way, we set as references a voltage level that is exactly half of the amplitude of the LASER and SNSPD signals, respectively, to properly activate the timestamp acquisition with regard to the risingedge of LASER and SNSPD.
Different acquisitions at different R SNSPD values are reported in Table 8. Obviously, there is no compensatory mechanism for baseline fluctuation. As a result, measurements with lesser precision than those obtained with DO are produced. By substituting σ DO with σ TDC in Equation (5), we can easily adapt it to the case of the TDC; i.e., Considering the presence of walk-error distributed between 3.9 ps r.m.s. and 8.9 ps r.m.s., as Table 5 highlights, we can theoretically estimate σ TOT that is in the range between 29.5 ps r.m.s. (i.e.,

C. SNSPD VS SNSPD
We employed the identical setup of measurement utilizing the TDC instead of the DO and computing the CTR in real-time rather than post-processing mode, as shown in Paragraph IV-B and Figure 4. In this method, we were able to compile statistics and determine the time gap between SNSPD and SNSPD using the TDC. As a result, we set as references a voltage level that is exactly half the amplitude of the SNSPD signal in order to properly initiate the timestamp acquisition with respect to SNSPD rising-edges. For SNSPD #1 (R SNSPD,1 ) and #2 (R SNSPD,2 ), several acquisitions were carried out for different count-rates and summed up in Table 9. In this situation as well, no baseline compensation method is used, resulting in a decrease in CTR precision when the rate is increased.
Replacing σ TDC with σ DO , Equation (8)becomes So, we can estimate σ CTR considering a walk-error between 3.8 ps r.m.s. and 8.9 ps r.m.s. obtaining a theoretical value between 28.5 ps r.m.s. and 29.7 ps r.m.s., which means quadratic error with respect to the experimental value between 17 ps r.m.s (i.e., 28.5 2 − 18.5 2 = 18 ps r.m.s). Less data was obtained in comparison to the Paragraph IV-B. Data rate is, in reality, proportional to R CRT , and each measure is 4 bytes long. This translates to a 4R CRT Byte/s global data rate. The TDC generates the timestamps associated with the SNSPDs, the coincidence between timestamps is checked, and only the timestamps that pass the coincidence check (i.e., the difference between the timestamps coming from different SNSPDs below a maximum value T ) are subtracted and put into the histogram using the proposed approach. The entire algorithm is conducted in real-time on the same FPGA that houses the TDC, with the PC serving solely as a read-out device. The pipeline for this elaboration process is shown in Figure 17. Elaboration process for the measure of the CTR. The coincidence checker set SW at '1' if a coincidence between timestamps is detected. In a) SW is at '0' because no SNSPD #2 event follows the SNSPD #1 ones; instead, in b) the distance between the timestamps is bigger than the maximum allowed (T ). VOLUME 10, 2022   . Block diagram of the entire simulation environment with the photon detection emulation, the two amplifiers, the HPF, the TC, the TDC for timestamp measurement, and the scope to see''analog'' waveforms.

VI. BASELINE FILTERING
In this section, we show how the precision was enhanced by filtering out the baseline fluctuation introduced in Paragraph VI-A for both LASER vs SNSPD and SNSPD vs SNSPD. In order to appropriately tune the proposed circuit, some simulations utilizing the SNSPD model found in Paragraph III-B were performed before designing the hardware.  The parameter σ TOT measured at different R SNSPD values using TDC with the HPF at F P = 100 MHz (σ TOT (HPF )), in DC (σ TOT (DC )) and estimation of the corresponding σ WALK .

A. BASELINE FILTERING CIRCUIT
To compensate for the baseline fluctuation, we assume that is low-frequency noise caused by the superposition effect created by exponential decay with a τ of a few tens of nanoseconds, i.e. F L = 10 MHz. In these terms, the easiest method is to employ a first-order High-Pass Filter (HPF) that reshapes the τ to ensure a shorter exponential decay (that is 5τ long). As a result, we replaced the 50 DC termination with an AC one in the TDC board's input area, as shown in Figure 18.
As you can see, the HPF attenuates each harmonic below the pole frequency F P = 1/2πRC, by 20 dB/dec, where C is the capacitance that creates the AC coupling and R is the equivalent resistance (100 100 ) that makes the 50 termination. Different simulations are performed in Paragraph VI-B in order to tune the best value of C.

B. SIMULATION
To begin, the HPF, TC, and TDC models have been added to the SNSPD simulation model described in Paragraph III-B. The LASER's digital output is directly connected to the TDC's START, but the SNSPD exponential form of the output is filtered by the HPF to remove the baseline, as shown in Figure 19. The output of the HPF is then transformed into a digital trigger by the TC and connected to the TDC's STOP, just like in the experimental set-up.The TDC freezes the simulation's timestamp when a trigger event occurs on the START, generating T START , and on the STOP, generating T STOP . The final measure T = T STOP − T START is then calculated, and the target is legitimate only if the STOP event occurs after the START in a laser period (i.e., T START ≤ T STOP < T START +T LASER ). To set the value of C, the standard deviation of T (σ T ) is employed as a driving parameter. Table 10 shows several values of σ T as functions of the simulated R SNSPD , with various values of C and the DC value (no HPF) considered as reference. We can see how the HPF fails to filter out the baseline for F P ≤ F L = 10 MHz, rendering the HPF worthless. As a result, we may argue that increasing F P improves precision by filtering out walk-error. However, increasing F P reduces the loudness of the output signal, making discrimination more difficult. Figure 20 illustrates this point. As a result, C = 32 ps was chosen for the hardware tests.

VII. SNSPD TIMESTAMPING WITH TDC WITH BASELINE FILTERING
We performed the experiment described in Section IV, but modified the TDC's input front-end as shown in Para-graphVI-A, and used a value of 32 pF for the C (F P = 100 MHz) as calculated in Paragraph VI-B.
A. SNSPD VS LASER Table 11 highlights the measurement precision of LASER vs SNSPD with (HPF present at F P = 100 MHz) and without (DC) baseline filtering used as reference. We can theoretically estimate σ TOT using Equation (10) Table 11, that use the quadratic difference between HPF and DC coupled values of σ TOT , shows the intensity of the rejected σ WALK in the range from 21.7 to 24.8 ps r.m.s.. This must be compared to Table (5) (Paragraph (IV-A)), which shows that the range is 3.9 ps r.m.s. to 8.9 ps r.m.s. This discrepancy is due solely to the fact that σ WALK is extracted by measuring σ BASE , and the Sl is impacted by a larger uncertainty than the direct measurement. In addition, we can see that when R SNSPD rises, the chance of the pile-up effect rises with it worsing σ TOT . The histograms used to determine the σ TOT at R SNSPD = 36 kHz with a precision of 34.6 ps r.m.s. using the HPL of F P = 10 MHz and 26.0 ps r.m.s. using the HPL of F P = 100 MHz are shown in Figure 21.

FIGURE 22.
The CTR measurement between SNSPDs at 450 kcps using the HPF at C = 320 pF (F P = 10 MHz) has a σ CRT of 38.8 ps r .m.s, according to the plotted histogram.
R CTR ) are reported in Table 12. We can observe how the σ CTR becomes more precise as the F P increases, implying that the baseline fluctuation is minimized. Unfortunately, if F P is set too high, the HPF suppresses not only the baseline but also the harmonics that make up the SNSPD's short rising time, resulting in a reduction in signal amplitude (A). In this way, we decrease not just σ BASE but also Sl while raising σ WALK . Furthermore, if A is too low, the exponential signal should be impossible to discern from the noise floor. We can observe that the possibility of the pile-up effect grows as the rates rise. In this way, a greater σ CTR can be obtained at a lower rate for the same F P . Figure 22 shows the histogram used to determine the CTR of 38.8 ps r.m.s at R CRT = 450 kHz with an HPL characteristic of a F P = 10 MHz. Figure 23 shows a comparison of CTRs as a function of R CRT , taking into consideration the various ways discussed in Sections IVV and VII. The blue line with ''o'' markers represents the CTRs obtained with DO from Table 6. The CTR is unaffected by the rate thanks to the DO's baseline filtering; nevertheless, the DO's memory and processing capabilities limit the maximum acquisition rate of 700Hz. The orange line with ''+'' markers represents the CTRs acquired with the TDC without any filters as reported in Table 9. We can observe a low resolution CTR because of the baseline fluctuation; the CTRs are computed directly in the FPGA.

VIII. COMPARISON AND FUTURE DEVELOPMENT
The CTRs acquired with the TDC with HPFs atF P = 100 kHz, F P = 1 MHz, F P = 10 MHz, F P = 100 MHz, and F P = 1 GHz are shown by the other four lines, yellow with ''x'' as markers, purple with '' '' as markers, green with ''♦'' as markers, light blue with ''$'' as markers, and magenta with ''0'' as markers. Table 9 has all of the values computed in real-time directly in the FPGA. We can observe how the baseline fluctuations are filtered for R CRT less than F P by looking at these lines. In reality, the lower F P of the yellow (F P = 100 kHz), purple (F P = 1 MHz), and green (F P = 10 MHz) lines does not guarantee a proper reduction of baseline fluctuations.The light blue F P = 100 MHz), and the magenta (F P = 1 GHz) lines are the exceptions. The baseline filtering is effective at rates up to 100 kHz for the light blue line (F P = 100 MHz). Instead, the magenta line (F P = 1 GHz) has a lower precision than the light blue lines (F P = 100 MHz), which is used to reduce the loudness of the filtered signal (see Figure 20). Figure 12 shows a comparison of time-walk estimates using the DO (blue line with ''o'' as markers) and TDC (orange line with ''+'' as markers) from Tables5 and 11. The rate increases the time-walk in both cases, which is proportional to the baseline fluctuation. Various techniques and, without a doubt, inherent measurement  faults are to blame for the numerical disparity between the DO and TDC results. In fact, we must keep in mind that the DO (blue line) time-walk estimation is achieved using a VM technique that extracts the baseline using the DSP algorithm and converts it to time-walk using equations (4). Instead, for the TDC (orange line), the time-walk is determined as the difference between the precision achieved with the HPF at F P = 100 MHz (σ TOT (HPF)) and without it (σ TOT (DC)), i.e., σ 2 TOT (DC) − σ 2 TOT (HPF). The CTR measurement is done between two SNSPDs to keep the budget under control, but the hardware supports 16 independent channels; the performance is confirmed using an emulation unsigned function generator (rather than 16 SNSPDs). In the future, the experimental evaluation of 16 SNSPDs will be investigated, with the measurement set-up and firmware modified to manage 16 detectors in parallel. Issues of cross-talk will be given specific attention.

IX. CONCLUSION
We explain how, at the cutting edge of technology, a time mode approach architecture integrally based on FPGA can conduct timing measurement and CTR on SNSPD (the most time-resolved detector). Both the TDC and measuring techniques such as coincidence and histogramming are performed inside the FPGA due to the versatility of the programmable logic manner. In this situation, the PC is only used as a monitor. Precision of less than 26 ps r.m.s. is attained using this technology in single-shot mode up to 1 MHz measuring rate. Furthermore, thanks to firmware flexibility, we can directly measure the CTR in FPGA; this allows us to identify whether two photons detected by two distinct SNSPDs are in coincidence or not with a precision of always.To ensure this high precision by removing the intrinsic baseline fluctuation and decreasing the time-walk, a low-cost filtering on the SNSPD's output pulse is required. All processing takes place in parallel and effectively inside the FPGA, making this technology ideal for modern multi-channel applications; and the number of input channels may be easily extended from a firmware standpoint. Furthermore, we have specified that if cross-talk effects are present, particular attention would be paid. ENRICO RONCONI (Member, IEEE) received the master's degree in electronic engineering from Politecnico di Milano, in 2020. His research interests include advanced programmable logic (PL) and software architectures for data processing and transfer in field programmable gate arrays (FPGA) implemented scientific equipment, and time-todigital and digital-to-time converters (TDC and DTC). VOLUME 10, 2022