A Low-Noise CMOS SPAD Pixel With 12.1 Ps SPTR and 3 Ns Dead Time

—Single-photonavalanchediodes(SPADs)havebecome the sensor of choice in many applications whenever high sensitivity, low noise, and sharp timing performance are required, simultaneously. Recently, SPADs designed in CMOS technology, have yielded moderately good performance in these parameters, but never equaling their counterparts fabricated in highly customized, non-standard technologies. The arguments in favor of CMOS-compatible SPADs were miniaturization, cost and scalability. In this paper, we present the ﬁrst CMOS SPAD with performance comparable or better than that of the best custom SPADs, to date. The SPAD-based design, fully integrated in 180 nm CMOS technology, achieves a peak photon detection probability (PDP) of 55% at 480 nm with a very broad spectrum spanning from near ultraviolet (NUV) to near infrared (NIR) and a normalized dark count rate (DCR) of 0.2 cps/ µ m 2 , both at 6 V of excess bias. Thanks to a dedicated CMOS pixel circuit front-end, an afterpulsing probability of about 0.1% at a dead time of ∼ 3 ns were achieved. We designed three SPADs with a diameter of 25, 50, and 100 µ m to study the impact of size on the timing jitter and to create a scaling law for SPADs. For these SPADs, a single-photon time resolution (SPTR) of 12.1 ps, 16 ps, and 27 ps (FWHM) was achieved at 6 V of excess bias, respectively. The SPADs operate in a wide range of temperatures, from − 65 ◦ C to 40 ◦ C, reaching a normalizedDCRof1.6mcps/ µ m 2 at6Vofexcessbiasforthe25 µ m at − 65 ◦ C. The proposed SPADs are ideal for a wide range of applications, including (quantum) LiDAR, super-resolution microscopy, quantum random number generators, quantum key distribution, ﬂuorescence lifetime imaging, time-resolved Raman spectroscopy, to name a few. All these applications can take advantage of the vastly improved performance of our detectors, while enjoying the opportunities of megapixel resolutions promised by the economy of scale that is offered by CMOS technologies.


I. INTRODUCTION
S ILICON-BASED Single-photon avalanche diodes (SPADs) attracted increasing interest in the last decades thanks to their interesting performance [1], [2]. These devices showed relatively low noise, high photon detection probability (PDP), and very good timing performance [3]- [5]. Initially, SPADs were fabricated in custom epitaxial technology [6]. The use of a custom technology guarantees freedom in design optimization, in order to obtain the best performance in terms of sensitivity and noise [7], [8]. Recently, important efforts have been devoted to implement SPADs in commercial CMOS platforms [9]- [25]. Although CMOS SPADs are generally less performing in terms of PDP and noise, the benefit of integrating them side-by-side with electronic circuits is quite obvious. Indeed, this approach leads to reduced parasitics and more sophisticated ancillary circuits, larger application spectrum through more extensive functionality, and cost reduction thanks to mass production. Moreover, the circuits used for the SPAD front-end interface have been substantially improved over time, allowing the implementation of much more complex systems that fit a wider spectrum of applications [26]- [30]. In this context, very large array sizes, up to 1 Mpixel have been achieved [31], [32].
In this work, we present three SPAD pixel detectors, based on high-performance SPAD pixels implemented in 180 nm CMOS technology. This device is specifically designed to achieve high performance in terms of count rate, sensitivity, timing precision, noise, and power consumption. The latter is very important if the same architecture is implemented in large arrays. The paper is organized as follows. After a description of the SPAD structure and corresponding TCAD simulations, along with the electronic front-end in Section II, the device characterization is presented, covering noise performance (Section III), sensitivity (Section IV), and timing (Section V-A). The setup for each of the parameters is added to the results. Section VI presents a discussion of the results followed by perspectives opened up by this work. Section VII closes the paper.

II. SPAD DESIGN
We designed three devices in separate dies. Each device comprises four independent SPADs with a dedicated pixel circuit, placed at a distance of 250 μm. Fig. 1 shows the micrograph of the three implementations, whereas the inset on the top-right of each micrograph, shows three different SPAD diameters (25 μm, 50 μm and 100 μm, respectively). To achieve the maximum of controllability and observability of the system, each chip has a large padring. Several pads were added in this prototyping phase to ensure the possibility of fine tuning the several control voltages of the pixel circuit, together with the high voltage to bias the SPAD, digital VDD, ground, and ESD supplies. The SPADs, implemented in 180 nm CMOS technology, rely on a p-i-n structure, similar to [21]. Fig. 2 shows the cross-section of the SPAD (top) and a TCAD simulation of the electric field in 2D, as well as a quantitative plot of the field along the vertical axis. The SPAD is a substrate-isolated type, where a p-well (PW) layer forms the anode of the SPAD and a buried n-well (BNW) layer creates the cathode contact. The latter is connected to the high voltage through a deep-n-well (DNW). An epi layer between anode and cathode allows a fairly large high-field region (Fig. 2, bottom), thus achieving a large sensitivity spectrum. The  simulation results correspond to the SPAD operation at an excess bias voltage of 6 V.
The SPAD breakdown voltage was measured to be about 22 V at room temperature. The corresponding I-V curve is shown in Fig. 3 under both dark and illuminated conditions.
The SPAD front-end circuit, shown in Fig. 4, was inspired by [33], whereas cascode transistor M 1 is used as a resistive divider, along with M 2 to enable high excess bias (up to 11 V) [34] in combination with thin-oxide MOS transistors in the remainder of the front-end.
The gate of M 1 is fixed at V CAS , supplied externally. When an avalanche is triggered in the SPAD, the voltage at the source of M 1 rises, thus decreasing the transistor overdrive. When the voltage reaches V CAS − V th , M 1 turns off boosting the impedance seen at the SPAD's anode. Thanks to the body effect acting on these transistors, the overdrive of M 1 is dynamically reduced, thus making it turn off faster. Both passive and active recharge strategies are available in the pixel and can be used independently. M 5 , controlled by V pq , is used to disable the passive quenching/recharge branch, represented by M 4 . Active recharge is formed by M 2 and M 3 , the latter being turned on by the feedback loop represented by the OR gate, Schmitt trigger, and tunable delay element. The loop acts as a programmable-length monostable. The delay element is implemented using a current starved inverter (CSI) with a series voltage controlled transistor for both pMOS and nMOS branches ( Fig. 4 right). Controlling this delay, and thus the hold-off time is important to control afterpulsing, especially in relatively large SPADs. This mechanism determines both the pulse width at the output and, in large part, the dead time. To guarantee the stability of the monostable and to get sharp edges at the output, an inverting Schmitt trigger was added, while, to improve the linearity of the CSI controls, a current mirror was included [35]. The slew rate of the output was maximized, unlike in [34], using a custom buffering chain to the bonding pad. This solution ensured an output slew rate of approximately 1 V/ns.

A. Dark Count Rate
Dark count rate (DCR) was measured at different excess bias voltages for all three SPAD structures (Fig. 5). The measurement was performed at room temperature using an oscilloscope (Teledyne LeCroy WaveMaster 813 Zi-B). To a first approximation, DCR is linear in the area of the active region. However, a superlinear behavior is generally observed in the normalized median DCR due to the increased probability of traps in larger SPADs, thus causing trap-assisted dark counts. The results are shown in Fig. 5, where the median DCR is 0.2 cps/μm 2 at 6 V excess bias and room temperature for the 25 μm diameter SPAD. The DCR of the devices was also measured as a function of temperature in the −65 • C to 40 • C range using a climate chamber operated in a closed loop. Fig. 6 shows DCR in cps as a function of temperature for a range of excess bias voltages for the smaller (25 μm) and the larger (100 μm) SPADs. The figure also shows the breakdown voltage behavior over temperature for the same devices. These values were used to apply a precise excess bias. By decreasing the temperature, DCR decreases by about three orders of magnitude, reaching a value of 1.6 mcps/μm 2 at 6 V ex for a diameter of 25 μm, operating at −65°C. The normalized DCR on the active area reaches a value of 4 mcps/μm 2 at 8 V ex at −65°C.

B. Afterpulsing
Afterpulsing probability is another very important parameter, especially when one wants to minimize dead time through active recharge, so as to increase the maximum count rate in SPADs. This effect is due to some carriers, generated during the avalanche process, that may be captured by deep-level traps [36]- [38]. These carriers are then released after a statistical delay that depends on the lifetime of the traps [37], [38]. If a free carrier is released in a region where the electric field is sufficiently high it can ignite another avalanche. In general, the probability that this event occurs is more frequent with short dead times. Afterpulsing characterization for silicon SPADs is performed by histogramming the pulse inter-arrival time. This can be measured in the dark or under dim and uniform illumination. It can also be indirectly obtained by estimating the lifetime and density of traps using the time-correlated carriers counting (TCCC) technique [38], [39], which is typically more useful for III-V SPADs where the afterpulsing probability is significantly higher. In the presented work, the afterpulsing probability was obtained through inter-arrival histogramming under controlled dim illumination. Fig. 7 shows the measured inter-arrival time between pulses generated by the 25-μm SPAD at 6 V ex . The SPAD dead time was set at about 11 ns using the integrated active recharge circuit described earlier.
In Fig. 8 it is possible to see the measured afterpulsing probability on the same SPAD, as a function of the pulse width. The afterpulsing probability remains as low as 0.1% for a pulse width of about 5 ns. With the current architecture, the minimum achievable SPAD dead time is 3 ns.

A. PDP Setup
The most common method of measuring the PDP is to create an area with uniform photon flux of a particular wavelength and compare the responsivity of the SPAD under test to a calibrated reference device (usually a photodiode). The setup used to measure PDP is based on the continuous light technique [40], schematically shown in Fig. 9. The setup comprises a wide-spectrum Xenon lamp that generates wide spectrum light, a monochromator, an integrating sphere, a calibrated reference photodiode (PD) with a precision source and measurement unit (SMU) to measure the photocurrent generated by the PD, and a universal counter connected to the device under test (DUT). The integrating sphere and the DUT are enclosed in a light tight box to eliminate any source of background noise that would affect the measurement. A custom software was developed to automate the scan at a very fine wavelength resolution. The DUT has been placed at distance L from the output window of the integrating sphere, so as to ensure lower light level and high uniformity [41]. The reason to have a lower light level is that the SPAD (sensible to single photons) can be saturated if exposed to high light level, thus causing pile-up, which distorts the SPAD's sensitivity curve, causing an underestimation of PDP. Moreover, a stronger light impinging on the reference photodiode can improve its SNR. A 45 s integration time was used for each step. For each value of excess bias, the DCR is measured before starting the acquisition under the light. This value is then used to compute  the PDP as shown in [40]: Where η is a light ratio computed during the calibration phase measuring the light power at the integrating sphere output port and at the location of the DUT with a calibrated reference photodiode; S is the number of pulses at the SPAD output when exposed to light; A SP AD is the active area of the SPAD; F P D (λ) is the photon flux detected by the reference photodiode.

B. PDP Results
The PDP is plotted in Fig. 10 as a function of wavelength (top) and excess bias voltage (bottom). All the measurements were performed at room temperature. The wavelength scan was performed with a step of 10 nm. The sensitivity peak is 55% at 480 nm at 6 V ex . These results are consistent with [21], [34] for similar SPAD cross-sections. The relatively large sensitivity spectrum is also in line with the structure used (Fig. 2). Note the typical PDP saturation above 5 V ex . At and above this voltage, the PDP becomes increasingly insensitive to variations of breakdown voltage, which makes this SPAD amenable to  integration in large arrays, where the breakdown voltage could vary significantly across the chip, thereby causing unwanted PDP variability.

A. Jitter Setup
The setup used to evaluate timing jitter in the DUT shown in Fig. 11 is based on [42]. The setup comprises a femtosecond Fig. 11. Optical setup used for the single-photon timing resolution measurement. A femtosecond laser generates a 150 fs pulse at 1030 nm, which is then upconverted to 515 nm after SHG. A fast PD is used as a reference to the oscilloscope, while the upconverted beam (515 nm) is filtered by neutral density filters (NDFs). The output of the DUT is sampled through a 4 GHz, 0.6 pF active probe by a 40 GS/s, 13 GHz oscilloscope to generate a histogram using time-correlated single-photon counting (TCSPC) acquisition. laser (Amplitude Systèmes SA, S-Pulse HR SP), capable of generating 150 fs pulses at a wavelength of 1030 nm and 515 nm after second-harmonic generation (SHG). A fast photodiode (Newport InGaAs Photodetector, 45 GHz bandwidth) is used as a timing reference, while the upconverted beam is attenuated by a bank of neutral density filters (NDFs), so as to achieve single-photon detection regime. The DUT has a high-impedance output and thus an active probe is used to capture the output. An oscilloscope (LeCroy WaveMaster 813 Zi-B) is used to capture both the waveform from the DUT and the reference PD.

B. Jitter Results
Timing jitter measurements for the 3 device sizes are shown in Fig. 12. The plot shows the histograms of the response of the SPADs when biased at an excess bias voltage of 6 V and at room temperature. The oscilloscope trigger threshold was set at 400 mV for the SPAD pulse and 300 mV for the PD. The laser repetition rate is 100 MHz and the light was reduced in order to detect less than a laser pulse every 100. The jitter value (FWHM) of the response distribution was measured at 12.1 ps for a diameter of 25 μm, 16 ps for 50 μm, and 27.2 ps for 100 μm. To capture the diffusion tails, the full width at tenth of maximum (FWTM) was extracted as well; it results in 55.7 ps for a diameter of 25 μm, 66.8 ps for 50 μm, and 91.7 ps for 100 μm. The exponential time constant for the diffusion tails was also extracted from the plot to be 31.5 ps, 40.7 ps and 38 ps for the 25, 50 and 100 μm SPAD, respectively.
The plots in Fig. 13 show the response of a 100 μm SPAD with two excess bias voltages of 6 and 8 V, with an improvement of the jitter from 27.2 to 23.5 ps FWHM. Also in this case the exponential time constant of the diffusion tail was extracted and it is 38 ps for 6 V excess bias and 33.1 ps for 8 V excess bias.
It is important to note that these results were obtained without the need for low threshold comparators, thus a simplified circuit can be used in each pixel, thereby ensuring scalability to large arrays of pixels.   0.16 μm) exhibit the best sensitivity performance [16]- [19], [21], [24]. Instead, in more recent nodes, where higher doping and shallower standard layers are used, peak PDP does not usually exceed 32% and noise is higher [10]- [13], [15], [20], [22]. In [43] it is shown how very high PDPs can be achieved in the red in a 180 nm CMOS node using custom layers. However, this SPAD is not isolated, and thus the integration of front-end circuits is not straightforward. In addition, the achieved timing jitter is high because of its large drift region.
The PDP performance of the devices presented in this work is among the best ever reported in the literature for substrate isolated SPADs. The peak value of ∼54% at 480 nm with 5 V excess bias is quite close to that reported in [24], device (C), for the same bias. The noise performance reported in this work is also among the best shown in the literature (Fig. 14 right).
To the best of our knowledge, the timing jitter achieved in this work is superior to any other CMOS SPAD-based device reported in literature, except for [5], which reports a peak PDP of 8% and a DCR of 2800 cps/μm 2 , while our device achieves a peak PDP of 55% and worst-case DCR of 0.23 cps/μm 2 (Fig. 14  left). Moreover, in our solution, we have shown the performance of the SPAD with a low-power digital front-end and without the need for any circuit that could affect power consumption, such as a low threshold comparators. Thus, we believe that the proposed SPAD is an example of a new generation of devices with similar or better performance than custom SPADs but allowing scalable architectures with little to no power budget restrictions. Finally, we believe that the reduction of the front-end circuit threshold could improve timing performance and power consumption even further.

VII. CONCLUSIONS
We report on the design and characterization of a new SPAD fabricated in 180 nm CMOS technology, exhibiting a performance comparable or better than that of the most advanced custom SPADs, to date. The devices have a peak PDP of 55% at 480 nm and DCR is as low as 0.2 cps/μm 2 at room temperature, both at 6 V excess bias. The DCR is as low as 1.6 mcps/μm 2 at −65 • C, while the SPAD operated normally at 40 • C. The pixel circuit used allows the fine tuning of the SPAD dead time to control the maximum achievable count rate up to 300 Mcps, while afterpulsing remains in the order of 0.1%.
Three SPAD families were designed with 25, 50, and 100 μm. The SPTR reached 12.1 ps (FWHM) in the smallest SPAD and did not exceed 27 ps in the largest, all at 6 V of excess bias and room temperature. Low static power consumption is compatible with large arrays of SPADs, which makes this technology amenable to scalable Mpixel sensor architectures, suitable for a variety of applications demanding high sensitivity, low noise, and sharp timing performance.

ACKNOWLEDGMENT
The Authors would like to thank Simone Frasca and Olivier Bernard for the fruitful discussions and the technical support on the laser setup. EPFL also gratefully acknowledges the generous support of the Swiss .