Design of a Time-of-Flight Sensor With Standard Pinned-Photodiode Devices Toward 100-MHz Modulation Frequency

We present an indirect Time-of-Flight (ToF) sensor based on standard pinned-photodiode (PPD) devices and design guides to pave the way for the development of a ToF pixel operating at 100 MHz modulation frequency. The standard PPDs are established well as predominant devices for 2-D color imagers in these days because of their low noise characteristic, but slow transfer speed of photo-generated electrons still prevents them from being employed to 3-D depth imagers. Optimized PPD structure with no process modifications is introduced to create a lateral electric field for enhancing charge transfer speed inside the PPD, and essential design parameters for achieving high operating frequency such as the epitaxial layer thickness, the pinning voltage, and the threshold voltage of the transfer gates are discussed with TCAD simulation results in this paper. Prototype indirect ToF sensors with various structures and parameters were fabricated using a 0.11- $\mu \text{m}$ standard CIS process and characterized fully. We successfully evaluated the demodulation contrast of each pixel at 10 to 75 MHz frequencies, figuring out the suitable conditions of the PPD-based pixel. The best pixel operating at 50 MHz frequency demonstrated a depth resolution of less than 13 mm and a linearity error of about 3.7% between 1 and 3 m distance with a zero-order calibration. We believe further optimization of the ToF pixel incorporated with the PPD devices is possible to improve the performance, operating it towards 100 MHz modulation frequency.


I. INTRODUCTION
Since a face recognition such as the FaceID [1] has been introduced, 3-D imaging technologies have been drawn significant attention as user identification for mobile devices. They are also inevitable for a next-generation user interface into smart personal devices, e.g., augmented and virtual reality. Among them, time-of-flight (ToF) techniques have the advantage in terms of the system size compared with traditional triangulation methods such as stereo cameras that require long baseline for improving the accuracy and range of the depth map, The associate editor coordinating the review of this manuscript and approving it for publication was Baile Chen.
being proper for smart devices. The ToF imagers extract the depth information from time delay directly or phase difference indirectly between the emitted and the reflected light, which is known as the direct ToF (D-ToF) and indirect ToF (I-ToF) method, respectively [2]. The D-ToF is typically adopted to light detection and ranging (LiDAR) applications with a few hundred meters range and a strong background light condition [3]- [6], while the I-ToF is appropriate to the user authentications and human-computer interactions with a few meters range and an indoor environment [7].
As the shrinking of the pixel size in 2-D color imagers, the pixel size of 3-D depth imagers particularly in the I-ToF has been scaled from 1600 µm 2 down to 12.25 µm 2 over the past two decades. Also, the spatial resolution of the I-ToF sensors is getting higher to approach 1 Mpixels as shown in Fig. 1 that plots data aggregated from prestigious journals and conferences for ToF imagers such as IEEE Transactions, Image Sensor Workshop (IISW), International Conference on Solid-State Circuits (ISSCC), and so on [8]- [23]. These are because a small-size pixel with a high spatial resolution is desirable for enhancing gesture recognition and motion detection. However, the pixel scaling is not always favorable to improve the depth accuracy that is governed by the number of accumulated electrons and the modulation frequency [8].
The smaller the pixel size is, the lower the signal-to-noise ratio (SNR) is achieved, deteriorating the depth accuracy.
Considering the infrared reflectivity of the target object and the distance from it, the required dynamic range is also much higher than that of 2-D imagers in that only the color of the object is taken into account. This is the main challenge to shrink the size of the I-ToF pixel even though its pixel structure is similar to the active pixel in 2-D imagers [24].
On the other hand, the time to transfer photo-charge to a designated location in the pixel can be greatly reduced, which increases the modulation frequency without degrading the demodulation contrast (C demod ) that represents the effectiveness of charge separation for detecting the phase shift. The depth error is inversely proportional to the product of the modulation frequency and the C demod [8]. Therefore, it is important to maintain the C demod of small-size pixel operating at high modulation frequency for acquiring reliable depth accuracy. The electron movement in conventional photodiodes relies on the diffusion process so that the charge transfer speed should be too slow to increase the modulation frequency up to 100 MHz even though the pixel size is small. The most promising method to accelerate the charge transfer speed in photodiodes is to generate an electric field for drift force towards the designated storage node. Various pixel structures to create the lateral drift field have been reported so far; high-resistive photogate with two different potentials applied [25], interdigitated CCD gates with static potential bias [10], current-assisted photodiode [21], photogate with quantum efficiency modulation [19], [22], triangle-shaped pinned photodiode [12], lateral gate electric field modulator [17], [20] and so on. Most of them are based on non-standard CMOS imaging process technology or need delicate modifications in implant layers and photogates, which may not be cost-effective solutions. The standard pinned-photodiode (PPD) prevailing on 2-D imagers is the alternative for affordable and reliable I-ToF pixel, but its performance of depth acquisition is still inferior to other pixel structures.
In this paper, we explore the potential of the PPD-based I-ToF pixel and suggest its design optimizations to increase the modulation frequency towards 100 MHz. The proposed pixel adopts the split and binning architecture [16] and adjusts the shape and doping profiles within the allowable ranges in a standard process to generate the lateral drift field, expediting the charge transfer speed inside the PPD. In addition, we tailor several process conditions such as the thickness of the p-epitaxial layer (epi-layer), the threshold voltage of the transfer gates, and the pinning voltage to find out the optimum design of the PPD-based depth pixel. TCAD simulation and measurement results with various pixels are presented in the following sections. Fig. 2 shows a block diagram of a demonstration sensor with various PPD-based I-ToF pixels for evaluation. It consists of a pixel array, two row drivers and decoder, and a columnparallel correlated double sampler (CDS) with an address decoder. The pixel includes eight sub-photodiodes with the transfer gates, as shown in Fig. 3, and the pixel size is fixed to 14.4 × 14.4 µm 2 . The split photodiode structure enables fast charge transfer without the electric field for drift force, and the electrons merged in both floating diffusion nodes, FD_A and FD_B, improve the SNR in the charge domain. The previous work proposing the split and binning structure with the standard PPD reported a reliable I-ToF sensor operated at 20 MHz [16]. Fig. 3(a) shows the sub-photodiode reported in [16], which is called the conventional pixel in this paper. Fig. 3(b) is the proposed pixel layout with four important layers for simplicity. The pinning potential (V pin ), the fixed  voltage of the depletion region in the PPD, relies on the width and dopant concentration of the arsenic implant layer when the other implant conditions are not changed. Note that n-and n represent lightly and heavily doped extrinsic silicon by arsenic, respectively. The wider width and the higher dopant concentration of the n layer is, the higher the V pin can be generated [26]. Hence, the conventional pixel has a single V pin due to constant width and dopant concentration of the n-layer, while the V pin of the proposed pixel gradually increased from the n-region to the transfer gates, TX_A and TX_B. The width of the n-layer should be getting wider to the assigned direction, which ends up with an inefficient TX location like the previous triangular shape approach [12]. The heavily doped region (n layer in Fig.3) is designed so that the V pin should be higher than that in the lightly doped region in spite of a narrow space for efficient charge transfer by the TX gates. The photo-generated electrons slide into the n layer by the potential gradient and finally move along with the activated TX, which is the same operating principle as previous works [10], [15]. The funnel-type n layer structure enables the electrons to move the dedicated FD node within a few nanoseconds although there is no electric field. The shape and dopant levels are carefully trimmed to make the whole PPD region fully depleted and confirmed by TCAD simulation in SPECTRA [27]. Fig. 4 shows the simulated V pin distribution in the proposed pixel, and a potential gradient is well defined as expected. The dotted line shows the moving path of the electron generated at coordinate (X = 0.5 µm, Y = 1.5 µm). The potentials of both FDs and TX_B are set to V DD , whereas the TX_A is turned off. To clarify the potential distribution further, 1-D potential profiles through X and Y directions along with the dashed lines are also plotted in Fig. 5. For the direction of X1-X1 and X2-X2 , a smooth potential slope is produced with no active potential applied externally. A slight potential gradient is also generated for the direction of Y -Y by turning on and off TXs, resulting in high-speed charge transfer. A drain gate represented as DX in Fig. 3 dumps out unwanted electrons by background light during a readout phase.

II. PROPOSED PIXEL STRUCTURE AND OPTIMIZATION BASED ON TCAD SIMULATION
Other blocks in Fig. 2 are designed with proven schemes in our previous works [14], [16]. Two row drivers employ conventional clock distributed inverter chains to reduce the clock skew and are deployed on both sides of the pixel array to alleviate large parasitic loading effect of long TX lines, which is also a common technique to achieve high-speed signal transmission up to 100 MHz. Each column has two CDS to read in-phase and out-of-phase images stored in both FD nodes simultaneously with the suppression of the fixed-pattern noise in the reset and source follower transistors. No analog-to-digital converters are included for flexible operations.
To investigate the optimization of I-ToF pixels further, we conduct the design splits with three process variations, the epi-layer thickness, the pinning potential, and the threshold voltage of the TX, represented by t, V pin , and d, respectively in the cross-section view in Fig. 6. First, the optimum epilayer thickness should be figured out. The epitaxial layer is grown on a bulk p-substrate wafer. Considering a penetration depth of infrared light to the silicon, a thick p-epi can increase the quantum efficiency. However, it elongates the distance of the electron trajectory in a vertical direction with no electric field, deteriorating the charge transfer speed. This trade-off between the SNR and the modulation frequency would be revealed by experimental results with three different VOLUME 7, 2019  epi-layer thickness. Next, two different pinning potentials are adopted to the same pixel structure. In fact, rigorous performance evaluation in terms of the pinning potential adjustment reported that low V pin could improve the C demod in high modulation frequency [28]. Nevertheless, we surmise that high V pin may give a better result with thick epi-layer because its depletion region should be deeper than the other, providing an electric field in a vertical direction farther. Finally, the TX strength with variable threshold voltages is explored. The TX threshold voltage can be modified by the overlap distance of the p layer with the gate in Fig. 6. In the CIS process we used, the p layer is implanted before depositing polysilicon so that it could determine the threshold voltage in the interface between the photodiode and the TX, which is similar to the channel doping to adjust the threshold voltage in typical CMOS process. Other implant layers of the PPD are done after poly deposition for the self-aligned process. Since the number of electrons generated is extremely small in a single pulse duration of less than tens of nanoseconds, the potential profile beneath the TX critically affects the transfer efficiency. We set three different overlap distance, d, 0.1 µm, 0, and −0.1 µm. The negative distance means the p layer is not overlapped but apart from the TX. Fig. 7 shows the simulation results of three cases, the potential distribution in the cross-section and 1-D profile of the maximum potential in Z-direction. Interestingly, the electron cannot be properly transferred in case of 0.1 µm and −0.1 µm overlap distances due to potential bumps in the interface. A negative overlap is even worse, creating two bumps in both TXs. Special care would be taken into account to design the threshold voltage adjustment layer for avoiding any potential bump. Note that specific overlap distance would be dependent on the alignment margin and thermal process in the actual technology.

III. EXPERIMENTAL RESULTS
Two types of prototype I-ToF sensors for verifying performances of various pixels were fabricated using a 0.11-µm frontside-illumination (FSI) standard CIS process. One of the prototype chips contains the conventional pixel array as a reference, and the other has the proposed pixel array with different TX threshold voltages. Fig. 8 shows a microphotograph of the chip that includes the most promising pixel array of 200 × 232 based on SPECTRA simulation. A test pattern in the die includes other analog circuitries for testing a readout chain. A custom test system was also built with two printed circuit boards for an emitter and a sensor interface which consists of two 12-b analog-to-digital converters, a complex programmable logic device (CPLD) as a digital controller, an oscillator, and regulators. Six laser diodes with 855-nm wavelength produced by Lumentum Operations LLC are configured as the emitter with a current driving circuit, and optical diffusers are deployed in front of them to spread IR out. The f -number and the focal length of a lens equipped with an IR pass filter are 1.3 and 8 mm, respectively.
We characterize the IR responsivity of the sensors with different thickness of the epi-layer first. The thicknesses are 3, 6.5, and 13.5 µm, and called thin, mid, and thick epi in following sections. Fig. 9 shows a relative responsivity of both conventional and proposed pixels in the same illumination and integration time. All responsivities are normalized by the value of the conventional pixel with the mid-epi thickness, which is 100% in Fig. 9. Since the fill factor of the proposed pixel is about 20% higher than that of the conventional one, its sensitivity is also about 20% higher than the others without respect to the epi thickness. The split structure in the proposed pixel enables to incorporate a microlens array without any process modification for a large size microlens, boosting the sensitivity further. As expected, the thick epi-layer shows the highest IR responsivity that is not linearly proportional to the thickness. It is attributed that shallow junction depth and depletion region cannot collect all electrons generated in the neutral region for the vertical direction.
To evaluate the performance as the I-ToF pixels with various parameters, we measure their C demod as the modulation frequency changes. Assuming that the emitted light is modulated with a sinusoidal waveform, the C demod is calculated by [8] C demod = measured amplitude measured offset (1)  where A 0 , A 90 , A 180 , and A 270 are intensity data of four phases for the demodulation. Since the proposed pixel has two storage nodes, two frames of in-phase (A 0 and A 180 ) and quadrature-phase (A 90 and A 270 ) data are required to calculate the phase shift and the C demod . Fig. 10 shows the measurement results of the C demod with four different conditions. Default design parameters are as follows.
• Pixel structure: The proposed pixel with lateral drift field the proposed pixel with the conventional one to verify the lateral drift field in the PPD. As shown in Fig. 10(a), the C demod of the conventional pixel is drastically degraded in 25 MHz, which shows a good agreement with the previous result [28], while that of the proposed one is kept above 50% up to 50 MHz, which confirms the lateral drift field is inevitable for operating at high frequency. Next, we check the performance of the proposed pixel regarding the epi-layer thickness as plotted in Fig. 10(b). The thinner the thickness is, the higher the C demod can be achieved, especially in more than 50 MHz frequency. Note that the n-dopant level for the thin epi wafer was different from the others on purpose. The average pinning potentials of both thin and mid epi wafers with low dose are approximately 0.7 and 0.5 V, respectively. These high pinning potentials in the thin epi reduce the strength of the lateral electric field, resulting in lower C demod than the mid epi in low frequencies. The thick epi has the worst performance as we expect. All the C demod values are drastically decreased at 75 MHz because of the performance degradation in the charge transfer speed, the TX signal transmission, and the current driver for the emitter. The optical power of the emitter is reduced and goes null sometimes at 75 MHz. Additionally, since the current driver is not operated at 100 MHz, we extrapolate the C demod represented by the dotted line in Fig. 10(b), assuming no change in the slope. Based on these results, further optimization in the pixel and supporting circuitry can provide reliable depth imaging with the C demod of more than 50%. The combination of the pinning potential with the epi thickness shows interesting results in Fig. 10(c). As we surmise, high V pin produces slightly better C demod in the only thick epi case. Finally, Fig. 10(d) plots the C demod with respect to changing the threshold voltage of the TX, which is also well aligned with the simulation results. The high and low V th devices suffer from the potential bumps in the interface between the TX and the PPD, dropping down the performance.    11 plots measured depth with the proposed pixel in the mid epi and low V pin from 1 to 3 m after a simple zeroorder calibration that compensates the time delay between the emitter and the TX of the sensor. The modulation frequency and the integration time are 50 MHz and 18 msec, respectively. The maximum non-linearity error is approximately 3.7% at 2 m distance due to large distortion in the waveform of the emitter. On the contrary, the depth resolution that is the standard deviation of 100 consecutive depth frames in a single pixel is less than 12.5 mm from 1 to 3 m distance, as shown in Fig. 12. The thin epi has worse depth resolution in spite of higher C demod , attributed by lower sensitivity. Fig. 13 illustrates a depth image captured from the prototype sensor at an indoor environment. Several objects located from 1 to 3 m distance are clearly recognized with z-axis FIGURE 13. Captured depth image with several objects located from 1 to 3 m distance. The Z-axis represents the distance, while the X-and Y-axis show the spatial resolution. as well as a color map representation. Severe column fixedpattern noise is observed due to no special compensation circuits which can be easily adopted. A depth image after averaging 100 frames data successfully presents fine features of the Julien Plaster Top in a 3-D mesh in Fig. 14. The performance comparison with state-of-the-art I-ToF sensors is also summarized in Table 1.

IV. CONCLUSION
A PPD-based I-ToF pixel structure and design guides to improve the performance at high modulation frequency have been presented. It is essential to create a lateral drift field inside the PPD by modifying the shape of the implant layer and increasing dopant level, which should be trimmed with TCAD simulation. The epi-layer thickness, the pinning potential, and the threshold voltage of the transfer gates also should be taken into account and chosen carefully. In this work, the best demodulation contrast is obtained from the proposed pixel with thin epi, low V pin , and mid V th at 75 MHz frequency. Considering another important parameter, the sensitivity to the emitter wavelength, the pixel with mid epi operating at 50 MHz produces higher depth resolution. This study shows the potential of the standard PPD device for being incorporated with the I-ToF pixel operating at even higher frequency towards 100 MHz.