Comparison of Two in Pixel Source Follower Schemes for Deep Subelectron Noise CMOS Image Sensors

This paper compares two in-pixel source follower stage designs for low noise CMOS image sensors embedded both on a same 5 mm by 5 mm chip fabricated in a 180nm CIS process. The presented chip embeds two pixel variants, one based on a body-effect-canceled thin oxide PMOS and the other embeds a native thick oxide NMOS. On the other hand they share the same sense node, same amplification circuit and 11bit single slope analog to digital converter (SS-ADC). The imager characterization demonstrates a histogram peak noise of 0.34e<inline-formula> <tex-math notation="LaTeX">$^{-}_{\text{RMS}}$ </tex-math></inline-formula> with the PMOS SF pixel and 0.47e<inline-formula> <tex-math notation="LaTeX">$^{-}_{\text{RMS}}$ </tex-math></inline-formula> with the NMOS SF at maximum analog gain. This performance is obtained at room temperature and 119 frame per second. Both pixel variants demonstrate a full well capacity over 5600 electrons.


I. INTRODUCTION
The performance of CMOS image sensors based on pinned photo-diodes (PPD), with associated readout chains, have been continuously improving since their first development in the late decades of the last century [1] and their introduction to the CMOS process [2]. PPDs feature today quantum efficiencies (QE) close to 90%, dark currents of less than a single electron per second in a square micrometer, and fast readout capability. CMOS image sensor processes (CIS) today include back-side illumination, vertical stacking and micron-level pixel pitch. These improvements in terms of miniaturization and integration found their way to the mainstream process thanks to the strong market demand initiated by the smartphone proliferation.
Noise is, today, one of the few performance metrics on which the fundamental limit of performance is not yet reached on mainstream CIS products. Whether for consumer applications or scientific imaging, noise is crucial for low light performance. Nevertheless, sub-electron noise CIS chips are hard to find on the market.
Recently, remarkably low noise pixels, operating at room temperature, have been presented [3], [4], [5], [6], [7] reaching noise levels below a single electron. These improvements have been followed by demonstrations of photoelectron counting capability with CMOS image sensors without any photo-electron multiplication process [4], [5]. These works combine correlated multiple sampling (CMS), analog gain, and pixel conversion gain (CG) enhancement through sense node (SN) capacitance reduction [4], [6], [7], [8], [9]. This SN capacitance reduction comes either at the cost of a low full well capacity, necessity of high voltage operation, or increased design complexity and process refinement.
In this work we address the CIS read noise reduction from the angle of the in-pixel source follower (SF) stage design. We compare and discuss two deep sub-electron CIS array implementations where the full well capacity (FWC) is maintained higher than 5600 electrons and the pixel readout time is maintained under 35 μs. The proposed readout chains are based on two in-pixel source follower (SF) designs, namely, a body-effect-canceled thin oxide PMOS and a native NMOS SF. Additionally, they share the same conventional SN parasitic capacitance reduction optimization, same amplification circuit and ADC. This comparison study aims at shading the light on an important parameter in CIS readout chain design that analog readout chain designers can leverage, which is the in-pixel SF transistor type. Fig. 1 shows a conventional low noise CIS readout chain embedding a 4-transistor pixel together with its readout timing diagram. The pixel embeds a PPD integrating the photo-generated electrons, a transfer gate (TG) allowing the transfer of these integrated charges to the SN and splitting the latter from the PPD well capacitance. A reset gate (RG) allows to set the SN to a high voltage before each transfer. When the row select (RS) switch is closed, the SF transistor buffers the voltage level of the SN to the column to be processed by the rest of the readout chain. Conventional CIS embed an array of pixels and column-level readout circuits performing a rolling readout scheme. All the pixels of a same line are readout in parallel. The column-level circuitry embeds, a correlated double sampling (CDS) scheme that takes a sample before and after the charge transfer from the PPD to the SN, an amplifier improving the signal-tonoise ratio (SNR) in case of low light conditions and an analog-to-digital converter (ADC).

II. IMPORTANCE OF THE SF STAGE FOR NOISE REDUCTION
The photo-electrons integrated in the PPD and transferred to the SN are converted to a voltage buffered by the in-pixel SF. The pixel conversion gain, CG, is the voltage difference that the SF creates at the column level for a single electron transferred to the SN. Increasing CG mitigates the impact of the noise generated at the column-level circuits which is key in low light application and also for reaching deep sub-electron noise.
The pixel SN is an area in which every fraction of a fF counts. The different elements contributing to the SN parasitic capacitance are depicted in Fig. 2 and further detailed in [10]. The in-pixel SF is far from behaving as an ideal voltage follower. In other words,CG is not simply given by the inverse of the SN capacitance. Hence, a small signal analysis taking into account both the SN and SF parasitic capacitance is necessary to express CG. Using the small signal analysis detailed in [11], [12], the CG can be formulated in the following from: where, C e and C i are the SF extrinsic and intrinsic capacitance densities, W and L are the SF gate width and length, C SN is the total SN capacitance including the junction, overlap with reset and transfer gates as well as metal wires parasitic capacitances as illustrated by Fig. 2 and n is the slope factor of the source follower transistor. In saturation, the value of n ranges from 1.2 to 1.6 and slowly tends to 1 for high V G [13]. Equation (1) shows that CG depends on the SN capacitance, the SF parasitic capacitance and the SF body effect.
Using a small signal analysis of a readout chain composed of the pixel, operational trans-impedance amplifier (OTA), and an ideal CDS. The input referred 1/f noise can be expressed as [11], [12]: where K F is the 1/f noise process and temperature dependent parameter, C ox is the oxide capacitance density and α 1/f is ni design dependent parameter resulting from the CDS effect on the 1/f noise [14]. This expression is valid under the assumption that the pixel SF dominates the other 1/f sources. Indeed, the column-level amplifier can be designed with much larger transistors compared to the in-pixel SF, making its contribution to the 1/f noise negligible. The amplifier gain mitigates the noise contribution of the next stages. The input referred thermal noise can be expressed as [11], [12]: where k is the Boltzmann constant, T the temperature, A col is the column-level OTA capacitive closed loop amplifier gain, C col is equivalent capacitance at the SF output that can be approximated by C L + C in A col +1 , γ SF , with C in and C L being the input and load capacitances of the OTA column level amplifier respectively, γ SF is the SF stage thermal noise excess factor and G m,SF its transconductance, while γ A and G m,A are those of the OTA amplifier, respectively. From (1), (2), and (3), the input-referred noise in a standard CIS readout chain can be reduced, at column-level (extra-pixel), thanks to the amplification, bandwidth control and CDS. At the pixel-level, both thermal and 1/f noise can be reduced by mitigating the SN parasitic capacitance C SN and optimizing the device choice and sizing of the in-pixel SF transistor. The latter having a direct impact on the K F noise factor and the minimum achievable gate and parasitic capacitance contribution of the SF to the SN. Hence, the SF stage transistor choice is crucial in a low noise CIS readout chain design.

III. DEEP SUB-ELECTRON NOISE READOUT CHAIN DESIGN
In this work, we compare two pixel source follower variants paired with the same low noise readout chain. Namely, a thick oxide NMOS based SF and a thin oxide source-to bulk connected PMOS. The thin oxide transistor allows reaching smaller gate and parasitic capacitance than the thick oxide. In addition, the source to bulk shorting allows the body effect suppression which boosts further the conversion gain.

A. GLOBAL ARCHITECTURE
The global architecture of the imager is shown in Fig. 3. The proposed imager features a conventional rolling shutter architecture. It embeds two arrays of 4-transistor and PPD based pixels. One array implements thick oxide NMOS native SF while the second array uses thin oxide PMOS SFs. The second stage consists in parallel column-level switched capacitor variable gain amplifiers followed by column-level parallel single slope ADCs (SSADC). The CDS necessary for low noise performance in CIS is performed at the input of the ADC thanks to a switched capacitor scheme.

B. PIXEL SN OPTIMIZATION
As shown by (1), the CG can be enhanced by optimizing the C SN term on one side, and by optimizing the SF size, slope factor and parasitic capacitance on the other side. C SN is the sum of the metal wiring parasitic capacitance connected to the SN, the junction capacitance of the SN and the overlapping of the SN with the transfer and reset gates. The last term dominates the C SN due to the large transfer gate needed for an efficient transfer and the relatively high oxide density. For instance, in the 180 nm process used in this work, the overlap capacitance per unit width is about 0.45 fF/μm. This value is even prone to be higher for advanced technology nodes. Hence, the first optimization focuses on the reduction of the overlap capacitance between the SN and the transfer and reset gates. A technique similar to low doped drains (LDD) [15] is proposed by the foundry and used to mitigate the overlap capacitance. Instead of uniformly doping the SN, the latter is doped with a gradually increasing concentration as shown in Fig. 4. The SN area overlapping with the transfer gate is doped with a concentration n 1 one order of magnitude lower with respect to the SN area where the metal contact is placed, n 2 . In this way, the overlap capacitance caused by the high oxide capacitance density is mitigated. In the same way the doping concentration n 3 underneath the reset gate overlap with the SN area corresponds to the concentration used for LDD area in standard NMOS transistors. The lower doping n 1 and n 3 reduces the overlap capacitances of the transfer and reset gates and hence reduce the total SN capacitance C SN . By adopting this sense node optimization proposed by the foundry, the term C SN is reduced to 0.6 fF based on extracted results form measurements. Fig. 5 shows the schematic of the SN optimized NMOS SF pixel. The SN optimization has no impact on the operation scheme of the pixel. On the other hand, the layout requires additional implants in order to implement the gradual doping. In this pixel variant, a thick oxide NMOS is used as a SF. It is optimally sized to the minimum gate width of 0.4 μm and an optimal length of 0.6 μm following the analysis detailed in [12], [16]. The device used in this pixel SF is a native transistor (slightly negative threshold) optimized by the foundry for linearity and voltage swing.

D. THIN OXIDE PMOS SF PIXEL
After reducing the C SN term, the contribution of the SF parasitic capacitance can no more be neglected (1). Thus, the second layer of improvement consists in optimizing the SF. The optimal SF sizing for a low input-referred 1/f and thermal noise is close to minimum sizing [16]. Due to the foundry design rules constraints, the thick oxide NMOS SF size cannot be further reduced. Hence, a way to go around this limitation is to use thin oxide transistors that are available in the same design kit. Thin oxide transistors are 1.8 V transistors featuring higher oxide density compared to the thick oxide ones used by default in pixel design. Even if these transistors feature higher oxide capacitance per unit area, they allow to go for smaller gate width and length reducing consequently the parasitic capacitance. PMOS transistors come with a separate n-well with a bulk connection. By connecting the bulk to the source, the body effect is also mitigated which brings the slope factor n in (1) close to 1 leading to a higher CG. Fig. 6 shows the schematic of the pixel implementing a thin oxide PMOS SF. As for the previous suggested optimization, this pixel scheme does not have any impact on the timing diagram but rather requires an additional voltage reference connection shifting-up the SF drain to 1.5 V in order to accommodate the 1.8 V transistor to the 3.3 V environment. On the layout side, the introduction of a separate n-well for the PMOS SF comes with more challenges imposed by the design rules constraints. Indeed a minimum spacing needs to be considered between the PPD well and the PMOS N-well. On the other hand the SF gate width and length can be reduced to a value as low as 0.2 μm.

E. COLUMN-LEVEL AMPLIFICATION
The schematic of the column-level adjustable gain amplifier is shown in Fig. 7. The closed-loop gain is set by the ratio between the input and the feedback capacitors. Seven independent feedback capacitors are implemented, corresponding to gains: 1, 2, 4, 8, 16, 32 and 64. One additional gain for ultra-low light performance can also be triggered by disconnecting all the feedback capacitors and relay only on the parasitic capacitance between the OTA input and output. To ensure precise closed-loop operation, particularly at larger gains, the OTA must provide large open loop gain. At the same time, the OTA has to be extremely low-noise and also operate with a reduced power budget, for better system integration. In this regard, a single-ended cascoded OTA is used. Such a configuration intrinsically achieves very large open loop gains, thanks to the large output resistance, accompanied by a negligible noise penalty due to the cascoded common-gate transistor. Moreover, it involves half the number of noisy transistors compared to a differential one. The OTA achieves more than 90 dB, at 12 μA DC current consumption. As far as the noise is concerned, first to make the 1/f noise contribution of the column-level amplifier negligible compared to the one originating from the pixel, the transistors of the OTA have gate areas more than 10 times larger than the SF. Regarding the thermal noise, the common-source NMOS produces more than 4 times larger  transconductance than the biasing PMOS, meaning that the thermal noise of the latter is negligible with respect to the former. Fig. 8 shows the schematic of the single-slope ADC. The corresponding timing diagram is shown in Fig. 9 It is similar to the topology in [17]. The comparators are based on amplifiers with auto-zero offset cancellation [18]. One potential problem of this topology is the possible gate capacitance change of M 0 when its operating region changes between weak to strong inversion. On one hand, to maximize the transconductance and hence minimize the input-referred noise with a certain bias current, M 0 needs to operate in weak inversion. On the other hand, when charge transfer occurs in pixel and V in rises, M 0 may enter strong inversion if the V in change is sufficiently large. This gate capacitance change can cause signal-dependent nonlinearity. To overcome this problem, a capacitor C boost is added and the signal V jump is used to ensure that M 0 is always in strong inversion in the entire range of V in during pixel charge transfer.

F. SSADC WITH INPUT CDS
One other potential problem faced with this topology is the charge injection after the autozero of the first comparator. Indeed this charge injection deviates the voltage at the gate of M 0 in the opposite way with respect to the amplifier output after the charge transfer. This results in having to convert to 0 a range of low input values. The V jump signal allows to compensate the effect of this charge injection by introducing a positive offset. The CDS is performed at the input of the ADC. The first sample (reset sample) is sampled after the auto zero while the second is sampled when SW 1 is opened. In this way the CDS time is independent of the signal level as it is the case in other SS-ADC topologies such as [19].
The digital counting starts once the 'enable' signal is high, and the converted digital value is stored in latches once the comparator output becomes high. The stored values are transferred to shift-registers at the 'load' pulse, and the values are shifted out when 'shift' is high.

IV. TEST AND CHARACTERIZATION A. PHYSICAL IMPLEMENTATION
The presented image sensor is fabricated in a 180 nm CIS process with 4 metal layers. The chip (Fig. 10b) measures 5 mm by 5 mm. The imager is directly wire-bonded to the test PCB on which an optical objective is directly mounted on a fixed barrel as shown in Fig. 10a. In order to perform pixel characterization, the optical objective is replaced by a light source to expose the pixels to a controlled light intensity.

B. PHOTON TRANSFER CURVES AND CONVERSION GAIN MEASUREMENTS
In order to measure the conversion gain, the photon transfer curve (PTC) method [20] is used. This method exploits the proportionality between the shot noise variance and the average signal. Indeed, the pixel output variance plot as a function of the mean must feature a linear trend if the readout chain is shot noise limited. In that case, the slope of the linear trend corresponds to the conversion gain. This technique is used to prove the shot noise limited performance obtained with all the pixels presented in this work and at the same time gives the evaluation of each pixel conversion gain.
To obtain the PTC including the complete read out chain, the mean and variance are calculated out of the ADCs output from 200 measurements performed at 20 different light levels. In each measurement every single pixel of the imager is read out, allowing to characterize the spatial variations of the conversion gain across the complete area of the imager. The PTC is recorded for all gain settings. Namely, gain 1 to 64 and the gain obtained using only the parasitic capacitance for the amplifier feedback that is measured to be equal to 157 for PMOS array and 138 for the NMOS array. This gain will be called high gain throughout the discussion. A LED connected to a current supply, uniformly illuminating the imager, is used as light source. In Fig. 11 the PTCs of one row of the imager with Gain 2 are shown for the PMOS and NMOS pixel types respectively. A spread of the PTC curves between pixels can be seen but with all curves following the same linear trend proving the shot noise limited performance for all gain levels. To extract the conversion gain in counts per electron the slope for each PTC curve is extracted. In Fig. 12 the histograms of the extracted conversion gains for PMOS and NMOS pixel type for Gain 2 are shown. In the case of the PMOS this corresponds to 18200 pixels and in the case of the NMOS to 6500 pixels. The difference in pixel number is due to the different size of the two pixel types on the imager array. Table 1 presents a summary of the mean measured conversion gains for each pixel variant and gain configuration.

C. INPUT REFERRED NOISE HISTOGRAMS
To evaluate the input-referred read noise, the transfer gate is turned off and the output noise, including the single slope 11 bits ADC, is measured in digital counts. The noise is calculated out of 200 measurements for each gain setting  (it is not possible to perform this measurement for Gain 1 as the ADC quantization noise dominates in this configuration). The noise measured at the output of the chain is then referred to the input using the conversion gains obtained earlier. Taking advantage of the fact that the CG is recorded for all pixels of the imager, two approaches to refer the noise to the input are compared. In the first approach, the output noise is divided by the averaged conversion gain value, and in the second, the output noise of each pixel is divided by the conversion gain extracted from that same pixel. This operation is applied to 18200 pixels in the case of the PMOS and to 6500 in the case of the NMOS. The two different evaluation methods are compared in Fig. 13 and Fig. 14 for PMOS and NMOS respectively. In (a) the noise measurements are referred to the averaged CG and in (b) the noise is referred to the actual conversion gain of each pixel. A difference between the two methods can be clearly seen, even though the histogram peaks are at almost identical values, 0.327 e − rms (average CG) and 0.345 e − rms for the PMOS and 0.489 e − rms (average CG) and 0.468 e − rms for the NMOS, a significantly bigger spread of the noise values, which is still not dramatic, can be observed when the CG for each pixel is used to refer to the input and thus the spread of conversion gain over the imager is included to the spread of the measured noise. This approach gives a more realistic image of the noise distribution over the imager and also shows that the noise of the golden pixels is even lower. Table 1 shows the noise measured at the peak of the histograms for all gain configurations (except gain 1) referring to the input using the CG of each individual pixel.

D. LOW LIGHT IMAGES
The imager is designed and optimized for ultra low light levels but can also be used in moderate light conditions using the lowest gains. To validate this, images are taken under different light conditions with an integration time of 38 ms. Due to the very experimental set up and the instruments used to operate the imager, an image is recorded every 3 seconds. With optimization of the set up, higher frame rates up to 119 fps can be reached. In Fig. 15, on the left, images taken for 3 different light levels are shown for the PMOS pixel type, while on the right a comparison between the PMOS and NMOS pixel types is presented for similar light levels. Using gain 2 with an average of (a) 940 photo-generated electrons and (b) 581 photo-generated electrons (PMOS) and 656 photo-generated electrons (NMOS) and high gain with an average (c) and (d) of 10 photo-generated electrons (PMOS) and 13 photo-generated electrons (NMOS), (e) 0.8 photo-generated electrons and (f) 0.4 (PMOS) and 1 (NMOS) photo-generated electrons. The corresponding number of generated photoelectrons is calculated using the measured overall conversion gain of the readout chain reported in Table 1. Thanks to the very low noise and the high conversion gain of the high gain mode, the features can still be distinguished at light levels as low as an average of 0.4 photo-generated electrons. The advantage in sensitivity of the PMOS pixel type over the NMOS pixel type can be clearly seen in the first two images (Fig. 15b,Fig. 15d). This difference is not prominent anymore at very low light levels as shown in 15f.

E. SUMMARY AND DISCUSSION
The above presented characterization shows that both pixel variants of the presented imager achieve deep sub-electron noise levels (a minimum of 0.34 e − RMS for the PMOS and 0.47 e − RMS for the NMOS). This performance is measured directly at the digital output of the imager without any post processing or off-chip instrumentation. The presented imager performance is demonstrated with ultra low light images showing the efficiency of the high gain modes for this application. Indeed, the maximum gain of 157 allows to capture an image with an average of less than a single photo-electron per pixel. Tab. 1 summarized the noise measurements (histogram peak) for both presented sub-imagers for all available gain settings. Fig. 16 shows a plot of the input referred noise as a function of the overall conversion gain for both variants of the imager. Here, a difference between the trend of the NMOS and the PMOS pixel type can be observed. The NMOS pixel follows a linear dependence on the overall conversion gain (column-level amplification) up to gain 64. This suggests that thermal noise dominates down to 0.7 e − RMS . For the highest gain the measured noise goes out of the linear trend. This suggests that the low frequency noise originating from the NMOS SF is dominating. Indeed, as suggested by (2) and (3) the input referred thermal noise is inversely proportional to the column-level gain, whereas the 1/f input reffered noise is independent of the gain. On the contrary, the PMOS SF based array noise decreases as 1/A col up to the highest gain configuration. This suggests that the low frequency noise is still not dominating and that there is even more room for thermal noise reduction. Table 2 summarizes the performance of the two image sensor variants presented in this work and compares it to recent state of the art works reporting a read noise level below 0.5 e − RMS . Both imager variants presented in this work offer the advantage of fast and simple readout requiring no external instrumentation or multiple sampling. Moreover, the full well capacity of the proposed pixels remain suitable for wide dynamic imaging. Hence, the proposed imager variants can cover a wide range of scientific imaging applications. This is not the case for most state of the art imagers presented in Table 2 where the sub 0.5 e − RMS performance is reached at the cost of lower dynamic range and readout speed.

V. CONCLUSION
This work demonstrates deep sub-electron noise performance at room temperature, in a standard CIS process and with a full imager array at a relatively short pixel/line readout time of 35 μs and a FWC over 5600 electrons. This is achieved thanks to optimal SN doping, optimal in-pixel SF sizing and a low noise readout chain composed of a low noise columnlevel amplifier and a SS-ADC embedding the CDS function at its input.
This noise reduction strategy is applied to two image sensor sub-arrays, one based on thin oxide in-pixel SFs and the other on thick oxide NMOS. As expected from the analysis, the thin oxide PMOS pixel features higher conversion gain thanks to a smaller sizing resulting in smaller intrinsic and extrinsic capacitance leading to a higher CG. The PMOS based pixel features lower histogram peak input referred noise down to 0.34 e − RMS . On the other hand, the pixel FWC is reduced due to a higher conversion gain and the lower voltage swing of the PMOS SF stage. This results suggests that further noise reduction can be obtained thanks to the technology scaling with a more advanced node.
The measured input-referred noise dependence on the column-level amplification shows that the NMOS pixel reaches a limit while the PMOS pixel noise can still be reduced by means of the column level gain. This is most probably due to a higher low frequency noise of the NMOS pixel. This suggests that higher gain can also help improving the reported results in a future work.