Histogram-Less LiDAR Through SPAD Response Linearization

We present a new method to acquire the 3-D information from a Single-photon avalanche diode (SPAD)-based direct-Time-of-Flight (d-ToF) imaging system which does not require the construction of a histogram of timestamps and can withstand high-flux operation regime. The proposed acquisition scheme emulates the behavior of an SPAD detector with no distortion due to dead time, and extracts the ToF information by a simple average operation on the photon timestamps ensuring ease of integration in a dedicated sensor and scalability to large arrays. The method is validated through a comprehensive mathematical analysis, whose predictions are in agreement with a numerical Monte Carlo model of the problem. Finally, we show the validity of the predictions in a real d-ToF measurement setup under challenging background conditions well beyond the typical pile-up limit of 5% detection rate up to a distance of 3.8 m.


Introduction and related work
Spatial perception enabled by 3D imaging techniques is constantly gaining interest for industrial [1], automotive [2], space [3] and consumer applications.As an example, the required level of selfawareness of autonomous driving vehicles demands for 3D imaging systems with high resolution and high frame rate.Unfortunately, these requirements are in conflict, constraining engineers to performance-limiting tradeoffs.In this paper we focus on Light Detection And Ranging (LiDAR) / direct-Time-of-Flight (d-ToF) measurement based on Single-Photon Avalanche Diodes (SPAD), which is one of the most promising among active techniques [4].In a SPAD-based d-ToF measurement, the distance is extracted by measuring the traveling time of a pulse of light projected from the source and reflected back by the target to a detector that consists of a SPAD operating as a photon-to-edge converter, coupled to a photon timestamping circuit (usually a Time-to-Digital Converter (TDC) or a Time-to-Amplitude Converter (TAC)) [5][6][7].Due to hardware limitations, such as the detector dead-time and the statistical nature of photons, together with the presence of uncorrelated background light, a number of observations are usually accumulated into a histogram memory to enhance the signal to noise ratio and extract the target distance by means of signal processing techniques [8,9].
The increased interest in Advanced Driver Assistance Systems (ADAS), where 3D vision is a pillar, is focusing the attention of researchers in developing techniques to increase the robustness of such systems against the effect of background light and against possible interference from similar devices.Concerning the problem of background light, one of the most effective and widespread technique is known as photon coincidence, which exploits the temporal proximity of photons belonging to the reflected laser pulse to filter out unwanted background photons which are more likely to be temporally sparse [10,11].With a different method, based on a smart accumulation technique by Yoshioka et al. [12], the signal to noise ratio (SNR) is increased by merging the information from pixels observing similar regions of the scene.A different approach has been recently proposed by Manuzzato et al. [13], where a per-pixel circuit is able to automatically decrease the SPAD sensitivity reducing the probability of saturation in case of high background intensities, favoring the detection of laser photons.Yet another technique is known as time-gating, where by means of a search procedure several sub-ranges of the scene are measured, increasing the SNR at the expense of an increased acquisition time [14].Regarding the problem of mutual interference, Ximenes et al. [15] propose a spread-spectrum based technique, where the laser emission time is randomized from device to device, spreading any other interference below the level of the signal of interest.Another solution is based on the emission of two laser pulses whose temporal relationship is different from device to device and used to actively discard unwanted interference [16,17].
These techniques, however, cannot cope with very high fluxes because of two fundamental problems.First, the histogram of timestamps appears distorted due to the dead-time of SPAD detectors and timestamping circuits, which translates into a non-linear response of the system to the incident flux of photons over time.It is a widely held belief that the upper photon flux limit that still results in a negligibly distorted histogram is given by 5% of detected photons per laser cycle [18].Second, the amount of data generated by the sensor is too large to scale to a large number of pixels.Rapp et al. [19] overcome the former problem by linearizing the histogram through a Markov chain model of the photon detection times.By using this method, the system can cope with up to 5 photoelectrons per illumination cycle.Gyongy et al. [20] mitigate the second problem by upscaling a low resolution depth image based on a high resolution intensity image.A similar approach is used by Ruget et al. [21], where the native resolution of a depth image from a SPAD camera is increased by means of a deep neural network.Still the bottleneck is represented by the necessity to build and transfer a histogram of timestamps for each pixel in the image from which the ToF is extracted.
One strategy to reduce the bandwidth requirement on the amount of data which is transferred from the chip to the controller (usually an FPGA or C) is to integrate the histogram, or part of it, directly on chip.Several solutions are proposed in the literature to integrate histogramming capability on-chip [22][23][24][25][26][27][28][29][30][31], but despite the advantage in bandwidth performance compared to other solutions where the histogram is built off-chip [10,11,[13][14][15], still many limitations are present.In general, the on-chip realization of either a partial or full histogram requires additional area, which can be obtained by either reducing the fill factor or by using expensive 3D-stacked solutions.
With the so-called partial approach [16,24,26,27,[29][30][31], a reduced histogram memory is available on-chip, therefore requiring a search procedure to identify the location of the ensemble of histogram bins containing the laser peak.In the literature, two techniques have been described to implement a partial histogram behavior.With the so-called zooming technique [16,24,26,30], at the beginning of the measurement the reduced set of histogram bins is spread across the entire distance range.By counting the number of photons detected on each bin, the reduced set of histogram bins is concentrated over several iterations on a shorter range, thus achieving the desired resolution on the estimated target distance.With the other technique, called sliding, the subset of histogram bins is already set to the desired resolution, thus covering only a small portion of the range.Again by means of several iterations, the subset of histogram bins slides across the entire range, and the number of photons at each iteration is used to estimate the target distance.Despite the intrinsic differences between the two methods, for both of them, as outlined by Taneski et al. [32], a laser power penalty occurs as more laser shots are required to find the laser peak location.
A full histogram approach is possible with resource sharing, e.g., by reallocating the same histogram circuitry to several pixels, as described by Kumagai et al. [28].However, resource sharing resulted in only ≈ 27% of the chip area dedicated to the SPAD array, requiring a very high clock frequency of 500 MHz and the design of a 3D stacked solution with a complex scanning illumination approach.
In this paper we propose for the first time a robust method supported by a rigorous mathematical model to extract the 3D information from a set of acquired timestamps without the need to build a histogram, which can also sustain high photon fluxes, enabling the possibility to operate beyond the standard limit of 5% detection rate [18] for pile-up distortion.The method can be implemented using only two registers and one accumulator for each pixel.With such a low amount of resources, the per-pixel memory requirement is reduced by more than 3 orders of magnitude compared to standard d-ToF architectures (off-chip histogram) [14,15], and by a factor of ≃ 5 compared to architectures with on-chip, full histogramming capability [33].We reach the goal in two steps.First, we propose and evaluate an algorithm to efficiently extract the target distance from a set of timestamps based on a simple on-the-fly average operation, which does not require the allocation of a histogram memory.Then, since the proposed algorithm works on the assumption that the detector response is linear, we present two acquisition schemes that can be easily implemented on chip and emulate the behavior of a single photon detector with no dead-time, providing the desired linear response to the input flux of photons.The proposed method is supported by analytical and numerical (Monte Carlo) models and has been validated experimentally up to a distance of 3.8 meters (mainly limited by the sensor used for characterization [34]) under a background light equivalent to 85 kilolux and beyond the standard 5% rule for pile-up distortion [18].
The paper is organized as follows.In Section 2, a typical d-ToF acquisition system with the most important parameters of concern is described and preliminary considerations on a histogram-less acquisition approach are provided with a first validation by means of a Monte Carlo model.In Section 3, we provide the analytical proof of the proposed acquisition method, while in Section 4 we explain in detail two acquisition schemes which are needed to emulate the response of a linear detector, together with a comparison against state-of-the art sensors in terms of memory requirement, scalability and tolerance to high background flux.In Section 5, we provide measurement results from an existing SPAD-based d-ToF sensor, showing that the proposed acquisition and extraction schemes are capable of successfully computing the ToF without the need for a histogram of timestamps.Finally, we discuss future perspectives to advance the results found in this work in Section 6.

Preliminary validation
In this section, we present the principle of operation of a SPAD-based d-ToF system with preliminary considerations and Monte Carlo simulations on the histogram-less approach which will be further developed in the paper.

Typical d-ToF operation
A typical d-ToF image acquisition requires a pulsed laser and a time-resolved, single photon image sensor with photon timestamping capabilities.It works by sending periodic laser pulses and then measuring the arrival time, or timestamp, of the first detected photons reflected by the target following each pulse.Due to space and bandwidth limitations, the number of photon timestamps generated per laser pulse is typically limited to one.
In principle, a single laser pulse, and thus a single timestamp, would be sufficient to estimate the time of flight.However, due to the presence of uncorrelated background events (from both external light sources or internal SPAD Dark Count Rate (DCR)) and of shot noise, the first detected photon may not be from the laser pulse, so that several repetitions are needed to discriminate the different contributions.To do so, the timestamps measured during the acquisition process are collected in a histogram memory that records how many times each timestamp has been observed.This provides a convenient representation of the temporal distribution of the arrival times, as shown in the example of Figure 1.In a system capable of acquiring only one photon (the first), the distribution of the arrival times is a piece-wise exponential curve, where each segment is described by: whose intensity (rate)   depends on the intensity of background light, dark counts and laser echo.Table 1 summarizes the most important parameters of the detection process.The ToF is typically estimated from the histogram by finding the location of its peak or of a sharp rising edge, which likely belongs to the reflected laser pulse.The histogram of timestamps contains all the relevant information to properly estimate the time of flight, and represents the gold standard processing technique in the field of SPAD-based d-ToF systems.Unfortunately, the histogram requires a considerable amount of resources in terms of memory, bandwidth and power, as it requires the readout of every timestamp from the sensor for processing by an external controller (FPGA or C).Even with the latest implementations where the histogram is available on-chip, the required amount of resources is considerable.As an example, a 10-m-range 128 × 128 LiDAR system with a 100 ps time resolution and 8 bit histogram depth requires approximately 10 MB of memory., the distributions of background-only events (with rate   ) and laser-only events (with rate   ), respectively, are shown.In (c), the distribution of the combination of background and laser events is reported, graphically showing that the superposition property does not hold due to the non-linear behavior of the detection process.In particular, the portion of background events after the laser peak are underestimated, as only one photon per acquisition can be detected.For each contribution, the amplitude terms of the exponential ( 0 ,  1 ,  2 and  3 ) are reported, with  3 <  0 due to the SPAD non-linearity.

Histogram-less approach
Intuitively, if no background events are present, and neglecting the width of the laser pulse, we could estimate the time-of-flight without the need to build a histogram.This can be achieved by simply calculating the average of the continuous stream of laser-only timestamps.To extend the above method to scenarios where background events are also present, we need to eliminate their contribution to the average.Again, intuitively, this can be accomplished by dividing the measurement into two acquisitions.The first is performed with the laser turned off, and is used to estimate the contribution of the background light only, by computing the average t of the recorded timestamps.The same operation is repeated in the second acquisition with the joint contribution of background and laser timestamps, resulting in a total average time t .In principle, the time of flight should be proportional to the difference between the two averages, with the contribution of the background canceling out: This approach, however, relies on the superposition property which does not hold, as SPADs are non-linear detectors.More specifically, the problem lies in the dead time of the detection process, which depends upon the SPAD dead-time itself, on the limited bandwidth of the timestamping circuit and also on the limited memory available, blinding the measurement channel for some time after each detection.With this limitation, the detection of a photon belonging to the laser echo does prevent a later photon from being potentially detected, resulting in a distortion of the statistics.In particular, the amount of background photons which contribute to t in the second acquisition (with the laser turned on) is underestimated.This behavior can be observed on the histograms of Figure 2, where the distribution of timestamps in different scenarios are compared, emphasizing the estimation error.Furthermore, the greater the laser echo intensity, the higher the number of background photons that are underestimated.This can also be seen analytically by computing explicitly the cumulative distribution function  of the random variable  associated to the first photon detection time, defined as  () := P( ∈ [0, ]).If  ∈ [0, ToF] then the incoming photon necessarily belongs to the background, hence Finally, for  > ToF +   we have: The probability density  of the distribution of , where  () = ∫  −∞  (), can be easily computed by means of the formula  () =  ′ (), yielding: while the average first arrival time is given by As shown in Equation (3), the average detection time depends non-linearly on two parameters (  and ToF).This confirms that it is not possible to uniquely extract the ToF from the aforementioned acquisition method, since the average of the timestamps acquired in the second acquisition, t , which is governed by the probability of detection of laser photons, also depends on the   parameter, which depends not only on the ToF, but also on the target reflectivity.One could try to compensate for the error, however this requires measuring also the intensity of the received laser light (which affects the error), introducing an extra variable which is hard to estimate, invalidating the procedure.Conversely, a linear detector, with no dead-time, is able to timestamp every photon which falls within the acquisition window.In this case, the histograms shown in Figure 2 become linear in time, as shown in Figure 3.
One can then extract the ToF with the proposed two-step procedure.In the first step, we measure the total number of background events,   , and their average absolute arrival time, t .In the second step, with the combination of both background and laser events, we measure the total number of events and their average absolute arrival time, denoted   and t .Because the superposition property holds, the difference   −   is equal to the amount of photons from the reflected laser source.We can then extract the ToF by properly weighting each average timestamp measurement with the relative photon count contribution: where   is the average arrival time of the laser photons referred to the laser emission time in the absence of background light (i.e.,   = 0 and ToF = 0).The value   is a characteristic parameter of the laser source, which can be experimentally estimated by means of an initial calibration.The proposed extraction method does not require the allocation of histogram memory, Figure 3. Distribution of timestamps obtained with a linear detection process, i.e., with no dead-time limitation.The distributions are uniform, since we are now considering the absolute arrival time of detected photons with respect to the beginning of the acquisition window.With this approach, it is possible to safely subtract the background contribution in (a) from the combined measurement in (c).As the superposition property holds in this case, there is no longer an under-weighting condition of background counts after the laser pulse peak in the histogram (c).and needs only two counters to store   and   , and two accumulators to compute t and t reducing the memory requirements by more than three orders of magnitude compared to recent long-range high-resolution d-ToF sensors [14,15].We can further reduce the amount of resources down to a single accumulator, needed to store t , and two counters for   and   , because a constant background throughout the acquisition window leads to an average background time t of   /2.In this case, Equation (4) turns into the simpler form: which yields: We have simulated this extraction method with a Monte Carlo simulator [35] by sweeping the parameters   and   in the range [10 6 − 10 8 ] and [10 5 − 10 9 ], respectively.For each pair of   and   values, 10 4 measurements have been acquired with the ToF value set to 25 ns.The resulting ToF, obtained from Equation (4), is shown in Figure 4 with the correct estimation over a wide range of   ,   pairs, failing only when the   /  ratio is too low even for a classic histogram-based approach.
In the next section, we provide a rigorous mathematical analysis which proves the results briefly introduced with Equation ( 6) from the underlying statistical distribution of photons.

Mathematical analysis
This section proves the validity of the method described above analytically.In the following, we shall denote the duration of the acquisition window by   , assume the laser echo to entirely occur within it, i.e.,   ≫ ToF +   , and denote the time-dependent intensity of the laser pulse by the function   : [0,   ] → R. The flux of photons can be modeled by a counting process (  )  ∈ [0,  ] obtained as the sum of two independent Poisson processes: (   )  ∈ [0,  ] , with intensity   , describing the background flux of photons, and (   )  ∈ [ToF,ToF+  ] , describing the signal.In particular, the process (   )  ∈ [ToF,ToF+  ] is modeled through an inhomogeneous Indeed, by considering a partition {  } =1,..., of the interval  = [0,   ], the independence of the increments of the inhomogeneous Poisson processes (  )  ∈ [0,  ] yields where, for an interval  = [, ] ⊂ R we adopted the notation   := ∫   () and we assumed  =1   = .By a straightforward computation, the right hand side of (8) can be written as the latter being equivalently obtained in terms of  independent and identically distributed continuous random variables  1 , . . .,   with density  given by (7).In other words, the arrival times {  } =1,..., of the  photons provide a statistical sample for the distribution (7).The distribution (7) has a mean  given by Clearly, since  is a linear function of ToF, it can easily be inverted.By denoting with  the ratio we get an equation providing the time-of-flight as a function of the other characteristic parameters of the process: where is the average arrival time of the laser photons referred to the laser emission time in the absence of background light (i.e.,   = 0 and ToF = 0).
If  photons are detected within the interval [0,   ], the sample mean t of their detection times provides an unbiased estimator for .The main issue with this approach is the estimation of the parameter , which depends on   and   , the latter being affected by a high level of uncertainty since it is related to the intensity of the laser echo.Because the total number of photons detected in the time interval [0,   ] is a Poisson random variable with average () and, analogously, the number of background photons detected in the time interval [0,   ] is a Poisson random variable with average     , we can estimate both parameters by observing a realization of both processes.More precisely, let us first switch off the laser source and collect the number   of photons arriving during the interval [0,   ], then let us switch on the laser source and collect the total number   of photons arriving during the interval [0,   ].The observed values   and   are respectively an estimate for     and     + ∫   0   (), while their ratio is an estimate α :=     for the parameter  defined in (10).By replacing the parameters  and  with their estimates t and α in formula (11), we obtain the following estimator for ToF: which coincides with (6).

Acquisition schemes
The simulation results obtained in Section 2.2 are based on the assumption that the photon detection process is ideal, i.e., with no dead-time and with a linear response over the incoming flux of photons.In a real-world scenario, however, detectors are limited by the dead-time between subsequent detections, resulting in a non-linear response.To implement the proposed extraction method, we propose a novel SPAD acquisition scheme which emulates the behavior of a linear detector.More in detail, we propose two ways to obtain a linearized SPAD response from a real SPAD.Both methods are based on the assumption that the underlying statistical processes are stationary and ergodic.In particular, we assume that there are no major fluctuations of the characteristic parameters of the process during the acquisition time.Similarly to an equivalent-time sampling oscilloscope, both methods rely on repeating the observation multiple times to emulate the response of a SPAD detector with no dead time.In Section 4.1 and 4.2, we describe the working principle of each method and propose a possible implementation.Then, in Section 4.3, we provide the mathematical proof that both acquisition methods are capable of correctly sampling the distribution of photon arrival times.

Acquisition scheme #1: Acquire or discard
The first acquisition scheme relies on a simple (albeit inefficient) mechanism which requires no additional resources in terms of the SPAD driving circuit.The acquisition works over multiple runs, each requiring multiple observations.The first timestamp of every run is considered valid, memorized, and used to increment either   or   , depending on the current phase of the acquisition, and update   .Then, in the next observations, timestamps are considered valid, and used to update the algorithm parameters, only if they are higher than the largest previous timestamp, otherwise they are discarded.This procedure is repeated until the end of the acquisition window   is reached (no photon detected), thus covering a complete acquisition, and concluding the run.The process is then repeated multiple times to increase statistics.Figure 5 shows an example of a run, including all the discarded events, and a possible implementation.While the implementation is straightforward, the method is inefficient because the majority of the detected photons may end up being discarded.

Acquisition scheme #2: Time-gated
The time-gated acquisition scheme works by delaying the activation of the SPAD to start from the previously acquired timestamp, until the end of the acquisition window is reached.With this approach, there is no need to discard timestamps, allowing for a faster acquisition.This, however, comes at the expense of a more complex hardware implementation, which needs a time-activated gating scheme, for instance using a programmable delay line.An example of acquisitions is shown in Figure 6, together with a possible implementation.

Mathematical description
While providing the same result, it is clear that the implementation cost and the performance of the two acquisition schemes are different.With the acquire or discard scheme, almost no hardware modification is required to an already existing SPAD sensor.However, because of the decimation process, the efficiency of the acquisition could be very low.This also depends on the intensity of the incoming flux of photons: the higher the intensity, the higher the probability to have smaller timestamps which block the detection process.On the other hand, the time-gated scheme requires a delay line and the SPAD-gating, but the efficiency is much higher since no decimation process occurs.To show the difference in terms of efficiency of the two proposed acquisition schemes, we run a Monte Carlo simulation with background light flux in the range [10    With this approach, no photon timestamp is discarded thanks to the delayed activation of the SPAD for each timing measurement.During the charge delay phase, the SPAD front-end is forced OFF, thus photons can not be detected.For each measurement, the first timestamp is detected and used to increment either   or   and update the average time   .At the same time, the charge delay phase value is updated accordingly for the next measurement.As opposed to the acquire or discard method, more hardware resources are needed to build the delay element which controls the activation of the SPAD. 100 ns.As shown in Figure 7, at the highest flux of 10 8 photons/sec, the amount of timestamps to be acquired to cover the acquisition window   for the acquire or discard scheme is more than 3 orders of magnitude higher compared to the more efficient time gated scheme.From the simulation, we can also identify the maximum number of measurements which can be executed by the two acquisition schemes to sustain an operation frame rate of 30 Frames Per Second (FPS).The time-gated scheme can average the linearized SPAD response up to  = 3 • 10 4 times over the whole range of background light flux.On the other hand, with the same number  of acquisitions, the acquire or discard scheme can only support up to ≈ 2.4 • 10 7 ph/s of background flux.
From a mathematical point of view, both acquisition methods allow sampling the correct distribution of the photon arrival times (  ) ≥1 .Let   denote the occurrence time of the −th event, i.e., the arrival of the -th photon.This can be defined as the infimum of the set of times  such that the number of arrivals   in the interval [0, ] is greater than or equal to : By definition,    = .We can therefore derive an equivalent representation of the random variables   .Indeed, for  ≥ 2, the time   of occurrence of the −th event can be obtained as the infimum of the set of times  greater than  −1 (the time of occurrence of the ( − 1)-th event) such that the increment   −   −1 , the number of events occurring in the interval ( −1 , ], is greater than 1.Therefore which corresponds to the results of the acquisition schemes described previously.

Comparison with state-of-the-art
In this section, we compare our histogram-less acquisition method with state-of-the-art SPADbased LiDAR sensors in terms of memory requirement, scalability and tolerance to high background light flux.For all comparisons in this section, we consider for our method 16 bits of counters depth (i.e., up to 65535 counts for   and   , respectively) and then 3 times the number of TDC bits (1xTDC bits required for the TDC word itself, and then 2xTDC bits to properly size the accumulator memory).First, we compare against standard sensors, i.e., sensors that require the raw timestamps to be read out to build the necessary histogram of timestamps off-chip.To provide a fair comparison, we do not consider the sensor resolution, which changes from chip to chip, but only the amount of memory required to build the histogram for one pixel.
In the works we consider for our comparison [2,10,[13][14][15], we extrapolate the total amount of per-pixel memory based on the number of reported TDC bits and on an 8-bit histogram depth for all of them.We then compare our solution to sensors that offer full on-chip histogram capability [33, 36, Also in this case, we consider the amount of memory reported in each work necessary to build the histogram of timestamps for one pixel.Results are reported in Figure 8, where an average memory reduction factor of ≈ 2129 and ≈ 136 for standard and full on-chip histogram sensors, respectively, is obtained.More comparison details, including minimum and maximum memory reduction factors are reported in Table 2. Similarly to our method, partial histogram approaches [24][25][26]31] are also quite effective in reducing the memory requirements.Nevertheless, our approach not only outperforms them with a memory reduction that ranges from 67% [25] to 3% [31], but also performs better in many other important aspects.In fact, unlike previous work, our approach does not have any of the following needs: • The need to find the laser peak in time using a zooming or a sliding search procedure, which is at the basis of every partial histogram approach [32].
• The need to share hardware resources (TDCs, memory) among pixels in the same column [24] or in the same cluster [31], to reduce the area usage.
• The need for area consuming processors to manage the algorithm underlying the partial histogram technique and that can only be implemented using advanced 3D integrated technologies with single-pixel access [25].
All these translate into higher measurement time and laser power penalty as more acquisitions are needed than a standard full-histogram approach [32], and higher costs.Our method, given the very limited amount of required memory resources, is also advantageous in terms of scalability to higher sensor resolutions, and also in terms of range extension.As an example, a standard histogram-based sensor with 15-bit TDC requires memory to store up to 32767 histogram bins per pixel.If the measurement range is doubled, the additional TDC bit results in an increase of 100% on the memory requirement.Conversely, with our approach, the amount of memory increase to double the range is limited to only ≈ 3.9%.
Concerning the tolerance to high background light flux, both detection processes can sustain very high flux regimes, with a limit determined by the finite resolution   of the timestamping circuit.This limit translates to the requirement 1    +  ≫   , i.e., having a low probability In (a), we consider standard sensors where every timestamp is read out and the histogram is built off-chip [2,10,[13][14][15].In (b), we consider sensors with full on-chip histogram capability [33,36,37].
that more than one photon fall into the same time bin.By considering a threshold on this probability, we can extract the maximum flux of photons   which can be sustained by our detection process.The probability to have more than one photon per time bin is expressed as By setting a threshold of less than 1%, and considering   = 100 ps, the maximum photon flux that can be sustained is equal to   ≃ 1.48 • 10 9 ph/s.Compared to the maximum flux required by a standard system which must comply with the 5% rule, and with the hypothesis of an acquisition window   of 100 ns, our detection process can sustain a photon flux ≃ 3000 times higher.

Standard sensors Full on-chip histogram
Min. Avg.Max.Min.Avg.Max.

Measurement results
The proposed acquisition scheme has been validated with measurements using real data from an existing single-point SPAD-based d-ToF sensor, with an architecture similar to the one from Perenzoni et al. [34], which in addition offers on-chip histogramming capability.The sensor is After a user-selectable number of acquisitions, the histogram is read out and unpacked.Then, the unpacked data is shuffled to recover a realization vector of the arrival times of the detected photons.
Background events were generated by means of a ≈180 W fiber-coupled halogen illuminator pointed directly toward the sensor, while a black matte panel with low ≈10% reflectivity was selected as target, with a distance range from 1 m up to 3.8 m.A picture of the setup is shown in Figure 9, with indications on the main components.
First, we focus on the validation of the linearization behavior of the proposed acquisition scheme by considering only background light.Then, we consider the combination of background and laser together, as in a real scenario, and we compute the ToF with the proposed histogram-less acquisition scheme.

Preliminary considerations
As we base our measurements on the re-engineering of an existing d-ToF sensor, preliminary considerations are needed before providing further details on measurement results.The sensor measures the arrival time of the first detected photon for each laser pulse, as described in Section 2, which is stored in an on-chip histogram memory.Since the sensor measures the arrival time of the first photon, the statistical distribution is exponential, thus we are considering relative arrival times.A statistically valid realization of the incoming timestamps is obtained by unpacking and randomly shuffling the content of the histogram memory.The obtained realization is a vector of relative arrival times, which is the starting point of our measurement analysis.
In Section 4, we described two possible acquisition schemes.The acquire or discard scheme, even though is intrinsically inefficient, can be straightforwardly used with our dataset as it requires no hardware modification over the already existing SPAD-based d-ToF system.The time-gated scheme, while more efficient, requires a time-gating circuit which is not implemented in our sensor.
The first set of measurements focuses on background events only.In this case, there is a single source of events with intensity   , so we can apply the time-gated scheme by computing the cumulative sums of timestamps to obtain absolute arrival times from relative ones.On the other hand, when events from background and laser are combined, as in a real measurement scenario, it is not possible to mimic the behavior of the time-gated scheme by means of the cumulative sum operation.In that case we rely on the acquire or discard scheme.

Measurements with background light only
We set the intensity of background light from a minimum of ≈ 6.5 • 10 6 up to ≈ 133 • 10 6 events/s.This is the rate of events at the output of the SPAD, which therefore takes into account all physical parameters of concern of a typical d-ToF system [35].Considering an acquisition window of 100 ns, specific from the sensor [34], the equivalent average number of detections within   equals ≈ 0.65 and ≈ 13.3 for the minimum and maximum background light intensity, respectively.In both cases, this is much higher than the conventional limit of 5% events [18] (13 and 266 times higher, respectively), showing the high resistance of our method against pile-up distortion.By considering the equation which links the intensity of background events,   , with the physical parameters of the system [35], it is possible to derive the equivalent background illumination level, in kilolux, up to a maximum of ≈ 85 kilolux.Measurement results are shown in Figure 10, showing a relative deviation from the reference background intensity extracted from the exponential fit of the original histogram of less than ±0.5 % over the whole range of values.

Measurements with background and laser light and extraction of the ToF
Our first goal is to show that the underestimation of background counts which occurs in a standard d-ToF system can be completely recovered with our acquisition scheme.This is demonstrated in the first measurement, displayed in Figure 11, which compares a traditional acquisition with the acquire or discard scheme, qualitatively showing the linearization process by means of the linearized histogram of timestamps.
We then quantitatively evaluated the linearization process by estimating the intensity of background light from both portions of the histogram, i.e., before and after the laser peak.For this characterization, we used the ≈10% reflectivity target (black matte panel) at 2.5 m distance from the sensor.The results are depicted in Figure 12, showing a relative deviation from the ground truth (estimated from an exponential fit on the original dataset) below ±4 %.
In a different measurement, we verify the resistance of the proposed SPAD linearization method against pile-up distortion.To do so, we acquire several timestamps from the reflected laser pulse with a detection rate of 90%, which is 18 times higher than the conventional limit of 5%.The results, shown in Figure 13, proves the efficacy of our linearization method in challenging pile-up conditions where a standard sensor would fail.A reference measurement acquired with a conventional Time-Correlated Single Photon Counting (TCSPC) setup is shown as reference.
The last set of measurements shows the extracted ToF without the need to build a histogram of timestamps.For each measured distance, we run the linearization algorithm 250 times to have sufficient statistics to compute accuracy and precision.For each run of the algorithm, we average the results from  = 1.5 • 10 4 vectors of linearized SPAD timestamps, to emulate an equivalent 30 FPS operation rate, as outlined in Section 4.3 and with Figure 7.For all measurements, the same ≃ 10% reflectivity target (black matte panel) was used, in the range from 1 m to 3.8 m, to emulate a challenging scenario for a typical SPAD-based d-ToF system.First, we evaluate the behavior of the ToF extraction process without background light.The results, depicted in

Conclusion
In this work, we demonstrate how to extract the time-of-flight information in a SPAD-based direct time-of-flight system without the need to build a resource and bandwidth-hungry histogram of timestamps.Moreover, the proposed method is resistant against high photon fluxes and can withstand detection rates three orders of magnitude higher than the conventionally recognized limit of 5%.The acquisition method, which is based on the linearization of the SPAD response, is suitable for integration in CMOS technology using low resources and is therefore scalable to large arrays, since it can be easily integrated per-pixel.The proposed extraction method has been completely characterized, first with Monte Carlo numerical simulations.The method is also mathematically justified, and we demonstrated its validity with real measurements, by repurposing an existing d-ToF sensor and using real data to extract the ToF.The proposed extraction method can be implemented at least in two ways, by means of the acquire or discard or time-gated detection schemes.While the acquire or discard scheme allows for the least usage of resources, it suffers from long integration times especially when the flux of photons is too high.On the other hand, the time-gated scheme can guarantee a more efficient acquisition at the expense of a per-pixel controllable delay element.Concerning the ToF extraction method, we demonstrated its validity by using an extremely low amount of resources, as only two counters and one accumulator are required.  .Characterization of the behavior of the proposed SPAD linearization method under strong pile-up conditions.The histogram obtained from the linearized vector of timestamps is compared against the original histogram (built from the detection of the first arrival time) and against a reference measurement obtained with a conventional TCSPC setup.In the histograms obtained from our sensor timestamps, the bin width is 100 ps, while the reference measurement from the TCSPC setup has 4 ps timing resolution.The proposed SPAD linearization method allows us to recover the full shape of the laser envelope even if the detection rate is 18 times higher than the conventional limit of 5%.

Figure 1 .
Figure 1.Simulated distribution of timestamps in a typical d-ToF system able to record 1 photon per acquisition, with a ToF of 20 ns and a laser pulse duration   of 4 ns.The histogram is composed of 10 5 timestamps, with a bin size of 100 ps.Superimposed to the Monte Carlo simulation, we show also the analytical exponential distribution.

Figure 2 .
Figure 2. Simulated distribution of timestamps in a typical d-ToF acquisition.In (a) and (b), the distributions of background-only events (with rate   ) and laser-only events (with rate   ), respectively, are shown.In (c), the distribution of the combination of background and laser events is reported, graphically showing that the superposition property does not hold due to the non-linear behavior of the detection process.In particular, the portion of background events after the laser peak are underestimated, as only one photon per acquisition can be detected.For each contribution, the amplitude terms of the exponential ( 0 ,  1 ,  2 and  3 ) are reported, with  3 <  0 due to the SPAD non-linearity.

Figure 4 .
Figure 4. Preliminary Monte Carlo simulation results showing the ToF computed with the proposed acquisition method with the hypothesis of an ideal linear detector.The ToF can be properly estimated over a wide range of   ,   pairs.

Figure 5 .
Figure5.Example of SPAD response linearization with the acquire or discard acquisition method.In (a), each photon arrival time is represented by a red arrow and the order of arrival is indicated.The first run starts with the acquisition of photon #1, resulting in timestamp  1 .Photon #2 is discarded, since its arrival time is earlier than photon #1.The next recorded information comes instead from photon #3, which is later than photon #1, and sets the new minimum time.The run proceeds with the same criteria resulting in the stream of photon arrival times  1 ,  2 ,  3 and  4 from photons #1, #3, #7 and #8, which is a single realization of the emulated response of the linearized SPAD detector.On the right, a principle schematic is proposed, showing the lightweight usage of resources, with only one comparator and one register required on top of the processing circuit.The acquire or discard acquisition method is simple but inefficient, as most of the photons arrival times are discarded, resulting in longer acquisition times.

Figure 6 .
Figure 6.Example of SPAD response linearization with the time-gated acquisition method.With this approach, no photon timestamp is discarded thanks to the delayed activation of the SPAD for each timing measurement.During the charge delay phase, the SPAD front-end is forced OFF, thus photons can not be detected.For each measurement, the first timestamp is detected and used to increment either   or   and update the average time   .At the same time, the charge delay phase value is updated accordingly for the next measurement.As opposed to the acquire or discard method, more hardware resources are needed to build the delay element which controls the activation of the SPAD.

Figure 7 .
Figure 7. Result of Monte Carlo simulation comparing the two proposed acquisition schemes in terms of efficiency for increasing values of background light flux (  ) in the range [10 6 , 10 8] ph/s.On the left axis, we show the number of timestamps required to linearize the SPAD response over an acquisition window   of 100 ns.On the right axis, we show the total time required for the two methods to collect  = 3 • 10 4 measurements, in order to average the linearized response of the SPAD N times.The horizontal line indicates a limit of ≈ 33.3 ms, for an equivalent operation frame-rate of 30 FPS.With the time gated scheme, the required frame rate can be guaranteed over the entire range of background light flux, while considering the acquire or discard scheme, the maximum sustainable flux is limited to ≈ 2.4 • 10 7 ph/s.

Figure 8 .
Figure 8.Comparison of the amount of per-pixel memory required by our histogram-less acquisition method against histogram-based d-ToF sensors.In (a), we consider standard sensors where every timestamp is read out and the histogram is built off-chip[2,10,[13][14][15].In (b), we consider sensors with full on-chip histogram capability[33,36,37].

Figure 9 .
Figure 9. Measurement setup with the FPGA control board, d-ToF system and halogen illuminator for the generation of background light pointed directly toward the sensor.

Figure 10 .
Figure 10.Linearization of the SPAD response with background events only.For each value of background flux, 8 • 10 6 timestamps are acquired from the sensor.In (a), an example of linearized histogram is shown together with the original one (exponentially distributed) for a background flux of ≈ 100 • 10 6 events/s.In (b), we show the flux of background events estimated from the linearized histogram of timestamps,  lin against the flux estimated from an exponential fit on the original histogram of timestamps,  fit .In (c), for each value of background flux, the entire dataset was split in 200 subsets to analyze the homogeneity of the linearization process, while in (d), the relative deviation from the background flux measured from the original histograms is shown, used as a reference, demonstrating a relative deviation below ±0.5 % over all data subsets.

Figure 14 ,
Figure 14, show good agreement between the extracted ToF and the ground truth.Then, we repeat the measurements with the inclusion of background light by setting the halogen illuminator to generate a background light flux of 7.7 • 10 6 events/s and 120 • 10 6 events/s.The values of background light flux are considered at the output of the SPADs of the sensor, and they correspond to an illumination level of ≃ 15 kilolux and ≃ 75 kilolux, respectively.Results are shown in Figure15and 16, demonstrating the validity of the proposed histogram-less ToF estimation in a real setup.

Figure 11 .
Figure 11.Qualitative measurement showing the linearization process of the proposed acquisition scheme.In (a), the original histogram of timestamps is shown in logarithmic scale, where the drop of counts which occurs after the laser peak is clearly visible.In (b), the histogram obtained with the acquire or discard scheme proves the efficacy of the linearization process, which fully compensates for the non-linearity of the detector.The length of the linearized histogram is shorter than the original dataset, as we decided to stop the linearization earlier to reduce the data loss which naturally occurs with the acquire or discard scheme.Due to the intrinsic inefficiency of this scheme, the histogram peak in (b) is attenuated by ≈ 34 dB with respect to the original dataset in (a).In (c), the two histograms are shown together after normalization.

Figure 12 .
Figure 12.Quantitative characterization of the linearization process considering four different values of background flux, from ≈ 27.6 • 10 6 events/s up to ≈ 133 • 10 6 events/s with a target distance of 2.5 m.For each value of background flux, 2.5 • 10 6 timestamps are acquired from the sensor.In (a), the relationship between the background fluxes computed before the laser peak is shown, where  fit, comes from an exponential fit on the original histogram, while  lin, comes from the linearized histogram.In (b), the same relationship is shown but considering the portion of background events after the histogram peak.For each portion, the relative deviation of the flux extracted from the linearized histogram of timestamps is shown, demonstrating an estimation error below ±4 % over the whole range.The application of the acquire or discard acquisition scheme results in a data reduction factor of ≈ 7.5 and ≈ 165 for the minimum and maximum background light flux, respectively.

Figure 13
Figure 13.Characterization of the behavior of the proposed SPAD linearization method under strong pile-up conditions.The histogram obtained from the linearized vector of timestamps is compared against the original histogram (built from the detection of the first arrival time) and against a reference measurement obtained with a conventional TCSPC setup.In the histograms obtained from our sensor timestamps, the bin width is 100 ps, while the reference measurement from the TCSPC setup has 4 ps timing resolution.The proposed SPAD linearization method allows us to recover the full shape of the laser envelope even if the detection rate is 18 times higher than the conventional limit of 5%.

Figure 14 .
Figure 14.Measurement results with no background light, showing the ToF extracted without the need to build a histogram of timestamps.The relative accuracy is below ±0.5%, while the relative precision is below 0.25% for all measurements.

Figure 15 .
Figure 15.Measurement results with low background light flux (  = 7.7 • 10 6 events/s), showing the extracted ToF without the need to build a histogram of timestamps.The relative accuracy is in the range [-0.2,2] %, while the worst relative precision is 6% at the highest distance of 3.8 m.

Table 1 .
List of parameters for a typical d-ToF acquisition.