0.5 Billion Counts per Second Enable High Speed and Penetration in Time-Domain Diffuse Optics

We present the application to time-domain diffuse optics of a high-speed 8 × 256 array of single-photon avalanche diodes with integrated 256 parallel time-to-digital converters. Thanks to the high light harvesting capability granted by the overall 0.85 mm2 active area combined with a high throughput (i.e., saturated photon counting and timing rate of 512 million of counts per second), it has been possible for the first time to reconstruct histograms of photons time-of-flight in diffusive media using pulsed illumination at photon counting rate of about 450 million of counts per second even using a source-detector distance of 2 cm. This has been achieved both on tissue-mimicking phantoms as well as in-vivo, permitting high accuracy with extremely low acquisition times (down to 5 ms). This approach has been systematically validated on phantoms using established performance assessment protocols in the field of diffuse optics covering both homogeneous (demonstrating high linearity in the recovering of the absorption coefficient) and heterogeneous (demonstrating high penetration inside scattering media) paradigms. Two preliminary in-vivo proof-of-concept applications on healthy volunteer are shown, specifically, the detection of the heartbeat pattern in the brachioradialis muscle during an arterial cuff occlusion and of the same pattern acquired on the forehead during resting state.


Billion Counts per Second Enable High Speed and Penetration in Time-Domain Diffuse Optics
Laura Di Sieno , Tuomo Talala , Elisabetta Avanzi , Ilkka Nissinen , Member, IEEE, Jan Nissinen , and Alberto Dalla Mora

(Invited Paper)
Abstract-We present the application to time-domain diffuse optics of a high-speed 8 × 256 array of single-photon avalanche diodes with integrated 256 parallel time-to-digital converters.Thanks to the high light harvesting capability granted by the overall 0.85 mm 2 active area combined with a high throughput (i.e., saturated photon counting and timing rate of 512 million of counts per second), it has been possible for the first time to reconstruct histograms of photons time-of-flight in diffusive media using pulsed illumination at photon counting rate of about 450 million of counts per second even using a source-detector distance of 2 cm.This has been achieved both on tissue-mimicking phantoms as well as in-vivo, permitting high accuracy with extremely low acquisition times (down to 5 ms).This approach has been systematically validated on phantoms using established performance assessment protocols in the field of diffuse optics covering both homogeneous (demonstrating high linearity in the recovering of the absorption coefficient) and heterogeneous (demonstrating high penetration inside scattering media) paradigms.Two preliminary in-vivo proof-of-concept applications on healthy volunteer are shown, specifically, the detection of the heartbeat pattern in the brachioradialis muscle during an arterial cuff occlusion and of the same pattern acquired on the forehead during resting state.
Index Terms-Optical imaging, single-photon avalanche diode, time-correlated single-photon counting, time-domain diffuse optics.

I. INTRODUCTION
T IME-DOMAIN diffuse optics is widely recognized as a highly informative approach to the investigation of turbid This work involved human subjects or animals in its research.Approval of all ethical and experimental procedures and protocols was granted by the Ethical Committee of Politecnico di Milano and performed in line with the Declaration of Helsinki.
Color versions of one or more figures in this article are available at https://doi.org/10.1109/JSTQE.2023.3298132.
Digital Object Identifier 10.1109/JSTQE.2023.3298132media [1] and it finds application in several biomedical fields (e.g., prevention, diagnosis, and monitoring in neurology, oncology, and other medical branches) as well as in non-medical areas (e.g., food, wood, and pharmaceuticals characterization) [2].The technique exploits the injection of visible/near-infrared subnanosecond light pulses [1].Photons propagate inside the medium undergoing several scattering events and those not absorbed are eventually re-emitted from different locations of the sample's surface.A time-resolved single-photon detector is used to collect photons emitted from a given point and to reconstruct the photon Distribution of Times Of Flight (DTOF) using Time-Correlated Single-Photon Counting (TCSPC) technique [3].
Measurements performed with a single source-detector pair in the so-called "reflectance geometry" [1] (i.e., when source and detector locations on the medium are aside) enable both [2]: i) independent retrieval of absorption (μ a ) and reduced scattering (μ s ) coefficients as their contribution differently affect the DTOF shape; ii) discrimination of information coming from different depths in the medium by considering photons recorded at different arrival times.
Unfortunately, time-domain diffuse optics is impaired by a low signal-to-noise ratio due to the need for reconstructing the DTOF one photon at a time, typically preventing the possibility to perform fast measurements [4].
Recently, thanks to i) device miniaturization permitting direct probe hosting of sources/detectors, avoiding optical fibers, to minimize losses (e.g., [5], [6]); ii) development of large area microelectronic detectors, increasing the overall amount of collected photons (e.g., [7]); and iii) high-throughput TCSPC devices, maximizing the amount of information per second that can be processed and transferred to a personal computer (PC) (e.g., [8]), initial studies at high sampling rates (i.e., 20-100 Hz) have been possible, thus enabling also in the time-domain the first clear detection of the heart beat in functional near-infrared spectroscopy studies [9], [10], [11], [12].
Let us now consider the most recent studies with cuttingedge performances.Reference [11], using traditional fiber-based detection/illumination and a high throughput TCSPC system, demonstrated an acquisition rate of 20 Hz.The system operated at 2 wavelengths (i.e., 690 and 829 nm) exhibiting at 690 nm its best diffuse optical responsivity (i.e., a figure of merit quantifying the detector light harvesting capability in diffuse optics, defined as in [13]): 2.8 10 −8 m 2 sr.A high laser pulsing rate of 80 MHz was also adopted, thus limiting the effect of pile-up distortions in TCSPC acquisitions [3].Overall, the system demonstrated the possibility to achieve a Photon Counting Rate (PCR) as high as ∼10 million of counts per second (cps) on a single detector adopting a source-detector separation of 3 cm on an optical phantom with standard properties (μ a ≈ 0.1 cm −1 , μ s ≈ 10 cm −1 ).
Reference [12] instead adopted fibreless and extremely high throughput TCSPC technologies to demonstrate an acquisition rate of 100 Hz, operating at 2 wavelengths (i.e., 690 and 850 nm) and exhibiting at 690 nm its best diffuse optical responsivity: 7.2 10 −9 m 2 sr.The laser pulsing rate was in this case 20 MHz.The detection chain is composed by six independent detectors and can reach a remarkable maximum PCR of ∼1.5 billion of counts per second (cps) per detector.However, in-vivo measurements performed at 2 cm (3 cm) source detector separation demonstrated a median PCR (across different detectors) of <20 million (<4 million) of cps at 690 nm, most probably motivated by the lower responsivity as compared to [11].A median PCR of ∼650 million of cps is instead achieved at a 1 cm source detector separation, where unfortunately the slow tail in the instrument response function (i.e., decay time constant of ∼300 ps) can limit the possibility to extract information from deep layers of the medium, as demonstrated in [14].
Generally speaking, the capability to follow deep and fast tissue dynamics requires to combine a high diffuse optical responsivity with a high throughput, thus obtaining a high signal-to-noise ratio not only at short source detector separations.Towards this objective, in this work we present the first application in time-domain diffuse optics of a fully-integrated detection chain (i.e., detector array and TCSPC electronics) initially designed for Raman spectroscopy [15].Thanks to its superior diffuse optical responsivity (i.e., 2.25 10 −7 m 2 sr at 670 nm) and high throughput (i.e., saturated PCR = 512 million of cps), we demonstrate the possibility to perform both phantoms and in-vivo measurements reaching a PCR of ∼435 million of cps at 2 cm source detector separation.Reliable measurements with integration times down to 5 ms are demonstrated, provided that pile-up correction strategies are adopted.This result has been obtained despite using a laser pulse rate of just 2 MHz due to a limit forced by the integrated circuit that was not custom designed for time-domain diffuse optics, easily anticipating the possibility in the near future to increase the PCR up to several Gcps with a redesigned TCSPC detector array and devices optimized for a more traditional tens of MHz operation.The device is validated on phantoms exploiting figures of merit defined in rigorous standardized protocols for performance assessment of diffuse optical instruments (namely, BIP [13], MEDPHOT [16], and nEUROPt [17]).Proof of concept in-vivo measurements are also reported, demonstrating the possibility to clearly identify the heartbeat pattern on muscles and on the forehead of a healthy volunteer.It is worth noting that this work has two relevant limitations.The former is that a single wavelength is used again because of hardware limitations (i.e., the temporal dynamic range of our TDCs is limited to <8 ns).This prevents in this embodiment the possibility to host another DTOF corresponding to a different wavelength (e.g., 850 nm) in the same histogram, since it should be delayed by more than 8 ns not to be overlapped with information related to the DTOF at 670 nm.Therefore independent patterns of oxy-and deoxyhemoglobin cannot be distinguished as it was instead done in [11], [12].The latter limitation is the dead-time between two subsequent acquisitions, which is 6 ms.As a matter of fact, 5 ms acquisitions result into a true measurement rate of ∼91 Hz instead of the theoretically achievable 200 Hz.However, this limit is related to a not custom FPGA circuit design (initially designed for Raman spectroscopy).A null dead time can be easily achieved with tailored FPGA circuit design (e.g., by utilizing unused memory banks available in the current FPGA circuit, thus storing photon counts so as to permit the parallel acquisition and data download to a PC [18]).

II. INSTRUMENTAL SETUP AND DATA ANALYSIS
Fig. 1 shows a schematic representation of the instrumental setup of reference for this work.The detector chip with integrated TCSPC electronics ("DET" in Fig. 1) is a 110-nm CMOS technology device composed by an 8 × 256 array of Single-Photon Avalanche Diodes (SPADs).Each SPAD has an active area of 415 μm 2 , resulting into an overall sensitive area of 0.85 mm 2 .256 parallel Time-to-Digital Converters (TDCs) are hosted on the same chip, each one independently connected to 8 SPADs.Each TDC time scale is composed by 128 time bins with adjustable time width, which has been set to ∼60 ps for this work, thus resulting into a temporal dynamic range of ∼7.7 ns.The detector is hosted on a printed circuit board and it is surrounded by a neoprene layer to prevent short circuits of light between source and detector points.A 5 V and a 21.35 V ("HV" in Fig. 1) voltage biases are applied to the board using two external power supply units, respectively for powering the board electronics and for biasing all the SPADs beyond breakdown Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
voltage (i.e., ∼18.3 V).The maximum laser pulse rate that can be used with the device, applied to the trigger input ("Trig.IN" in Fig. 1) of the board, is 2 MHz.DTOFs can be acquired for an integration time that is user selectable by providing to the board, through a USB connection to a PC, the equivalent number of laser trigger pulses to be acquired.Therefore, at 2 MHz laser rate, it can be set from an extremely low integration time of 500 ns (i.e., 1 laser pulse) up to more ordinary values like 1 s (i.e., 2M laser pulses) or more if needed.Independently of the integration time, a 6 ms dead-time is present between two subsequent acquisitions, thus limiting the maximum feasible measurement rate to ∼167 Hz.At 2 MHz laser pulse rate, the maximum saturated PCR is 512 Mcps, resulting from 256 TDC conversions (i.e., 1 for each TDC) every 500 ns.Further details on the device can be found in [15]. 2 MHz laser pulses are provided by a commercially-available laser source composed by a laser control unit (PDL 828 Sepia II, Picoquant GmbH, Germany) connected to a 670 nm laser head (LDH-P-C-670M).The synchronization signal is provided by the laser trigger out ("Trig.OUT" in Fig. 1) through a radiofrequency pulse inverter ("INV" in Fig. 1).The laser head is coupled to a 600-μm core diameter step-index optical fiber to drive the light to the sample.The fiber has a pigtailed fiber-to-fiber u-bench hosting a variable optical attenuator (i.e., circular, continuously variable and reflective neutral density filter) used to set the proper PCR.Indeed, by changing the position of the incident beam on the disk, it is possible to modify the attenuation experienced by the photons thus setting the amount of optical power delivered on the sample.
Instead, the detector is directly placed on the surface of a diffusive sample (i.e., tissue mimicking phantom or volunteer's arm/forehead) to maximize the light harvesting.The laser tune parameter is set to 100%, thus resulting into an overall average power of 2.4 mW at the sample.
Once the DTOFs are acquired, some post-processing steps have been applied for all measurements such as pile-up and Differential Non-Linearity (DNL) correction and background subtraction.To correct the effect of the pile-up distortion, Coates' algorithm has been applied [19]: it is a simple (i.e., does not require any a-priori knowledge of the DTOF shape) yet effective data correction method in the field of time-domain diffuse optics as demonstrated in other works such as [20] and [21].Then the DNL correction have been applied and the background noise (taken as the average level before the arrival of first photons re-emitted from the medium) have been subtracted.All those post-processing steps have been applied to all histograms acquired by each TDC.
Finally, all the DTOFs acquired by the TDCs have been summed up together in order to have a single histogram for each acquisition time.

III. SYSTEM PERFORMANCES ASSESSMENT
In this paragraph, the performances of the proposed device will be assessed using figure-of-merits of internationally shared protocols such as BIP, MEDPHOT and nEUROPt ones [13], [16], [17].

A. Basic Instrumental Performances (BIP Protocol)
As a first step, we computed the diffuse optical responsivity.It depends on several factors such as the numerical aperture, the quantum efficiency and fill factor of the detectors.To compute the diffuse optical responsivity, we made use of a solid phantom with calibrated photon transmittance.Simply put, responsivity is the ratio between of the measured signal on the phantom and the injected photon radiance.The calculation of the diffuse optical responsivity has been done according to the definition reported in [13] and it was computed to be 2.25 10 −7 m 2 sr.This value is the highest reported for a system working at short acquisition times.Indeed, if compared to systems reported in [11] and [12] at the same wavelength, it shows a diffuse optical responsivity that is about a factor 10 and 30 higher, respectively.On the other hand, if comparing the present device with the cutting-edge microelectronic technologies exhibiting the largest active areas in the field of time-domain diffuse optics, we have a diffuse optical responsivity which is about 550, 80 and 30 times lower [21], [22], [23].This difference can be mainly ascribed to the smaller active area of this device (0.85 mm 2 ) as compared to the other 3 devices (7.38, 32 and 92.1 mm 2 active area, respectively).
In the following, all the acquisitions have been performed in 3 different conditions: r "Low PCR (all TDCs)" -obtained by setting a PCR per TDC of ∼60 kcps, corresponding to 3% of the laser pulse rate at each TDC (i.e., a commonly adopted limit to avoid pile-up effects) and summing up all TDCs, thus resulting into an overall PCR of ∼15 Mcps; r "Low PCR (20 TDCs)" -obtained by setting a PCR per TDC of ∼60 kcps, corresponding to 3% of the laser pulse rate at each TDC and summing up only 20 TDCs, thus resulting into an overall PCR of ∼1.2 Mcps; Those 3 cases mimic different operating conditions.In the first case ("High PCR (all TDCs)") the goal is to exploit at the best the proposed device to achieve the highest possible PCR (without reaching the saturation, i.e., 512 Mcps, where pile-up correction algorithm would result less effective).The second case ("Low PCR (all TDCs)") has been devised to explore the performance of the proposed device at standard single-photon statistics condition (PCR≤ 3% of the laser pulse rate) but exploiting the high throughput given by high parallelization (sum of all 256 TDCs) of the sensor.The third case ("Low PCR (20 TDCs)") has instead been chosen with the goal to compare the previous conditions with performances normally achievable with standard detection chains.Indeed, given the resulting 1.2 Mcps PCR, this condition is similar to the case of a single detector coupled to a single TCSPC chain operating at 3% of a 40 MHz laser repetition rate [21].
Fig. 2 reports the Instrument Response Functions (IRFs) recorded in the 3 above-mentioned operating conditions.To acquire the IRF, we put a thin layer of Teflon on the laser fiber tip to illuminate homogeneously the whole active area of the detector Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.by keeping the fiber tip about 6.5 cm above the detector's surface.The width of the IRF (quoted as Full-Width at Half Maximum, FWHM) is around 850 ps and it is dominated by the large laser pulse shape resulting from a 100% tune operation, needed to reach the high PCR condition.On the other hand, a slow tail with time constant around 570 ps attaches less than one decade below the peak.Such a long tail determines a contamination of the early photons in a region of the curve where the contribution of late photons is expected and prevents the use of short source detector separation [14], [24].It is worth noting that minor improvements are expected by reducing the IRF width since, as demonstrated in [24], the limiting factor is provided by the tail.It is also possible to see an overshoot at about 7.5 ns (corresponding to bin 123).Such a peak is caused by the structure of the TDC that has a part (from bin 1 to bin 122) where all TDCs are parallel connected to keep temporal differences between them as small as possible and a part (bins 123-127) where parallel connections are not made due to reasons related to power net wiring in the layout.This causes an instability in the bin width that can not be perfectly corrected with DNL correction.For this reason, in all analysis, we exclude the last bins starting from the 123 one.

B. Capability to Retrieve Optical Properties of Homogeneous Media (MEDPHOT Protocol)
To test the capability of the proposed device to properly retrieve the absorption coefficient of homogeneous media, we used some tests taken from the MEDPHOT protocol [16] such as the noise, the linearity in the retrieval of μ a and coupling between the recovered μ a and the μ s of the phantom.
The implementation of the MEDPHOT protocol is possible thanks to a phantom kit featuring different reduced scattering (from 3.9 to 15.6 cm −1 at 670 nm).For each value of μ s , 8 phantoms with absorption ranging from 0.002 to 0.386 cm −1 (at step of 0.055 cm −1 ) at 670 nm are available.All measurements reported in the following were done in reflectance geometry with an source detector separation of 2 cm (computed as the distance between the fiber tip and the middle of the sensor).
The first test performed is the noise one.The same phantom ( μ a = 0.057 and μ s = 7.73 cm −1 , as done in [25]) was measured in all the 3 operating conditions with an integration time that ranges from 20 (10 μs) to 2621440 (1.3 s) laser cycles.All intermediate integration times are obtained multiplying the previous integration time by a factor of 2. For each integration time, 100 repetitions were acquired.To recover the optical properties (both μ a and μ s ) of each single DTOF we made use of an in-house developed fitting tool that fits the experimental DTOF to the analytical model of the photon transport in a diffusive semi-infinite medium under the diffusion approximation [26].To take into account the non-idealities of the system, the software convolves the IRF shape (acquired in the same operating condition as the measurements) to the analytical model.The curves were fitted in the range spanning from 20% of the peak on the rising edge down to the 1% of the tail of the DTOF.
Once recovered the optical properties, we computed for each operating condition and integration time the Coefficient of Variation (CV) for the absorption coefficient as reported in [16] and in (1): where σ(μ a ) and μ a are the standard deviation and the mean value of the absorption coefficient over the 100 repetitions, respectively.Fig. 3 reports the CV at the different integration times (expressed as laser cycles accumulated) for the 3 operating conditions.It is clear that, with the same integration time, the high PCR measurements show a much lower CV with respect to the low PCR ones.
In this latter case, despite the laser pulse rate at 2 MHz, the exploitation of the parallelization allowed by the proposed sensor decreases significantly the noise in the retrieval of the absorption coefficient, if compared to a system working at 40 MHz but with a single detector (equivalent to our low PCR with 20 TDCs).Moreover, it has to be noted that for the low PCR (20 TDCs) condition, the number of photons when considering 20, 40 and 80 laser cycles are so low (<100) that only the background is fitted, thus leading to a relatively stable solution determining a flat CV.When working in high PCR condition, a CV lower than 1% (usually considered as threshold since in the same order of the biological variations) is achieved when accumulating photons for 10240 laser cycles (i.e., 5.12 ms).
In this condition the recovered CV is 0.97%, thus the minimum integration time for spectroscopic measurements can be set to 5 ms.It is worth noting that with the same number of laser cycles in low PCR condition a CV of 3.04% or 11.1% is obtained if all TDCs or just 20 of them are used, respectively.Moreover, some fluctuations in the CV for high integration time in high PCR is to be ascribed to an increase in the standard deviation of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.μ a , being the recovered μ a very stable.This fluctuation, despite being very small, can be ascribed to a normal drift/fluctuation of the system (e.g., laser or detector drift).
Once the minimum integration time (5 ms) has been determined, we measured the MEDPHOT phantom kit in this condition.Only phantoms reaching the high PCR (i.e., 1.7 Mcps average count-rate per TDC) were measured, corresponding to: i) all those from series featuring μ s = 3.87 cm −1 ; ii) those with μ s = 7.73 cm −1 and μ a ≤ 0.221 cm −1 ; iii) those with μ s = 11.6 cm −1 and μ a ≤ 0.111 cm −1 .All the above-mentioned phantoms were measured in reflectance geometry (source detector separation = 2 cm) in all the 3 operating conditions and acquiring 100 repetitions for each sample to enable the computation of the standard deviation.The optical properties were recovered using the same procedure explained for the noise measurements.
In this context, we are interested in the capability of the system to properly retrieve the μ a since in most of applications (e.g., brain imaging) the μ s is supposed to stay constant within the whole activity.To characterize the capability of the proposed device to correctly retrieve linear variation in the μ a within a homogeneous medium, we made use of the linearity test dictated by the MEDPHOT protocol [16].
Fig. 4 reports the results of the linearity tests where the bubbles' center represents the average value of the measured μ a while its area is proportional to the standard deviation among repetitions.
The measured μ a are plotted against their conventionally true values for different μ s phantom series (different plots in the first row) and different operating conditions (colors).
It is possible to notice that for the phantom with a conventionally true μ a lower or equal to 0.22 cm −1 , there are no significant differences among the 3 operating conditions in the average value of measured absorption.For increasing value of μ a (only for μ s = 3.87 cm −1 ), the high PCR operating condition ensures a more linear retrieval of the average μ a (obtained values are closer to the black dotted line), while the low PCR approach (both with all and 20 TDCs) shows an underestimation of the recovered μ a probably linked to the reduced dynamic range of the DTOF.Moreover, the use of high PCR operating condition permits to reduce the standard deviation, in particular for high absorbing phantoms, where a high dynamic range of the DTOF is fundamental to precisely retrieve the optical properties.Conversely, the coupling between the μ s of the phantom and the measured average μ a (graphs in the second row of Fig. 4) is basically negligible and does not depend significantly on the operating conditions (different graphs).

C. Capability to Detect an Absorption Perturbation Within a Homogeneous Medium (nEUROPt Protocol)
As last step for the characterization of the proposed device on phantoms, we tested its capability to detect an optical perturbation (e.g., change in the μ a of the cortex due to brain activation or presence of a tumor in the breast) buried within a homogeneous medium.To this extent, we made use of two figures of merit defined in the nEUROPt protocol: the contrast (C) and contrast-to-noise ratio (CNR) [17].The former represents the relative change in the number of photons due to the presence of an optical perturbation, while the latter assesses the robustness of the contrast with respect to the intrinsic noise of the measurements.The calculation of C and CNR is done as reported by ( 2) and (3), respectively: where N 0 and N correspond to the number of counts in the homogeneous (i.e., no perturbation present in the medium) or in the perturbed case respectively, whereas σ(N 0 ) is the standard deviation of the number of counts in the homogeneous situation.The calculation of the contrast and CNR can be done also considering time-gates (i.e., a given portion of the DTOF) thus N 0 and N will be the number of counts in the selected time-gate.The implementation of contrast and CNR tests is done using a liquid phantom (made of calibrated quantities of water, ink and Intralipid to have μ a = 0.1 cm −1 and μ s = 10 cm −1 [27]) and a totally absorbing inclusion (a black 100 mm 3 volume PVC cylinder with height = diameter = 5 mm whose effect is equivalent to a Δ μ a = 0.17 cm −1 over 1 cm 3 [28]).The phantom was contained inside a black tank with a lateral Mylar window to allow the optical access for the probe.To test the penetration depth of the system, the totally absorbing inclusion was put in contact from the Mylar foil (in this condition its center is at 2.5 mm depth) and then moved in depth through a motorized Fig. 4. In the first row, for the three µ s phantom series (different plots), the measured µ a are plotted against the conventionally true values for the 3 different operating conditions (colors).The dotted black line (bisector) represents the expected value.In the second row, for each operating condition (different plots), the coupling between measured µ a and conventionally true µ s is displayed (colors encode the conventionally true µ a of the phantom).The bubbles' center represents the average value of the µ a measured while its area is proportional to the standard deviation among repetitions.
stage till 50 mm at steps of 2.5 mm (positions referred to the center of the cylinder).The homogeneous state was simulated moving the perturbation 60 mm in depth so as its effect can be considered negligible.
For each position of the inclusion, 100 repetitions of 10 ms (i.e., 20000 laser cycles) were acquired to enable the computation of the CNR as shown in (3).All the settings were kept identical for the 3 operating conditions.Fig. 5 reports the curves recorded in the homogeneous case for the 3 operating conditions (different colors), after averaging the 100 repetitions.The colored rectangles represent the time-gates used for analysis (width of about 1 ns).The first gate starts where first photons in the high PCR condition are detected, at about 250 ps before the temporal position of the IRF's center of gravity.As discussed in Section III-A, the latest region of the DTOF, starting from the overshoot occurring at about 7.5 ns (i.e., bin 123), has been discarded from analysis.
Fig. 6 reports the contrast (left) and CNR (right) computed for different time-gates (symbols and colors, the same as those of the rectangles in Fig. 5) for all the 3 operating conditions (rows).Independently of the PCR and the number of TDCs used, it is clearly noticeable that later time-gates allow to distinguish deeper perturbation, as predicted by theory [2].However, the maximum penetration depth strongly depends on the operating condition.Indeed, assuming as a threshold for detectability a contrast larger than 1% (usually considered as the contrast induced by physiological variations in a living tissue) and a CNR larger than 1 (i.e., the fluctuation of the measurement lower than the change produced by the inhomogeneity to be detected), despite the short acquisition time, a remarkable penetration depth of 35 mm can be reached if working in high PCR condition, while 30 and 25 mm can be achieved in low PCR condition (with all or 20 TDCs, respectively).It is worth noting that for high PCR and low PCR (all TDCs) approach, the penetration depth is limited by the CNR since the contrast curves are pretty smooth (i.e., not noisy) and larger >1% till more than 45 mm and 42 mm, respectively.It has to be noted that in our case the integration time was set to 10 ms while, the nEUROPt protocols dictates repetitions of 1 s (100 times more): such a reduction of the integration time determines a decrease of the CNR of a factor √ 100 = 10 with respect to the standard protocol Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.condition.This implies that, using an integration time of 1 s, a boost of a decade in the CNR can be achieved, thus potentially reaching penetration depth of 42.5 mm and around 40 mm (the estimation is less precise since the contrast curves are more noisy at these depths) for high and low (all TDCs) PCR, respectively, which represents, to the best of our knowledge, a remarkable value overcoming the best results achieved in literature with cutting-edge system [22], [23], [24], [25], [26], [27], [28], [29].

IV. IN-VIVO PROOF OF PRINCIPLE MEASUREMENTS
Once the suitability of the proposed device to recover variation in absorption and a penetration depth up to 35 mm (thus reaching the brain cortex) when using high PCR operating condition had been demonstrated, we decided to test it for in-vivo proofof-principle measurements.To exploit the high-throughput and sampling rate of the proposed system, we measured one subject both on a resting-state and during an arterial occlusion with the aim to record the heartbeat.The acquisitions were approved by the Ethical Committee of Politecnico di Milano and they have been conducted in compliance with the Declaration of Helsinki.Moreover, written informed consent was given by the subject.
For both measurements, the device was placed in contact with the tissue (left forehead for the resting-state measurement and right arm -brachioradialis muscle-for the occlusion one) and the source detector separation was set to about 2 cm.For the resting-state acquisition the probe was pressed on the forehead to minimize the heartbeat effect in the scalp [30].The integration time was set to 10 ms and the acquisition were done at an average PCR of 1.7 Mcps (i.e., high PCR operative condition).For what concerns data analysis, we summed up all TDC channels (till bin 122) and then we removed possible movement artifacts or trends by subtracting the moving average (70 samples) of the recorded signal.This operation allows us to enlighten the fast oscillations possibly due to heartbeat.Finally, a moving average at 10 samples was applied to clean up the signal.In the following, we will refer to this procedure as "band pass filtering".To enlighten the spectral components of the signal, we developed a custom-made code, based on the Fast Fourier Transform (FFT) algorithm (MATLAB 2022a, The MathWorks Inc., USA).Fig. 7(a) reports the signal obtained, after the above mentioned band-pass filtering, for a resting state measurement of the forehead.It is evident that a periodical component is present, whose periodicity is compatible with the heartbeat.Indeed, the FFT of the signal, reported in Fig. 7(b), clearly indicates a spectral component at about 1.3 Hz (corresponding to around 78 beats per minute) that is compatible with the heartbeat.
Finally, we performed an arterial occlusion in the same subject with the following task: 30 s rest, 120 s of occlusion and then  Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.120 s of recovery.The block of the arterial (and venous) flow was induced using the cuff of a sphygmomanometer that was inflated reaching 250 mmHg.As can be seen in (a), the contrast (computed on the signal after applying a moving average at 70 samples -equivalent to a low-pass filtering-and considering as homogeneous the average of counts in the first 30 s), increases during the occlusion phase.This is due to the decrease in the number of counts resulting from the conversion of the oxy-in to the deoxy-hemoglobin (the total hemoglobin remains constant).Indeed, due to the use of a single wavelength, it is not possible to better distinguish those two components even though the chosen wavelength (670 nm) is more sensitive to de-oxygenated hemoglobin.
Fig. 8(b) reports the band-pass filtered signal.It is possible to notice that during the resting and the recovery phase, fast oscillations (compatible with heartbeat) are present.Then, for each phase, a nearly 25 s region was taken to compute the FFT (colored regions in Fig. 8(b)).Fig. 8(c), (d), and (e) report the spectral frequency components computed during the rest, occlusion, and recovery phase, respectively.It is evident that a spectral component at about 1.3 Hz was present in both the rest and recovery phase, while it disappears during the occlusion phase.This is expected since, during the occlusion, the blood circulation is blocked, thus preventing the possibility to see any periodic signal due to pumping of the blood.Moreover, in the region between 150 and 160 s (corresponding to the release of the pressure induced by the cuff) a not periodic behavior can be noticed.This can be due to the release of the occlusion that firstly enables to restore the arterial circulation, and only after that, the venous one, thus causing a fast wash-in of the oxyhemoglobin without the counterpart inflow of de-oxyhemoglobin.

V. CONCLUSION AND FUTURE PERSPECTIVES
In this work we present a device featuring 8 × 256 SPADs with 256 TDCs, thus simultaneously leading to high light harvesting efficiency and data throughput in time-domain diffuse optics applications.We characterize it using internationally shared protocols for performance assessment of instruments for diffuse optics (BIP, MEDPHOT and nEUROPt).For what concerns the light harvesting capability, this device overcomes by almost one order of magnitude other systems operating with short acquisition times (such as [11] and [12]), thus possibly improving the achievable performances in terms of sensitivity in depth, for example.Indeed, considering the penetration depth assessed using the nEUROPt protocol with 10 ms integration time, the proposed device working in the high PCR regime (i.e., 1.7 Mcps on average for each TDC, thus overall about 435 million cps with a laser pulse rate limited at 2 MHz), can reach the noticeable depth of 35 mm.Moreover, if using a standard integration time (1 s) a penetration depth close to 45 mm could be potentially achieved, which probably represents the highest value reported for time-domain diffuse optics instruments.Moreover, the device has been characterized using the MEDPHOT protocol.A minimum integration time to achieve a CV of 1% is reached at 5 ms using the "high PCR" operating condition.In this condition, the linearity in the retrieval of μ a is in line with state-of-the-art instruments despite the short acquisition time.Further, the coupling between measured μ a and the conventionally true μ s of the phantom is computed to be negligible.
It is worth noting that such a results (obtained with short integration time) in terms of achievable penetration depth and linearity in the retrieval of μ a are completely in line (or even better) than those achievable with cutting-edge systems (obtained with standard integration time) with much larger diffuse optical responsivity (e.g., [21], [22], [23]).This result has been possible thanks to the high PCR condition, which enhanced the achievable results, despite the relatively low diffuse optical responsivity of the proposed system.
The reported results with protocols obtained using low integration time (down to 5 ms), open the way to several possible applications such as speed-up the duration of spectroscopy measurements (e.g., optical mammography [31]) or to follow fast dynamics.As a first proof-of-principle demonstration, we record the signal in resting-state on the forehead and then on the arm during an arterial occlusion.In the former case, we are able to retrieve, after proper band-pass filtering, a periodic signal compatible with the heartbeat.In the latter, the same periodic signal can be retrieved during the rest and recovery phases while it disappears during the occlusion, where the blood supply is prevented due to the block of both venous and arterial perfusion.
In the future, the limitation on the laser pulse rate (currently at 2 MHz) could be overcome, thus reaching a value of tens of MHz, with an improvement of more than a factor of 10 in the achievable throughput (i.e., ∼5 Gcps).Further, direct gated acquisition acting on the SPAD voltage may also become possible with dedicated chip redesign.However, this could result beneficial only when the ungated detector is saturated at full laser power, which is not the case for the present work.Moreover, the use of SPADs with a shorter IRF tail may open the way to the application of new approaches such as the short source detector separation one, thus improving the achievable performance in terms of contrast and spatial resolution [2].On the other hand, a possible increase of the detector area will enable increased source-detector separation thus possibly improving also the achievable performances in terms of depth sensitivity.She is currently an Assistant Professor with the Department of Physics, Politecnico di Milano.She has authored more than 80 papers in international peer-reviewed journals and conference proceedings.
Her activity is mainly focused on the study and application of a new approach and instrumentation for time-domain optical spectroscopy of highly scattering media using singlephoton detectors.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Manuscript received 27
January 2023; revised 11 July 2023; accepted 17 July 2023.Date of publication 24 July 2023; date of current version 15 December 2023.This work was supported in part by the Laserlab-Europe EU-H2020 Project funded by the EU under Grant 871124 and in part by the Academy of Finland, under Grants 323719 and 330121.(Corresponding author: Laura Di Sieno.)

r
"High PCR (all TDCs)" -obtained by setting a PCR per TDC of ∼1.7 Mcps, corresponding to ∼85% of the laser pulse rate at each TDC and summing up all TDCs, thus resulting into an overall PCR of ∼435 Mcps;

Fig. 3 .
Fig. 3. CV recovered for increasing number of laser cycles in the 3 operating conditions (colors and symbols).

Fig. 5 .
Fig. 5. Representation of the average DTOF acquired in the homogeneous condition for all the 3 operative conditions (colors).The colored rectangles represent the time-gates used for contrast and CNR calculation.

Fig. 6 .
Fig.6.Contrast (left) and CNR (right) graphs obtained for all the 3 operating conditions (rows) computed for different time-gates (colors and symbols).The gray regions highlight conditions where, conventionally, the detectability is lost.Time-gates where CNR was lower than 1 for all the depths are not displayed.

Fig. 7 .
Fig. 7. Measurement of resting-state with the probe placed on the forehead: signal obtained after band-pass filtering the recorded data (a) and its spectral components recovered with FFT analysis (b).

Fig. 8 .
Fig. 8. Arterial occlusion measurement.Contrast computed on the low-pass filtered data (a); signal obtained after the band-pass filtering the recorded data (b) and its spectral components computed with FFT analysis on about 25 s taken in three different phases of the measurements -colored rectangles in graph (b)-: during resting phase (c), occlusion (d), and recovery (e).

Laura
Di Sieno was born in Varese, Italy, in 1987.She received the master's degree in electronics engineering and the Ph.D. degree in physics from the Politecnico di Milano, Milan, Italy, in 2011 and 2015, respectively.

Tuomo
Talala received the M.Sc.degree in mathematics and the M.Sc.degree in electrical engineering in 2010 and 2018, respectively, from the University of Oulu, Oulu, Finland, where he is currently working toward the Ph.D. degree in electrical engineering with the Circuits and Systems Research Unit.His research interests include the development of integrated sensors and data post-processing techniques for time-resolved optical measurements.Elisabetta Avanzi was born in Segrate, Italy, in 1996.She received the M.S. degree in physics engineering in 2020 from the Politecnico di Milano, Milan, Italy, where she is currently working toward the Ph.D. degree in physics.Her research interests include the design, validation, and application of time-resolved diffuse optical spectroscopy components and systems.Ilkka Nissinen (Member, IEEE) received the M.Sc.(Eng.) and Dr.Tech.degrees in electrical engineering from the University of Oulu, Oulu, Finland, in 2002 and 2011, respectively.Since 2018, he has been an Associate Professor of analog and mixed-signal microelectronic circuit design with the Circuits and Systems Research Unit, University of Oulu.His research interests include the design of time interval measurement architecture for the integrated sensors of pulsed time-of-flight laser technologies.Since 2019, he has been a Member of the Technical Program Committee of the IEEE Nordic Circuits and Systems Conference.