Onsite Non-Line-of-Sight Imaging via Online Calibration

There has been increasing interest in deploying non-line-of-sight (NLOS) imaging systems for recovering objects hidden behind corners. Existing solutions need to calibrate the imaging system using auxiliary apparatus and additional detectors. We present an online calibration technique that directly decouples the transients, which are acquired by onsite scanning on a relay surface, into line-of-sight (LOS) and hidden components. We use the former to directly (re-)calibrate the system upon changes of scene $-$ surface configurations, scannable regions, and sampling patterns, and the latter for hidden object recovery via spatial-, frequency-, or learning-based techniques. We also calculate a Gamma map from the LOS component to preview calibration effects for accurate transient measurements. The entire process of our calibration for 64 scanning points takes no more than 14 seconds on an Intel i7-6600H CPU. In particular, our technique avoids using auxiliary calibration tools such as mirrors or checkerboards and supports both uniform and non-uniform sampling in an onsite NLOS imaging system. Comprehensive experiments via calibration evaluation and NLOS reconstruction demonstrate the efficiency and effectiveness of our solution. Besides, we have made our data and code open-source on GitHub to the research community.


I. INTRODUCTION
N ON-LINE-of-sight (NLOS) imaging aims at recovering objects outside the direct line of sight of a sensor [1], [2]. Most active NLOS imaging systems exploit an ultra-fast pulsed laser beam that can be controlled to direct light toward a relay surface (e.g., a wall). A companion time-resolved detector then collects the arrival time and number of photons that return after the first and third of three bounces: off the relay surface, off the hidden objects, and back off the relay surface. Fig. 1 illustrates a top-viewed non-confocal (or conventional) scenario of the NLOS imaging. The first bounce corresponds to direct reflection in the line-of-sight (LOS) scene. By removing or gating the photons from the first bounce, photons from the third bounce are employed to reconstruct [3], [4], [5], [6], [7], [8], [9], [10] or to localize [5], [6], [10], [11], [12], [13] the hidden objects. Potential applications are numerous, including autonomous driving, remote sensing, and biomedical imaging.
To achieve accurate measurements in accordance with the physics-based imaging model, laborious calibration is required to calibrate an onsite setup of NLOS imaging systems. Streak camera-based systems provide temporal resolution of down to 2 picoseconds (ps) or 0.6 mm but are difficult to calibrate under nonlinear temporal − spatial transforms [3], [9], [14], [15]. In recent years, single photon avalanche diodes (SPADs) have served as an affordable and convenient alternative [4], [5], [6], [7], [10], [16], [17]. A single-pixel SPAD, which is coupled with a time-correlated single photon counting (TCSPC) device, produces a transient, i.e., a histogram of photon counts versus time bins, of 4 ps at a detection point. early all existing NLOS calibration schemes require auxiliary apparatus. Buttafava et al. [4] and Ahn et al. [18] (Ahn [18]) exploit a digital camera and a checkerboard (or a regular grid) to estimate camera parameters, the relay surface, and 3D coordinates of detection points based on computer vision (CV) techniques. Klein et al. [19] use mirrors to establish correspondences of the laser spots and the detection points for simultaneously estimating the mirror plane and the relay surface. When readjusting the configuration of the imaging system, these methods repeat the whole process of calibration, which is both labor-intensive and time-consuming for onsite deployment.
In this paper, we present an online calibration technique for a SPAD-based NLOS imaging system. Our technique needs no additional tools such as mirrors or a checkerboard. We decouple an acquired transient into LOS and hidden components. We then exploit the LOS component to calibrate the relay surface and calculate a Gamma map. By introducing the Gamma map, our calibration technique can preview the calibration effects when readjusting the hardware devices. Many fast Fourier transform (FFT)-based algorithms, e.g., LCT [5], f-k [6], and PF [7], require transients measured in a regular grid as input whereas learning-based methods, such as neural transient fields (NeTF) [20], allow flexible sampling patterns. Isogawa et al. [8] present a circular sampling pattern and Jiang et al. [21] recently propose a ring and radius sampling pattern for the Rayleigh − Sommerfeld diffraction algorithm. These patterns enable fewer measurements for memory-efficient NLOS reconstruction than a regular grid. Our online calibration technique can support our NLOS imaging system for arbitrary sampling. Thus, we can uniformly sample detection points in, e.g., an evenly spaced regular grid and a ring and radius pattern (i.e., a concentric-circle pattern), and non-uniformly detect points on the relay surface.
In summary, our major contributions are as follows: r We present a novel calibration technique for NLOS imaging systems. Nearly all existing calibration schemes require known targets, e.g., mirrors and a checkerboard, and an additional detector. In contrast, our technique exploits the LOS components of transients to calibrate the relay surface and introduces a Gamma map to preview calibration effects, requiring no auxiliary apparatus. Our entire calibration process for 64 scanning points takes approximately 20 seconds (s), which includes 6 s to calibrate the galvanometer, on a laptop personal computer (PC) with an Intel i7-6600H CPU. The calibration time is markedly reduced in comparison with roughly 15 minutes using the methods with auxiliaries.
r We provide detailed analysis of our calibration procedures.
In particular, the galvanometer calibration, which needs to be performed only once for the entire system, establishes the correspondences between the galvanometer voltages with respect to the coordinates of scanning points on the relay surface. We achieve a galvanometer calibration accuracy of 0.002 • (or 0.001 V), i.e., the difference between the theoretical and measured optical scanning angles. With accurate calibration of the galvanometer and the relay wall, our calibration procedures enable high-accuracy galvanometer − wall mapping for both uniform and nonuniform sampling in the NLOS imaging system.
r We show that our calibration technique is effective and that the transient measurements from our calibrated system are sufficient for NLOS reconstruction. Our calibration accuracy is an average of 3.5 mm in terms of root-meansquare-error (RMSE) for detection point coordinates with respect to the 4 mm laser spot on the relay surface, which is emitted 1500 mm away. The NLOS reconstruction results using state-of-the-art (SOTA) methods for uniform and non-uniform sampling demonstrate the efficacy of our solution.
II. RELATED WORK Substantial efforts have been made to improve NLOS imaging systems. For high-accuracy measurements, the entire setup of the system needs to be calibrated such that the geometric deployment of devices and hidden objects is known, which results in reliable tasks such as reconstruction and tracking of the hidden objects. Here, we briefly review the most relevant prior work and refer readers to recent surveys [1], [2], [22] for a comprehensive overview.
SPAD-based NLOS imaging systems. Buttafava et al. [4] build the first SPAD-based NLOS imaging system with a femtosecond (fs) laser. The main factors of a SPAD and a companion laser define the signal-to-noise ratio and the temporal resolution of the system. Table I lists important parameters of several singlepixel SPAD-based NLOS imaging systems in the literature. Two major types of SPADs are affordable in terms of material properties: silicon-based SPADs cover the visible spectrum, e.g., 400-800 nm, with full width at half maximum (FWHM) of tens of ps [5], [6], [7], whereas InGaAs/InP-based SPADs cover the infrared spectrum with FWHM of approximately 200 ps [10], [11], [19], [23]. Wu et al. [10] exploit InGaAs/InP SPADs to construct a long-range NLOS imaging system, over 1.43 km.
We opt for a silicon-based single-pixel SPAD with a fastgating mode, which can switch off the direct light paths between the relay surface and the SPAD. A 2D SPAD array simultaneously records transients of many pixels, but is often composed of single SPADs that have a small active area (of 6.95 μm × 6.95 μm) and low temporal resolution (hundreds of ps) due to the complicated fabrication [11], [16], [24]. Nam et al. [25] and Peng et al. [26] tailor a 16 × 1 SPAD array with fast-gated SPADs, which is not commercially available. The Stanford NLOS imaging system [5], [27] is improved with a fast-gated SPAD and a stronger laser [6]. Their systems, as well as ours, are built under a confocal setup by using a beam splitter that locates the laser and the SPAD co-axially. Liu et al. [7] exploit similar hardware devices to scan 180 × 130 laser spots by indicating the SPAD at one fixed point on the relay surface. Ahn [18] and Xin et al. [28] use two sets of 2D galvanometers to separately control illumination and detection points on the relay surface. Their system can perform both confocal and non-confocal scanning.
Calibration schemes for NLOS imaging systems. Klein et al. [19] employ a 32 × 32 InGaAs/InP SPAD array to construct an NLOS imaging system under non-confocal setting.  [5], [27], AND OURS Analogous to the classical calibration of a digital camera in the CV community, they propose a calibration scheme with mirrors as known targets placed at different positions in an NLOS scene.
Using an extra digital camera, Buttafava et al. [4] take pictures of laser spots in a regular grid with known spacing on the relay surface while Ahn [18] take pictures of a checkerboard with known size. From these pictures, the position and orientation of the relay surface and the 3D coordinates of the laser spots can then be estimated using CV techniques. Ahn [18] also calibrate the galvanometers by learning wall-to-camera mapping and support non-uniform sampling, but do not provide a detailed description or experiments. In contrast, we need no mirrors, checkerboard, or additional camera for the whole calibration and offer detailed analysis of calibration procedures, including the galvanometer calibration. Ou et al. [29] present a computational adaptive optical method to correct aberrations in a terahertz-band NLOS imaging system. Their method is applicable to data undistortion after NLOS measurements. As well as planar surfaces, Lindell et al. [6] consider a non-planar relay surface by pre-processing the recovery of its depth variation for a virtual planar surface. Manna et al. [30] account for a non-planar and dynamic relay surface by using a second free-running SPAD to calculate the illumination positions from LOS measurements. Similar to Lindell et al. [6], our technique is theoretically extendable to tackle a non-planar relay surface by estimating 3D coordinates of scanning points and tailoring an algorithm to estimate a virtual planar surface. Learning-based algorithms, e.g., [20], [31], [32], [33], implicitly optimize NLOS reconstruction with minor calibration error, but produce poor results from calibration-free measurements.

A. NLOS Imaging Models
An NLOS imaging formulation models the physics of light traveling from a laser to a detector via LOS and hidden objects. As shown in Fig. 1, the pulsed laser beam o illuminates a spot l on a relay surface W . After the light scatters off the spot, some photons bounce off from a point p on the hidden object P and travel back onto a patch s on the relay surface. The detector d collects a number of photons from the patch at a time instant t. Based on the physics of light transport [34], we define the image formation model as: where τ (t; o, l, s, d) records a 5D transient, which is a histogram of the number of photons that travel back from the hidden object at t. The integral P represents a summation of the photons that travel back from a small area A p centered at the hidden point p. The Dirac delta function δ(·) relates the time t to the travel times t o→l , t l→p , t p→s , and t s→d . The function f (p; ω l→p , ω p→s ) describes the bidirectional reflectance distribution function (BRDF) of a point p with the incident and exit directions ω l→p and ω p→s . The unit vector ω a→b = b−a |b−a| denotes the direction from the input argument a to b. The function g(p; l, s) is an attenuation term dependent on the distance, shading, and visibility effects due to the surface normals of p, l, and s. For simplicity, (1) neglects the volatility of light, e.g., diffraction and interference. We provide its complete derivation in Supplementary Material. Γ(s) models the intensity variation of the light after scattering off the relay surface, and restricts the field of view (FOV) of the detector. In (2), N o is the number of photons emitted in one pulse of light from o. ρ denotes the albedo of the relay surface. The coefficients A d and A s are the active area of the detector and the area mapped on the relay surface, respectively. n s represents the surface normal vector at s.
Under the non-confocal setting, illumination and detection points are considered to be the foci of an ellipsoid, where both illumination and detection points vary, or one point remains static and the other varies. Calibration procedures are complicated because the travel times t o→l and t s→d are difficult to separate from the measurements only. When illumination and detection points coincide, i.e., l = s and o = d, the recorded transient constitutes a 3D subset of τ (t; o, l, s, d). The confocal imaging model, first proposed by O'Toole et al. [5], is thus simplified as: The confocal imaging model has advantages in terms of system calibration, and the travel time between s and o and the coordinates of s are easily determined. τ (t; d, s) contains two components: τ LOS (t; d, s) along direct light paths between the relay surface and the detector, and τ hidden (t; s) along indirect light paths between the relay surface and the hidden object. These two components are convoluted and can be separated from τ (t; d, s), resulting in: and In practice, the LOS component is often gated or removed to obtain the hidden component τ hidden for NLOS reconstruction by assuming a virtual light source at l and a virtual detector at s. τ LOS contains information, e.g., depth and reflectance, on the relay surface and enables NLOS imaging system calibration.

B. SPAD-Based NLOS Imaging System
We construct an NLOS imaging system in accordance with the confocal imaging model. Fig. 2 shows the overview of our imaging system with the opto-electric design of the hardware devices (a), which contains: a pulsed laser, a single-pixel SPAD, a beam splitter, a 2D galvanometer, and accessories.
The SPAD model. The real transients captured from the NLOS imaging system are influenced by the properties of the SPAD, including photon detection efficiency, afterpulsing, and pile-up, and by the temporal jitter of the laser and the SPAD. The temporal jitter models uncertainty in the time-resolving mechanism. While Hernandez et al. [35] introduce a computational model for a SPAD, we opt for the approximation model in [36] to describe the probability of detecting individual photon events in a histogram bin as a Poisson distribution P ois. The bias b is usually considered to be independent of time at a detection point, and is due to the ambient light and the dark count of the SPAD. With these factors, we formulate the transient recorded with a SPAD coupled with a TCSPC as: where j represents the temporal jitter of the entire system. Following [37], the temporal jitter typically yields a curve having two parts: a Gaussian peak and an exponential tail, where μ, σ, κ 0 , and κ 1 are the coefficients of the temporal jitter, and γ is the weight of the exponential term (see Supplementary Material for details). The LOS transient τ SPAD LOS is exploited to calculate the temporal jitter j of our system from the crossentropy loss L SPAD : We compute the temporal jitter at several detection points and consider the average value as the temporal jitter of our capture system. System prototype. Fig. 3 shows a photo of our imaging system prototype. We employ a fast-gated SPAD from MPD, which provides 4 ps temporal resolution with a PicoHarp 300. An achromatic lens (Canon EF 50 mm f/1.8) is exploited to focus the returning light onto the SPAD. A laser (SuperK EXTREME FIU-15) with a wavelength-tunable filter (SuperK VARIA) emits Fig. 3. Photo of our NLOS imaging system prototype. Our system consists of a fast-gated SPAD, an ultra-fast pulsed laser with a wavelength-tunable filter, a 2D galvanometer, a beam splitter, and a relay wall. collimated light through a polarized beam splitter (Thorlabs VA5-PBS251). The light is then guided by a 2D galvanometer (Thorlabs GVS012). The SPAD records the indirect light from a hidden object by gating the direct light using a delayer (Pico-Quant PSD-065-A-MOD) with gate width 12 ns. The temporal jitter of our entire system is approximately 200 ps. Our hardware devices are located 1.5 m away from the relay wall, a melamine white panel whose position and orientation can be adjusted in the laboratory.

IV. ONLINE CALIBRATION
We focus on calibration of an NLOS imaging system for accurate measurements by minimizing the difference between the transients that are captured from the setup and those that are theoretically computed based on the imaging model. Our online calibration can operate calibration procedures for the galvanometer, the relay wall, and the NLOS bounding box of the system, and employ Gamma maps to preview the calibration effects as shown in Fig. 4.

A. Gamma Map
In (2), we observe that in physics, Γ(s) is related to a laser with N o , a detector with A d , and a relay wall with ρ π at the detection points s. It also relates the detector with the relay wall via the mapping from A d to A s , and via the distance |s − d|. When the detector d is fixed, we can define Γ(s) by determining the coordinates of detection points s on the relay wall. Based on the confocal imaging model, we calculate Γ(s) as the summation of the LOS component τ LOS (t; d, s) in (5) at each detection point because the integral of δ is one. Γ(s) for all detection points scanned on the relay wall, or a Gamma map, is calculated as: As shown in (9), the Gamma map can also be considered as a steady-state image of the relay wall with the laser as a light source. In (6) for a SPAD-based capture system, similar to the Delta function δ in (5), the integral of the temporal jitter j is one, and the Gamma map is thus identical to that calculated in (5). Even in the infrared spectrum, we can also calculate Gamma maps from LOS transients measured with SPADs since the Poisson distribution is additive and the bias b is small with respect to direct light. We employ the Gamma map for our online calibration to preview scannable regions, the FOVs of scanning patterns, and the NLOS bounding box, and to normalize the transients τ (t; d, s) for high-quality reconstruction.

B. Calibration Procedures
Hardware alignment. The laser and the SPAD are first aligned by observing the speckle of the laser beam that illuminates a surface, e.g., a white paper. We then situate the beam splitter, the galvanometer, and the relay wall such that the speckle is clear at its focus on the relay wall. Using a high-power laser, since light directly reflected on the relay wall may overwhelm the SPAD, we slightly misalign the positions of the laser and the SPAD to reduce the measured intensity of the direct reflection.
Galvanometer calibration. A dual-axis galvanometer supports optical scanning angles of about ±40 • , depending on several factors such as the laser beam diameter and the input voltage. In general, the galvanometer is coupled with two servo motors, which offer the feedback angles of the mirrors while scanning. In the NLOS literature, few prior studies have mentioned galvanometer calibration. We exploit the feedback between input voltages and angles to calibrate the galvanometer and to further determine the positions of scanning points. Fig. 2(b) illustrates the coordinate systems and scanning regions. Two Cartesian coordinate systems include XY Z at the origin o where the laser beam is emitted into the free space toward the relay wall, and xyz with the origin at the center of the detection area on the relay wall, while z = 0 when the relay wall is planar.
To calibrate the galvanometer, we assume that the scanning system is linear, and formulate the relationship between the optical scanning angles θ X and θ Y and the input voltages V X and V Y , as: where the initial angles X and Y are determined by the offset arrangement of the two mirrors prior to voltages input to the galvanometer, and β represents the coefficients of the angle pairs with respect to the input voltages. For clarity, we rewrite (10) into θ = + βV. The initial angles and the coefficients β may be offered by the manufacturers, whereas for precision, we collect N groups of optical scanning angles θ n along with input voltages V n to achieve and β in our system. The coefficients β are first optimized using the multiple linear regression algorithm with a loss function L Galvo , as: Using the calculated β, we consider to be the average error between the theoretical and measured optical scanning angles. We then achieve the correspondences between the optical scanning angles and the input voltages, which enable us to determine the voltages input to the galvanometer for desired scanning angles. Thus, we can define the scannable region of the galvanometer, which is shown as the yellow area in Fig. 2(b) but is not necessarily rectangular. Relay wall calibration. The relay wall plays a critical role for an NLOS capture system. The LOS component of a transient, as in (5), can be exploited to recover the albedo and the orientation of the relay wall. Specifically, we extract the peak of the histogram τ LOS (t; s) at each detection point and the corresponding t, and calculate the depth of the point on the relay wall. We then employ the optical scanning angles θ X and θ Y of the galvanometer and the depth to estimate 3D coordinates of the detection points in XY Z, as: The relay wall W (W X , W Y , W Z ) is considered to be planar and is formulated as: We notice that the albedo and the orientation of the relay wall are involved in the Gamma map. Our calibration technique therefore does not require any additional devices or textured targets by introducing the Γ(s) map. We thus recover the relay wall when the loss function L W (W X , W Y , W Z ) is minimal, which is the RMSE of distances from N points to the plane: NLOS bounding box. The geometric setting of the hardware devices and the detection region on the relay wall greatly affect measurement efficiency and reconstruction accuracy. We define a bounding box in the NLOS imaging system to preview where an object may be situated in the hidden scene.
Ahn [18] have mentioned that the hidden volume should be within the orthogonal projection of the scanning region on the relay wall. In the laboratory, we make a free space to allow for large hidden objects by exploiting a relay wall whose orientation and position are adjustable. This setting is equivalent to adjusting the position of the entire setting of hardware devices when using our imaging system in the wild. The orthogonal projection of the scanning region thus restricts a measurable bounding box of the NLOS scene with the maximal width and height of the scanning region. The minimal depth of the bounding box is approximated as: where c is the speed of light and t delay represents the delay we set on the delayer. Since the distance between the hidden object and the relay wall plays a significant role in the attenuation of photons, we consider it to be the maximal depth z max of the bounding box, as: where the minimal Γ(s) limits the scanning region and the volume of the bounding box. We offer the derivation of z max in Supplementary Material. The bounding box allows us to preview detectable sizes and positions of a hidden scene. Sampling patterns. The scanning process relates the hardware devices to the relay wall and the NLOS bounding box, and can be used to double-check the effects of the entire system calibration. In the NLOS literature, illumination and detection points are usually distributed in a regular grid with evenly spaced points on the relay wall. In contrast, our imaging system can address uniform and non-uniform sampling by leveraging the correspondences between input voltages to the galvanometer and the coordinates of each detection point on the relay wall. Specifically, we first re-parameterize (13) with w(w X , w Y , w Z ), as: New sets of orthonormal basisx,ŷ,ẑ andX,Ŷ ,Ẑ are constructed to normalize the surface normal of each detection point (x, y, z), as: whereẑ is the unit vector of the surface normal. Note that z = 0 for a planar relay wall, and the coordinates of a point p on the hidden object P can therefore be denoted with a value of z such that the two coordinate systems for a detection point are transformed as: With (10), (12), (18), and (19), we connect the input voltages of the galvanometer and the scanning angles with the coordinates of any detection point on the relay wall. The middle area (in pink) shown in Fig. 2(b) is an effective scanning region restricted by the occlusions between the galvanometer and the relay wall. We preview this area by calculating the Gamma map, Γ C (s), and select a region with the origin o s at its center on the relay wall, e.g., the inner area (in red), to define sampling patterns. We then calculate the Gamma map, Γ S (s), of the selected region. Both Γ C (s) and Γ S (s) are calculated using (9). Fig. 5 shows two examples of Γ C (s) (left) and Γ S (s) (right). For Γ C (s), we scan a small number of detection points to preview the scannable region and readjust where the relay wall is situated with respect to the galvanometer. For Γ S (s), we scan a larger number of detection points to verify the FOV determined by the sampling pattern for effective measurements and to normalize transients for accurate reconstruction.

V. EXPERIMENTS
We have conducted extensive experiments to qualitatively and quantitatively evaluate our online calibration technique and to validate the transients measured from our calibrated NLOS imaging system by reconstructing the hidden objects using SOTA methods. We have also carried out ablation studies to validate calibration effects of the relay wall.

A. Calibration Evaluation
Qualitative evaluation. We evaluate the performance of our online calibration technique by testing three sampling patterns: a non-uniform pattern and two uniform patterns in a regular grid and in concentric circles.
For the non-uniform sampling pattern, we randomly detect several points in different areas, and estimate the coordinates of the detection points s and the equation of the relay wall. Newly computed coordinates s W of these points, as well as their corresponding input voltages, are then calculated on the estimated relay wall. Using the input voltages, the galvanometer is controlled to re-scan on the relay wall. Fig. 6(a) demonstrates the two groups of detection points: the randomly detected points s are shown as blue reference crosses, and the re-scanned points s W are shown as red dots.
We also raster-scan detection points in a regular grid pattern with our system. N × N points are scanned in a region of L × L, and the coordinates of each detection point s(i, j) are represented as: Fig. 6(b) shows a uniform sampling pattern with evenly spaced points (in red) on the relay wall. Note that the reference lines in blue for the regular grid are post-processed to identify the equidistant spaces between the scanned points. We further raster-scan detection points in concentric circles. In a scanning region with the radius R, we define N r concentric circles, and N φ points on each circle. The coordinates of the detection points s(i, j) are then determined as: The spaces between the circles and between the points on each circle may be equal or unequal. Specifically, as shown in Fig. 6(c), we scan 32 detection points (in red) on 4 circles, i.e., 8 points on each circle, and display them on the post-processed reference lines (in blue) for the concentric circles. From Fig. 6(a)-(c), the qualitative results of the three sampling patterns demonstrate that the detection points scanned using our system are in good agreement with the desired positions in either uniform or non-uniform patterns, and either evenly or unevenly spaced. The RMSEs of calibration for 64 detection points in the three patterns are 2.57 mm, 1.55 mm, and 1.58 mm, resulting in a mismatch between the centers of reference crosses and red dots.
Quantitative evaluation. In addition, we conduct quantitative evaluation of our calibration technique in terms of scanning points and input voltage ranges. Table II shows the RMSEs between the predicted and measured coordinates and the RMSEs between the predicted and measured voltages. First, we select five combinations of input voltages ranges, i.e.,   3 × 3. From the input voltages, we calculate 3D coordinates of the scanning points and estimate the relay wall using (14). We then randomly select new scanning points to calculate RMSEs of the 3D coordinates of the scanned and predicted detection points on the relay wall. Our results of the coordinates RMSE of detection points are stable with an average of 3.5 mm, which is sufficiently small with respect to the speckle diameter 4 mm of the laser emitted at 1500 mm from the system origin to the relay wall. Second, we calculate the input voltages of the predicted 3D coordinates using (10) and (12). We then evaluate the RMSEs of input and estimated voltages for the detection points. The average voltages RMSE is 25.1 mV. In general, the galvanometer produces slightly different angles θ X and θ Y with even the same input voltages because of the mechanical properties of the servo motors. We have confirmed the theoretical and measured angles θ X and θ Y and achieve a difference of 0.002 • , which is sufficiently small that we can simply leverage the feedback θ X and θ Y with input voltages.
Comparison evaluation. For comparison, we implement the calibration scheme in Ahn [18], which additionally uses a digital camera and a checkerboard. Following their supplementary material, we assume that they calibrate the galvanometer in a similar  Table II. way to ours.We exploit the checkerboard as a target and take its pictures by holding the checkerboard in front of the surface of the relay wall while rotating it in different views. Using these pictures, we estimate intrinsic and extrinsic camera parameters and 3D coordinates of detection points on the relay wall based on the CV techniques in Matlab. We then optimize the relay wall using the loss between predicted and captured 3D coordinates of the detection points and learn camera − wall mapping.
For fair evaluation, we scan the detection points with input voltages in the same way as ours in Table II, and estimate the coordinates of the detection points on the relay wall. Fig. 7 shows one group of calibration results for 5 × 5 detection points with the input voltage range of [−3, 3] using the two methods: (a) from ours, (b) from Ahn [18]. In Fig. 7, the re-scanned points are shown as red dots and the reference points as blue crosses. The RMSEs of calibration from our method and from Ahn [18] are 3.3 mm and 28.1 mm, respectively. Furthermore, we calculate coordinates RMSEs for five groups of predicted and scanned detection points on the relay wall using the method in Ahn [18]. Table II shows the results and the comparison with results from our method. The coordinates Fig. 8. Reconstruction evaluation using transients measured from our calibrated NLOS imaging system. From top to bottom: S-shape, Reso.board, Checkerboard, and Mannequin reconstructed using SOTA methods. From left to right: Reconstruction results using LCT [5], f-k [6], PF [7], and NeTF [20]. We exploit the transients measured only in a regular grid for FFT-based LCT, f-k, and PF, and the transients measured in two uniform patterns, i.e., a regular grid and concentric circles, and in a non-uniform pattern for NeTF. RMSE from Ahn [18] is stable with an average of 27.8 mm, but markedly larger than our 3.5 mm. We also calculate the RMSEs for five groups of input and estimated voltages for the detection points. Their average voltages RMSE is 73.6 mV, much larger than our 25.1 mV. The results in Table II demonstrate that our technique is superior to Ahn [18].
From Figs. 6 and 7, and Table II, our online calibration can precisely estimate 3D coordinates of the detection points, the input voltages, and the relay wall by incorporating the LOS components of transients, and can accurately build the mapping between the input voltages and 3D coordinates of scanning points, i.e., galvanometer − wall mapping. In contrast, the existing schemes using CV techniques, including Ahn [18], need to construct camera − wall mapping and wall − galvanometer mapping, and they accumulate errors from estimation procedures of camera parameters, the relay wall, the 3D coordinates of detection points, and the galvanometer. These errors can significantly affect the entire system calibration.

B. NLOS Reconstruction Validation
We further validate the transients measured from our NLOS imaging system, which is calibrated with our online calibration technique. Using the measured transients, we reconstruct the hidden objects using SOTA methods, including LCT [5], f-k [6], PF [7], and NeTF [20].
Implementation. The hidden objects we have captured include an S-shape (0.6 m × 0.6 m), a Checkerboard (0.8 m × 0.8 m), a Mannequin (0.3 m × 0.7 m), and a Reso.board (0.8 m × 0.8 m) that is designed with stripes of different widths and lengths. All the objects are situated at 0.5 m away from the relay wall and are diffuse with materials of paper, cotton, or wood. The transients are captured with exposure time of 2 s to record the measurement at a single detection point. For each object, we capture transients in three sampling patterns: one non-uniform pattern and two uniform patterns, i.e., a regular grid and concentric circles, by scanning 64 × 64 detection points on the FOV of (1.0 m × 1.0 m) on the relay wall.
We use the publicly available source codes of LCT [5], f-k [6], and PF [7] and run these codes on a PC with an Intel i7-6600H CPU (2.6 GHz), 8 GB RAM, and an Intel HD Graphics 520. For these FFT-based methods, we reconstruct the hidden objects using the transients measured only in a regular grid. For NeTF [20], we carry out the experiments of each hidden object using transients measured in three sampling patterns on two NVIDIA 3090 GPU cards with 24 GB RAM. The parameters, e.g., the learning rate and batch sizes, are similar to those reported in the paper [20].
NLOS reconstruction results. Fig. 8 demonstrates reconstruction results of the hidden objects using SOTA methods. From the transients measured in a regular grid, the texture of three planar objects, i.e., S-shape, Reso.board, and Checkerboard, are well reconstructed using LCT [5], f-k [6], PF [7], and NeTF [20]. The albedo (or NLOS volume) for the 3D Mannequin is also sufficiently recovered using the four methods. The results from NeTF have higher quality, e.g., with less noise, than the other three methods while NeTF takes longer (e.g., several hours) to train the neural networks.
As well as in a regular grid, NeTF supports the transients measured both in a concentric-circle pattern and in a non-uniform pattern as input. We thus show the results in Fig. 8. From the last three columns, the hidden objects are reconstructed in good agreement with the photos and match well with the features of the sampling patterns. For instance, the reconstruction of Mannequin in the concentric-circle pattern and in the non-uniform pattern shows strong intensity at the center (or the torso). We offer additional reconstruction results in Supplementary Material using a baseline optimization method, which show more noticeable features of different sampling patterns but are more blurry than those from the learning-based NeTF. The experimental results in Fig. 8 demonstrate the effectiveness of transients measured from our calibrated system in either uniform or non-uniform sampling patterns.

C. Ablation Studies
Calibration errors are accumulated from the calibration procedures of the entire capture system, such as the relay wall, and affect reconstruction quality of hidden objects. We conduct ablation studies to show how calibration errors of the relay wall may affect NLOS reconstruction of hidden objects because the relay wall relates calibration procedures of the galvanometer and 3D coordinates of detection points.
First, we develop an algorithm to synthesize transients of a hidden Bunny in an NLOS scene based on the confocal imaging model in (4). The details of the algorithm are provided in Supplementary Material. The relay wall is accurately situated at 0.0 • , which implies that the wall is parallel to the measurement surface of the hidden objects and is inclined by 48 • toward the galvanometer. We assume that the relay wall is mis-calibrated with errors from ±2.5 • to ±10.0 • at intervals of 2.5 • , where rotating clockwise is positive in comparison with the accurate position. Similarly, we assume that the relay wall is mis-calibrated forward and backward with the errors at the same intervals, where forward is positive. For all these assumptions, we render the transients of the hidden Bunny with 4 ps temporal resolution and 64 × 64 spatial resolution. Since the measured transients include noise, we further add noise on the synthesized transients based on (6). The experiments on simulated transients are conducted with and without noise.
In addition, we collect the real transients of a checkerboard from our calibrated NLOS imaging system. Similar to the simulation, we consider that the relay wall is accurately calibrated at 0.0 • , and is mis-calibrated with errors from ±2.5 • to ±10.0 • at intervals of 2.5 • . The transients are captured for each mis-calibration error while rotating the relay wall positively or negatively, and tilting it forward due to the onsite deployment.
We carry out the ablation studies on the simulated and measured transients using PF [7], which rapidly offers comparable results to NeTF. Fig. 9 shows the major reconstruction results of the simulated Bunny (with noise) and the measured checkerboard, with the relay wall accurately calibrated at 0.0 • and mis-calibrated with errors of ±5.0 • and ±10.0 • . The relay walls accurately calibrated are shown as blue solid lines and the mis-calibrated walls as red dotted lines. We also calculate the multiscale structural similarity (MS-SSIM) of the simulated Bunny between reconstructed and ground-truth. The MS-SSIMs rapidly become smaller in negative directions than in positive directions because the relay wall at 0.0 • is inclined by 48.0 • Fig. 9. Ablation studies for calibration errors of the relay wall. Top: Illustrations of the accurately calibrated relay wall at 0.0 • as blue solid lines and of mis-calibrated relay walls at ±5.0 • and ±10.0 • as red dotted lines. Middle: NLOS reconstruction of a simulated Bunny with MS-SSIM. Bottom: NLOS reconstruction of a measured Checkerboard. These results are achieved using PF [7]. Note that the relay wall at 0.0 • is actually parallel to the measurement surface of the hidden objects and is inclined by approximately 48 • toward the galvanometer. toward the galvanometer. Additional results are provided in Supplementary Material. We observe that calibration errors of the relay wall result in optical distortions, e.g., radial and perspective distortions, on NLOS reconstructions of the simulated Bunny and the measured checkerboard. The reconstruction quality with calibration accuracy of the relay wall within ±2.5 • remains similar to the accurately calibrated wall and markedly degrades with increasing calibration errors.

VI. CONCLUSION
In this work, we have presented an online calibration technique for a SPAD-based NLOS imaging system. Our technique does not require auxiliary apparatus or additional detectors. Only LOS components of transients are exploited to accurately calibrate the system, including the galvanometer and the relay wall, and to calculate Gamma maps for previewing the effects of the calibration process. With accurate galvanometer − wall mapping, our scheme can support both uniform and non-uniform sampling of hidden objects in an NLOS imaging system. This makes our solution applicable to a wide variety of NLOS reconstruction algorithms.
Our online calibration can be extended in future work. The NLOS reconstruction in our experimental results tends to be bright at the right bottom due to attenuation and out of focus while scanning on the far side of the relay wall. Varifocal lenses may help to focus detection points at different areas of the relay wall. Under non-confocal setups, our technique can calibrate the system by selecting more than three pairs of co-axial illumination and detection points along with known positions of two galvanometers, which separately control laser spots and detection points. Non-Lambertian relay surfaces are a substantial problem. Unless we assume a known BRDF for the surface, it would be very difficult to factor out the effect of non-uniform reflectance in the measurements. Sampling patterns resemble how to align cameras and light sources in a light field imaging system and how to capture rich information in the NLOS scenes in specific views. Adaptive sampling has the potential to unlock optimal numbers and positions of the illumination and detection points, resulting in minimal acquisition time and high-quality reconstruction. We believe that our work is valuable for the NLOS research community; further efforts are needed for high-resolution NLOS imaging and high-quality reconstruction.