Differentiable Uncalibrated Imaging

We propose a differentiable imaging framework to address uncertainty in measurement coordinates such as sensor locations and projection angles. We formulate the problem as measurement interpolation at unknown nodes supervised through the forward operator. To solve it we apply implicit neural networks, also known as neural fields, which are naturally differentiable with respect to the input coordinates. We also develop differentiable spline interpolators which perform as well as neural networks, require less time to optimize and have well-understood properties. Differentiability is key as it allows us to jointly fit a measurement representation, optimize over the uncertain measurement coordinates, and perform image reconstruction which in turn ensures consistent calibration. We apply our approach to 2D and 3D computed tomography, and show that it produces improved reconstructions compared to baselines that do not account for the lack of calibration. The flexibility of the proposed framework makes it easy to extend to almost arbitrary imaging problems.


I. INTRODUCTION
In computational imaging a physical process, A, such as 2D computed tomography (CT) relates the object we want to image, x, to an observable field, y.Both x and y are naturally functions of continuous coordinates: x could be a density over 2D spatial coordinates (e.g., [0, 1]2 ), and y a sinogram over angles and 1D projection coordinates (e.g., [0, π) × [−1, 1]).
In practice, however, we have finite sensors.Denoting the space of continuous measurement coordinates by Ω, the sensors sample the field y at µ = ( µ 1 , . . ., µ M ) ∈ Ω M , such that the observed measurements y ∈ R M follow where and η ∈ R M is the measurement noise.The discrete forward operator A µ parameterized by µ samples the output of A.
In computational imaging we work with a discretization or some other finite-dimensional approximation of x denoted by x ∈ R N .Hereafter we let A µ act on x rather than on x.
In this paper we address the scenario where the true measurement coordinates µ are only approximately known: the imaging system is out of calibration.Consequently, the true operator A µ is unknown, and we work with an operator A µ for measurement coordinates µ = (µ 1 , . . ., µ M ), which would be correct if the system was calibrated.The assumed measurement coordinates µ are related to the unknown true measurement coordinates µ by small perturbations.
Not accounting for the mismatch between µ and µ can lead to a poor reconstruction.The gist of our method is to learn a representation of the measurement space that can be evaluated and differentiated at arbitrary measurement coordinates µ.This gives us measurements y(µ) that are then well-suited for a reconstruction method that uses µ.Our proposed framework enables us to 1) Jointly reconstruct the image x and learn the true unknown measurement coordinates µ by using gradientbased optimization.2) Learn continuous measurement representations that take measurement coordinates as input.These representations can be evaluated at µ and are in essence interpolations of the discrete measurements with unknown interpolation knots.3) Leverage standard differentiable image reconstruction methods that exist for the assumed µ even though the observations correspond to unknown µ.This helps learn consistent measurement representations.A strength of our approach is that we can use any continuous interpolation method that admits backpropagation to input coordinates.To showcase this, we use implicit neural networks (neural fields) but also develop fully-differentiable variableknot splines which allow us to optimize spline control points and control point weights.We show that splines perform as well as implicit neural networks while being faster to fit and simpler to interpret.A key property of both these representation types is that we can perform automatic differentiation with respect to their input coordinates and recover µ by gradientbased optimization.

A. Example: Computed tomography
We illustrate our framework with 2D CT, where we measure parallel beam projections of an image at different view angles in a detector plane.Let Ω = [0, π) × R, x ∈ L 2 ([−1, 1] 2 ), and y ∈ L 2 (Ω). 1 The measured projections y are obtained from a finite set of discrete view angles Θ = ( θ 1 , . . ., θ J ), and are sampled at a finite set of discrete detector locations T = ( t 1 , . . ., t K ).The true set of unknown measurement coordinates are then ( µ m ) M m=1 = Θ × T with M = J • K. Furthermore, A µ is given by the discrete Radon transform for view angles Θ and detector locations T .Many other Fig. 2. 2D CT reconstructions with 90 and 120 view angle.The SNRs (in dB) are written in the bottom-left corners.When there is parameter mismatch because the true and assumed view angles differ, the reconstruction method performs poorly and there is a clear degradation.
computational imaging applications such as electron cryotomography (CryoET), magnetic resonance imaging and optical microscopy can be parameterized similarly.
The parameter mismatch between µ and µ and its severe impact is illustrated in Fig. 1.We use a state-of-the-art reconstruction method that computes a filtered backprojection estimate with µ and feeds it into a UNet deep neural network to obtain a reconstruction of x [1], [2].The neural network is trained in a supervised manner using training data that is also generated using µ.When there is parameter mismatch, the true view angles differ from the assumed view angles and therefore µ ̸ = µ.Fig. 1 shows that in this case, the reconstruction method produces a degraded reconstruction from both a visual and SNR evaluation.Fig. 2 shows this for another ground truth image and number of view angles.The last column of Fig. 2 also shows that when there is no parameter mismatch ( µ = µ), the same reconstruction method performs strongly.

B. Related work
Many of the state-of-the-art methods for solving imaging inverse problems are based on deep learning [3], [4].Popular supervised approaches use the forward operator to obtain an initial estimate which is then enhanced by a neural network (reconstruction method in Fig. 2 for example) [2], [5], [6], [7].Unrolling iterative methods or replacing optimization components with deep networks is another established approach [8], [9], [10], [11], [12], [13].A different direction involves requiring the inverse problem solution to lie in the range of a generative neural network [14], [15], [16].In general all these methods utilize a forward operator in some way but do not account for measurement coordinate uncertainty.This can severely impact the reconstruction as shown in Fig. 2. In this paper we propose to address this problem by reevaluating the measurements at the measurement coordinates that deep neural network reconstruction methods are designed for.
While much less common, there are some recent deep learning approaches which address measurement coordinate uncertainty.Gilton et al. explored fine-tuning a neural network that was trained with measurements sampled at µ to work well with test measurements sampled at unknown µ [17].Our method is different: we reevaluate the measurements via a measurement representation for a single example so we do not need to fine-tune.Another appealing approach is to train a neural network on a family of operators with different parameterizations [18].These methods always induce a tradeoff between the reconstruction quality and the variety of forward maps they are trained on, and they only work well for the distribution of perturbations seen at training time.There is also the challenge of dataset generation, especially in computeintensive problems.We mitigate this tradeoff by using consistency to identify the true measurement coordinates.
Continuous representations have been used to solve a variety of science and engineering problems.To the best of our knowledge no prior work has used them for imaging with measurement coordinate uncertainty.Splines have long been used to model surfaces and curves in geometry [19], [20], [21], [22].Moreover, differentiable splines with parameters that can be fitted using automatic differentiation have also been developed [23].Recently deep learning methods called neural fields or implicit neural networks have proven to be extremely good and efficient continuous representations [24].They have been employed in a broad spectrum of applications ranging from representing geometry via signed distance functions [25], [26] to solving partial differential equations [27].Implicit neural networks have also been used to solve tomographic imaging inverse problems [28], [29].In particular, Sun et al. also used them to represent 2D CT measurements [30].Rather than for calibration, they use implicit networks to upsample measurements for downstream reconstruction by a deep neural network.While upsampling may also be performed by traditional tools such as splines, our framework relies on the differentiability of the used representations-both neural and spline-based-with respect to the input coordinates.
In this paper, we learn forward model parameters and continuous measurement representations through a differentiable imaging framework [31].We use automatic differentiation to optimize the parameters and inputs of implicit neural network and differentiable spline representations.Prior differentiable imaging works for microscopy [32], holography, ptychography and ptychographic tomography [33] also learn forward model parameters via automatic differentiation.However, these works do not learn a continuous measurement representation, and so do not reevaluate measurements for use with state-of-the-art reconstruction methods that assume different parameters.
There are alternative approaches for handling uncertainty in measurement coordinates.Bundle adjustment is an example from computer vision where scene coordinates, camera coordinates and system coordinates are jointly optimized [34].Similar to our framework, bundle adjustment uses a good starting estimate to optimize the coordinates.However, a key difference is that we obtain a final solution by reevaluating measurements at the assumed measurement coordinates, and use a reconstruction method designed for the assumed measurement coordinates.Imaging approaches based on alternating minimization optimization [35], and designing handcrafted regularizers to manage ill-posedness [36], [37] have also been employed when there is miscalibration.In this work we jointly minimize our objective function.Furthermore, we use a consistency loss which can be interpreted as a regularizer when learning the measurement representation.
Our work is related to the broader theme of inverse problems with "noisy" forward operators.On one end of the spectrum there are inverse problems where the operator (and µ) is perfectly known, and at the other extreme there are blind inverse problems where the operator and µ are completely unknown.Related blind imaging problems are tomography with unknown view angles [38], [39] and cryo-electron microscopy with unknown projection angles [40].In between the extremes, there are semi-blind inverse problems where the operator and µ are approximately known.Total least squares approaches that perturb an assumed operator [41], [42], [43] and our proposed measurement coordinate-based framework fall under this category.

C. Paper organization
In Section II we present the optimization problem which models joint calibration and image reconstruction.We first model measurements as continuous functions and then show how to learn these measurement representations.We show how implicit neural networks and splines can both be used as representations.Section III numerically verifies that when there is measurement coordinate uncertainty, our proposed method yields significantly improved reconstructions.The simulated experiments are performed on 2D CT in the main paper and 3D CT in Appendix B. We conclude the paper and motivate future work directions in Section IV.

II. DIFFERENTIABLE FRAMEWORK
We wish to learn a continuous representation of the measurement space so that we can sample it at specific measurement coordinates and obtain the corresponding measurements.We model the measurement representations as continuous functions r φ (•) that take measurement coordinates as input and map them to the corresponding sampled measurements.The learnable parameters, φ ∈ Φ where Φ is the space of feasible parameters, are optimized so that where ω ∈ Ω is a measurement coordinate and y(ω) ∈ R is the sample from the measurement space at measurement coordinate ω.For convenience we also denote a batch evaluation of r φ (•) as where ω = (ω 1 , . . ., ω Q ) ∈ Ω Q and R φ (ω) ∈ R Q .We also require r φ (•) to be differentiable with respect to φ and its input so that we can use gradient-based optimization to estimate φ and the unknown measurement coordinates.

A. Joint optimization objective
We want r φ (•) to accurately produce samples from the space of measurements.Since we only observe measurements y at measurement coordinates µ in (1), we require However, since µ is unknown we cannot use it to verify the accuracy of r φ (•).This motivates us to jointly learn the representation parameters φ and the unknown measurement coordinates µ by minimizing a measurement fitting loss, with respect to the input ν ∈ Ω M and φ.Recall y ∈ R M is defined by (1).Learning r φ (•) by minimizing only L fitting (ν, φ) with respect to both the representation's parameters and input would not in general result in an accurate measurement representation because there are too many degrees of freedom.Therefore we regularize and control φ by enforcing r φ (•) to be consistent with reconstructions that could be obtained by using its output.This is done by minimizing a consistency loss with respect to φ, where G µ : R M → R N is a differentiable reconstruction method that was designed using measurement coordinates µ.
Putting everything together, our complete joint optimization objective is where λ ∈ R ≥0 is a tunable weight that controls the relative importance of the consistency and fitting losses.As the assumed coordinates µ are close to the true unknown coordinates µ, we initialize ν to µ.After completing the optimization ( 8), the learned coordinates µ are close to µ (verified in Section III), and the final reconstruction and estimate of x ∈ R N is given by One of the advantages of the objective ( 8) is that the same operator, A µ in (7), is used in each optimization iteration.A formulation that uses the continually updating learned measurement coordinates ν in (7), would require the operator to be rebuilt after each optimization iteration.If there is no efficient implementation to change the measurement coordinates due to the complexity of the forward process, and if the operator is large, this can be severely time consuming and resource intensive.

B. Leveraging reconstruction methods
A key aspect of our framework is that consistency and the final reconstruction are obtained by using reconstruction method G µ (•) that was designed using measurement coordinates µ even though the observations in (1) are sampled at measurement coordinates µ.This provides significant flexibility and allows us to incorporate a variety of reconstruction methods.For example, the reconstruction method may be a relatively straightforward adjoint or pseudoinverse operation.Alternatively, it can be a more complex neural network that provides state-of-the-art reconstructions with measurements from µ (reconstruction method in Fig. 2 is one example).Our framework is particularly advantageous in this case because it may be cumbersome to retrain a neural network for different measurement coordinates.

C. Measurement representations
In order to better understand how to use the framework to solve real imaging problems, we now pick implicit neural networks and splines and explain how they are suitable measurement representations.While we consider these, we emphasize that our framework is general and is not restricted to these representation types.
1) Implicit neural representations: Implicit neural representations are deep feedforward neural networks that represent discrete signals as continuous functions.When used as the measurement representation r φ (•) in (3), φ are the trainable network parameters.The input to the network are measurement coordinates and the output are the corresponding measurements.Implicit neural networks have previously been used to represent measurements when there is no measurement uncertainty [30].In this case the network input was not optimized and consistency (7) was not enforced.
As implicit neural representations are neural networks, we can use automatic differentiation to calculate their gradients with respect to φ and their input coordinates to solve (8).In this paper we use an architecture comprising a Fourier feature mapping layer followed by standard fully-connected layers [26], [30].
2) Differentiable splines: Splines use locally supported basis functions to represent signals as a continuous surface.In this paper we focus on Non-uniform Rational Basis Splines (NURBS) because of their ability to model complex surfaces [21], [44], [45].To aid understanding, we explain NURBS using the 2D CT imaging example that was introduced in Section I-A.The measurement coordinates µ are the Cartesian product of assumed view angles and assumed detector locations, Θ × T where Θ = (θ 1 , . . ., θ J ) and T = (t 1 , . . ., t K ).The NURBS surface s φ (•) with parameters φ evaluated at measurement coordinate where b j,k (•) ∈ R are scalar-valued rational basis functions with local support and p j,k ∈ R 3 are control point vectors.The NURBS are parameterized by weight parameters w j,k ∈ R of the rational basis functions, and the control point vectors.This gives . Note that the control point vectors are three-dimensional because for each two-dimensional measurement coordinate µ ′ , there is a corresponding scalar measurement.Consequently, according to (10), s φ (µ ′ ) is also three-dimensional.We then establish the following relationship between NURBS surfaces and our measurement representations (3) to get a spline measurement representation, where s ⋄ φ (•) denotes the value of the measurement dimension of s φ (•).
It has recently been shown that automatic differentiation can be used to learn spline parameters [23].Hence, we develop differentiable spline representations and use gradient-based optimization to learn the NURBS parameters φ and their input measurement coordinates.These differentiable splines fit into our framework straightforwardly as measurement representations (11), and we can use them to solve (8).
In our numerical experiments, we carefully initialize the spline parameters φ and then learn them: w j,k is initialized to one and control point vectors p j,k are initialized using the assumed measurement coordinates and their corresponding observed measurements, Additionally, in our simulated experiments we also extend (10) to higher dimensional measurement coordinates (see Appendix B).Further details on NURBS and their implementation are provided in Appendices C and D.

III. NUMERICAL VERIFICATION WITH 2D CT IMAGING
We experimentally verify our framework by solving 2D CT imaging problems in the main paper.Further simulation results for 3D CT imaging are provided in Appendix B. The imaging forward operators are implemented using the Operator Discretization Library (ODL) [46].These simulated experiments demonstrate that our framework can be used to solve imaging problems with different measurement coordinate dimensions.Furthermore, we exhibit the flexibility of our method by using implicit neural networks and splines as measurement representations. 2e use SNR (in dB) to quantify the measurement noise and measurement coordinate uncertainty.SNR is calculated by If we let y ∈ R M denote the observed measurements as in (1) and let y denote the unobserved noiseless measurements, the measurement noise level is SNR(y, y).The measurement noise is simulated with zero-mean iid Gaussian noise with variance adjusted to achieve a target SNR level.Due to their state-of-the art performance when there is no measurement coordinate uncertainty, we use deep neural networks for the reconstruction method, G µ (•), in (8).For 2D CT we use a 2D Unet [2], and for 3D CT in Appendix B, we use a 3D Unet [47].Following standard practice, a preprocessing step applies the pseudoinverse of the imaging operator to the measurements to produce an initial image estimate [2].These networks are then trained in a supervised manner to map the initial estimates to ground truth images.The training data generation and preprocessing step are done with the assumed measurement coordinates µ.The measurements in the training data are noisy and in each experiment we use a Unet whose training measurement noise level matches the measurement noise level of the obtained measurements.
As mentioned in Section II, the solution, x, is given by ( 9).We use SNR to evaluate the solution quality, SNR( x, x).We compare x against baseline reconstructions that are obtained by directly using the obtained measurements with the reconstruction Unets, Recall, the obtained measurements correspond to measurement coordinates µ.Appendix D contains further hyperparameter and implementation details.In 2D CT imaging, one-dimensional projections of a twodimensional object at different view angles are collected by an array of detectors.Our goal is to reconstruct an image of the object from these projections when there is uncertainty in only the view angles, only the detector locations, or in both the view angles and the detector locations at the same time.
For these simulated experiments, the assumed view angles are uniformly spaced on the interval [0, π], and the true view angles can have an unknown perturbation from these assumed view angles to simulate experimental incertitude.We keep the view angle measurement coordinate uncertainty level the same for each view angle-each true view angle is independently perturbed from the assumed view angle by zeromean Gaussian noise with variance adjusted to obtain a target SNR level.If we let µ m ∈ R and µ m ∈ R denote the mth true and assumed view angles, the view angle measurement coordinate uncertainty level is SNR( µ m , µ m ).
Similarly, the assumed detector locations are uniformly spaced on the normalized interval [0, 1] and the true detector locations can have an unknown perturbation from the assumed detector locations.To simulate detector uncertainty, we first independently perturb the first and last detectors from their assumed locations with a uniform perturbation scaled so that their location uncertainty meets a target SNR.The remaining detectors are then perturbed so that that the final true unknown detector array is uniformly spaced between the perturbed first and last detectors.Unlike the view angle measurement coordinate uncertainty model, this detector location coordinate uncertainty model is not iid.This enables us to evaluate the performance of our framework when there are non-iid measurement coordinate uncertainties.
Combining these spaces gives the measurement coordinates space as Ω = [0, π] × [0, 1].We explore the performance of our method compared to the baseline which does not account for measurement coordinate uncertainty.We use images from the the LoDoPaB-CT tomography dataset resized to 128 × 128 [48].From this dataset, 35,000 samples were used to train the image reconstruction 2D Unet G µ (•).Test images from this dataset are used in this section to verify our framework.

A. View angle uncertainty
In the first group of experiments we consider the case where there is only view angle uncertainty and no uncertainty in the detector locations.As there is only uncertainty in the view angle dimension and not in the detector location dimension, we only optimize the view angle dimension of the measurement coordinates in (8).
1) Combinations of measurement noise and view angle error: To determine how our framework performs under different settings, we consider different combinations of measurement noise and view angle measurement coordinate uncertainty.We do trials over 25 different test images, and in each trial, different measurement noise and view angle perturbations are used.
We evaluate the performance of our framework relative to the baseline by calculating the average reconstruction SNR improvement over the baseline.Fig. 3a shows the performance for 90 view angles.The performance trends are similar for both implicit neural and spline measurement representations.For a given measurement noise SNR, our method shows increasing improvements as the view angle SNR decreases (measurement coordinate uncertainty increases).This shows that our method handles measurement coordinate uncertainty well, especially when the uncertainties begin to dominate over measurement noise.We can also see that for a fixed angle SNR, the performance gains decrease as measurement SNR decreases and measurement errors becomes more dominant.Fig. 3b shows that the same trends hold when there are 120 view angles.The slight SNR drops seen with splines in Fig. 3a, when the measurement SNR is 30 dB and 35 dB, do not appear in Fig. 3b.This is because there are more view angles which makes the imaging problem less ill-posed.In Fig. 4, we show some randomly chosen example ground truth, pseudoinverse filtered backprojection (FBP) and baseline reconstructions with their SNRs when there are 90 view angles.The reconstructions using our framework with implicit neural and spline measurement representations are also shown.Compared to the baseline, solutions obtained using our framework have fewer artifacts.Note that Fig. 4 shows specific examples, and that the average performance for different combinations of measurement noise and view angle uncertainty is shown in Fig. 3a.Fig. 11 in the Appendix shows these same reconstructions when there are 120 view angles instead.
2) Learned view angles accuracy: In the next numerical experiment we verify that the learned measurement coordinates, µ in (8), are close to the true unknown measurement coordinates µ.As the detector locations have no error, we verify the learned view angles only.We denote the set of true unknown view angles and learned view angles as Θ and Θ.We quantitatively measure the average angle error in degrees as

Average angle error
where J is the number of view angles.Fig. 5 shows how the average angle error changes as the optimization iterations of ( 8) progress.This is shown for one of the test images with different combinations of measurement noise and view angle uncertainty when there are 90 view angles.The solid lines are for implicit neural representations and the dashed lines are for spline representations.For both representation types, the angle error reduces as ( 8) is solved which confirms that our framework learns measurement coordinates that are more accurate than the assumed measurement coordinates which they were initialized with.The average angle error when using the assumed measurement coordinates for reconstruction, as is done in the baseline, is the initial point on the plots.In some instances, the average angle error may increase slightly as the optimization of ( 8) progresses.This behavior can be explained as overfitting of the measurement representations.Fig. 12 in the Appendix shows that the same trends hold when when there are 120 view angles.
3) Reconstruction with more measurements: Next we investigate a variant of the main problem considered in this paper: in addition to the M true measurement coordinates being unknown, the reconstruction method G µ (•) is now designed for M ′ measurements where M ′ ≥ M .In this case µ = ( µ 1 , . . ., µ M ) as before and now µ = (µ 1 , . . ., µ M ′ ).As the measurement representation r φ (•) can be evaluated at any measurement coordinate, we can evaluate (7) at M ′ measurement coordinates.
In this simulated experiment we obtain measurements from 90 view angles.As in the previous experiments, the true view angles are unknown and we initialize the angles of measurements coordinates ν in (6) to be 90 uniformly spaced view angles in the interval [0, π].The detector locations have no uncertainty.We consider different values of M ′ by varying the number of view angles J ′ .The number of detectors K are not varied which gives M ′ = J ′ • K. Again, to obtain the strongest performance, we use state-of-the-art reconstruction Unets.We try two different reconstruction Unets: 1) G J ′ µ (•) which was trained with training data having J ′ view angles uniformly spaced on [0, π] and, 2) G 90 µ (•) which was trained with training data having 90 view angles uniformly spaced on [0, π]. 3  Table I shows the average reconstruction SNR over 25 test images.There is 35 dB measurement noise and the unknown true view angles are perturbed by 35 dB from 90 uniformly spaced view angles.With Unet G J ′ µ (•), the performance fluctuates.The performance is stable with Unet G 90 µ (•).With both reconstruction methods, the best performance is seen when J ′ = 90 (M ′ = M ) and evaluating more measurements does 3 The number of Unet training view angles and evaluated view angles J ′ can be different.This is because the Unet input is computed by a filtered backprojection for J ′ view angles which always results in an Unet input with the dimensions of the image being reconstructed.not help.This is because the fitting loss (6) ensures that r φ (•) accurately represents the M obtained measurements which were sampled from the measurement space at coordinates µ.
Then when M ′ = M , because the measurement coordinates used for reconstruction, µ, are a small perturbation away from µ, they are also represented accurately which results in good reconstructions.Furthermore, when M ′ = M , there are enough measurement coordinates that densely cover the measurement space.The accuracy of the measurement coordinates for the extra measurements when M ′ > M is not enforced by the fitting loss (6).
It has been shown that reconstruction with more measurements can help when there is no measurement coordinate uncertainty, and when the input to the reconstruction neural network combines the observed measurements with the measurement representation output [30].When there is

B. Detector location uncertainty
The previous experiments demonstrate that our framework improves over the baseline when there is uncertainty in the view angles.Next we consider the case where there is only uncertainty in the detector location and no uncertainty in the view angles.In this set of experiments there are always 90 uniformly spaced view angles.As there is only uncertainty in detector locations, we only optimize the detector location dimension of the measurement coordinates in (8).
1) Combinations of measurement noise and detector location error: Similar to Section III-A1, we consider different combinations of measurement noise and detector location uncertainty.We do trials over the same 25 test images.In each trial, different measurement noise and detector location perturbations are used.
The average reconstruction SNR improvement over the baseline is shown in Fig. 6.For a given measurement noise SNR, our method shows increasing improvements as the detector location SNR decreases (measurement coordinate uncertainty increases).This is consistent with the previous results for view angle measurement coordinate uncertainty shown in Fig. 3. Our method handles measurement coordinate uncertainty well, especially when the uncertainties begin to dominate over measurement noise.Note that our framework does not assume a specific uncertainty model-we have used different uncertainty simulation models for view angle uncertainty in Fig. 3 and detector location uncertainty in Fig. 6.Compared to Fig. 3, the curves in Fig. 6 for high measurement SNR and measurement coordinate uncertainty SNR almost overlap at times because the baseline SNR is higher and so the average SNR improvement is lower.In Fig. 7, we show randomly chosen example reconstructions with their SNRs.
2) Learned detector locations accuracy: Similar to Section III-A2, in the next simulation we verify that the learned detector locations are close to the true unknown detector locations.We denote the set of true unknown detector locations and learned detector locations as T and T .We measure the average detector location error as where K is the number of detectors.Fig. 8 shows how the average detector error changes as the optimization progresses for different combinations of measurement noise and detector location uncertainty.Again, the solid lines are for implicit neural representations and the dashed lines are for spline representations.Compared to the assumed detector locations, for both representation types, the final learned detector locations are closer to the true detector locations.

C. View angle and detector location uncertainty
The numerical experiments in Sections III-A and III-B demonstrates that our proposed framework performs well when there is view angle or detector location measurement coordinate uncertainty.We now perform simulations when there is view angle and detector location uncertainty at the same time.There are 90 view angles in these experiments.As there is uncertainty in both the view angles and the detector locations, we will optimize both the view angle and detector location dimensions of the measurement coordinates in (8).
1) Combinations of measurement noise and measurement coordinate error: Following the experiments of Figs. 3 and  6, we consider different combinations of measurement noise  The average reconstruction SNR improvement over the baseline is shown in Fig. 9. Consistent with previous experiments, Fig. 9 shows that we are able to improve upon the baseline when there is measurement coordinate uncertainty.In Fig. 10, we also show randomly chosen example reconstructions with their SNRs when there is measurement coordinate uncertainty in both the view angles and the detector locations.

IV. CONCLUSION
We presented a differentiable imaging inverse problem framework to jointly reconstruct the unknown image and learn unknown measurement coordinates when they are approximately known.There are two major elements in our proposed method.Firstly, we learn continuous representations of the measurements whose input are measurement coordinates and output are the corresponding measurements.By optimizing with respect to their parameters and their input, we jointly learn the measurement representation parameters and the unknown measurement coordinates.The second aspect of our method is that because these representations can be evaluated at any input coordinate, we can leverage reconstruction methods that are designed for measurement coordinates that are different from the ones of the observations.Our 2D and 3D CT imaging simulations show that the benefit of using our framework increases with the level of measurement coordinate uncertainty.
As our framework does not assume a particular measurement representation, we use both implicit neural networks and splines to represent measurements.Splines are generally viewed as interpolation tools, however, our work demonstrates that they can also be learnable differentiable representations that perform comparably to implicit neural representations.Differentiable splines may provide a viable solution for current research directions that have been focusing on using implicit neural networks which can have significantly more parameters and complexity [24].
A strength of our framework is that no extra training data is required to learn the measurement representations.However, a drawback is that it can be time consuming if there are multiple test images because we have to learn separate measurement representations and measurement coordinates for each new set of observations.Therefore, extending our framework to   jointly recover a batch of images and the shared unknown measurement coordinates is an important step towards helping practitioners adopt our framework.Batch imaging also introduces robustness which can help learn more accurate measurement representations.Another important endeavor is to adapt our framework to account for operator uncertainties due to reasons other than measurement coordinate uncertainty which this paper studied.For example, there may be approximation uncertainties if the true operator is approximated to enable faster computations and facilitate analysis.Not accounting for the approximation can lead to degraded solutions [49].Another source of operator uncertainty arises when the object being imaged is altered in an unknown manner during the measurement acquisition process.For example, in CryoET imaging, the sample can translate and deform during imaging which needs to be taken into account [50], [51], [52].Besides the learnable weights and learnable control point vectors explained in Section II-C2, there is a knot vector for each dimension of the measurement coordinates that is not learnable.The NURBS surface also has degrees d Θ , d T ∈ Z + for the view angle and detector location measurement coordinate dimensions.The knot vector for the view angle dimension has (J + d Θ + 1) elements which are arranged in ascending order.We design its uth element, k u , to be zero when 0 ≤ u < d Θ + 1, uniformly spaced between zero and one when d Θ + 1 ≤ u ≤ J, and one when J < u ≤ J + d Θ [23].The knot vector for the detector location dimension has (J + d T + 1) elements and is made in a similar manner by using its degree d T .
The rational basis functions defined in (10) are The function Q u,d (•) is the uth B-spline basis function of degree d.If k u denotes the uth element of a vector, each basis function can be obtained using the Cox-de Boor recursion method,   From (17), we can see that a degree zero NURBS is constructed from piecewise constant basis functions.The recursion ( 18) is then used to create higher degree basis functions with larger support.If the denominator in any term of ( 18) is zero, that term is taken to be zero.The NURBS surface for two-dimensional measurement coordinates in (10) is the tensor product of two one-dimensional NURBS curves as shown in (16).To create NURBS surfaces for D-dimensional measurement coordinates, we take the tensor product of D one-dimensional NURBS curves.This is done for D = 3 in Appendix B.

A. Framework optimization
In this section we provide implementation details for solving our optimization problem (8).All parameters were tuned on a held out set of images for three randomly chosen measurement and operator error combinations.
To implement NURBS, we modified and extended the PyTorch source code released by the NURBS-Diff module authors [23].The implicit neural representation and neural network for G µ (•) in ( 8) are also implemented in PyTorch.This enables us to conveniently optimize the objective function (8) using automatic differentiation and the Adam optimizer.
When implicit neural networks are used, the optimization is run for at least 8,000 iterations and at most 20,000 iterations.The optimization is terminated when the loss value between successive iterations is below 1×10 −10 for 2D CT imaging and 1×10 −11 for 3D CT imaging.When splines are used and there is any view angle or tilt angle uncertainty, the optimization runs for at least 2,000 iterations and at most 5,000 iterations.If there is only detector location uncertainty, the optimization runs for at least 6,000 iterations and at most 15,000 iterations.Additionally, when splines are used, the optimization for all imaging problems is terminated when the loss value between successive iterations is below 1 × 10 −11 .
1) 2D CT imaging: When implicit neural representations are used, λ = 0.1 in (8).The learning rate for both the neural network parameters and the input coordinates is 5 × 10 −4 .When splines are used, λ = 0.025 if there is only view angle uncertainty, and λ = 0.25 if there is any detector location uncertainty.The learning rate for the neural network parameters is 5 × 10 −2 and for the input coordinates is 2 × 10 −4 .
For the implicit neural representation, we use the cosine of the view angle rather than the angle when creating the measurement coordinate.This encodes the circular nature of angular data and improves performance.

B. Reconstruction Unet training
In this section we describe the implementation details for the Unet neural networks used for the reconstruction method G µ (•) in (8).
A mean-squared error loss function is minimized using Adam during training.We train the Unets for 100 epochs where one epoch is a full pass through the training dataset.For the 2D Unet, a batch size of 128 and learning rate of 1 × 10 −3 is used.For the 3D Unet, a batch size of 16 and learning rate of 1 × 10 −3 is used.
Publicly available Unet architectures were downloaded and trained.Unless mentioned here, the default parameters from the download sources were used.The 2D Unet model is from https://github.com/mateuszbuda/brain-segmentation-pytorch [56].We used one input channel, one output channel, and 16 features in the first layer.The 3D Unet model is from https://github.com/ELEKTRONN/elektronn3.We used one input channel, one output channel, 16 features in the first layer and a depth of four blocks.
Due to the limited size of the 3D volume training dataset for 3D CT imaging, we use a data augmentation strategy.We perform random horizontal flips, random vertical flips, and random rotations by 90, 180 or 270 degrees.

C. NURBS
1) 2D CT imaging: For 2D CT imaging, the NURBS degree along the measurement coordinate dimension being learned is 18.It is two in the measurement coordinate dimension that is not being learned.Furthermore, where there is view angle uncertainty, we create additional control points to ensure the spline measurement representations satisfy the Radon transform measurement consistency conditions [57] r φ ([θ + π, t] T ) = r φ ([θ, −t] T ), and where a detector location of −t refers to the tth last detector.This gives the locally supported NURBS basis functions ( 16) a sufficient number of control points around 0 and π radians and ensures the NURBS is accurate in the interval [0, π] radians.Specifically, if d Θ is the degree of the spline in the view angle dimension, we use the consistency condition to create new control points for (10) for 1 ≤ j ≤ d Θ and for J − d Θ < j ≤ J where J is the number of view angles in the observed measurements.Essentially, if there are K detectors, we create 2d Θ K additional control points.2) 3D CT imaging: For 3D CT, the NURBS degree along the tilt angle measurement coordinate dimension is 15.It is two in the two detector location dimensions.Additionally, for this imaging technique, we fix the basis function weights to one and do not optimize them.This makes the NURBS surface a B-spline surface.

Fig. 1 .
Fig.1.Visualization of the parameter mismatch between µ and µ in 2D CT imaging that is described in Section I-A.A mismatch in the view angles between µ and μ can produce a significant drop in reconstruction quality.

Fig. 3 .
Fig. 3. SNR improvement (dB) when solving (8) for 2D CT imaging.There are different combinations of measurement noise SNR and view angle uncertainty SNR.We consider 90 and 120 view angles.

Fig. 4 .
Fig. 4. Example reconstructions for different measurement noise SNR and view angle uncertainty SNR combinations for 2D CT imaging.There are 90 view angles.The reconstruction SNRs are shown for each reconstruction.

Fig. 5 .
Fig. 5. Average angle error for one test image with different combinations of measurement noise and view angle uncertainty when there are 90 2D CT view angles.The solid lines are for implicit neural representations and the dashed lines are for spline representations.

Fig. 6 .
Fig. 6.Average SNR improvement (dB) when solving (8) for 2D CT imaging.There are different combinations of measurement noise SNR and detector location uncertainty SNR.There are 90 view angles.

Fig. 7 .
Fig. 7. Example reconstructions for 2D CT imaging.There are different measurement noise SNR and detector location uncertainty SNR combinations and 90 view angles.The reconstruction SNRs are shown for each reconstruction.

Fig. 8 .
Fig. 8. Average detector error for one test image with different combinations of measurement noise and measurement detector location uncertainty when performing 2D CT imaging.The solid lines are for implicit neural representations and the dashed lines are for spline representations.

Fig. 9 .
Fig.9.Average SNR improvement (dB) for 2D CT imaging when solving(8) for different combinations of measurement noise SNR and measurement coordinate uncertainty.There is uncertainty in both the view angles and detector locations.

Fig. 10 .
Fig. 10.Example reconstructions for 2D CT imaging.There are different measurement noise SNR and measurement coordinate uncertainty SNR combinations.There is uncertainty in both the view angles and detector locations.The reconstruction SNRs are shown for each reconstruction.

Fig. 11 .
Fig. 11.Example reconstructions for different measurement noise and 2D CT view angle uncertainty.There are 120 view angles.The SNRs are shown for each reconstruction

Fig. 12 .
Fig. 12.Average angle error for one test image with different measurement noise and measurement coordinate uncertainty combinations when there are 120 2D CT view angles.The solid lines are for implicit neural representations and the dashed lines are for spline representations.

Fig. 14 .
Fig. 14.Example reconstructions of slices of the 3D test volumes.Reconstructions for different measurement noise SNR and measurement coordinate uncertainty SNR are shown.The reconstruction SNRs for each volume are stated.

Fig. 15 .
Fig. 15.Example reconstructions for two of the orthogonal central slices of each of the 3D test volumes in Fig. 14 for 3D CT imaging.The SNRs for the entire volume are stated.

Fig. 16 .
Fig. 16.Average tilt angle error for one 3D CT test volume with different measurement noise and measurement coordinate uncertainty combinations.The solid lines are for implicit neural representations and the dashed lines are for spline representations.

TABLE I AVERAGE
2D CT RECONSTRUCTION SNR WHEN RECONSTRUCTING WITH MORE MEASUREMENTS.
measurement coordinate uncertainty, using the observed measurements in the input to the reconstruction neural network can reduce performance as shown by Fig.2and the baseline reconstructions in Fig.4. )