A Forward Model Incorporating Elevation-Focused Transducer Properties for 3-D Full-Waveform Inversion in Ultrasound Computed Tomography

Ultrasound computed tomography (USCT) is an emerging medical imaging modality that holds great promise for improving human health. Full-waveform inversion (FWI)-based image reconstruction methods account for the relevant wave physics to produce high spatial resolution images of the acoustic properties of the breast tissues. A practical USCT design employs a circular ring-array comprised of elevation-focused ultrasonic transducers, and volumetric imaging is achieved by translating the ring-array orthogonally to the imaging plane. In commonly deployed slice-by-slice (SBS) reconstruction approaches, the 3-D volume is reconstructed by stacking together 2-D images reconstructed for each position of the ring-array. A limitation of the SBS reconstruction approach is that it does not account for 3-D wave propagation physics and the focusing properties of the transducers, which can result in significant image artifacts and inaccuracies. To perform 3-D image reconstruction when elevation-focused transducers are employed, a numerical description of the focusing properties of the transducers should be included in the forward model. To address this, a 3-D computational model of an elevation-focused transducer is developed to enable 3-D FWI-based reconstruction methods to be deployed in ring-array-based USCT. The focusing is achieved by applying a spatially varying temporal delay to the ultrasound pulse (emitter mode) and recorded signal (receiver mode). The proposed numerical transducer model is quantitatively validated and employed in computer simulation studies that demonstrate its use in image reconstruction for ring-array USCT.


I. INTRODUCTION
U LTRASOUND computed tomography (USCT) is an emerging medical imaging technology that is being developed for a variety of clinical applications [1]- [6].For example, USCT methods for cancer detection and treatment response monitoring have been reported [1], [5], [7]- [11].Images that represent accurate estimates of the speed of sound (SOS), density, and acoustic attenuation (AA) of tissue can be produced by use of USCT image reconstruction methods [12]- [16].To account for the relevant wave physics and thereby achieve high spatial resolution images, full-waveform inversion (FWI) reconstruction methods [3], [13], [17], [18] are being developed.Such advanced reconstruction methods can circumvent the limitations of simplified physics methods [19], [20].
A few examples that employ this configuration include the SoftVue system (Delphinus Medical Technologies, Inc, Novi, MI, USA), approved by the Food and Drug Administration for the screening of women with dense breast tissue and diagnostic use for all women [36], the UltraLucid system developed by Song et al. [25], [33] and a brain imaging system developed by Guasch et al. [3], [29], [30].The ring-array system enables acquisition of ultrasound measurements at multiple vertical positions by translating the ring of transducers vertically.This design allows for slice-by-slice (SBS) image reconstruction, where a 3D volume is obtained by stacking together twodimensional (2D) slices reconstructed from USCT measurements acquired at each vertical position of the ring-array [13], [14], [37].
While the SBS reconstruction approach for ring-array-based systems alleviates the computational burden associated with directly reconstructing a 3D volume, it possesses significant limitations.First and foremost, artifacts may appear in the reconstructed images due to scattering and diffraction from out-of-plane structures.Such effects are not accounted for in the 2D imaging model [10], [37], where point-like transducers are typically assumed and the spatial impulse responses (SIRs) of transducers are not modeled [38], [39].Moreover, there are also significant differences in the way acoustic waves propagate in 2D compared to 3D.To reduce artifact levels and improve image quality in ring-array-based USCT, there remains an important need to develop reconstruction methods based on 3D wave physics models that can incorporate the elevational focusing properties of the transducers.
In this study, a 3D USCT forward model that accounts for the transducer's elevational focusing characteristics is developed to enable improved FWI-based image reconstruction in ring-array-based USCT.In this way, the out-of-plane acoustic scattering and the focusing properties (i.e., the SIRs) of the transducers, in both transmit and receive modes, are accounted for in the forward and associated adjoint models that are utilized by image reconstruction methods.Specifically, a rigid flat transducer element coupled with a concave acoustic lens is considered.The incorporation of focusing effects induced by an acoustic-lens in a numerical wavesolver-based USCT forward model is the main contribution of this work.This differs from related works [40], [41] that are mostly concerned with accurate modeling of transducers with finite curved apertures that may not conform to the computational grid.
The proposed forward model is systematically validated by the use of an analytic solution.Additionally, the developed forward and adjoint models are utilized in a numerical case study of ring-array-based USCT that employ realistic 3D numerical breast phantoms (NBPs).This demonstrates, for the first time, the feasibility of incorporating transducer elevational focusing effects in a 3D time-domain FWI reconstruction method for ring-array-based USCT and reveals improvements in image quality over the traditional SBS reconstruction approach.To enable related investigations by other researchers, the 3D NBPs and corresponding 3D USCT measurement data used in the case study have been made publicly available under CC-0 licensing [42].
The remainder of this article is organized as follows.In Section II, a brief review of the canonical USCT forward model and the time-domain FWI method is provided, as well as background on ring-array-based USCT and lens-focused transducer modeling.Section III introduces the proposed USCT forward model that incorporates elevational focusing effects.The forward model is validated and characterized as reported in Section IV.A case study is presented in Section V to demonstrate the utility of the proposed forward and adjoint models for 3D FWI.Section VI provides discussions of the numerical results presented in the case study, as well as of the limitations and future developments of the proposed approach.Finally, a summary of the work are provided in Section VII.

II. BACKGROUND A. Canonical forward models for USCT
USCT imaging models in their continuous-to-continuous (C-C) form and discrete-to-discrete (D-D) forms are reviewed below.In USCT, a sequence of short acoustic pulses are transmitted from transducer emitters and subsequently propagate through an object.The acoustic pulse is denoted as x(t) ∈ L 2 ([0, T ]) where t ∈ [0, T ] is the time coordinate, and T is a fixed finite time interval.The spatiotemporal source Here, ) is the mapping operator, r ∈ R 3 denotes the spatial coordinate, χ i (r) is the indicator function describing the support of the active area of the i-th transducer surface, and M is the total number of emitting transducers.
When the i-th source pulse s i (r, t) propagates through the object, it generates a pressure wavefield distribution denoted by p i (r, t) ∈ H 1 (R 3 × [0, T ]) 1 .The wave propagation in heterogeneous media can be described by the following lossy second-order wave equation [45]: where p i = p i (r, t), s i = s i (r, t), and c = c(r), α = α(r), ρ = ρ(r) denote the heterogeneous SOS, AA, and density distributions, respectively.The pseudo-differential operator L models power-law frequency-dependent acoustic absorption and dispersion using fractional Laplacians.In particular, the first term stems from the Chen and Holm's reformulation [46] of Szabo's causal convolution operator [47] and the second term is a dispersion correction derived from the Kramers-Kronig relations [48].The quantities µ and η denote absorption and dispersion proportionality coefficients, that are defined as µ(r) = −2α(r)c(r) y−1 and η(r) = 2α(r)c(r) y tan( πy 2 ), where y = y(r) is power-law exponent.Utilizing the fractional derivative enables the employment of non-integer power-law order y and the fractional Laplacian operator enables efficient implementation through time-domain pseudospectral methods [45].
Equation ( 1) can be expressed in operator form as: where During data acquisition, p i (r, t) is only recorded at a limited number of transducer receivers.These measurement data will be referred to as g ij (t) ∈ L 2 ([0, T ]), with the subscript i denoting that the data were produced by the ith source, also referred to as the i-th data acquisition, and the subscript j denoting the data recorded at the j-th transducer for j = 0, 1, . . ., M − 1.In the case where discrete sampling effects and transducer focusing properties are not considered, the idealized C-C forward model can therefore be described as: for all emitters i = 0, 1, . . ., M − 1 and all receivers j = 0, 1, . . ., M − 1.Here, extracts the measurements corresponding to the j-th transducer location on the measurement aperture; specifically, Note that, because of acoustic reciprocity, the mapping operator E i of the i-th transducer is the adjoint of the sampling operator M i , i.e.E i = M † i . 1 Here, H 1 (R 3 × [0, T ]) denotes the space of square integrable functions with square integrable spatiotemporal gradients.Regularity of weak solutions of the wave equation is studied e.g. in [43,Ch 3].Specifically, a weak solution of the wave equation is such that p ∈ L 2 (0, T ; H 1 (R 3 )) and ∂p ∂t ∈ L 2 (0, T ; L 2 (R 3 )), where L 2 (0, T ; H 1 (R 3 ) and L 2 (0, T ; L 2 (R 3 ) are the Bochner spaces [44].It can be shown that if p ∈ H 1 (R 3 × [0, T ]) then p and ∂p ∂t belongs to the Bochner spaces above.
The description of a digital imaging system is typically approximated in practice by a D-D imaging model.To establish this, u(r), x(t), s i (r, t), p i (r, t), g ij (t) are sampled on a Cartesian grid and at a temporal interval △t to obtain the finite-dimensional representations as u ∈ R 4K , x ∈ R L , s i ∈ R KL , p i ∈ R KL , and g ij ∈ R L .Here, K and L denote the number of spatial and temporal samples, respectively.Let [z] k denote the k-th element of a vector z.These discretized parameters can be defined as: for k = 0, 1, . . ., K − 1, l = 0, 1, . . ., L − 1, where r k denotes the k-th spatial grid point and t l = l △t denotes the l-th time sample.Given these discretized quantities, a D-D version of the idealized imaging model in (3) can be expressed as: for all emitters i = 0, 1, . . ., M − 1 and all receivers j = 0, 1, . . ., M − 1. Above, the matrix H u ∈ R KL×KL represents a discrete approximation of the wave propagation operator H u(r) , whose action is implemented by use of a numerical wave solver method.The matrix M j ∈ R L×KL denotes the discretization of the sampling operators M j corresponding to the j-th transducer, and the superscript • ⊺ denotes the transpose operation.Specifically, M j = Mj ⊗ I L , where Mj ∈ R K×1 stems from the discretization of the indicator function χ j (r) (i.e.[ Mj ] k = χ j (r k ) for k = 0, 1, . . ., K − 1), I L ∈ R L×L is the L-dimensional identity matrix, and ⊗ denotes the Kronecker product.In the special case where the j-th transducer is point-like and located at the position r k , the entries of the matrix Mj are all zero but the k-row entry that is equal to 1.
Finally, by collecting in a single vector g i ∈ R M L the measurement data g ij recorded by all receivers when the ith emitter is excited, the discrete imaging model in (4) can be rewritten in a more compact form as where Here, the semicolon notation denotes row-wise concatenation.

B. Time-domain waveform inversion with source encoding (WISE) in its discrete form
The USCT reconstruction problem is to estimate the tissue acoustic properties u, or a subset of them, from a collection of (noisy) measurement data g i ∈ R M L , for i = 0, 1, . . ., M − 1.The underlined notation denotes that g i is measured data as opposed to the simulated data g i produced by use of the forward model.This problem can be solved by use of the WISE method [13], [49], [50].The WISE method is a FWI method that circumvents the large computational burden of conventional FWI [13], [18] by leveraging the superposition of acoustic waves corresponding to the excitation of multiple emitters.
When an ℓ 2 -norm is employed as the misfit functional, the WISE method can be formulated as a stochastic optimization problem as [13]: where w ∈ R M is a random encoding vector with zero mean and identity covariance matrix, E w denotes the expectation w.r.t w, λ is a regularization parameter and R(u) is a regularization penalty.The quantities x are the encoded measurement data and the encoded source, respectively.In previous studies, the encoding vector w has been chosen according to a Rademacher distribution [13], [51].

C. Modeling of lens-focused transducers based on geometrical acoustics approximation
In ring-array USCT systems, an elevational focusing effect can be achieved by use of an acoustic lens attached to the front face of a flat transducer element [32], [52], [53].The variation in the thickness of a concave lens introduces a spatially dependent time-delay in the wavefield upon propagation through the lens.
One approach to modeling a flat transducer is based on the Rayleigh-Sommerfeld integral [54], [55].In transmit mode, the generated acoustic pressure can be described as the superimposition of the contributions of point sources that span the transducer aperture Ω ⊂ R 3 , which describes the active area of the transducer element.An acoustic source on an aperture may be modeled as a velocity source or a pressure source.When an aperture surface Ω of an otherwise perfectly rigid boundary vibrates with a normal velocity v ⊥ (r, t), the acoustic pressure p(r, t) radiated from Ω, without a lens present, into a homogeneous non-lossy medium with constant SOS c 0 and density ρ 0 can be analytically described as: (7)   Equation ( 7) can be generalized to the case where a lens is attached to the transducer by incorporating a time delay into the normal velocity term.One way to accomplish this is to invoke a geometrical acoustics approximation to describe the interaction of the pressure wavefield with the lens, and neglect any effects associated with mode conversion into shear waves and refraction at the lens surface [56]- [61].The impacts associated with attenuation of the lens and impedance mismatches between the lens and propagation medium can be modeled by introducing an apodization weights function, which is discussed in Appendix B but omitted in this section.
In this case, the transducer properties introduced by the curvature of the lens can be modeled implicitly by defining the normal component of the velocity as Here, the quantity v(t) denotes the normal velocity of the transducer element front face that is constant in space over the aperture Ω, and τ (r ′ ) is a spatially varying time shift.
The time shift is determined by the thickness profile d(r ′ ) and SOS, c lens , of the concave lens as [58]- [60]: In receive mode, the recorded signal is affected by the finite detecting aperture, again referred to as Ω, and the time-delays introduced by the lens mounted on it [62].In this case, the measured signal g Ω (t) can be analytically described by an integral of the incident pressure wavefield evaluated over Ω [63], [64] with time delays defined in (9): In practical applications of USCT, the acoustic medium is spatially heterogeneous and analytic expressions for p(r, t) are generally not available.Simulation of the transmitted or received measurement signal therefore requires use of a numerical wavesolver method as described below.

III. USCT FORWARD MODEL FOR USE WITH ELEVATION-FOCUSED TRANSDUCERS
In this section, a new forward model for ring-array USCT that incorporates the elevational focusing properties of the transducers is formulated.The forward model is presented first in a C-C form then in its D-D form.The adjoint model for use in FWI is also provided in a D-D form.

A. The C-C forward model incorporating focusing effects
Here, a lens-focused transducer model will be combined with a wave propagation model to establish an overall forward model for lens-focused USCT.To accomplish this, the relationship between the source pulse s i (r, t) in (1) and the normal velocity v ⊥ (r, t) in the transducer model ( 7) is described and a new sampling operator and its adjoint are introduced.
For the rigid baffle transducer model employed in Section II, the forcing term s i (r, t) can be modeled as a mass source in (1) [65].Specifically, the forcing term has the form where ρ0 is the ambient density, S m (r, t) = 2 ρ0 v ⊥ (r, t) is a mass source, v ⊥ (r, t) and χ i (r) are the normal component of the velocity and the indicator function describing the support of the i-th transducer as introduced above [65], [66].By use of ( 8), ( 11) can be re-expressed as: where τ i (r) is the time delay function associated with the i-th transducer.Equation ( 12) defines the excitation source in the lens-focused USCT forward model provided below.This source can be further characterized in terms of a mapping operator In terms of these quantities, the source and operator (M τ i ) † are related as: The measurement data g ij (t) produced by the i-th source and measured by the j-th transducer can be described by use of the sampling operator (M τ j ) as: Finally, the idealized C-C forward model in (3) can be generalized to the case of elevational-focused USCT as:

B. The D-D imaging model incorporating focusing effects
Here, a D-D version of ( 15) is provided.First, the discrete counterpart of the time delay operator D τ , denoted as D τ ∈ R L×L , is defined as: where ϕ ∈ R L denotes a 1D discrete temporal signal, F and F −1 denote the 1D discrete Fourier transform and its inverse, ⊙ represents element-wise multiplication, and k is a set temporal frequencies defined as: As defined in Section II-A, L denotes the number of temporal samples of the signal that are spaced by ∆t.Note that ( 16) allows for arbitrary time delays to be implemented that are not necessarily exact multiples of the sampling interval ∆t.
Next, the transducer aperture Ω is discretized to conform to the assumed computational grid.Due to the high aspect ratio of the transducers often employed in ring-array USCT [31], [32], the transducer width is neglected and the aperture is described approximately by a line that is parallel to the vertical axis and whose length corresponds to the height of the transducer.The line-aperture is divided into N consecutive line-segments, with the length of each segment corresponding to height of a voxel in the computational grid.An example of a lens-focused transducer and such segmented line-aperture is depicted in Fig. 1.When the line-aperture does not evenly bisect the voxels, a nearest-neighbor interpolation method is employed where each segment is assigned to the nearest grid point.Hereafter, the set K Ωi ⊂ {0, 1, . . ., K − 1} will denote the collection of indexes k corresponding to the grid points associated with the i-th transducer aperture (r k ∈ Ω i ).The cardinality of K Ωi is the number N of consecutive line-segments that form the line-aperture.
Next, as a prerequisite for establishing the D-D forward model, discrete versions of ( 13) and ( 14) are established.Firstly, a sampling matrix M τ j corresponding to the j-th receiver is introduced as the discrete counterpart of M τ j .It is defined as , where τ j (r k ) is the time delay of the j-th transducer at a position r k , and such that the k-th element is equal to 1 and other elements are all zero, where δ ik is a Kronecker delta function.Similarly, the mapping matrix for the i-th transmitter, which is the transpose of the sampling matrix, is defined as . Note that the time shift operator is self-adjoint and the Kronecker product is commutative.
To establish the discrete counterpart of ( 13), the spatialtemporal source term is specified by use of the mapping matrix (M τ i ) ⊺ .Specifically, for the i-th transmitter, the source term is defined as: for k = 0, 1, . . ., K − 1 and l = 0, 1, . . ., L − 1, where x ∈ R L is a 1D discretized signal temporally sampled from the continuous velocity source as [x] l := 2ρ 0 ∂ ∂t v(t l ).In (17), the source s i can be interpreted as a superposition of temporally-delayed contributions produced by the N linesegments belonging to the i-th transducer Ω i .
To establish the discrete counterpart of ( 14), the pressure data recorded by receivers is specified by use of the sampling matrix M τ j .Specifically, the discrete pressure data g ij that are produced by the i-th source and recorded at the j-th transducer are described as where p i (r k ) denotes 1D temporal discrete pressure data at the grid point r k generated by source s i , and the summation is over the j-th receiver aperture Ω j .In (18), the received data are formed as a superposition of discrete pressure signals corresponding to different locations on the transducer aperture, each with appropriately defined time shifts.Finally, a D-D version of the C-C forward model in ( 15) is obtained as where . This equation represents a generalization of the idealized D-D forward model in (5) to the case of elevational-focused USCT.

C. Formulation of the adjoint equation
Similar to the formulation reviewed in Section II-B and reusing the same notation, the reconstruction problem in 3D ring-array based USCT can be defined as: To enable reconstruction of the SOS distribution, the gradient of data fidelity term in (20) with respect to c, denoted by ∇ c J w , can be computed via an adjoint state method [67], [68].The discretized expression for the gradient is given by [13], [18], [69]: where k, l are the indices of grid points and time steps, respectively.The quantity p w ∈ R KL denotes the wavefield data generated by the encoded source as p w = H u s w and q w ∈ R KL denotes the adjoint wavefield data that are computed as: Here, η w ∈ R M L is the time-reversed data residual and j is the index of the transducer.Equation ( 21) holds when non-lossy media is considered.In attenuating media, ( 21) is an approximate expression for the gradient with respect to the SOS, as additional terms coming from AA modeling are neglected [69].

IV. VALIDATION AND CHARACTERIZATION OF THE IMAGING MODELS
In this section, the proposed forward model is validated and characterized.First, the forward model is validated by use of a semi-analytical approximation of the Rayleigh-Sommerfield diffraction integral solution for the case of a lossless homogeneous medium.Next, a sensitivity map is computed to describe the region of a 3D object that is effectively "seen" by a specified focused emitter-receiver pair in a ring-array system.In the following studies, all emitters and receivers were specified using the same elevation-focused transducer model.Here, p 0 (r 0 ) and p 1 (r 0 ) denote the pressure measured at a specific location r 0 (x = 71 mm, y = 1 mm, z = 0.1 mm), while p 2 (r 0 ) represents the response of the lens-focused transducer when a point-like source is at location r 0 .The source pulse depicted in Fig. 2(a) was employed to compute all pressure fields p 0 , p 1 , and p 2 .Specifically, a line transducer with a height of 18 mm and a concave lens with a parabolic curvature was considered.The SOS in the lens was set as c lens = 4500 m/s.The shape of the excitation source is shown in Fig. 2(a), and the profile of the lens is shown in Fig. 2(b).The source pulse had a central frequency of 2 MHz and a maximum frequency of 3.5 MHz.Additional information regarding the employed source pulse, acoustic lens curvature, and associated time delays can be found elsewhere [38].

A. Validation of the forward model
In the first validation study, the pressure wavefield produced by use of the numerical transducer model and measured by use of an ideal point transducer was compared to the reference solution described in ( 7)-( 9).This is equivalent to validating the source s i = (M τ i ) ⊺ x and operator H u s i .The reference solution was computed in a semi-analytical way, where the integrals were computed numerically by the distributed point source method [70].The action of the operator H u was implemented by use the k-space pseudospectral method [66], [71].The computational domain was a uniform 3D Cartesian grid with 1280 × 1280 × 144 cubic voxels.Each voxel had a dimension of 0.2 mm.Accordingly, the transducer was divided into a 1D array of 90 voxels, corresponding to a height of 18 mm.The discretized source term s i with varying time delays was assigned to the 90 voxels as multiple mass sources.To ensure accurate simulation using pseudospectral methods, the spatial discretization of the pressure wavefield was set to 3 points per wavelength [71].To ensure numerical stability, the Courant-Friedrichs-Lewy (CFL) number was set to 0.3, which yielded a time step size of 0.04 µs.The medium was assumed to be lossless and homogeneous with a SOS of 1500 m/s and a density of 1000 kg/m 3 .
Figure 3(a) and 3(b) show spatial maps of the recorded maximum pressure amplitude in a lateral plane, corresponding to the numerical model (p 1 ) and reference method (p 0 ).Also, the spatial relative difference map of the two solutions is shown in Fig. 3(c), where is relative error is less than 0.1% over all the spatial domain.Plots of the time-varying pressure signal at a fixed measurement location r 0 are compared in Fig. 3(d).The pressure signals produced by use of the numerical (p 1 (r 0 )) and reference methods (p 0 (r 0 )) are nearly identical.These results serve as a validation of the operator H u s i , which is a special case of the forward model when an idealized point receiver is assumed.
To validate the overall forward operator M τ j H u s i for use with lens-focused receiving transducers, a numerical experiment was designed in which the validated operator H u s i and the principle of acoustic reciprocity were involved.Traditionally, the reciprocity principle is stated in terms of a monopole point source and a point receiver; however, the same principle also holds for transducers with an extended aperture and lens curvature [72].Here, the validation study exploited the fact that the pressure signal measured by the lensfocused transducer in response to a point source is equivalent to the pressure signal produced by the lens-focused transducer and observed at the point source location, assuming the same excitation source pulse is used.A simulation was performed that employed an idealized point transducer emitter located at r 0 .The focused emitter used for the validation of H u s i was now employed as a receiver to record the signal produced by the point emitter, which is denoted as p 2 (r 0 ) in Fig. 3(d).The close agreement between p 0 (r 0 ) and p 2 (r 0 ) in Fig. 3(d) serves as a validation of the overall transducer model.

B. Characterization via the sensitivity map
To characterize the combined focusing effect of a given emitter-receiver transducer pair, a sensitivity map was computed.This served to visualize the region of the 3D object that can significantly influence the data recorded by that pair.The sensitivity map S(r) is defined as: where p 0 (t) is the recorded pressure signal corresponding to the homogeneous medium and p r (t) is the recorded pressure signal when the point target is inserted at a location r.As such, the value at each location in the sensitivity map describes how the measured pressure data corresponding to a homogeneous medium would be perturbed by a point target at that location.In this study, the homogeneous medium was defined by the following acoustic parameters: c 0 = 1500 m/s, ρ 0 = 1000 kg/m 3 , α = 0.The inserted point target was modeled as a 3D Gaussian perturbation in the SOS.After the insertion of a point target at location r, the SOS distribution was specified as c r (r ′ ) = c 0 + c 1 e − (r ′ −r) 2 2σ 2 , where c 1 = 70 m/s and σ = 0.8 mm. Figure 4 shows the relative location of a diametrically opposed emitter-receiver separated by a distance of 220 mm and an illustration of the inserted point target.The sensitivity map was computed by varying r within the field of view.Figure 5 shows the computed sensitivity map.Point targets were uniformly distributed between -10.8 mm and 10.8 mm in the vertical direction with a spatial sampling interval of 0.4 mm, and between -104 mm and 104 mm in the radial direction with a spatial sampling interval of 3.2 mm.The map depicts a thin region (about 6 mm) of high sensitivity to acoustic heterogeneities near the elevation focus.
By enabling visualization of the effective focusing effect achieved by a given emitter-receiver pair, the sensitivity map can provide insights into the vertical resolution of ring-array USCT systems.Additionally, it can guide the choice of parameters used in the reconstruction method, such as the height of the required computational domain.If measurements acquired at multiple ring-array locations are to be utilized together for image reconstruction, the sensitivity map can also guide the design of the step size by which ring-array should be vertically translated during data-acquisition.

V. CASE STUDY
To demonstrate the utility of the proposed forward models for ring-array-based USCT, a virtual imaging study that em-ployed realistic 3D NBPs was conducted.This study sought to understand whether 3D image reconstruction of a slab-shaped volume could yield improved image quality as compared to the conventional 2D SBS approach.The impact of transducer modeling errors on reconstructed 3D images was also investigated.

A. Virtual imaging system
A virtual ring-array-based USCT system with elevational focusing prescribed by the proposed focused transducer model was assumed.The imaging system comprised 128 elevationfocused emitters and 1024 elevation-focused receivers that were evenly arranged in a ring array of a radius of 110 mm [32], [34].The transducer parameters assumed in Section IV were employed.Specifically, a lens with parabolic curvature a = 0.014 mm −1 (cf.Fig. 2(b) ) and SOS c lens = 4500 m/s was assumed.During data-acquisition, measurement data at three different ring-array locations, referred to as ring -1, 0, +1, were acquired by translating the ring-array vertically and stopping at three equispaced locations.The distance between two consecutive scanned imaging planes was 1.8 mm, corresponding to one-tenth of the transducer height.In the subsequent reconstruction studies, the multi-ring data were only utilized to generate a good initial guess for 3D studies.Only single-ring data from ring 0 were utilized in the final reconstruction.
Thin slabs extracted from two 3D anatomically realistic NBPs [73] corresponding to a BI-RADS breast density category B (scattered areas of fibroglandular density) and C (heterogeneously dense) were virtually imaged.Heterogeneous SOS, AA and density distributions were considered, with realistic values of the tissue properties assigned [73].Figures 6(b-c) shows crossectional maps of the SOS distribution of the type B and type C NBPs in the transverse plane, respectively.Note that the field of view for the breast type C was chosen near the nipple, resulting in a pronounced vertical curvature.
Further implementation details regarding the wave simulations and measurements data acquisitions are presented in Appendix C.

B. Image reconstruction studies
The goal of the studies was to investigate the robustness of 3D FWI to transducer modeling errors, as it can be  challenging to accurately represent the focusing properties of actual transducers due to their complex design, so that a small modeling mismatch may exist between the actual transducer and the computational model.To this end, slab-shaped SOS volumes were reconstructed using 3D FWI and the proposed forward model, with the exact numerical transducer model used in the forward data generation.An additional 3D reconstruction study was conducted using a transducer model with distorted lens curvature.The transducer model overestimated the curvature of the lens by 10% (a = 0.0154 mm −1 ), which can lead to inaccurate time delays and focusing effects.As a reference, SOS images were reconstructed by 2D FWI using measurement data acquired at ring 0. The transducers were modeled as point-like sources/receivers in the imaging plane.To compare the estimated SOS from the 2D and 3D models, the 2D FWI-reconstructed image was compared to 2D slices of the 3D FWI-reconstructed images at the imaging plane of ring 0.
In these reconstruction studies, a two-step coarse-to-fine grid method for time-domain FWI was used to address the issue of constructing a good initial SOS model to avoid cycle-skipping and reduce the computational burden.In the first step, a homogeneous SOS medium (c 0 = 1500 m/s) was used as the starting model and FWI was performed on a coarser grid using lower frequency measurements, to estimate a good initial guess for finer grid reconstruction.In the second step, the estimated SOS map from the first step was refined on a finer grid using higher frequency data.Specifically, in all 3D reconstructions, initial guesses were obtained by combining three coarse grid SOS estimates using measurements acquired from rings -1, 0, +1.This approach, which uses multi-ring data, allows for a more accurate SOS representation of the object at different elevation positions, particularly in the high sensitivity region shown in Fig. 5.However, only measurements at ring 0 were used in the final finer grid FWI.In all reconstructions, a heuristic approach to compensate for the heterogeneity of AA and density media was used.Specifically, a two-region piece-wise constant model of AA/density media was used that assumes that the breast boundary is known (reflectivity imaging could be possibly used to estimate it) and assigns a constant AA/density value to the background and another to the breast region, which corresponds to a weighted average of the mean values of AA/density in fatty and glandular tissues [73].The heuristic approach is designed as a fair approach for comparing 2D and 3D methods.The accurate reconstruction of SOS with AA/density mismatches and the impact of such mismatches are not the primary focus of this paper [74].Additionally, measurements were corrupted with Gaussian independent and identically distributed (i.i.d.) noise.The implementation details of the two-step reconstruction method, the approach of the initial guess formation from three coarse grid reconstructions, the use of perturbed time delays, and the noise model are provided in Appendix C.

C. Results
The reconstructed SOS maps at ring 0 are shown in Fig. 7 for breast phantom type B, and Fig. 8 for breast phantom type C. Line profiles corresponding to the presented 2D SOS maps (at the thin yellow lines) are shown in Fig. 9.The green shaded region indicates the vertical variance of true SOS in a high sensitivity region (from -4 mm to 4 mm), where the upper and bottom boundaries are the 90% percentile and 10% percentile of the SOS distribution of each z-column.The shaded region reveals the large SOS heterogeneity of the employed phantoms along the vertical direction.

VI. DISCUSSION
The results of the case study show that modeling wave propagation in 3D and accounting for elevation focusing can overcome the limitation of the 2D approach and enable high-resolution estimates of SOS in the imaging plane, even when data from only a single elevation of the ring-array are used.Particularly, the 2D reconstruction presents significant artifacts and incorrect tissue structures due to the out-ofplane scattering and 2D/3D model mismatch.Notably, these image artifacts are similar to those observed in USCT images obtained from actual experimental data by the 2D method [10].The object estimates produced by 3D FWI have a more accurate estimation of tissue structures, SOS values, and lower artifacts level, even when the transducer curvature (and thus its focusing properties) are not known exactly.More importantly, as shown in Fig. 9, the profiles of 3D FWI results mostly reside within the shaded region, whereas the profiles of 2D FWI do not.This show that 3D FWI results can provide SOS estimates that, despite the limited spatial resolution in the vertical direction, are consistent with SOS values of the 3D object within the high-sensitivity region of the ring-array.The results demonstrated that such a 3D reconstruction approach can improve the accuracy of SOS reconstruction, and it is robust with respect to uncertainty in the transducer model parameters.
However, mismatches clearly exist between the true object and the object estimated by 3D FWI.Also, the 3D FWI results appear smoother compared to the 2D FWI.These are expected since only single-ring measurements were used in the finer grid reconstruction, which results in an underdetermined 3D reconstruction problem.This is the cause of the limited spatial resolution of the images.On the other hand, the single-ring measurement data are sufficient for 2D image reconstruction; however substantial modeling errors arise by applying a 2D imaging model to reconstruct data generated via 3D wave propagation physics and lens-focusing.As such, the images reconstructed by use of the 2D method appear sharp but contain conspicuous artifacts due to the modeling error involved.This motivates the development of new algorithms incorporating USCT data corresponding to multi-position of the ring-array to further improve the spatial resolution of 3D reconstruction [75].
The proposed model possesses a few limitations.First, the in-plane focusing (directivity) of the transducers is not modeled due to the high aspect ratio of the transducers employed in ring-array USCT.Additional discussions about modeling the directivity in ring-array USCT, especially when the in-plane aperture is not significantly larger than the pixel size, can be found in the literature [41], [76].Second, the transducer model is Cartesian grid-based and requires the use of nearest-neighbor interpolation when the transducer location does not coincide with the voxel size.Non-Cartesian grids can also be explored as potential alternatives.One such approach is the isogeometric finite element method [77], which allows for an unstructured grid that can conform to the curved shape of the transducer [78], [79].Nevertheless, it has challenges including computational cost and implementation difficulties, particularly when it comes to efficient hardware parallelizations.To reduce the modeling errors due to the irregular spacing in the non-Cartesian grid inside the transducer, one potential solution is applying varying weights scaling factors to different grid points.Third, the effects of the electrical impulse response (EIR) are not considered.However, the EIR can be readily incorporated as described elsewhere [64].Fourth, in the employed lens-focused transducer model, the attenuation of the lens and the acoustic impedance mismatches at the lens surface are not taken into account.Related discussions are presented in Appendix B.
It is also worth noting that, in practical applications, lens parameters should be carefully designed for the desired focusing performance.An important consideration is the tradeoff between the desired focusing performance and the loss induced by the lens (due to AA and acoustic impedence changes as described in Appendix B) as this could reduce the signal-to-noise ratio.In addition, The SOS value and curvature profile of the lens in simulation studies may not be optimal.Nevertheless, these are parameters of the proposed transducer model that can be adjusted as needed.
However, accurate transducer modeling for real experimental data remains a challenging task.To reduce transducer modeling errors for practical use, one approach involves calibration, which treats the excitation source pulse, transducer aperture, and time delays as a surrogate model.Further discussions on this topic can be found in the literature [30], [80].
Finally, in this study, only in-plane data acquisitions were employed to emulate the existing ring-array USCT system.However, the proposed model can be extended for more flexible cases.For instance, it can be adapted to model the case where receivers are positioned at different heights or where a vertically rotated ring-array configuration is utilized.Thus, the proposed model can be also used to explore new system geometries that could potentially offer improved vertical solutions.

VII. SUMMARY
In this study, a 3D forward model on a regular Cartesian grid accounting for the elevation focusing effects in ring-array USCT systems was presented to enable accurate ultrasound simulation of the USCT data acquisition process.The focusing effects are modeled by decomposing each transducer into a discrete collection of source/receiver patches and applying a spatially varying time delay to the ultrasound pulse (emitter mode) and recorded signal (receiver mode).The proposed numerical transducer model was quantitatively validated by the use of a semi-analytical approximation of the Rayleigh-Sommerfeld integral solution and of the acoustic reciprocity principle.The case study demonstrated that, by accounting for transducers' focusing properties and 3D wave propagation effects, the proposed 3D imaging model can lead to more accurate estimates of SOS in the imaging plane and fewer artifacts compared to estimates obtained using a 2D wave propagation model, such as that used in the SBS reconstruction approach.Furthermore, the case study demonstrated the robustness of 3D FWI-based image reconstruction using the proposed imaging model with respect to uncertainty in the focusing properties of the transducer, which may be difficult to accurately characterize experimentally.
This work is a key step towards enabling improved volumetric image reconstruction in elevation-focused ring-array USCT.It also provides a foundation for conducting 3D image reconstruction studies that employ data acquired at multiple ring-array positions.

APPENDIX A PUBLICLY RELEASED DATASETS
The NBPs and computer-simulated pressure traces used in the case study have been released under Public Domain CC-0 dedication and are available for download from [42].

APPENDIX B MODELING OF ACOUSTIC IMPEDANCE DISCONTINUITY AND ATTENUATION BY APODIZATION WEIGHT FUNCTION
In a practical lens-focused transducer, discontinuities in acoustic impedance often occur at the interface between the lens and the propagation medium.Additionally, AA is often present in the lens materials.In this section, the modeling of a lens-focused transducer taking into account the acoustic impedance and AA is presented by introducing an apodization weight function.Additional simulation studies are conducted to demonstrate the use of the apodization weights together with the proposed transducer model in a scenario of practical relevance.For the transducer in emitter mode, the apodization weight function is applied to the normal velocity, such that (8) is updated as [60], [61]: where w e (r ′ ) denotes the apodization weight function at the emitter.It accounts for the attenuation loss due to the AA within the lens and reflection loss due to impedance mismatch as waves travel from the lens to a homogeneous medium.Specifically w e (r ′ ) = a(r ′ )z e (r ′ ), where a(r ′ ) is the attenuation factor, which can be modeled as a function of lens thickness d(r ′ ) and the AA value of the lens using a raybased approximation, and z e (r ′ ) is the transmission coefficient accounting for the reflection loss.Assuming the lens is a fluid medium, the transmission coefficient z e (r ′ ) for the transition from the lens to a water medium can be derived based on the laws of reflection and refraction [61], [81]: z e (r ′ ) = 2c 0 ρ 0 cos θ lens c 0 ρ 0 cos θ lens + c lens ρ lens cos θ 0 ; with sin θ lens c lens = sin θ 0 c 0 , where ρ lens and ρ 0 represent the density values in the lens and homogeneous medium, respectively.The variables θ lens and θ 0 are functions of r ′ , denoting the incident angle and refraction angle, as shown in Fig 10.
Similarly, for the transducer in receiver mode, ( 10) is updated as: where w r (r ′ ) is the apodization weight function at the receiver.
It is modeled as w r (r ′ ) = a(r ′ )z r (r ′ ), where z r (r ′ ) = 2 − z e (r ′ ) accounts for the loss when the wave travels from the propagation medium to the lens.To validate the proposed apodization weight function, and in particular, the acoustic impedance correction z e (r ′ ), an additional simulation study was conducted.A non-attenuating lens was assumed in this study.A high-resolution 2D axisymmetric simulation [82] is utilized as a reference.The simulation medium is shown in Fig. 11, where the lens is explicitly constructed within the propagation medium by assigning different SOS values to the lens region.In this case, an acoustic impedance discontinuity is present between the lens and propagation medium, where c lens ρ lens = 3c 0 ρ 0 .A non-delayed excitation source pulse is assigned to each pixel on the backplane.The phase changes are achieved by the higher SOS in the lens region.The generated wavefield data is then compared with the semi-analytical integral solution incorporating the apodization weight function (with transmission coefficient only), based on (7)(9)(24) (25).
The proposed 3D transducer model exhibits cylindrical symmetry and monopolar characteristics (i.e., no directivity) in the imaging plane.As a result, simulation of such transducer model can be efficiently represented by a 2D axisymmetric simulation, which can largely reduce computational costs and allows for simulations on a finer grid.The computational simulation domain was a high-resolution 2D Cartesian grid with 500 × 5600 pixels.Each pixel had a size of 0.04 mm.Accordingly, the transducer was divided into a 1D array of 450 pixels.To ensure numerical stability, the Courant-Friedrichs-Lewy (CFL) number was set to 0.2, resulting in a time step size of 2 ns.An apodization layer (cf.Appendix C-A) was introduced between the lens and the perfectly matched layer (PML) [83] to mitigate strong reflections at the back, top, and bottom edges of the lens due to acoustic impedance mismatches.Figure 12 shows the comparison of the generated pressure data by two methods.A reasonably close agreement between the two solutions was observed, demonstrating the effectiveness of the apodization weight function in mitigating the impedance mismatch.However, it is important to clarify that we did not anticipate a perfect match, as the 2D simulation is a discretized model with small stair-casing errors when representing the continuous curvature of the lens on a discrete grid.In addition, due to the complicated design of transducers, accurate modeling of lens-focused transducers in real applications remains a challenging task.The proposed lens model is still a simplified computational model and does not account for the presence of one or more matching layers typically used in piezo-electric transducers to reduce losses induced by acoustic impedance mismatch.To reduce the modeling errors, the apodization weight function can be treated as an unknown surrogate variable and calibrated using experimental data [30], [80].

APPENDIX C IMPLEMENTATION DETAILS OF THE CASE STUDY A. Generation of the measurement data
The forward simulation of the USCT measurement data was conducted using k-Wave [66], an open-source toolbox for time-domain ultrasound simulations based on the k-space pseudospectral method.To avoid strong wave reflections and scattering that are due to the use of a truncated thin slab phantom and that are not present in a clinical setting, the computational domain is embedded in an extended domain that includes an apodization layer and PML.In the apodization layer, heterogeneous SOS and density maps of the tobe-imaged object are smoothly apodized to match those of the homogeneous coupling medium (water).This apodization layer is introduced because PML can effectively suppress wave reflection at the boundaries of the computational domain [66], [83] only when acoustic heterogeneities are bounded away from the boundary of the domain.In the case study, the computational domain has a height of 27.2 mm and it is embedded in an extended domain with height 43.2 mm by the addition of the apodization layers (4 mm each) and PMLs (4 mm each).The discretization parameters of the forward wave simulation are reported in Table I.For all reconstruction studies, measurement data were corrupted with additive Gaussian noise ϵ with components that are assumed to be independent and identically distributed (i.i.d.) with a zero mean and a standard deviation of 3.25e-4.This results in a signal-to-noise ratio (SNR) of 22 dB, which is the logarithmic ratio between the power of the noiseless signal and the power of the noise.An example of time trace of noisy pressure data is shown in Fig. 13(a).

B. Multi-grid approach for FWI reconstruction in 3D
Details regarding the implementation of the coarse-to-fine grid FWI image reconstruction approach are described below.This multi-grid approach employs a lower-resolution lowerfrequency FWI reconstruction to provide an initial guess for the higher-resolution higher-frequency reconstruction.Doing so, not only circumvent the cycle-skipping phenomenon but also leads to significant computational savings since most FWI iterations are performed on the coarse grid.In fact, when a coarse grid with voxel size three times larger than the fine grid is used, each coarse grid FWI iteration is approximately 81 times less computationally expensive than a fine grid iteration.Notably, this multi-grid approach also allows for the use of a larger field of view for the coarse grid reconstruction.As described in (21), the computation of the gradient requires access to the forward and adjoint pressure variables at every voxel in the field of view and at every time step.Therefore, computing the gradient for the entire breast region is unfeasible on the fine grid.As a result, while it is feasible to store all pressure variables within the computational domain for the coarse grid reconstruction, in the fine grid reconstruction, only pressure variables within a thin field of view (less than 1 cm of vertical thickness) close to the imaging plane can be stored for gradient computation.
1) Coarse grid FWI reconstruction: In the first step of the reconstruction, the computational grid used for wave simulations is [480,480,108], with a voxel size of 0.6 mm.The transducer is modeled as 30 voxels in height, and the temporal step size is 0.12 µs.A low-pass Hann window filter with a cutoff frequency of 0.45 MHz was applied to the excitation pulse and measured pressure traces to prevent cycleskipping.The height of the field of view included the entire computational domain and apodization layers (35.4 mm).
2) Construction of the initial guess for fine grid reconstruction: Three image reconstructions using measurements acquired from three distinct vertical positions of the ring array where performed on the coarse grid to define the initial guess for the fine grid reconstruction.Specifically, for i ∈ {−1, 0, 1} the vertical position of the ring array is denoted by z i = i∆z, where ∆z = 1.8 mm denotes the distance between consecutive vertical positions of the ring array.The coarse grid SOS estimate using data from the ith ring location is denoted by c i (x, y, z).The initial guess for the fine grid reconstruction was then defined as a weighted sum c(x, y, z) = i∈{−1,0,+1} c i (x, y, z)β i (z) of the coarse grid SOS estimates c i (x, y, z).The weight functions β i (z), those profiles are shown in Fig. 13(b), form a partition of unity (β i (z) ≥ 0 and i∈{−1,0,1} β i (z) = 1) and are such that β i (z j ) = δ ij , where z j denotes the elevation of the j-th ring (j ∈ {−1, 0, 1}) and δ ij is a Kronecker delta function.
3) Fine grid FWI reconstruction: In the second step, the computational grid size is [1280,1280,180].The voxel size (0.2 mm) and temporal step size (0.04 µs) is the same as the one in the forward simulation.A low-pass filter with a cutoff frequency of 1.2 MHz is applied to the excitation pulse and measurement data.The height of the field of view is 8 mm that corresponds to a thin slab of high sensitivity to acoustic heterogeneity centered at the imaging plane (cf.Fig. 5).
4) Stopping criteria: The stopping criteria for this multigrid strategy are defined as follows: The iteration with minimum root-mean-square errors (RMSE) with the true object was selected in both the coarse and fine grid reconstructions.
It is important to note that stopping rules based on the minimum RMSE are not suitable for practical applications since the true object is unknown.However, the primary focus of this work is forward (transducer) modeling and not the development of a practical image reconstruction method.This simple stopping criteria was adopted for the purpose of eliminating the choice of more sophisticated stopping rules for the 2D and 3D studies as a potential confounding factor.By minimizing the RMSE, the comparison of 2D and 3D methods could be performed in a fair way in our computer-simulation studies where the true objects were known.The exploration of alternative stopping criteria or various optimization methods that may improve image quality is left for future research.

C. Details on the reference 2D FWI reconstruction
The reference SOS image was reconstructed by 2D FWI using measurement data acquired at ring 0. The transducers were modeled as point-like sources/receivers in the imaging plane.To mitigate the discrepancy between wave propagation in 2D and 3D, an effective excitation pulse was estimated by minimizing the Euclidean norm of the difference between pressure traces measured by a receiver diametrically opposed to the source when 2D and 3D wave propagation through a homogeneous medium (water).The 3D excitation pulse and the estimated effective 2D excitation pulse are shown in Fig. 13(c).
The reference SOS image was obtained by adapting the above FWI multi-grid approach to the 2D case.First, an initial guess is estimated by FWI on a 2D coarser grid with 480×480 pixels, 0.6 mm pixel size, 0.12 µs temporal step size.A low-pass Hann window filter with a cutoff frequency of 0.45 MHz was applied to the excitation pulse and measured pressure traces.The final reconstruction was then performed on a finer grid with 1280×1280 pixels, 0.2 mm pixel size, 0.04 µs temporal step size.A low-pass filter with a cutoff frequency of 1.2 MHz is applied to the excitation pulse and measured pressure traces.The same stopping criteria employed in the 3D method is used in 2D.

D. Computational time
The majority of the reconstruction time is dedicated to the fine-grid reconstruction process.In the case of 3D reconstruction, it takes approximately 2.3 days using a single NVIDIA A100 (40GB memory) GPU on the Delta supercomputer at the University of Illinois Urbana-Champaign.Each iteration of the 3D fine grid FWI requires around 34 minutes, while each iteration of the 3D coarse grid FWI takes about 50 seconds.
For the 2D reconstruction, the process takes approximately 4.5 hours using a single NVIDIA TITAN X Pascal (12 GB memory) GPU.Each iteration of the 2D fine grid FWI takes 120 seconds.On the coarser grid, each iteration takes 16 seconds.
describes the action of the wave equation and has an explicit dependence on the distribution of acoustic properties u(r) = [c(r), α(r), ρ(r), y(r)].

Fig. 1 .
Fig.1.Schematic of a discretized focused transducer.The transducer aperture is approximately a line segment due to the high aspect ratio.The vertical, radial and tangent directions are referred as the Z-, X-and Y-axis, respectively.The radial direction is toward the center of the imaging system.

Fig. 2 .
Fig. 2. (a) The excitation source pulse employed in the validation studies.(b) The profile of the lens thickness d(z) = a 2 z 2 + d 0 with curvature a = 0.014 mm −1 and offset d 0 = 0.45 mm.Note that plot is not to scale and the horizontal axis representing the thickness of the lens has been stretched for visualization purposes.

Fig. 3 .
Fig.3.The maximum pressure amplitude maps in a transverse plane are displayed, using (a) the numerical transducer model and (b) the semi-analytical reference solution.These maps are presented on a decibel scale and computed using the formula 20 log 10max t |p i | max r,t |p i | (i = 1, 2), where p 0 = p 0 (r, t) and p 1 = p 1 (r, t) represent the pressure fields computed by the semi-analytic solution and the proposed method, respectively.The relative difference map of the two solutions is shown in (c).The map is normalized by the maximum value of the on-axis (z = 0 mm) semi-analytic solution (| max t p 0 −max t p 1 | | max t p 0 (r 1 ,t)| ), where r 1 = (x, 0, 0)).The focal region is situated at a distance ranging from 60 to 140 mm from the transducer.(d) Validation of the operator H u s i (p 0 (r 0 ) vs p 1 (r 0 )) and the overall forward operator M τ j H u s i is conducted using acoustic reciprocity (p 0 (r 0 ) vs p 2 (r 0 )).Here, p 0 (r 0 ) and p 1 (r 0 ) denote the pressure measured at a specific location r 0 (x = 71 mm, y = 1 mm, z = 0.1 mm), while p 2 (r 0 ) represents the response of the lens-focused transducer when a point-like source is at location r 0 .The source pulse depicted in Fig.2(a) was employed to compute all pressure fields p 0 , p 1 , and p 2 .

Fig. 4 .
Fig. 4. Schematic of the transducer locations and the point target that was employed in the sensitivity map calculation.The transducer emitter-receiver pair is shown as blue rectangles.The point target: a blurred object with a peak SOS value of 1570 m/s.

Fig. 5 .
Fig.5.A transverse sensitivity map computed by use of (23) for a diametrically opposed emitter-receiver pair.This map depicts the elevationfocusing achieved for the emitter-receiver pair.

Fig. 6 .
Fig. 6.Schematic of the virtual imaging system used in the case study: (a) A 3D rendering of the ring-array USCT system and the object; (b) A transverse plane view of the thin-slab SOS breast phantom of breast type B; and (c) the thin-slab SOS phantom of breast type C.

Fig. 7 .
Fig. 7. Breast type B: Cross-section view of reconstructed SOS distribution by 3D and 2D time-domain FWI at the imaging plane of ring 0. The colorbar is displayed in units of m/s.The rel-RMSE evaluates the relative error between the slice at imaging plane and the true 2D slice and rel-RMSE(c)= ||c−c true || 2 ||c 0 −c true || 2 .The quantities Cover, Cacc denote the object reconstructed by 3D FWI using an overestimated and an accurate transducer model, respectively; C 2d denotes the SOS image estimated by 2D FWI.The yellow bounding box in True object indicates the selected region shown of the bottom row images.The vertical yellow dash line indicates the location of the line profiles presented in Fig. 9.

Fig. 8 .
Fig. 8. Breast type C: Cross-section view of the reconstructed SOS distribution by 3D and 2D time-domain FWI at the imaging plane of ring 0. The colorbar is displayed in units of m/s.

Fig. 9 .
Fig. 9. Line profiles at y = 0 mm for comparing the estimated SOS using different transducer models.(a) Breast phantom type B; (b) type C.

Fig. 11 .
Fig. 11.Illustration of a 2D axisymmetric wave simulation medium.Here, the impedance of the lens is 3 times greater than the impedance of the water medium.

Fig. 12 .
Fig. 12. Maximum pressure amplitude profiles for 2d axisymmetric simulation and semi-analytical integral solution with apodization weight function.Lines along the radial direction were plotted (at z = 0 mm and z = 3 mm).The excitation pulse used in the computation is shown in Fig 2(a).Pressure amplitude profiles are reported in the same unit as the those used in Fig. 3(d).

Fig. 13 .
Fig. 13.(a): Profiles of the clean and noisy data emitted by the 1-th emitter and received by the 512-th receiver, for SNR dB = 22 dB.(b): The profile of weights β i (z) for combining coarser grid reconstructions from three different ring measurements.(c): The 3D excitation pulse and the estimated effective 2D excitation pulse.