Characterization of direct localization algorithms for ultrasound super-resolution imaging in a multi-bubble environment: A numerical and experimental study

Localization plays a significant role in the production of ultrasound localization microscopy images. For instance, detecting more microbubbles reduces the time of acquisition, while localizing them more accurately improves the resolution of the images. Previous approaches to compare the multiple localization algorithms rely on numerical simulation of a single steady microbubble, with or without modeling its nonlinear response. In real-life situations, vessels have a nonconstant velocity profile, which creates relative movement, producing dynamically overlapped microbubbles even at low concentrations. These complexities deteriorate the behavior of the localization algorithms. To incorporate these effects on the characterization of the localization methods, we designed a virtual medium containing four microtubes of different inner diameters, where single-pixel microbubbles were allowed to flow within each microtube with a parabolic velocity profile. A finite difference method was used to simulate the propagation of ultrasound waves to obtain B-mode images that fed four direct microbubbles localization algorithms (i.e., weighted centroid, 2D-spline interpolation, parabolic fitting, and onset detection). The performance of these methods was quantified using the number of microbubbles detected, the microbubbles distribution, the full width at half maximum, the maximum velocity, and the computational time as metrics. Our simulation results suggest that 2D-spline and paraboloid fitting were the best methods, detecting 100% of the microbubbles with an error in their distribution of 249 and 244 microbubbles, respectively. Both methods with a computational time cost of 18% and 7% lower than weighted centroid, respectively. We also present an experimental comparison of these localization methods, finding results similar to the numerical ones.


I. INTRODUCTION
Ultrasound Localization Microscopy (ULM) has been identified as a powerful technique to visualize microvessels and detect blood flow in a noninvasive manner. ULM relies on the localization of individual ultrasound contrast agents or microbubbles (MBs) injected into the bloodstream [1]. This imaging modality has been proposed to help in the diagnosis of health issues [2], such as, cognitive decline related to age [3], early cancer diagnose [4], and early liver fibrosis [5], by analyzing the vascular anatomy, blood velocity, and vessel tortuosity with resolution in the order of ten microns. For this reason, this technique is also known as ultrasound superresolution imaging. Although ULM has been successful, this technique is not mature enough since there are some challenges to overcome. One of these challenges is to select the best method to localize the contrast agents. Previous studies have proposed different localization methods on the radiofrequency data (e.g., centroid, and local maximum) and on the beamformed data (e.g., weighted centroid, onset detection, Gaussian fit, 50% peak detection, and interpolationbased methods) [6]- [9]. To the best of our knowledge, and even considering the variety of available localization methods, no study compares their performance in realistic multi-bubbles scenarios.
Several steps are necessary to produce a ULM image. First, MBs of gas-encapsulated by a lipid shell with a diameter of 1-5 µm [10] are injected into the bloodstream. An ultrasound transducer sonicates the region of interest to produce thousands of ultrasound images. These images hold two sources of information. First, a slowly-changing signal containing mostly the background tissue, and second, a more rapidlychanging signal coming from the moving MBs. Therefore, these two signals are separated to allow proper detection of the MBs, which is usually achieved by employing a Singular-Value-Decomposition (SVD) filter [11]. Subsequently, the MBs are localized, and their positions are recorded. Finally, the MBs' positions are accumulated to draw the vascular network of the region of interest [12] overcoming the diffraction limit.
Localization of MBs is a crucial step for producing ULM images because, along with the resolution grid, they determine the final resolution of the image [1], [13]. Therefore, the localization method is decisive for resolving the vascular map. This crucial stage can be performed in both domains, the radio-frequency or the beamformed image. In the radiofrequency domain, the MBs' echoes appear as hyperbolas, while in the beamformed domain they appear as the point spread function of the imaging system.
In the literature, we distinguish at least two families of localization methods: the first is based on direct localization algorithms and the second is based on inverse problem algorithms. The main difference between them is that the former does not need a model to perform the localization. However, an advantage of the algorithms based on inverse problems is that they allow the localization of MBs with higher concentration in the bloodstream. Therefore, these algorithms relax the constraint of separability between contrast agents, minimizing the number of images needed to reconstruct the whole vascular network of the region of interest [14]- [17].
Among the direct localization methods, the weighted centroid applied over the beamformed data appears as the most popular one due to its simplicity and low computational cost [6], [18], [19]. Other localization methods found in the literature are based on the curve fitting around local maximums in the beamformed data [7]. MBs can also be localized in the radio-frequency domain by fitting a hyperbola [20], [21], which offers the possibility to exclude outliers and phase jumps.
Previous studies compared six localization methods: onset, 50% point, peak detection, centroiding, and two different Gaussian fit [8]. Onset detection has been found to provide the most accurate detection of the MBs' position and the best contrast to noise ratio. Additionally, the authors explained that onset detection is less sensitive to nonlinear ultrasonic response because it detects the beginning of the MB, where harmonics have less influence when compared to the center of the acoustical MB response. However, this comparison was made by using only one microbubble model, which lacks some key features of realistic microbubbles-tissue-wave interaction, such as MBs' superposition, the relative velocity between MBs and reverberations, among others.
In this work, we compared the performance of four localization methods: weighted centroid, 2D spline interpolation, paraboloid fitting, and onset detection. The comparison was performed by using numerical simulations (FullWave solver [22]) over a virtual medium consisting of four vessels with different diameters. To incorporate relative velocity between MBs, a Poiseuille flow was imposed within the vessel. The metrics evaluated were the number of MBs detected, the MBs' distribution, the Full Width at Half Maximum, the computational time, and the maximum velocity of the MBs. This analysis was contrasted with a phantom experiment.

A. SIMULATIONS WITH FULLWAVE SOLVER
A digital 2D heterogeneous sound-speed map was created to resemble a set of four microvessels embedded in soft tissue, with a random distribution of 12 scatters per resolution cell to fully develop acoustic speckles at the background [23] (Fig. 1A). The microvessels were positioned angled 10 • with respect to the horizontal plane and they were set to have the following inner diameters: 105 µm, 276 µm, 381 µm, 574 µm and an the outer diameters of 208 µm, 611 µm, 1079 µm, 936 µm, respectively, from top to bottom. Microubbles (MBs) were allowed to flow within the vessels ( Fig. 1B and the supplementary movie 1). Initially, the MBs were uniformly distributed, with a concentration of 0.5 MBs per resolution cell, on a medium much larger than the region of interest, which was subsequently cropped. This procedure was done to have all the MBs moving independently to each other and to avoid using periodic boundary conditions on the flow. The sound-speeds assigned to each tissue/material are organized in Table 1. This digital 2D map was used as a goldstandard medium to evaluate the localization methods and the data obtained by the gold-standard medium will be referred to as the ground truth.
Numerical simulations were performed by using the Full-Wave solver, a plane wave imaging sequence was propagated into the virtual medium with a central frequency of 7.813 MHz, which is equivalent to a wavelength (λ) of 197 µm. This numerical tool utilizes finite differences in the time domain to solve the wave equation in a heterogeneous and attenuating medium [22]. The FullWave simulation captures wave physics such as refraction, reflection, clutter, diffraction, and multiple scattering. This software has been validated by the biomedical imaging community in many peer review studies such as [23]- [25]. The flow inside a vessel of a living organism is not constant across its transversal section [26]. This fact produces that MBs move with a relative velocity between them within the same vessel [27]. To incorporate this phenomenon, a parabolic velocity profile was imposed into the microtubes obeying the classical fluid dynamics for a tube with constant cross-section [28]. Thus, the local velocity inside the vessels (v(r)) as a function of the distance from their center (r) was where v max is the fluid maximum velocity at the center of the vessels. The velocity of the MBs was imposed by creating 1000 sound-speed maps in which each MB was displaced from frame to frame following this expression. The maximum velocity at each vessel was remained constant at 10 pixels/frame. The grid size of the virtual medium was set to λ/15 in both directions, axially and laterally. A FullWave simulation was prepared to propagate a single plane-wave at a center frequency of 7.813 MHz with a Courant-Friedrichs-Lewy condition (CFL) of 0.4 [29] on each of the 1,000 sound-speed maps. After the radio-frequency data were obtained, conventional delay-andsum beamforming was applied to reconstruct 1,000 B-mode images with a 0.2 λ lateral grid size and 0.066 λ axial grid size (see Fig. 1C).

B. DETECTION AND LOCALIZATION OF MICROBUBBLES
A Singular-Value-Decomposition (SVD) filter [30], [31] was applied over the B-mode movie to separate MBs from background tissue. The cut-off of the filter was determined empirically, eliminating the first ten and the last ten singular values (see Fig. 1D and the supplementary movie 2). To localize the MBs, four methods were tested, and applied over the enveloped-detected and beamformed data after the SVD filter was performed. Before localizing the MBs, we removed the local peaks that have a smaller width compared to a wavelength. The peak removal was done by utilizing a convolution with a λ × λ window. Additionally, a threshold was applied to remove the background noise. To produce the best result for each localization method, the threshold was optimized based on the predicted and the tabulated diameter of the microtube, independently for each localization algorithm. We clarify here that the localization methods found an optimal threshold that depends on the size of the tube. Therefore, the threshold selected was computed as the average of the optimal threshold for each tube. After performing localization, the positions of all MBs in all frames were accumulated to produce a single super-resolution image for each localization method. The localization algorithms tested in this study are explained as follows: Weighted Centroid (WC) is the simplest and the most popular method [6], [18], [19]. After the threshold was applied, the image was binarized, producing clusters of pixels that are probable MBs. Then, the center pixel for each group of eight or more pixels was localized. Finally, we calculated the centroid based on the neighbors' intensities ( Fig. 2A).
In 2D Spline (2DS) interpolation, the local maximum in-VOLUME , tensity pixel was localized with diffraction-limited resolution (yellow X mark on Fig. 2B). Then, by using the equation of a 2D paraboloid, given by f (x, y) = ax 2 + by 2 + cx + dy + f , the constants a, b, c, d and f were exactly determined for each MB using the maximum intensity pixel in addition to the two vertical and the two lateral closest neighbors (red X mark on Fig. 2B). Many recent studies have used spline interpolation [3], [13], [32], showing that it offered a lower Root-Mean-Squared-Error (RMSE) and better precision when compared to the weighted average, bi-linear interpolation, cubic interpolation, and Lanczos interpolation. However, the 2DS showed poorer RMSE and precision when compared to the radial symmetry method [9]. In Paraboloid Fitting (PF), and similar to the previous method, the maximum intensity pixel in the diffractionlimited grid (yellow X mark on Fig. 2C) along with all its eight neighbors (red X marks on Fig. 2C) were used to fit a paraboloid, given by f (x, y) = ax 2 +by 2 +cx+dy+exy+f . In contrast to the previous method, an exact determination of the constants was not performed. In this case, a fit was done by using the minimum squares method. Thus, the constants obtained a, b, c, d, e, and f were the ones that best represent the pixel around the maximum of each MB.
By using the Onset Detection (OD) method, the pixel with the highest intensity in the diffraction-limited grid (yellow X mark on Fig. 2D) was found first. Then, cubic interpolation was used to find the coordinate of the first value that matched 30% of the maximum intensity in the central line. This process was repeated to find the same value on each adjacent column. Lately, a parabola was fit to compute its central position. This algorithm was adapted from the onset detection method described in [8]. In this study, instead of using 30% of the maximum intensity peak, they used three times the standard deviation above the noise level. The aim was to have a more accurate MB localization due to variable MB signal appearances caused by different sizes, MB resonances, and partial volume effects. Conversely, we have found that three times the standard deviation was outside the image in most cases, so we used 30% of the peak intensity to define the onset.

C. TRACKING AND VELOCITY ESTIMATION OF MBS
After localizing the MBs in all frames, their coordinates were compared to find the closest neighbor between consecutive frames [12]. This tracking allows the estimation of the velocity of each MB in pixel/frame. The velocity data were converted into a 2D image by averaging the velocity of all the MBs at every frame. Then, the velocity profile across the microtubes was computed by averaging along the corresponding microtube.

D. EXPERIMENTAL SET-UP
A syringe pump (Pump 11 Elite, Harvard Apparatus) was used to set a flow rate of 1.5 mL/min through a microtube (Intramedic T M PE Tubing, Fisher Scientific) with 280 µm of inner diameter. In-lab-made lipid-encapsulated MB contrast agents [33] were diluted to 90% in Phosphate-Buffered Solution (PBS) and infused through the microtube (see the setup in Fig. 3A). Ultrasound images were acquired by using a Verasonics Vantage 128 scanner using a transducer Philips ATL 12-5 narrow, operating at a center frequency of 7.8 MHz. The scanner was programmed to emit a single plane wave sequence and to acquire 1,500 frames at 100 frames per second. The beamformed image was produced, as one can see in Fig. 3B. Note that only the microtubes' walls are visible, with an exaggerated size compared to the true size of the tube. The SVD filter was applied over the B-mode images to remove the stationary part of the signal and detect the MBs. This configuration allowed the detection of five MBs per frame, approximately (see Fig. 3C).

E. METRICS
Five metrics were introduced to compare the localization methods with the ground truth: number of MBs detected, MBs' distribution error, error in the Full Width at Half Maximum (FWHM), maximum fluid velocity, and computational time.
The average number of MBs detected per frame was chosen as a metric. The more MBs detected, the fewer frames needed to reconstruct the vascular network of the region of interest. However, this metric is not sufficient to quantify the performance of a localization method, which can present two different sources of error, one related to false-positive cases and one related to false negative cases. Therefore, in addition to the average number of detected MBs, the MBs' distribution error was used as a metric to compare with the goldstandard distribution. The MBs' distribution was computed by summing all MBs in the super-resolution image along the tube direction, and it will depict the probability distribution of the number of MBs detected across the width of the tube. A RMSE was computed by comparing the probability distribution that resulted from the super-resolution images produced by each localization method with the one produced by the ground truth.
The error on the FWHM was computed, comparing each method to the ground truth. Experimentally, it compares the FWHM with the tabulated size of the tube. Note that four microtubes were presented in the simulation. Therefore, the average FWHM error reported corresponds to the average of these four microtubes. However, only one microtube was present in the experimental phantom. The velocity profile was calculated, as commented in section C, and the maximum velocity of each microtube was used as a metric. Finally, the computational time spent per frame to localize the MBs was measured for each method. All the processing presented was done using a LENOVO ThinkPad workstation having an Intel Core I7 with 32 GB of RAM. Fig. 4A shows super-resolution images created from the accumulation of the centers of detected MBs for the Weighted Centroid (WC), 2D Spline (2DS), Paraboloid Fitting (PF), Onset Detection(OD), and the ground truth, respectively. The color bar indicates the number of MBs detected in each pixel. WC method detects significantly fewer MBs when compared to the other methods, as visible from a much darker image. The super-resolution images obtained from 2DS and PF methods look very similar to each other, and they compare well to the ground truth. OD detects fewer MBs than PF and 2DS but more than WC.

A. NUMERICAL RESULTS
To analyze this information quantitatively, the microbubbles (MBs) were summed along the vessel direction, as indicated by the yellow arrow in Fig. 4A, to produce the distribution of MBs across the cross-section of the tube (Fig.  4B). Surprisingly, this result indicates that the wider the microtube is, the fewer MBs are detected for the WC method. VOLUME , This effect can be seen in the blue curve decreasing in amplitude for wider microtubes. On the other hand, the 2DS and PF produced a similar distribution of MBs compared to the ground truth. Finally, OD makes evident substantial differences in the MB distributions since it detects more MBs at the upper edge of the microtube than at the bottom wall. The fluctuations in the MB distribution of the ground truth are produced by the low number of MBs initially placed at the first frame of the simulation (about 188 in the smallest tube). In subsequent frames, the localized MBs are the same ones as in the first frame but displaced. This fact produces a nonindependent distribution of MBs between the first frame and the others, propagating the fluctuation to the total distribution of MBs shown in Fig. 4B. Fig. 4C depicts the velocity maps calculated by using the localizations obtained by each method. The color bar indicates the velocity in pixel/frame. Note that, as imposed, the speed at the center was detected larger than at the edges of the tube. As expected, these images show a similar microtube diameter compared to the ones in Fig. 4A, which is corroborated by the velocity profiles shown in Fig. 4D for all microtubes. Interestingly, the velocity profiles show a great similarity among all the methods, especially considering the three smallest microtubes, detecting a maximum velocity of about 9 pixels/frame. In the biggest microtube, however, the OD underestimates the velocity even more than in the other ones, with a maximum velocity of fewer than 8 pixels/frame. Even though the methods have similar results in the velocity estimation, they still have a difference compared to the ground truth, mainly due to overlapped MBs. This finding is discussed in the next section. Although the WC was the method that detected the fewest MBs, it produced a good This poor image quality comes from the fact that WC neglects slightly overlapped MBs. In contrast, the 2DS produces a much better superresolution image, which depicts microtubes with a RMSE in the distribution of 249 MBs, and a FWHM error of 12%. Similarly, PF depicts the microtubes slightly clearer with a MBs' distribution error of 244 MBs and a FWHM error of 11%. OD shows a RMSE in the distribution of 390 MBs. This high error occurs because OD tends to distort the shape of the distribution of MBs, populating the upper edge of the tube more and producing a higher FWHM error of 27%. Surprisingly, 2DS was found to be the fastest method (8.0 ms/frame) regarding the computational time cost, being even faster than WC (9.8 ms/frame). On the contrary, OD was the slowest method (17.1 ms/frame), approximately twice slower than 2DS. Regarding PF, this method presents a time cost of 9.1 ms/frame. Finally, the maximum velocity of the ground truth was set to 10 pixels/frame. However, the WC, 2DS, PF, and OD detected 9.0, 9.0, 9.0, and 8.5 pixels/frame, respectively.

B. EXPERIMENTAL RESULTS
To validate our simulation results, a microtube phantom experiment was performed. Fig 6 shows the super-resolution image of the microtube phantom made with each method. Contrary to the numerical case, in the experiment there is no access to the real MBs' distribution and the real number of MBs inside the tube. In order to produce a fair comparison between the localization methods, the initial threshold for each method was optimized to match the FWHM of the tube with the tabulated inner diameter of the microtube of 280 µm.
Each method found 4.0, 5.4, 4.7, and 5.0 number of detected MBs per frame for WC, 2DS, PF and OD, respectively (Fig. 7A), which agrees with the simulation result (Fig.  5A) where the WC detected the lowest number of MBs. Note that there is a factor in the order of 50 between the detected MBs in the simulation and in the experiment, which is explained by the presence of only one microtube in the experiment and by the difference in the MBs' concentration. The FWHM error was 7.9% for the WC, 2.9% for 2DS, 1.1% for PF, and 3.9% for OD (Fig. 7B), which agrees with the numerical result again (Fig. 5C), where WC and OD present the highest FWHM errors. A difference between simulation and experimental results is that PF shows a smaller error when compared to 2DS for the experimental part. In general, the experimental part produces an error on the FWHM significantly smaller when compared to the numerical analogous because only one microtube is used in the experimental phase. We found the maximum velocity of the MBs to be 343 mm/s, 346 mm/s, 333 mm/s, and 375 mm/s for WC, 2DS, PF, and OD, respectively while the ground truth was 405 mm/s (Fig. 7C). This result is comparable to the one obtained in the numerical scenario (Fig. 5D), and in both cases the estimated velocity is similar among the tested localization algorithms. Finally, the computational time cost was comparable with the localization methods: 5.0 ms for WC, 4.2 ms for 2DS and PF, and 4.3 ms for OD. Fig. 7E shows the MBs' distribution across the width of the tube for each method. Note that 2DS, PF, and OD have almost the same behavior, while WC detected fewer MBs in the center of the tube.
Onset detection was previously mentioned as one of the most accurate methods to localize MBs because it reduces the influence of the nonlinear response of MBs [8]. This conclusion was reached by considering the behavior of a single steady MB. However, the results presented here showed that onset detection localized the MBs' positions accurately only when the correct threshold to remove the background noise was given. Furthermore, the election of this threshold is not trivial, and for the case of onset detection and Weighted Centroid it depends on the size of the microtube of interest, making these two methods less robust. In contrast, 2D Spline interpolation and Paraboloid Fitting work well with a threshold that is similar for all the microtubes considered in the simulation. Fig. 8A shows some common localization mistakes regarding Weighted Centroid (i.e., blue square) and Onset Detection (i.e., magenta circle). A total of nine MBs in this image were found. They were localized with a yellow dashed circle to fa-  Fig. 4B. It is worth noting that 2DS and PF detected eight of the nine MBs in the correct position, showing only one problem when the overlap is near to 50%. When there is no overlap, such as in the right part of the image, no problems were found for any of the localization methods. However, this phenomenon occurs only when the MB concentration is very low, eventually slowing down the total time of image acquisition.
Regarding the MBs' tracking and its velocity estimation, the four methods measured the imposed parabolic flow profile of the microtubes (Fig. 4D) using the nearest neighbor approach. Although all the localization methods produce an error of at least 10% in the peak velocity, Onset Detection mistook the peak velocity even more, mainly when using the largest microtube. This underestimation occurs due to overlapped MBs and it is consistent with the distortion of the MBs' distribution profile produced by Onset Detection, making this method more susceptible to velocity errors.
In this context, we have decreased the concentration of MBs by half and obtained an improvement in the maximum velocity error of 5% with WC, 6% with 2DS, 6% with PF, and 12% with OD. Furthermore, when doubling the concentration, a deterioration in the velocity result of 8% VOLUME , with WC, 2% with 2DS, 2% with PF, and 3% with OD (see Fig. 8B) was found. To validate the correct nearest neighbor implementation, a single MB simulation was performed, imposing a velocity of 10 pixels/frame. As seen in Fig. 8C, all the methods measured the correct maximum velocity.
In conclusion, 2D Spline and Paraboloid Fitting were the most robust methods regarding a medium with vessels of different diameters and with relative motion between MBs. Both methods were able to detect 100% of the MBs with an error in the MBs' distribution of 249 and 244 MBs for 2DS and PF, respectively. For these two methods, the error in the prediction of the size of the tube was in the order of 10% with a computational time that is 18% lower for the case of 2DS and 7% lower for the PF case when compared to the WC method. Although parameters such as MBs' concentration of the experiments and simulation did not match, both showed consistent observations regarding the localization methods. A major drawback in previous studies is the lack of a goldstandard to compare their proposed methods. Here, we have presented a methodology to use FullWave solver to simulate the interaction of ultrasound waves with MBs in a relatively realistic medium. This research potentially allows to study or to optimize other stages in the production of ultrasound super-resolution images. In further work, we aim to increase the complexity of the medium, with different MBs' concentrations and several vessels that have different diameters and bifurcations.
DR. ALINE XAVIER completed her Bachelor's degree in biomedical engineering (2013) and her Master in energy and nuclear technologies (2015), both at the Universidade Federal de Pernambuco, Brasil. She received the Ph.D. degree in engineering (2020) from the Pontificia Universidad Catolica de Chile, Chile. Currently, she is a Post-Doc researcher at the Instituto de Ciencias de la Ingeniería at the Universidad de O'Higgins, Chile. Her research interest are medical physics, medical imaging, magnetic resonance imaging/spectroscopy and mainly, superresolution imaging using ultrasound.
DR. HÉCTOR ALARCÓN obtained his Ph.D. in experimental physics at Universidad de Santiago de Chile in 2012, where he studied the mechanics behind fractures of cohesive granular material. After that he pursued post-doctoral research at Université Paris Sud XI in Paris, France, studying the nonlinear amplification of friction in paper assemblies, then he came back to work at Universidad de Chile in Santiago, Chile, where he studied the current formation in vibrated fluids. He is now currently conducting a postdoctoral research at Universidad de O'Higgins in Rancagua, Chile, performing ultrasound measurements with medical applications.
DR. DAVID ESPÍNDOLA was born in Chillán, Chile, in 1986. He received the Ph.D. degree in physics from the Universidad de Santiago de Chile, Santiago, Chile, in 2012. As part of his Ph.D. dissertation, he studied the interaction waveparticle in granular materials. He pursued postdoctoral research at the Institut d'Alembert, Sorbonne Universités, Paris, France, where he started conducting research on medical ultrasound. He also held a post-doctoral position with The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA, where he was a Research Assistant Professor until 2019. He is currently an Associate Professor at the Instituto de Ciencias de la Ingeniería at the Universidad de O'Higgins, Rancagua, Chile. His current research interests are the linear and nonlinear shear wave propagation in soft tissue, the design of novel ultrasound sequences for the development of quantitative images, and the improvement of vascular imaging using ultrasound super-resolution techniques. VOLUME ,