Combined Ultrasound and Photoacoustic Image Guidance of Spinal Pedicle Cannulation Demonstrated With Intact Ex Vivo Specimens

Objective: Spinal fusion surgeries require accurate placement of pedicle screws in anatomic corridors without breaching bone boundaries. We are developing a combined ultrasound and photoacoustic image guidance system to avoid pedicle screw misplacement and accidental bone breaches, which can lead to nerve damage. Methods: Pedicle cannulation was performed on a human cadaver, with co-registered photoacoustic and ultrasound images acquired at various time points during the procedure. Bony landmarks obtained from coherence-based ultrasound images of lumbar vertebrae were registered to post-operative CT images. Registration methods were additionally tested on an ex vivo caprine vertebra. Results: Locally weighted short-lag spatial coherence (LW-SLSC) ultrasound imaging enhanced the visualization of bony structures with generalized contrast-to-noise ratios (gCNRs) of 0.99 and 0.98–1.00 in the caprine and human vertebrae, respectively. Short-lag spatial coherence (SLSC) and amplitude-based delay-and-sum (DAS) ultrasound imaging generally produced lower gCNRs of 0.98 and 0.84, respectively, in the caprine vertebra and 0.84–0.93 and 0.34–0.99, respectively, in the human vertebrae. The mean ± standard deviation of the area of −6 dB contours created from DAS photoacoustic images acquired with an optical fiber inserted in prepared pedicle holes (i.e., fiber surrounded by cancellous bone) and holes created after intentional breaches (i.e., fiber exposed to cortical bone) was 10.06±5.22 mm2 and 2.47±0.96 mm2, respectively (p < 0.01). Conclusions: Coherence-based LW-SLSC and SLSC beamforming improved visualization of bony anatomical landmarks for ultrasound-to-CT registration, while amplitude-based DAS beamforming successfully distinguished photoacoustic signals within the pedicle from less desirable signals characteristic of impending bone breaches. Significance: These results are promising to improve visual registration of ultrasound and photoacoustic images with CT images, as well as to assist surgeons with identifying and avoiding impending bone breaches during pedicle cannulation in spinal fusion surgeries.

photoacoustic image. Applications of photoacoustic imaging to surgical guidance include visualization of tool tips such as a neurosurgical drill tip [22], a needle tip [23]- [26], or a cardiac catheter tip [27], visualization of underlying structures such as blood vessels [28], and photoacoustic-based guidance during a range of surgeries, such as fetal surgeries [29], endonasal surgeries, [30]- [32], hysterectomy procedures [33], [34], and prostate surgeries [35]. The incorporation of robotics with teleoperated surgery [33] and robotic visual servoing [36], [37] has also been demonstrated. Applications related to the spine include stem cell injection guidance [38] and discrimination of cortical bone from cancellous bone to identify optimal insertion points prior to initiating pedicle screw placement [21]. Despite these remarkable advances, no previous studies investigate the accuracy of photoacoustic signal visualization and localization within the pedicle of a vertebra. This paper investigates two hypotheses. First, based on previous studies to visualize photoacoustic signals from the surface of human vertebrae [21], we hypothesize that similar visibility can be achieved beneath the bony structure in a more realistic setup and closer to the surgical environment of a spinal fusion surgery. Second, we hypothesize that improvements to 2D ultrasound imaging would reduce the computational burden associated with requiring 3D ultrasound images to complete the segmentation task for ultrasound-to-CT registration. To address the poor 2D ultrasound segmentation that otherwise compromises the performance of ultrasound-to-CT registration, we propose a novel coherence-based beamforming technique named locally weighted short-lag spatial coherence (LW-SLSC) beamforming. LW-SLSC beamforming is a regularized version of short-lag spatial coherence (SLSC) beamforming [39], designed to minimize the trade off between contrast and spatial resolution. Therefore, LW-SLSC has the potential to enhance the vertebral boundaries adjacent to soft tissue when compared to conventional delay-and-sum (DAS) beamforming, as previously demonstrated in an ex vivo caprine vertebra [40].
Our hypotheses were tested with ex vivo caprine and human vertebrae. First, the segmentation enhancement achieved with LW-SLSC beamforming was compared to that obtained from SLSC beamforming and conventional DAS beamforming in an ex vivo caprine vertebra. Then, we demonstrated the visualization of photoacoustic signals originating from inside the lumbar vertebrae located inside a human cadaver during pedicle hole creation, using the same methods implemented during spinal fusion surgeries. Validation of the photoacoustic signal locations was based on manual registration of postoperative CT volumes to co-registered ultrasound and photoacoustic images. This registration relied on identified landmarks within segmented ultrasound images that were enhanced with LW-SLSC beamforming. Finally, we successfully differentiated photoacoustic signals originating from cancellous and cortical bone inside the human cadaver by measuring the areas of −6 dB contours of DAS photoacoustic images. This paper is organized as follows. Section II details our acquisition, beamforming, segmentation, and registration methods. Section III presents our experimental results. Section IV discusses insights from the experimental results. Section V summarizes our conclusions.

1) Short-Lag Spatial Coherence:
Unlike the conventional amplitude-based DAS beamformer, SLSC beamforming [39] displays the similarity of received signals in the aperture domain, as a function of element separation m. A received time-delayed sample is represented as s i (n), where i is the channel index and n is the depth index in a zero-mean radio frequency signal s i . First, the coherence function R(m) is calculated using an axial kernel as follows: where N is the the number of elements in the aperture, and n 1 and n 2 are the limits of the axial kernel k in units of sample number. Then, an SLSC image is generated as the integral of the spatial coherence function over the first M lags: 2) Locally Weighted Short-Lag Spatial Coherence: Enhancement of bone boundaries can be achieved by implementing a regularized version of the SLSC beamformer [40]. Instead of averaging the cumulative sum up to a lag value M (out of a preselected total of N L lags, where M < N L ), LW-SLSC beamforming computes the weighted coefficients for N L lags by minimizing the total variation (TV) of the weighted sum within a moving kernel R i ∈ ℝ k z × k x × N L obtained from the correlation matrix R ∈ ℝ N z × N x × N L . In order to preserve the high resolution information available at higher lags (i.e., M > 15), this adaptive solution was regularized using the L2-norm with a gradient operator. Then, the TV minimization was defined as: Subject to : w i 1 = 1 0 ≤ w i ≤ 1 (3) where TV is the 2D total variation with the L2-norm applied to the cost function f, R i is the kernel i of the correlation matrix R, and w i ∈ ℝ 1 × N L is the optimized weight vector for the calculated summed lags of R i . The weighted sum kernels w i R i were stacked into multiple layers and positioned relative to the center of each R i . The LW-SLSC image was the median of the stacked kernels w i R i . The main advantage of LW-SLSC relies on the adaptive selection of lower lags in kernels surrounding isoechoic regions, which enhances contrast, and higher lags otherwise, which enhances resolution. The selective combination of higher and lower lags is known to reduce the noise commonly observed in SLSC images created with higher lags [41].
The original formulation in (3) can be simplified using the framework of Barbero et al. [42] for Total Variance alternatives. However, these simplifications only hold when computing TV with the L2-norm. The gradient operator ∇ (i.e., 1D TV operator) used in the penalty term is simplified to: Similarly, the two dimensional TV operator used in the fidelity term is reduced to: Reshaping R into the form R i ∈ ℝ k z k x × N L and using (4) and (5) in (3), results in the following expression: w = argmin w BR i w 2 2 + α 2 Dw 2 2 (6) = argmin w w T Hw , H = BR i T BR i + α 2 D T D The reduction presented in (7) has several advantages over (3). First, by assuming the kernel size is constant during the LW-SLSC computation, the term α 2 D T D is independent from the kernel R i and thus can be computed only once. Second, matrix operations in BR i T BR i can be parallelized using built-in libraries for computational speed up, where the matrix B is precomputed. Finally, the Hessian H allows quadratic programming using Newton step optimizers instead of the conventional gradient descent, featuring faster convergence rates.
In this study, the primal-dual interior point method [43] is used for estimating the solution of (7).

B. Segmentation of an Ex Vivo Caprine Vertebra
The segmentation accuracy of bony structures achieved with DAS, SLSC and LW-SLSC were tested on an ex vivo caprine thoracic vertebra (with surrounding tissue intact). This vertebra was imaged with a L3-8 linear array ultrasound probe connected to an Alpinion E-CUBE 12R ultrasound system (Alpinion, Seoul, South Korea), as shown in Fig Bone boundaries from DAS, SLSC, and LW-SLSC images were computed by applying a binary threshold of 50% of the maximum pixel amplitude and selecting the closest contour to the vertebral foramen. These boundaries were registered with manually selected horizontal slices from volumetric 3D CT data. The registration used Mattes Mutual Information as the similarity metric [44], with One Plus One step evolutionary as the heuristic optimizer [45].
C. Vertebral Imaging of a Human Cadaver 1) Specimen and Surgery Details: An adult male human cadaver was placed in prone position and dissection was carried along the cranio-caudal axis with the aid of a Cobb elevator to reveal the spinous process, lamina, and facet joints at each level from L1 to S1. The specimen had no reports of spine pathologies, malformations, or previous spinal surgeries, which was also confirmed with pre-operative CT imaging. The pedicles were cannulated bilaterally from L2 through L4 along anatomic trajectories using a standard free hand technique with a pedicle probe. Intentional medial and lateral breaches were made in some of the pedicle cannulation attempts. The total depth of the pedicle tracts from the bone surface ranged from 14 mm to 25 mm, as measured with the ruler on pedicle probe.
2) Data Acquisition: Fig. 2 shows the acquisition setup for ultrasound and photoacoustic data from the human lumbar vertebrae. A 1-mm diameter optical fiber was inserted to touch the bottom of the pedicle hole. The optical fiber was used to transmit 750 nm wavelength laser light from a Phocus Mobile laser (Opotek Inc., Carlsbad, CA, USA) with an energy of 13.4 mJ at the fiber tip. Photoacoustic signals were received by a SC1-6 convex array ultrasound probe connected to an Alpinion E-CUBE 12R ultrasound system. The probe was positioned in an oblique axis across several lumbar laminae. Enhanced realtime visualization of photoacoustic signals was achieved with GPU implementation of SLSC [37], [46], [47] for a convex array. This photoacoustic beamforming method was chosen because it was the best real-time imaging option available to assist the surgeon with fiber tip localization during the surgery.

3) Ultrasound and Photoacoustic Imaging:
Ultrasound and photoacoustic radiofrequency data were acquired up to a depth of 70 mm, with a focal depth of 25 mm for the ultrasound data. No frame averaging was applied in order to avoid the blurring artifacts that would hinder the performance of ultrasound-to-CT registration. SLSC ultrasound images were computed with M = 5 and 1λ axial kernel length, whereas LW-SLSC ultrasound images were computed with N L = 15, a 2.0 mm (lateral) × 3.1 mm (axial) kernel, 60% overlap, and α = 1. Similarly, SLSC photoacoustic images were computed with M = 15 and 1λ axial kernel length, whereas LW-SLSC photoacoustic images were computed with N L = 25, a 2.0 mm (lateral) × 3.1 (axial) kernel, 60% overlap, and α = 1.

4) Ultrasound and Photoacoustic Segmentation:
The segmentation of bony structures and their respective centers of mass were measured from DAS, SLSC, and LW-SLSC ultrasound images, whereas the segmentation of the tip of the optical fiber and its respective center of mass was measured from DAS, SLSC, and LW-SLSC photoacoustic images. Note that the fiber tip was in contact with bone during each image acquisition, thus the fiber tip segmentation was considered to be representative of a bony landmark within the created hole. To achieve the ultrasound and photoacoustic segmentations, binary masks were computed with 30% maximum pixel amplitude threshold. Then, the removal of isolated pixels was achieved with morphological opening with a structuring element size of 0.38 mm × 0.38 mm, whereas small holes in the bony masks were filled with morphological closing with a structuring element size of 0.63 mm × 0.63 mm. Ultrasound and photoacoustic images were filtered with the computed mask and further segmented into separated bony structures through a connected component routine. For each component, the center of mass was calculated based on the position of pixels and the amplitude of the ultrasound or photoacoustic image, which was normalized over the maximum amplitude of each component.

5) Landmark Registration:
Pre-operative and post-operative CT volumes (512 × 512 × 192 samples) of the human cadaver were acquired with an O-arm O2 (Medtronic, Minnesota, USA) using 140 kV-peak and 0.78 × 0.78 × 0.83 mm 3 voxel resolution. The CT volumes were optimized for bone visualization by adjusting the window level to 2000 Hounsfield units (HU) and the window width to 2000 HU. Centers of mass calculated from both ultrasound and photoacoustic images were used as fiducial markers for landmark registration, which was conducted with 3D Slicer [48]. The corresponding fiducial markers in the CT volume were manually placed to match bony contours in the registered CT slices to those in the ultrasound images. The registered CT volume was displayed in X-Z and Y-Z views, where X, Y, and Z represent the lateral, elevation, and axial dimensions of the ultrasound probe.

6) Cancellous Vs. Cortical Bone Differentiation:
Photoacoustic imaging was used to differentiate signals originating from cortical and cancellous bone. Photoacoustic signals from cancellous bone were acquired when the tip of the optical fiber was either touching cancellous bone after being placed within a correctly created pedicle hole or touching the cortical bone surrounding walls of the pedicle after creating an intentional medial or lateral breach. Medial and lateral breaches in the cortical bone were confirmed with the CT volume described in Section II-C5.
SNR was calculated to determine which signals would be included in the analysis of bone differentiation, using the equation: where μ t and σ b are the mean and standard deviation of signals within photoacoustic target and background regions of interest (ROIs), respectively, prior to log-compression. To identify appropriate target ROIs, LW-SLSC images were used to estimate the center of the photoacoustic targets (which was challenging with DAS photoacoustic images because of the diffuse patterns observed in some cases [21]). Then, a 10 mm × 10 mm ROI was centered on the photoacoustic target and a background ROI was placed 25 mm above the center of the target.
As demonstrated in the Appendix, photoacoustic acquisitions that yielded a SNR value of 3 or less were considered as out-of-plane signals to be discarded from additional analysis. We reasoned that signals with SNR > 3 were more likely to be associated with a photoacoustic signal from the fiber tip, while SNR values below this threshold produced images that mostly contained noise. These noisy images were suspected to result from signal sources located outside of the imaging plane.
After removing the out-of-plane signal cases, 6 cases of cancellous bone and 5 cases of cortical bone were analyzed. For each case, DAS, SLSC, and LW-SLSC photoacoustic images were processed with the same parameters as described in Section II-C3. Then, contours of −6 dB were computed around a 10 mm × 10 mm ROI that was centered on the photoacoustic target. This process was repeated for 10 acquired frames from each cortical and cancellous bone case. Finally, a t-test was used to evaluate the statistical significance (p < 0.01) of the difference in areas generated from the contours measured when the optical fiber was touching either cancellous or cortical bone. This statistical analysis was repeated for each beamformer.

D. Image Quality Assessments and Data Representation
The generalized contrast-to-noise ratio (gCNR) [49], [50] was used to assess the separability of bone structures and surrounding soft tissue in ultrasound images, defined as: where p i and p o are the probability density functions of signal amplitudes within regions of interest (ROIs) inside and outside of the lamina, respectively. The probability density functions were calculated from histograms computed with 256 bins. Similarly, the contrastto-noise ratio (CNR) was measured and compared, defined as: where S i and σ i are the mean and standard deviation, respectively, within a ROI inside of the target prior to log-compression and S o and σ o are the mean and standard deviation, respectively, of a ROI outside of the target prior to log-compression.
Results from measurements of the thickness of segmented lines (Section II-B) and from areas of photoacoustic signal originating from cancellous and cortical bone (Section II-C6) are both presented as box-and-whiskers plots in Section III. In these plots, the horizontal lines represent the median, the upper and lower edges of each box represents the upper and lower quartiles of each data set, the top and bottom lines extending from the boxes indicate the maximum and minimum of each data set, and the crosses indicate outliers (defined as any value larger than 1.5 times the interquartile range). The gCNR of the DAS image was 0.67. In addion to improving gCNR, SLSC and LW-SLSC imaging improved the boundary between soft tissue and the spinous, lamina, and transverse processes of the vertebra, when compared to DAS imaging. CNR was also enhanced in the SLSC and LW-SLSC images (2.13 and 4.59, respectively), when compared to that of the DAS image, which was 0.55. Fig. 4(a) shows the registration of vertebral boundaries segmented from CT and ultrasound images. While the segmented boundaries successfully converged in the final ultrasound-to-CT registration, a notable difference was observed with DAS when compared to SLSC and LW-SLSC boundaries. Specifically, a fuzzier segmentation was produced from the DAS image, while the coherence-based methods reduced outliers and produced finer contours. An additional reduction of pixel outliers is observed for the LW-SLSC image result which more closely follows the CT contour when compared to SLSC image result. Fig. 4(b) shows the corresponding thickness difference for the lateral and axial dimension of the segmented boundaries. To quantitatively compare the thickness of the segmented boundaries, the integration of the segmented regions was calculated in the axial and lateral dimensions for each boundary. The differences between these integrated segmentation thicknesses at each lateral or axial position was computed to compare the obtained CT boundary with each of the ultrasound boundaries. The overall thickness of the CT contour in each dimension (axial: 1.84 mm, lateral: 1.79 mm) was closer to that obtained from the LW-   Table I.

B. Vertebral Imaging of a Human Cadaver
The photoacoustic signals in Fig. 5 are shown registered to the ultrasound images, with a magnified view shown as a figure inset. These photoacoustic signals arise from the tip of the optical fiber that was inserted into the prepared pedicle hole. Coherence-based images were qualitatively observed to produce more focused photoacoustic signals when compared to DAS photoacoustic images, which is expected to enhance the estimation accuracy of the fiber tip location. Quantitatively, the distance between the center of mass and the brightest pixel of each photoacoustic image created with DAS, SLSC, and LW-SLSC beamforming was 0.26 mm, 0.21 mm, and 0.18 mm, respectively, where a shorter distance represents a more compact and less diffuse photoacoustic signal.
The bottom row of Fig. 5 shows the segmented ultrasound and photoacoustic masks for the three beamformers. The green triangles and magenta circles represents the center of mass of the isolated components from ultrasound and photoacoustic masks, respectively. The segmented masks from the DAS ultrasound image includes undesirable soft tissue and a single bony structure, while coherence methods identify at least 3 bony structures. Similarly, SLSC images created with greater M values have an increased number of outliers (i.e., pixels with coherence values that differ significantly from their surroundings and from their values at other lags [41]) and decreased SNR and CNR [39], which caused some otherwise continuous bony structures to appear disconnected, affecting the estimation of center of mass and resulting in redundant landmarks. This effect is mitigated with LW-SLSC.   7 shows the X-Z and Y-Z views of the registered CT volume and the fiber tip fiducial marker segmented from the LW-SLSC photoacoustic image. To assess the proximity of the registered fiducial marker to the bottom of the pedicle hole, five manual markers were selected around the border of the pedicle hole for each X-Z ( Fig. 7(a)) and Y-Z view ( Fig.  7(b)). The position of the manual markers represents the potential positions of the optical fiber tip when it was inserted in the pedicle hole. Euclidean distances between the fiducial marker and each of the manual markers are reported in Table II. The minimum distances are shown in bold, indicating the marker associated with the location of the bone surface that the tip of the optical fiber was most likely touching when inserted in the pedicle hole. Fig. 8 shows examples of co-registered LW-SLSC ultrasound images and DAS photoacoustic images when the tip of the optical fiber was placed in holes corresponding to a medial breach (Fig. 8(a)), a lateral breach (Fig. 8(b)), and the cancellous core of the pedicle (Fig. 8(c)). The corresponding CT slices were chosen to optimize visual confirmation of the fiber placement description, and therefore they are not registered to the photoacoustic and ultrasound images. It was not possible to perform ultrasound-to-CT registration for these figures, because of the absence of clear anatomical landmarks in the ultrasound image of the lumbar vertebrae. Our primary goal was instead to obtain ground truth images while touching the tip of the hole identified by post-operative CT images, without regard to the presence of suitable bony landmarks in the ultrasound images. Axial slices of the CT volume are shown in Fig. 8 in order to clearly visualize the pedicle hole and intentional lateral and medial breaches.
In particular, the medial breach in the CT image of Fig. 8(a) shows the tip of the hole coinciding with high density bone (i.e., the cortical bone) where the tip of the optical fiber was placed. Similarly, the tip of the fiber is in close proximity to the outer cortical wall of the pedicle in Fig. 8(b). In contrast, the tip of the hole in Fig. 8(c) is surrounded by low density bone (i.e., cancellous bone). Qualitatively, DAS photoacoustic images show distinct pattern differences when the optical fiber was touching either cancellous or cortical bone. Specifically, DAS photoacoustic signals from the cancellous core produced signals with greater area coverage than that present with lateral and medial breaches (i.e., fiber touching cortical bone) when images were displayed with the same dynamic range of 25 dB. Because coherence-based methods reduced the appearance of incoherent signals, the area of photoacoustic signals originating from cancellous bone (see Fig. 5) was reduced when compared to the same signals in DAS photoacoustic images, resulting in reduced differentiation between these signal origins with the coherence-based images. Fig. 9 shows quantitative comparisons of the differences observed in Fig. 8, as measured by the enclosed area of the −6 dB contours generated from DAS photoacoustic images. These results are grouped by the expected location of the optical fiber tip, touching either cortical or cancellous bone, based on the corresponding CT images. The total mean area measured within the −6 dB-contours was 7.59 mm 2 greater when touching cancellous bone compared to cortical bone (p < 0.01). In addition, greater standard deviations in these measurements were observed for cancellous bone (5.22 mm 2 ) when compared to cortical bone (0.96 mm 2 ). Fig. 10 compares areas of the −6 dB contours obtained from DAS, SLSC, and LW-SLSC images of the optical fiber touching either cortical or cancellous bone. The mean ± one standard deviation of measurements from DAS images was 10.06 ± 5.22 mm 2 for cancellous bone and 2.47 ± 0.96 mm 2 for cortical bone. In comparison, the mean ± one standard deviation of measurements from SLSC images was 1.64 ± 0.88 and 1.06 ± 0.59 mm 2 for cancellous and cortical bone, respectively. The mean ± one standard deviation of measurements from LW-SLSC images was 2.60 ± 2.25 and 1.51 ± 0.77 mm 2 , for cancellous and cortical bone, respectively. While the three beamformers showed statistically significant differences between the mean of measured areas from cortical and cancellous bone (p < 0.01), DAS images offered the greatest distinction.

IV. Discussion
We successfully demonstrated that combined ultrasound and photoacoustic imaging has the potential to improve pedicle screw placement during posterior spinal fusion surgeries. Coherence-based beamforming plays an important role in both ultrasound and photoacoustic image formation for this task. Specifically, coherence-based ultrasound imaging improves the visualization of bone structures (Figs. 3 and 5), which enables individual landmarks for each independent bone structure during the registration of ultrasound to CT images (Figs. 5 and 6). As a complement to this information, coherence-based photoacoustic imaging enables localization of fiber tips (Fig. 5).
On the other hand, amplitude-based methods such as DAS photoacoustic imaging of signals inside the lumbar vertebrae allowed differentiation between cortical and cancellous bone. As observed in Fig. 8, DAS photoacoustic images show a diffuse pattern when the optical fiber was inside the pedicle, where its core is composed of cancellous bone. This pattern is understandable, as reflections within the porous, blood-rich structure of the cancellous bone are expected to compromise the alignment of the delayed signals during the beamforming process. In contrast, a well-defined, compact signal was observed for the medial and lateral breaches, which can be explained by the wall surrounding the pedicle being composed by cortical bone, which is more dense than cancellous bone [51] and is expected to produce less signal reflections. Similar signal appearance differences were previously obtained prior to the removal of any bone, presenting photoacoustic imaging as a potential option to find the ideal starting points for pedicle screw insertion [21]. The new contributions of this work demonstrate that these same differences in bone appearance can be used to determine if the pedicle hole is being created with the correct trajectory to avoid impending bone breaches. As out-of-plane signals need to be identified and excluded for successful implementation of this concept, the use of a 2D ultrasound array to identify the out-of-plane photoacoustic signals is a promising alternative to our empirical SNR>3 threshold.
We additionally note that coherence-based beamforming was not sufficient to visualize nor quantify differentiation between cortical and cancellous bone (Fig. 5). These coherencebased beamformers reduced the incoherent signals associated with the cancellous bone, which is a necessary feature of bone differentiation that is emphasized with amplitude-based beamforming methods. However, the added value of coherence-based beamforming is its ability to localize the coherent signal source with more clarity for photoacoustic signal tracking during pedicle hole creation. Thus, we conclude that amplitude-and coherencebased photoacoustic beamformers are synergistically and mutually beneficial for the clinical task of guiding spinal fusion surgeries. Specifically, SLSC and LW-SLSC beamformers have the potential to improve target localization that is otherwise difficult in the presence of noise [37] or diffuse patterns from the cancellous core of the pedicle [21], while DAS beamforming can assist with determining proximity to cortical bone based on the shape of the amplitude-based signal.
In a previous study, a single vertebra with tissue attachments removed was submerged in a water tank [52], and the presence of reverberations required the introduction of some assumptions about fiber tip positions in order to estimate true locations within pre-drilled pedicle holes. However, the human cadaver study presented in this manuscript did not require these additional assumptions. As observed in Figs. 5 and 8, photoacoustic signals from the optical fiber tip did not produce additional artifacts that would otherwise negatively impact tip position estimates (compared with Fig. 2 in [52]). While the previous study differed from the cadaver study by using a custom drill bit that surrounded the optical fiber, we hypothesize that reverberations in [52] were primarily generated by the absence of muscle, nerves, fat, and blood vessels. These additional artifacts were substantially reduced in the human cadaver experiments because of sound attenuation in the surrounding soft tissue, which emphasizes the importance of conducting cadaver studies on the path to clinical translation of this photoacoustic-guided surgery concept, as noted in [53].
Regarding real-time capabilities, DAS and SLSC or LW-SLSC photoacoustic images can be interleaved during surgeries. Previous work describing a real-time GPU implementation of the SLSC beamformer on a research ultrasound system indicates that this is a viable possibility [37]. We demonstrated that photoacoustic SLSC images can be displayed in highnoise-level environments generated with <200μJ laser energies at 41 frames per second [37]. Given that LW-SLSC operates on independent kernels R i as described in Section II-A2, realtime imaging can be similarly achieved by concurrent execution of each R i in a separate thread inside the GPU. The complexity of the operations per thread is further reduced by pre-computing matrix B and α 2 D T D, which are defined in Section II-A2. With a GeForce GTX Titan X graphic card, we estimated a computation time of 60 ms based on the number of cores of the GPU (i.e., 3072 cores) and the computation time when executed in MATLAB (i.e., approximately 3 minutes). This estimation does not consider memory transfer and precomputation times. In addition, we previously developed a deep neural network architecture (i.e., CohereNet) to estimate spatial coherence functions [54], which are foundational to LW-SLSC imaging. This deep learning approach achieved real-time computational processing times and can potentially be adapted to include the additional regularization steps needed for LW-SLSC imaging.
We envision several implementation possibilities to achieve the stated benefits of combined amplitude-and coherence-based ultrasound and photoacoustic images. First, as the fiber tips are ultimately envisioned to be inserted into the hollow core of custom drill bits [21], [36], [52], [55], the observed benefits of coherence-based photoacoustic images can potentially be extended to benefits for tracking the tips of common surgical tools used during spinal fusions surgeries (e.g., drill tips, pedicle probe tips). The feasibility of this concept was demonstrated for drill bits in a previous publication from our group [55]. As observed in Fig.  2 of [55], a stationary optical fiber was connected to the laser source, and the opposite end of the fiber was inserted into a stationary interface. The other end of this stationary interface accommodated a rotating drill bit, which was custom-fabricated with holes on both ends to house a rigidly attached optical fiber that rotated with the drill bit. Both the stationary and rotating optical fibers were air coupled to each other to permit light transmission from the stationary laser to the tip of the rotating drill bit. If attachment to tool tips are not possible, a surgeon may periodically check trajectories by removing the pedicle probe (or any other surgical instrument used to create pedicle holes) and replacing the instrument with an optical fiber, as implemented for the human cadaver study described in this paper.

V. Conclusion
This paper presents the first known combined ultrasound and photoacoustic image guidance system with software capabilities that are optimized for pedicle cannulation in posterior spinal fusion surgery, demonstrating that both amplitude-and coherence-based beamforming methods are mutually beneficial for this task. Specifically, coherence-based beamforming of ultrasound images improved the visualization of bone for ultrasound-to-CT registration, while coherence-based beamforming of photoacoustic images has the potential to improve target localization and tracking during pedicle hole creation. Amplitude-based photoacoustic beamforming has the potential to provide complementary quantitative information regarding proximity to the cortical bone surrounding the desired pedicle hole trajectory. Overall, this proposed combination of imaging modalities and beamforming methods is promising to assist surgeons with identifying and avoiding impeding bone breaches during spinal fusion surgeries. These new findings are complementary to previous work demonstrating that photoacoustic imaging is useful to determine optimal entry points into the pedicle [21]. Together with these previous findings, we have successfully demonstrated a complete system that has the potential to significantly impact the standard of image guidance methods for spinal fusion surgery.
plane signals. By empirically defining a threshold SNR of 3, out-of-plane signals were discarded from the area analysis.
To facilitate comparisons between the DAS images and each SLSC and LW-SLSC image, ROIs are not shown in Fig. 5. However, to provide a refrence point, Fig. 12 shows the ROIs used for quantitative assessment of ultrasound image quality reported in Table I.   The triangles and circles represent the center of mass of isolated components from ultrasound and photoacoustic images, respectively, which are later combined and used as landmarks for CT registration. The insets show magnified views of the photoacoustic signal originating from the fiber tip. Co-registered ultrasound (color) and CT (grayscale) images using ultrasound and photoacoustic landmarks (magenta) from segmented LW-SLSC images. (a) X-Z and (b) Y-Z planes of the CT volume registered to the ultrasound and photoacoustic images. The yellow marker represents the centroid of the photoacoustic signal reconstructed with the LW-SLSC image, which was used as a fiducial marker for landmark registration. The blue markers show the outline of the pedicle hole. Areas of −6 dB-contours around the center of photoacoustic targets from cortical and cancellous core using DAS beamforming. Each boxplot shows the median, interquartile range, maximum and minimum values of the estimated areas over 10 frames for cancellous (left) and cortical (right) bone. Comparison of −6dB-contours from photoacoustic targets inside cortical and cancellous bone in a human cadaver vertebrae using DAS, SLSC and LW-SLSC beamforming. Each boxplot shows the median, interquartile range, maximum and minimum values of the estimated areas over 60 frames for cancellous and 50 frames for cortical bone. Qualitative and quantitative assessment of photoacoustic images originating from out-ofplane signals. (a) Examples of cancellous, cortical, and out-of-plane DAS photoacoustic images. (b) SNR assessment measured from photoacoustic signals associated with the cancellous core, cortical bone, and characteristic out-of-plane signals. The shaded area represents signals that did not achieve the SNR > 3 threshold and were therefore not included in the area results of Figs. 9 and 10. Examples of ultrasound and co-registered photoacoustic images from an oblique sagittal view of L3-L5 vertebrae reconstructed with DAS, SLSC and LW-SLSC. S 1 , S 2 , S 3 , and B denote the selected regions for quantitative assessments. Gonzalez