Refined Locally Linear Transform-Based Spectral-Domain Gradient Sparsity and Its Applications in Spectral CT Reconstruction

Spectral computed tomography (CT) is extension of the conventional single spectral CT (SSCT) along the energy dimension, which achieves superior energy resolution and material distinguishability. However, for the state-of-the-art photon counting detector (PCD) based spectral CT, because the emitted photons with a fixed total number for each X-ray beam are divided into several energy bins, the noise level is increased in each reconstructed channel image, and it further leads to an inaccurate material decomposition. To improve the reconstructed image quality and decomposition accuracy, in this work, we first employ a refined locally linear transform to convert the structural similarity among two-dimensional (2D) spectral CT images to a spectral-dimension gradient sparsity. By combining the gradient sparsity in the spatial domain, a global three-dimensional (3D) gradient sparsity is constructed, then measured with L1-, L0- and trace-norm, respectively. For each sparsity measurement, we propose the corresponding optimization model, develop the iterative algorithm, and verify the effectiveness and superiority with real datasets.


I. INTRODUCTION
X-ray computed tomography (CT), as a nondestructive inspection technique, has been widely used in many fields, such as medicine, industry, biology, exploration, security, and so on [1]- [3]. Although the specific equipment design, scan protocol, and utilization purpose are greatly diverse, the fundamental image-forming principle is consistently based on the Beer-Lambert law, which describes the relation between X-ray and the object as an exponential attenuation model, I E = I 0 E exp −P μ E, x , (1) where µ(E, x) is the linear attenuation coefficient for energy E at the position x [4], P ⋅ a line integral operator (the ray transform and especially the Radon transform for 2D cases [5]), I 0 (E) the emitted photons, and I(E) the collected photons. Actually, the validity of (1) is confined to a set of ideal imaging conditions, such as the X-ray energy is monochromatic, there is no scattering in the imaging process, and so on. However, in practical applications, the energy E obeys an emitted spectral distribution called S(E), i.e., polychromatic. Thus, (1) should be changed to After the logarithmic transformation, we can obtain the commonly known projection as follow, . (2) The problem of CT reconstruction is to determine an approximate distribution of µ from a set of (2).
Over the last 40 years, 5 major innovations were made in the development of CT technology, which contribute to improve the scan efficiency and practicability, i.e., spatial-domain modification, but never touch the spectral scope. In other words, these generations just develop different patterns to construct P ⋅ mappings. Recently, spectral CT is proposed by extending the conventional single-spectral computed tomography (SSCT) along the energy dimension [6]. And the state-of-the-art implementation technique is employing a photoncounting detector (PCD), a kind of energy-selective detector [7], [8], which can divide the X-ray photons into different energy channels with appropriate post-processing steps and then obtain multiple energy-dependent projection sets [9]. Thus, compared with the SSCT, spectral CT is superior in energy resolution and material distinguishability. It has great potential in both medical and industrial applications [10]. Mathematically, spectral CT can be described by introducing a window function to (2), i.e.,

Author Manuscript
Author Manuscript

Author Manuscript
Author Manuscript is a window function of the channel c, 1 ≤ c ≤ C, ω c indicates the energy interval, and C is the channel number.
By denoting the inverse operator of P ⋅ as P −1 ⋅ , we can obtain the channel-wise image as follow, Denote the size of F c (1 ≤ c ≤ C) as M × N. By stacking up all the channel images along the spectral dimension, we can obtain a volumetric spectral image F with the size of M × N × C.
Same strategy can be used for channel-wise projection data to achieve the corresponding volumetric spectral projection P.
Apparently, (2) can be treated as a special case of (3) when the only energy window is extended to fully cover the whole spectrum. And (3) represents the local performance of (2) with a truncated spectrum. Meanwhile, it is obvious that the emitted photons for a channel c are just a fraction of the originally emitted photons in an X-ray beam. Thus, in practical applications, the decreased channel dose inevitably increases the noise level of the corresponding projection. Thus, the fundamental problem of CT, i.e., how to reconstruct high-quality images from noisy projections, will be more challenging for spectral CT. Moreover, it may further have adverse consequences on the decomposition accuracy and damage the native material distinguishability.
To overcome the ill-posedness in spectral CT reconstruction, prior knowledge needs to be greatly concerned and effectively incorporated. The features of spectral CT images can be classified into two categories, which lie in the spatial and the spectral domains, respectively. The spatial feature can be ascribed to a sparsity in the spatial domain itself [11], an appropriate transform domain [12], [13], or a high-dimensional space [14]- [16]. The spectral feature is a correlation among channel images, more specifically, a structural similarity [17]- [20]. Most existing methods for spectral CT employ different measurements to directly describe both or either the aforementioned features.
Different from directly employing the correlation among channel images, in our previous study [21], we developed a locally linear transform based gradient L 0 -norm minimization method for spectral CT reconstruction. As a natural continuation and deeper investigation, in this work, we innovatively refine the proposed locally linear transform with a Gaussian kernel to improve the edge preservation, when converting the spectrum-related structural similarity to a gradient sparsity. Then, we propose a general optimization framework with a global three-dimensional (3D) sparsity constraint. We also concretize the 3D constraint with gradient L 1 -and L 0 -norm, and spatial total-variation with spectral trace-norm (TVLR) [14], respectively. Moreover, we perform experiments to verify the effectiveness and superiority of the proposed methods comparing with the previous versions (2D L 1 -and L 0 -norm minimization and TVLR method).
The remainder of this paper is organized as follows. In section II, we briefly review the locally linear transform based 3D gradient L 0 -norm minimization method for spectral CT. In section III, we present the refining strategy for the locally linear transform, establish a general optimization framework with 3D sparsity constraint, and develop concritized optimization models with gradient L 1 -and L 0 -norm, and TVLR. We perform real experiments to verify the effectiveness of the proposed method in section IV. In last section, we conclude this work.

II. THEORY
In this section, we review how to construct a 3D gradient sparsity by employing the locally linear transform, and how to incorporate this sparse constraint into an optimization model.

A. LOCALLY LINEAR TRANSFORM BASED 3D GRADIENT SPARSE CONSTRAINT
Comparing all the reconstructed channel images, the structural similarity and quantitative difference are obvious. By employing locally linear transform, we convert this feature to a gradient sparsity along the spectral direction. Assuming the current target channel is c, we fix F c as the filtered input image, choose F i (1 ≤ i ≤ C) as a reference image, and obtain a filtered output F i c satisfying where x and k indicate pixel positions, Ω 2 (k) represents an image patch with a center position k. a i c k , b i c k is a pair of constant coefficients for the patch Ω 2 (k), which are determined by a quadratic optimization model as follow [22], Equation (5) also suggests F i c can be viewed as a copy of F c . Considering each pixel x is covered by several patches (∀x ∈ Ω 2 (k)), we adopt an averaging strategy for a i i.e., a i c x , b i c x is a pair of averaged coefficients in all the patches covering the pixel x.
Thus, (4) is converted to By performing the patch-wise parameter average, the locally linear transform in (4) is converted to the point-point linear transform in (6).
When we employ all the channel images as references, we can repeatedly perform (5)

Author Manuscript
Author Manuscript

Author Manuscript
Author Manuscript operation can be performed to a i c and b i c 1 ≤ i ≤ C , and we can also obtain the volume-based a c and b c . Thus, we further represent (6) in a volume version as follow, Here Ω 3 indicates the 3D spatial domain. The filtering input volume is represented as V c , which is a simply duplicate extension of F c in the spectral dimension.

B. OPTIMIZATION MODEL AND ITERATIVE ALGORITHM
To measure the 3D gradient sparsity of F c by L 0 -norm, we employ a counting function C ⋅ as follow, Incorporating the data constraint and considering the relationship between F c and F, we propose the following optimization model, where λ>0 is a parameter to control the importance of the regularization term. By relaxing the constraint, Eq. (7) is converted to the following unconstrained model, where τ > 0 is a parameter controlling the relaxation degree. Then, for each target channel c (1 ≤ c ≤ C), we split (8) to the following sub-problems, Equation (9a) is a quadratic optimization problem, which can be iteratively solved by using the POCS scheme [23]. Equation (9b) can be viewed as a 3D generalization of the 2D gradient L 0 -norm minimization, and the solution approach can be achieved by extending the 2D method in [24]. Finally, by averaging F c along the spectral dimension, we can obtain the searched-for channel image. Furthermore, the decomposition method [25] can be employed to obtain material percentage images.

III. METHOD
In this section, we first refine the construction of the 3D gradient sparsity, i.e., refined locally linear transform. Then, we propose a general optimization framework with 3D gradient sparsity, and concrete it with three different regularizers (gradient L 1 -and L 0 -norm, and TVLR).

A. REFINED SPECTRAL-DOMAIN GRADIENT SPARSITY CONSTRUCTION METHOD
The correlations among channel images are conspicuous. For one fact, all the slices contain the same structures and textures, i.e., structural similarity. For the other fact, the gray value and contrast vary a lot, i.e., quantitative diversity. When employing the locally linear transform to establish the 3D gradient sparsity, we perform a patch-wise average operation for the coefficient pair a i c k , b i c k , which meanwhile may cause edge deformation. To overcome this drawback, we introduce a Gaussian kernel to refine the transform, i.e., replacing the average weight with a radial basis function. Thus, we revise (6) as where the tilde symbol indicates a weighted average operation. Be giving the central pixel more weight, and the edge pixel less weight, the edge distortion can be effectively suppressed.
When the reference image traverses all the energy channels, we can stack up the corresponding filtering outputs along the spectral direction to form a volume F c , of which the i-th channel image is F i c 1 ≤ i ≤ C . Thus, we further represent (10) in a volume version as follow, The corresponding filtering input volume is represented as V c , which is a duplicate extension of F c along the spectral dimension. It is emphasized while F c represents a 2D channel image, F c is the corresponding 3D extension along the spectral dimension. It is worth noting that the channel images of F c successfully overcome the shortcoming of quantitative diversity, and well maintain the structural similarity at the same time. Thus, its 3D gradient volume is globally sparse, i.e., 2D spatial sparsity and 1D spectral sparsity.

B. REFINED LOCALLY LINEAR TRANSFORM BASED GENERAL OPTIMIZATION FRAMEWORK
Considering the gradient sparsity of F c (1 ≤ c ≤ C), we employ it as a constraint by performing a general measurement noted as Φ ⋅ . Combining the data fidelity term, we propose the following optimization framework,  (11) Here F is the spectral CT reconstruction volume by stacking up all the channel images F c (1 ≤ c ≤ C) along the spectral dimension. F c is a duplication volume of the searched-for c-th channel image along the spectral dimension. a c x and b c x are determined by the following quadratic optimization model, Similar to II-B, (11) can be relaxed and splitted into two sub-problems. The corresponding pseudo-codes are summarized in Algorithm 1.

C. CONCRETIZED 3D REGULARIZERS
Many measurement methods can be used to concretize the general regularizer Φ ⋅ , such as L 1 -, L 0 -norm. However, the commonly employed version is 2D, which should be modified to a 3D extension to meet the sparsity feature in this problem.
For the TVLR method [14], the trace-norm measurement is performed on a directly unfolded channel image according to the spectral domain. Theoretically, the trace-norm measurement describes a low-rank feature, which fits sparsity better than similarity. Thus, we modify the TVLR method by performing the trace-norm measurement on the unfolded F c instead of F.

•
Modified trace-norm: where σ γ (F 3 c ) is the γ -th largest singular value, For each concretized Φ ⋅ , we develop the corresponding optimization model and perform experiments to verify the effectiveness.

IV. RESULTS
We perform two real experiments to verify the effectiveness of algorithm 1 concretized with the L 1 -and L 0 -norm and TVLR, respectively. We visually compare reconstructed channel images and decomposed material images among conventional filtered backprojection (FBP) method, TV-class methods, L 0 -class methods, and TVLR-class methods. The TV-and L 0class methods include 2D version, 3D version, locally linear transform (LLT-) based version and refined locally linear transform (ELLT-) based version. The TVLR-class methods include TVLR, locally linear transform (LLT-) based version and refined locally linear transform (ELLT-) based version. For all the aforementioned methods, we consistently fixed the iteration number to 20 for fair comparisons. We used an image-domain material decomposition method for all the experiments [26]. To quantitatively compare the decomposition accuracy, for each experiment, we calculated the mean value and standard deviation of the decomposed solid water for all the comparison methods.
The experiments are with a same X-ray source (YXLON 225 kV micro-focus tube) operated at a tube voltage of 140 kV and a tube current of 100 µA. The detector is a 4-channel PILATUS3 PCDs by DECTRIS. The source-object distance is 35.27 cm and the sourcedetector distance is 43.58 cm. 720 views are collected by the detector consisting of 515 cells with 0.15 mm length. The reconstructed and decomposed results are with 512×512 pixels. For each pixel, the physical dimension is 0.122 mm × 0.122 mm.
In the first real experiment, we perform a one-time scan with an equal photon ratio setting.
To reduce the scattering influence, we just employ 3 energy bins with higher energies. The examined specimen, shown in Fig. 1 Table 1, we quantitatively compared the decomposed solid water with mean and standard deviation measurements.
Comparing with channel images, the FBP-based results suffers from serious noise influence. The 2D based methods (2D TV and 2D L 0 ) perform inconsistently between different channels. Some are still noisy ( perform well in noise suppression, the ELLT-based ones are more desirable in fine structure protection (comparing the edges marked by red arrows in Fig. 5). Another characteristic is the TV-and TVLR-class methods are good at smoothing the images because they penalize the gradient magnitude. However, L 0 -class works more stiff, because it penalize the gradient existence. Thus the edges are sharper than the TV-and TVLR-classes. The numerical comparison in Table 1 shows superior decomposition accuracy of the LLT-involved methods, where the mean value is very close to the ground truth and the standard deviation is very small (below 0.06).
In the second experiment, we scan twice with different energy thresholds. A 8 channel projection dataset is obtained. And for each channel, the initially emitted photon number is roughly the same. We employ 6 energy bins with higher energy to weaken the scattering influence. The examined specimen, shown in Fig. 1 (lower row), is consist of bone, muscle, fat and solid water. The channel reconstructions and zoomed-in patches are shown in Figs. 6-7 and 8-11, respectively. We choose bone, muscle and solid water as three basis materials to perform the material decomposition. The results and the magnified details are illustrated in Figs. 12 and 13-14, respectively. The numerical comparison of decomposition accuracy for solid water is summarized in Table 2.
Comparing the bone material in Figs. 8-9 and 13, the TVLR, 2D and 3D TV methods bring blurring effects to the fine bone structures. Although the FBP, 2D and 3D L 0 methods well preserve the edges, they fail to effectively remove the noise in soft tissue. Both the LLT-and ELLT-based methods work well for the dual tasks and perform consistent among all the channel images. However, the ELLT-based methods are superior in fine structure preservation (see the yellow and red arrows in Fig. 14). Comparing the numerical results in Table 2, the LLT-involved methods consistently achieve high mean value (larger than 0.98) and low standard deviation (smaller than 0.04). For 2D and 3D L 0 methods, the decomposed solid water had obvious bias with the ground truth, and even the standard variation is greater than 0.2. However, by introducing LLT, both of the evaluation metrics are dramatically improved, where the mean value is enhanced to 0.99 from 0.82 and the standard deviation drops to 0.04 from 0.20.

V. CONCLUSION
In this work, we mainly investigate a method to effectively establish a sparsity feature in the spectral domain, and correspondingly develop the potential applications, such as 3D gradient L 1 -and L 0 -norm minimization and modified TVLR method for spectral CT reconstruction.
Comparing with the previous work, we refine the sparsity construction method to improve the edge preservation accuracy, propose a general optimization framework, and develop three specific minimization models. Real experiments are performed, and the results confirm the effectiveness and superiority of our proposed approaches for both image quality and decomposition accuracy. Quantitative comparison of decomposition accuracy for solid water in real experiment 1 (Soft Tissue Group of Fig. 4).
Mean ± Std.Dev.  Quantitative comparison of decomposition accuracy for solid water in real experiment 2 (Water Group of Fig.  12).