Plug-and-Play ADMM for MRI Reconstruction with Convex Nonconvex Sparse regularization

Traditional ℓ1-regularized compressed sensing magnetic resonance imaging (CS-MRI) model tends to underestimate the fine textures and edges of the MR image, which play important roles in clinical diagnosis. In contrast, the convex nonconvex (CNC) strategy allows the use of nonconvex regularization while maintaining the convexity of the total objective function. Plug-and-play (PnP) algorithm is a powerful framework for sparse regularization problems, which plug any advanced denoiser into traditional proximal algorithms. In this paper, we propose a PnP-ADMM algorithm for CS-MRI reconstruction with CNC sparse regularization. We first obtain the proximal operator for CNC sparse regularization. Then we present PnP-ADMM algorithm by replacing the proximal operator of ADMM with properly pre-trained denoisers. Furthermore, we conduct comparison experiments using various denoisers under different sampling templates for different images. The experimental results verify the effectiveness of the proposed algorithm with both numerical criteria and visual effects.


I. INTRODUCTION
M RI is one of the most powerful clinical diagnostic tools due to its non-invasive and non-ionizing characteristics. However, one of the primary limiting factors of MRI application is its relatively long image acquisition time. In order to reduce the scanning time, MRI data are usually undersampled. How to reconstruct MR images from undersampled measurements without degrading quality has been a concern [1]- [4].
CS-MRI is one of the most important breakthroughs in fast MR imaging [5]. Mathematically, for a single-coil CS-MRI model, the acquired K-space data y ∈ C m is given by where x ∈ C n is the ground truth MR image, η ∈ C n denotes measurement noise or disturbance, A ∈ C m×n (m < n) represents the CS sampling matrix. Usually, we have A = P × F where P ∈ R m×n is an undersampling matrix, and F ∈ C n×n is the Fourier transform. The ground truth MR image x can be obtained by solving the following regularized inverse problem where the nonsmooth regularization term φ(x) is used to present some prior knowledge of x, λ is the regularization parameter. The most used prior for MR image is sparsity in the wavelet domain, which can be included as the regularization term φ(x) = Ψx 1 . Here, Ψ is a discrete wavelet transform, and the 1 norm is used to promote sparsity. The convexity of the 1 norm can ensure the efficient solvability of the objective function (2). However, 1 regularization is a biased estimation, which tends to underestimate the high-frequency part of x in the wavelet domain. The highfrequency part of the wavelet domain corresponds to the details of the image, such as fine textures and edges, which is very important in clinical diagnosis. In contrast, existing nonconvex regularizers ( such as q norm φ(x) = Ψx q (0 < q < 1), Capped 1 [6], Smoothly Clipped Absolute Deviatio (SCAD) [7] and Minimax Concave Penalty (MCP) [8], etc.) can yield a more accurate estimate of high-amplitude components. However, the nonconvex regularizer will make the objective function of the problem (2) nonconvex and will produce suboptimal local solutions [8]- [12]. Recently, Selesnick et al. proposed a new CNC sparse regularizer formed by subtracting the Moreau envelope of 1 norm from itself [13]- [17]. The CNC sparse regularizer is nonconvex that can reduce the estimation bias. Meanwhile, the convexity of the whole objective function (2) can maintain by adjusting the parameter settings. The CNC strategy is widely used in image reconstruction [18], [19], image denoising [13], [14], [20], and fault detection [21]- [23].
How to solve sparse regularization problem (2) is a challenging topic. PnP algorithm is a powerful framework for sparse regularization problems, which plug any advanced denoiser into traditional proximal algorithms. Ahmad et al. proposed PnP-ADMM for MRI reconstruction with 1regularization [3]. In this paper, we propose a novel PnP-ADMM algorithm to solve problem (2) with the CNC sparse regularization. The contributions of this paper are summarized as follows: Firstly, we propose the proximal operator for CNC sparse regularizer in iterative form. This iterative proximal operator can be used in other generalized CNC regularizers, which closed-form solution is challenging to obtain. Secondly, by replacing the proximal operator with properly pre-trained denoisers, we obtain the PnP-ADMM algorithm for the problem (2). Finally, we apply the proposed algorithm to MRI reconstruction. Experimental results demonstrate the superiority of the proposed algorithm.
The rest of this paper is organized as follows. In the next section, we present some basic concepts of the proposed method. In Section III, we introduce the CNC sparse regularizer and give its proximal operator in iterative form. ADMM and PnP-ADMM algorithms are proposed to solve the CNC sparse regularization model in Section IV. The experimental results are demonstrated in Section V . Finally, the conclusion is presented in Section VI.

A. RELATED WORK
MRI reconstruction algorithms can be roughly divided into three categories: Optimization-based, neural network-based and combination algorithms with optimization and neural network.
Traditional first-order iterative optimization algorithms, e.g. iterative shrinkage/thresholding algorithm (ISTA) [24], alternating direction method of multipliers (ADMM) [25], primal-dual hybrid gradient (PDHG) [26] and proximal gradient descent (PGD) [27], [28], have achieved great success in the past. Proximal operators are often used to deal with the nonsmooth regularization terms. Recent works have shown the effectiveness of the above algorithms when a simple closed-form solution of the proximal operator can be computed easily [29], [30]. However, the iterative algorithms may be hard to build when the proximal operator is not easy to evaluate. Furthermore, iterative algorithms often need hundreds of iterations to converge to an optimal solution.
In recent years, with the rapid rise of deep learning, neural networks are also used in MRI reconstruction. Neural networks, such as U-Net [31], [32] and generative adversarial network (GAN) [33], are used to learn the mapping between the k-space data and the ground truth MR image. The Network parameters are learned end-to-end. The main drawbacks of these methods are the huge network parameters and lack of interpretability.
Incorporating deep learning techniques into traditional optimization algorithms attracts increasing attention. Roughly, the so-called learning-based or data-driven algorithms implement in two ways. The first approach is algorithm unrolling (or unfolding), which unrolls an iterative algorithm to a deep neural network [34], [35]. The deep neural network architecture is constructed by mapping one iteration to a network layer and then stacking all layers together. The network parameters are learned from real-world training data sets through end-to-end training. Existing algorithm unrolling includes Learning ISTA (LISTA) [36], ADMM-Net [37], PGD-Net [38], Primal-Dual-Net (PD-Net) [39], etc.
The second approach is PnP, which plugs any advanced denoiser into traditional proximal algorithms. Existing PnP algorithms include PnP-ADMM, PnP-FBS, and PnP-ISTA, have shown their superiority in image reconstruction, image restoration, and other inverse imaging fields [3], [40]- [46]. Compare with algorithm unrolling, PnP has several advantages. First of all, when there is not enough data for end-to-end training, any traditional or pre-trained denoiser can be used as a modular component under the PnP framework. Commonly used denoisers include traditional handdesigned denoisers, such as soft threshold, Total Variation (TV) [47], Block-Matching 3D (BM3D) [48], and deep neural network denoisers [49]- [52]. Secondly, the convergence of PnP algorithms can guarantee theoretically under some general assumptions. For example, [40] used the bounded noise reducer hypothesis to analyze the convergence. [41] studied the convergence of some PnP frameworks under a specific Lipschitz condition on the denoisers. [42] proved the convergence of PnP algorithms under the assumption that the denoiser is nonexpansive.

II. PRELIMINARIES
In this section, we recall some basic concepts of proximal operator and monotone operator theory, which will be helpful in the rest of the paper. These concepts can be found in classical convex optimization literature [53], [54].
For a proper, closed, convex function g(x) , its proximal operator prox is defined as prox λg (y) = arg min As a special case, if g(x) = x 1 , then its proximal operator prox λ · 1 is the element-wise soft threshold where y i is the i-th element of y. The soft threshold can be viewed as a denoiser since it can reduce the elements of y towards zero.
For a general regularized model where f (x) is a convex and differentiable data-fidelity term, g(x) is a convex but nonsmooth regularization term. By using the proximal operator prox αg , PGD iteration for (5) is given by where ∇f (x) is the gradient of f (x) and α is the step size.
The main iteration steps of ADMM for the problem (5) can be summarized as Since f (x) is a convex and differentiable, subproblem (7) have analytic solutions. If the proximal operator prox 1 β g can be expressed in a simple closed-form, ADMM can solve (5) efficiently.
The key idea behind PnP-ADMM is to plug in a powerful pre-trained denoiser in place of the proximal operator prox 1 β g . In this case, subproblem (8) becomes where H σ denotes a pre-trained denoiser with denoising parameter σ. Figure 1 provides an intuitionistic illustration of ADMM and PnP-ADMM framework.

III. CNC SPARSE REGULARIZATION AND ITS PROXIMAL OPERATOR
Recently, Selesnick et al. proposed a new CNC sparse regularizer formed by subtracting the Moreau envelope of 1 norm from itself [13]- [17]. To be specific, the CNC sparse regularizer is represented by is a nonconvex regularizer that can reduce the estimation bias. Meanwhile, the convexity of the whole objective function (2) can maintain by adjusting the parameter settings. Particularly, the objective function in By the definition of the proximal operator, the proximal operator of the CNC sparse regularizer can be defined as The proximal operator of φ b (x) may not have a simple explicit formula since φ b (x) is nonconvex. However, we can still obtain the PGD iteration by rearranging the formula of the objective function.
The objective function of equation (12) can be expressed as As noted by Theorem 6.60 in [53], S b (x) is differentiable and its gradient is given by Let is also differentiable, and its gradient is Furthermore, the proximal operator of g(x) = λ x 1 is the soft threshold. So the PGD iteration for (13) can express as Finally, we can get the iterative expression of prox λφ b as follows We also give a numerical example to verify the validity of the iterative expression of prox λφ b . When the variables in (12) reduce to scalar (one-dimensional) and b 2 ≤ 1/λ, VOLUME 4, 2016 prox λφ b can be expressed as firm threshold function [11], [15], i.e., where firm(y; λ, µ) = Firm threshold is a generalization of the hard threshold and soft threshold. When µ → λ and µ → ∞, the firm threshold function is close to the hard threshold function and the soft threshold function, respectively.
We set α = 1, λ = 1 and b = 1/ √ 2 , plot firm threshold function (18) and the iterative solutions (17) with different initial values x (0) in Figure 2. As shown in Figure 2, although the initial values are different, the iterative solutions converge to the firm threshold after a few iterations. This means that the iterative expression of the proximal operator is equally valid when the closed-form expression is difficult to obtain.

IV. PNP-ADMM ALGORITHM A. ADMM ITERATION FOR CNC SPARSE REGULARIZATION
This subsection gives the ADMM algorithm for sparse regularization model (2) with CNC regularizer. In this case, problem (2) can be expressed as while φ b (Ψx) is defined as (11). Set z = Ψx, then the augmented Lagrangian form of (20) is given by According to ADMM iteration, the minimizer of function (21) can be found by solving the following sub-problems.
Step 1. Update x k+1 with z k and u k fixed: x k+1 = arg min Step 2. Update z k+1 with x k+1 and u k fixed: The last equation is by definition of prox λφ b (·) in 17.
while "stopping criterion is not met" do

B. PNP-ADMM ITERATION FOR CNC SPARSE REGULARIZATION
As mentioned in the section IV-A, the proximal operators prox 1 b 2 · 1 and prox αρλ · 1 in (23) are soft thresholds with different parameters 1/b 2 and αρλ. Furthermore, they can be viewed as denoisers with varying abilities of denoising.
The basic idea behind the original PnP framework is to plug in powerful denoisers in place of the proximal operators. If we replace the proximal operators prox 1 b 2 · 1 and prox αρλ · 1 with two appropriate denoisers H σ1 and H σ2 , where σ 1 and σ 2 are denoising parameters. The larger the value of σ 1 and σ 2 is, the greater the denoising intensity will be. Then (23) can be rewritten as Finally, we can get plug-and-play ADMM algorithm for CNC sparse regularization (PnP-ADMM-CNC) as Algorithm 2. Figure 3 provides a intuitionistic illustration of ADMM-CNC and PnP-ADMM-CNC framework.

V. NUMERICAL EXPERIMENT A. IMPLEMENTATION DETAILS
In this section, we empirically validate the proposed PnP-ADMM-CNC algorithm in the context of CS-MRI reconstruction. We consider 15 widely used MR images and three sampling templates, as shown in Figure 4 and Figure 5. The grey value range of all MR images is [0, 1] , and the size is set to 256 × 256, the undersampling rate is 30%, the measurement noise η e ∼ N (0, σ I k ) with noise level σ = while "stopping criterion is not met" do We compared the proposed PnP-ADMM-CNC algorithm with different kinds of MRI reconstruction methods: (1) Zero-Filling, which simply applies the inverse Fourier transform with the unsampled K-space data filled with zeros.
It has been shown that the choice of denoiser affects the performance of the PnP algorithms [44], [45]. For PnP-ADMM-1 and PnP-ADMM-CNC, we consider several state-of-the-art denoiser, including model-based denoiser BM3D [48] and five neural network-based denoisers (DnCNN [49], FDnCNN [51], IRCNN [50], FFDNet [51] 1 The source code is available at https://github.com/zj15001/PNP_ ADMM_CNC_MRI. and DRUNet [52]) are used as denoisers under the PnP framework. Table 1 lists the characteristics of the above denoisers briefly. As shown in [45], DRUNet achieves superior performance against other deep neural network denoisers. In particular, ADMM-CNC and ADMM-1 can also be considered as special PnP algorithms if we view the soft threshold function as a special denoiser. As mentioned in [41] and [56], PnP method has high generalizability due to its plug-and-play nature. It does not even require problem-specific training. Neural network denoisers that learn from natural images could also obtain satisfactory results in MRI reconstruction and seismic interpolation. In our experiment, we also do not train the denoisers on the MRI image dataset. The five neural network-based denoisers used in the experiment are downloaded from https://github.com/ cszn/KAIR. These denoisers are trained on natural images datasets, e.g., BSD dataset [57], DIV2K dataset [58] and Flick2K dataset [59].
Furthermore, we only consider two proximal operators to replace with the same kind of denoiser in the experiment of this paper. As mentioned in [45], DnCNN needs to learn a single model for each noise level separately, and other denoisers can deal with various noise levels via a single model. Therefore, we set denoising parameters σ 1 = 5σ /3 = 25/255, σ 2 = σ = 15/255 for DnCNN and σ 1 = σ 2 = σ = 15/255 for the other four neural network-based denoisers. Other algorithmic hyperparameters, including the step size α, regularization parameter λ and nonconvexity control parameter b, are properly tuned for the best reconstruction performance with respect to the ground truth MR image.

B. EXPERIMENTAL EVALUATION CRITERIA
To evaluate the quality of the reconstruction, relative error (RE), peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) are used.
RE is defined as where x and x 0 are the reconstructed and the ground truth image, respectively. PSNR is defined as where MSE = 1 mn x − x 0 2 2 , and MAX x0 is the maximum value of the ground truth MR image data.
RE and PSNR are sometimes mismatched to perceive visual quality. In contrast, SSIM measures the structural similarity of two images, which is defined as where µ x and µ y are the means of x and y , σ x and σ y are the variances of x and y , σ xy is the covariance of x and y, c 1 and c 2 are constants that maintain stability. Lower RE, higher PSNR and SSIM mean better quantitative performance.

C. IMPLEMENTATION RESULTS
We first compare different algorithms on 15 MR images with three sampling templates, as shown in Figure 4 and 5. The average values of RE, PSNR, and SSIM are summarized in Table 2. As one can see, the algorithms for CNC regularization outperform the algorithms for regularization since CNC regularization can reduce the reconstruction bias. On the other hand, PnP-PGD significantly outperforms traditional PGD since more powerful denoisers are embedded into the PnP framework. In addition, DRUNet has achieved the best performance in most of the evaluation criteria among all the advanced denoisers. Specifically, compared with Zero-Filling, the average PSNR and SSIM of PnP-PGD-CNC with DRUNet denoiser are increased by 9.37 dB and 0.32, and the average RE is reduced by 0.12. U-Net and PD-Net perform well under Cartesian sampling, but not so well under other sampling since they are trained with Cartesian sampling schemes.
To get a more intuitive understanding of the reconstruction performance, we also design two visual comparisons of different algorithms on Cervical spine (Figure 4(e)) and Brain (Figure 4(b)) MR image with the random sampling template ( Figure 5(a)) and the radial sampling template ( Figure 5(b)), respectively. The reconstruction results are shown in Figure  6 and 7. To show the difference more clearly, we enlarge two local regions, display the PSNR values on the reconstructed images and set the grey value range of the difference images to [0,0.2]. From Figure 6 and 7, it can be seen that PnP-PGD-CNC with the DRUNet denoiser can reconstruct more fine details and contrasts, which leads to the lower error level around edges and high-frequency areas.

VI. CONCLUSION
In this work, we propose a novel PnP algorithm for CNC sparse regularization. CNC sparse regularization can reduce the reconstruction bias and ensure the global convexity of the objective function. PnP algorithm is obviously superior to the traditional optimization-based algorithm, which is mainly benefited from the powerful pre-trained denoisers. Experiments results have demonstrated that the proposed PnP-ADMM-CNC algorithm with a powerful deep denoiser can improve the reconstruction result.
In this work we only consider a single-coil CS-MRI model with simulated MR data. As parallel MRI can lead higher image quality, it is important to apply the proposed algorithm on multi-coil CS-MRI model with acquired MR data. we will employ the proposed algorithm to the practical MR application in our future work. In addition, we would like to extend the proposed algorithm to other tasks, such as seismic interpretation [56] and Electrical impedance tomography [60].

Denoisers
Brief Description BM3D Sparse 3D transform-domain collaborative filtering, filtering by finding similar blocks in the image.

DnCNN
Learned the residual mapping of the 17-layer convolutional neural network and has Batch Normalization (BN) modules.

FDnCNN
Compared with DnCNN, it lacks residual learning and BN modules.

IRCNN
With HQS, the 7-layer denoising network with dilated convolution is introduced.

FFDNet
Adjustable noise level map as input, learning 15 layers of convolutional neural network.

DRUNet
Combination of U-Net and ResNet, the network structure is symmetrical.