A Probabilistic Model-Based Method With Nonlocal Filtering for Robust Magnetic Resonance Imaging Reconstruction

Existing model-based or data-driven methods have achieved a high-quality reconstruction in compressive sensing magnetic resonance imaging (CS-MRI). However, most methods are designed for a specific type of sampling mask or sampling rate while ignoring the existence of external noise, resulting in poor robustness. In this work, we propose a probabilistic model-based method based on Laplacian scale mixture (LSM) modeling and denoising based approximate message passing (D-AMP) algorithm to address this issue. Sparse coefficients of similar packed patches are modeled with LSM distribution to exploit the nonlocal self-similarity prior of MR image, and a maximum a posterior estimation problem for sparse coding is formulated. It is shown that both hidden scale parameters i.e. variances of sparse coefficients and location parameters can be jointly estimated along with the unknown sparse coefficients via the method of alternating optimization. Moreover, the variance of noise is also iteratively updated based on maximum likelihood estimation. With plug-and-play prior method, the above structured sparse coding procedure can be regarded as a nonlocal filtering operation and be incorporated into D-AMP for MR image reconstruction. Owing to the power of our nonlocal filtering which takes both signal and noise estimation into account, the proposed method not only outperforms many state-of-the-art methods for most situations of observation, but also delivers the best qualitative reconstruction results with finer details and less artifacts in experiments.


I. INTRODUCTION
Magnetic resonance imaging (MRI) [1] is an essential medical diagnostic tool which is used for observing the tissue changes of the patients in the absence of ionizing radiation. It provides high-resolution images that contain important anatomical information. To obtain excellent depiction of soft tissues, the imaging speed of MRI including both scanning speed and reconstruction speed is usually slow. However, long scanning time and slow reconstruction may result in blurring on images due to local motion such as heart beating, breathing etc. Any technology related to faster scanning and more accurate reconstruction could be valuable in MRI. Compressive sensing (CS) [2], [3] performs signal acquisition The associate editor coordinating the review of this manuscript and approving it for publication was Qingchun Chen . and processing using far fewer samples than required by the Nyquist rate, which has drawn quite an amount of attention as novel digital signal sampling theory when the signal is sparse in some domain. For most of MR images, the assumption of sparsity can be followed, therefore the combination of MRI with CS seems to be quite reasonable. In fact, compressive sensing magnetic resonance imaging (CS-MRI) [4], [5] has become the high-impact ones among rapidly growing applications of the CS theory. In CS-MRI, one can achieve accurate reconstruction of MR image from only a small number of measurements, leading to the significant reduction of acquisition time of MRI scanning.
CS sampling techniques produce the measurement vector of the signal by a linear system where the original signal is multiplied by a known measurement matrix, which is usually constructed from a Gaussian or Bernoulli random operator.
In CS-MRI, under-sampled Fourier matrix is employed as measurement matrix, implying that one can only obtain a small number of Fourier coefficients of the MR image and thus the linear system is underdetermined. In order to reconstruct a MR image that is sparse under some proper basis with those Fourier coefficients, CS-MRI usually searches for the sparsest solution of underdetermined system by reconstruction methods, including greedy algorithms [6] such as matching pursuit (MP) and orthogonal matching pursuit (OMP), iterative shrinkage-thresholding algorithms [7] and their variations such as fast iterative shrinkage-thresholding algorithm (FISTA), a shorthand for Nesterov's algorithm (NESTA), and Bayesian compressive sensing or sparse Bayesian learning (SBL) algorithms [8].
MR image reconstruction problem is often modeled as a linear combination of least square fitting and sparse regularization in CS-MRI. Therefore, sparse modeling which refers to finding a sparse representation for MR image becomes critical to the success of CS-MRI. Numerous regularization terms have been developed to express sparse sparsity, ranging from the well-known total variation (TV) regularization [9], the sparsity-based regularizations with transforms (e.g., DCT and Wavelets) [10], to the patch-based sparse representation models [11]. CS-MRI reconstruction methods based on TV model exploit the sparsity of the MR image in gradient domain, but TV model favors piecewise constant structures, and is unable to express more complex image edges and textures. Sparsity-based techniques with transforms are based on the assumption that the transform coefficients of real MR images are sparse, however, the global sparsity utilized by these methods cannot help to recover local details of image well. By contrast, patch-based sparse representation methods are more effective in representing local image structures with a few elemental structures from a redundant dictionary. Unfortunately, they are still ignoring the relationship among sparse coefficients.
Modeling statistical dependencies among sparse coefficients can achieve more accurate reconstruction due to the reduction of uncertainty of the underdetermined system. These works explore the use of more complicated and hand-designed signal models, such as tree-structured wavelet sparsity [12], [13], group sparsity [14], Gaussian scale mixtures (GSM) model [15] and nonlocal sparsity [16], [17]. Wavelet coefficients of real MR images tend to be quadtree structured which is described by a regularization term of Wavelet tree in [12] and hidden Markov tree (HMT) in [13]. Components within the same group tend to be zeros or non-zeros, which corresponds to the group sparsity in [14]. It has been proved that, less measurements are required for structured sparsity recovery, or more precise solution can be obtained with the same number of measurements. On the other hand, data-driven methods such as dictionary learning [18] and deep learning [19] also lead to improvements in performance. Instead of extracting information from the input data itself, data-driven methods explore the power of external data to take advantage of an immense amount of information provided by external datasets. The K-singular value decomposition (K-SVD) algorithm which is utilized to learn an adaptive dictionary for image patches has been used for MR image reconstruction [20]. More recently, deep neural networks which learn mapping functions from the measurements or zero-filled image to the high-resolution MR image have achieved exciting success for accelerated MRI. Deep learning has been successfully applied to CS-MRI by training the CNN from down-sampled reconstruction images i.e. zero-filled reconstruction images to learn fully sampled reconstruction in [19]. Either pure convolutional layers (Deep Inverse [21]) or a combination of convolutional and fully connected layers (DR 2 -Net [22] and ReconNet [23]) is used to build deep learning frameworks capable of solving the CS image reconstruction problem. Meanwhile deep neural network architectures that combine the model-based method and the data-driven method have been developed in [24], [25]. ADMM-Net [24] is designed for CS-MRI by unrolling the procedure of the alternating direction method of multipliers (ADMM) to an iterative network. The parameters of the regularization term in the model are learned by this network, which relieves the parameter selection procedure. Similarly, Learned D-AMP [25] explicitly unrolls the algorithm of D-AMP and turns it into a neural net. Unfortunately, these methods are held back by the fact that there exists almost no theory governing their performance, and has to pre-train with a lot of time and vast amounts of data. In practice, large datasets of MRI comprising hundreds of subjects scanned under a common protocol are rare.
By contrast, nonlocal sparsity-based methods that have explicit models and do not require pre-training, has shown much beneficial to CS-MRI image reconstruction. Nonlocal sparsity refers to the fact that a patch often has many nonlocal similar patches to it across the image, and is often depicted by regularization terms, filtering-based models and composite sparse models. In order to obtain an adaptive sparsity regularization term for CS image reconstruction process, a nonlocal total variation regularization term for using the self-similarity property in gradient domain, a local piecewise autoregressive model for exploiting the stationarity of local patches, and a nonlocal low-rank regularization for characterizing the lowrank property of the two-dimensional data matrix grouped by similar patches of MR image, have been respectively designed in [26]- [28]. Probabilistic graphical models related to the D-AMP algorithm have also been proposed in [29], where collaborative filtering [30] is adopted to promote sparsity of packed patches. Considering the fact that irregular structures of image are more difficult to recover, composite sparse models are developed to soften and complement nonlocal sparsity. A new framework for compressive sensing MR image reconstruction via group sparse representation modeling has been proposed in [31], which enforces local sparsity and self-similarity simultaneously under a unified framework in an adaptive group domain. In [32], two sets of complementary ideas for regularizing the image reconstruction process have been proposed, sparsity regularization parameters are locally estimated for each coefficient and updated along with adaptive learning of PCA-based dictionaries. Moreover, CS-MRI methods exploiting both low-rank structure and local sparsity or TV sparsity have been developed and shown to provide enhanced performance [33], [34]. Despite the steady progress, nonlocal sparsity-based methods still have weaknesses. First, the regularization parameters are often set empirically which may not be optimal. Second, noise corruption is unavoidable in practice resulting in introducing of artificial textures. At last, weak generalization ability leads to the degrading of image visual quality for various type of sampling patterns or sampling rates.
In this work, we propose a robust magnetic resonance imaging reconstruction method with a novel probabilistic model. More specifically, we use Laplacian scale mixture (LSM) [35] models to represent the sparse coefficients packed by similar patches of the MR image. Using Bayesian formula, we can derive the maximum a posteriori (MAP) estimation with regard to sparse coefficients and hidden variables. It is shown that regularization parameters which are connected with the variances of sparse coefficients and noise can be estimated via the method of alternating optimization. For each subproblem, closed-form solution has been derived admitting computationally efficient implementations. Using the method of plug-and-play prior, our simultaneous sparse coding procedure is then incorporated into the framework of D-AMP for MR image reconstruction. Particularly, our method employs two nonlocal filters comprising a nonlocal mean filter used to estimate the location parameters of the LSM model and a nonlocal sparse coding for the scale parameters estimation. Owing to the jointly optimization of signal components and noise, our method has the capability of achieving a better spatial adaptation than other competing approaches, and obtains more accurate reconstruction in most situations in experiments.
We describe the relationship between our work to other competing approaches. First, the proposed work is related to LSM-based image restoration methods. Laplacian scale mixture distribution has been widely utilized in modeling sparse coefficients of signal, such as the coefficients of simulated sparse data or radar signal [36], the sparse coefficients of local patches of natural images in [35], the sparse impulse noise in [37], and the tensor coefficients of multi-frame images in [38]. In this work, we extend the application of the LSM model in representing the nonlocal sparsity of MR images. We have noticed that two papers [39] (an image super-resolution method) and [40] (a denoising method) also constructed the LSM model to represent the nonlocal sparsity of images although they are not used in CS-MRI. However, they are learning-based methods implying that they have exploited the information of external images. Moreover, none of them explores the use of noise estimation which is the key to improve the robustness to noise in CS-MRI, and is incorporated into the method of D-AMP. Second, the proposed work is different from other nonlocal sparsity-based CS-MRI methods. Low-rank regularization-based methods [26]- [28] focus on building the low-rank regularization to characterize the low-rank property of the two-dimensional data matrix grouped by similar patches of MR image, but the regularization parameters are set empirically. Our work uses the structured sparse coding that takes both signal and noise estimation into account to improve the performance especially in the presence of measurement noise. The BM3D-AMP method [29] promotes sparsity of packed patches with collaborative filtering which is most related to our work. Instead of using BM3D, we incorporate our sparse coding procedure into the framework of D-AMP as a denoiser with the method of plug-and-play prior leading to a more effective method.

II. BACKGROUND A. COMPRESSIVE SENSING MAGNETIC RESONANCE IMAGING
Compressive sensing (CS) is a signal acquisition technique that enables accurate reconstructions of signals from far fewer measurements acquired by linear projection than the number of unknowns if signals are sparse. The linear system can be expressed as y = Ax + w where x ∈ R n is the original signal, y ∈ R m is the measurement vector, A ∈ R m×n is the measurement matrix, and w denotes the additive noise. Generally, MR images are sparse in some domain such as wavelet domain and gradient domain, and thus can be well reconstructed with sub-Nyquist Shannon sampling ratio based on CS theory. The image reconstruction problem in CS can be formulated as the linear combination of a least square fitting representing the closeness of the solution to the measurements, and sparsity regularization representing a priori sparse information of the original signal: where λ is a regularization parameter that balances the contribution of the data fidelity term and the regularization term (x) in (1), A is the sensing or measurement matrix for the application. In a single pixel camera, A is a sequence of ones and zeros representing the modulation of a micromirror array, while in CS-MRI, A can be samples of an n×n Fourier matrix, i.e., partial Fourier transform. The MRI is first modeled as a CS problem in SparseMRI [4]: In (2), the sparsity regularization is the linear combination of total variation and wavelet sparsity regularization, denotes the wavelet transform, ρ and λ are two regularization parameters. In CS-MRI, the measurement vector y is the undersampled k-space data. SparseMRI transforms the nonsmooth term referred to the l1 norm of wavelet sparsity to a smooth one by introducing an approximated problem, which is then solved by classical conjugate gradient (CG) method.
Various forms of (x) are elaborately designed for the underdetermined reconstruction problem of CS-MRI except the total variation and wavelet sparsity regularization, such as the sparsity on learned dictionaries, filtering-based regularization and structured sparse regularization. Meanwhile these models can be efficiently solved by numerous algorithms, including greedy-based algorithms which greedily trace the sparsest solution, iterative thresholding algorithms which iteratively denoise the intermediate noisy signal, and sparse Bayesian learning algorithms with Bayesian inferences.

B. CS-MRI VIA NONLOCAL LOW-RANK REGULARIZATION
To exploit the nonlocal sparsity of MR images, CS via nonlocal low-rank regularization (NLR-CS) method [28] regularizes the CS-MRI recovery by patch grouping and lowrank approximation. As shown in Figure 1, MR images have abundant self-repeating patterns which can be used to form low-rank matrices. For each exemplar image patch, a set of similar image patches is grouped to form a data matrix X using the method of block matching based on Euclidean distance. Since each patch contains similar structures, the rank of this data matrix X is low. This property can be utilized to construct a regularization term representing an image prior. NLR-CS uses the following objective function to reflect the group sparsity of similar patches where . , x k ] denotes the matrix formed by the set of similar patches for every exemplar patch x i , ρ and λ are two regularization parameters, G is the total number of similar patch groups, L i is the low-rank data matrices to be estimated which can be used to approximate the matrix X i . L i * denotes the nuclear norm of L i , taking a sum value of its singular values. Let l i = [l 1 , l 2 , . . . , l k ] be the singular value vector of L i , then L i * = k |l k |.

C. DENOISING BASED APPROXIMATE MESSAGE PASSING
Alternating direction method of multipliers (ADMM) [41] is adopted by NLR-CS to solve Eq. (3), which can be also solved by other iterative optimization algorithms in fact, such as iterative shrinkage-thresholding methods [7], or Bregman iterative algorithms [42]. In this paper, we focus on an iterative algorithm i.e. D-AMP due to its promising performance and efficiency. The D-AMP method is derived from the AMP algorithm which is defined by Donoho et al. [43] and based on the theory of belief propagation in graphical models. After quadratic approximation, the final alternating expressions in the AMP algorithm to solve Eq. (1) when (x) = x 1 are where x (t) and z (t) are the estimate of x and the residual at iteration t. A * is the conjugate transpose of A. The iteration starts from The functions η(·) and η (·) are the wavelet threshold function and its first derivative respectively. The last term in Eq. (5) is called the ''Onsager reaction term'' in statistical physics. This Onsager reaction term helps improve the phase transition of the reconstruction process over existing iterative shrinkage-thresholding algorithms [7].
Instead of adopting the wavelet thresholding, D-AMP explores the use of a wide variety of denoisers to generalize the AMP algorithm as follows: where (·) denotes the denoising operator and (·) is its derivative. For most of denoisers, the denoising operators do not have an explicit input-output relation, thus it is not easy to compute (·). In [29], the method of Monte Carlo (MC) is utilized to simulate (·) with random numbers. Instead of assuming any knowledge of the signal information, the D-AMP algorithm employs the denoising algorithm to achieve its goal.

III. CS-MRI VIA NONLOCAL FILTERING AND D-AMP
The abundance of self-repeating patterns in MR images implying a useful prior has been widely adopted in patch-based CS-MRI reconstruction methods. Despite the great success, most of the existing works have poor robustness when external noise is involved or various types of sampling situations must be considered. Moreover, a challenging problem encountered by these methods is that how to set those parameters in a spatially adaptive way. In this paper, we propose modeling sparse coefficients related to similar patches of MR image with LSM distribution for the construction of nonlocal filtering, and then incorporate it into the D-AMP method as a black-box denoiser for MR image reconstruction.
Owing to the joint estimation with regard to both signal and noise, the proposed method achieves promising performance in robustness and favorable solution in parameter tuning.

A. LSM PRIOR MODELING FOR EXPLOITING NONLOCAL SPARSITY
Laplacian scale mixture distribution has been widely utilized in modeling sparse coefficients of signal, such as the sparse coefficients of local patches of natural images in [35], the sparse impulse noise in [37], and the tensor coefficients of multi-frame images in [38]. In this work, we extend the application of the LSM model in representing the nonlocal sparsity of MR images. A random variable α i = θ i β i is a Laplacian scale mixture if β i has a Laplacian distribution with scale 1, i.e. p(β i ) = exp(−|β i |)/2, and the multiplier variable θ i is a positive random variable with probability p(θ i ). Supposing that β i and θ i are independent, conditioned on the parameter θ i , the coefficient α i has a Laplacian distribution with scale p( The distribution over α i is therefore a continuous mixture of Laplacian distributions with different scales. The distribution in Eq. (8) is defined as a Laplacian scale mixture.
For most of the choices of p(θ i ), we do not have an analytical expression for p(α i ), thus it is difficult to compute the MAP estimates of sparse coefficients. However, we can jointly estimate (α i , θ i ) via the method of alternating optimization to obtain approximate solutions. We consider the sparse coding problem h = Dα +v, where h is the noisy observation;D is a sparse dictionary; α denotes the sparse coefficient vector; v ∼ N (0,σ 2 ) denotes the additive Gaussian noise. Using Bayesian formula, we might derive the following MAP estimation problem (α, θ) = arg min According to Eq. (9), we need to define the likelihood term p(h|α), and the prior distribution p(α, θ). First, the additive noise v is assumed to be white Gaussian with variance σ 2 . Thus, we have the following likelihood Second, the prior term can be expressed as Note that the mean of the sparse coefficient µ i related to image patch is not assumed to be zero in Eq. (11). At last, the noninformative prior is adopted for the scale parameter θ The hierarchical model of sparse coding can be summarized in Figure 2. According to Eq. (10), (11) and (12), we can translate the MAP estimation problem i.e. Eq. (9) in log domain into the In order to further simplify the above optimization problem, a diagonal matrix = diag(θ i ) is introduced to characterize the variance for an image patch. Then we have α = β and µ = τ . Finally, Eq. (13) can be translated from (α, θ, µ) domain to (β, θ, τ ) domain To exploit the nonlocal sparsity for a collection of similar patches, we extend the sparse coding problem i.e. Eq. (14) to a structured sparse coding problem. For a collection of similar patches, their corresponding sparse coefficients should be characterized by the same prior, i.e., the probability density function should be with the same θ and µ, thus, the prior becomes where α i and µ i denote the sparse coefficient vector of the i-th similar patches and its mean respectively. Using the inference process as above, we have the following optimization problem in matrix form . . , τ k ] ∈ R k×k and H is the collection of similar patches of the noisy observation. ε is a small positive number introduced for numerical stability, since log θ is unstable as θ → 0. The variance of noise σ 2 in Eq. (16) is assumed to be known, however, it can also be estimated when it is combined into the D-AMP method. Its estimation will be elaborated later. VOLUME 8, 2020

B. SOLVING STRUCTURED SPARSE CODING VIA ALTERNATING MINIMIZATION
Using the method of alternating optimization, we can simultaneously learn parameters including the scale parameter θ and the location parameter µ, and do inference (i.e. estimate the sparse coefficient α). Note that α and µ are associated with β and τ respectively in Eq. (16).
First, we compute for a fixed (B, θ). Given a collection of similar patches, the nonlocal mean filter is adopted to estimate the location parameter of LSM model µ µ = j ω j α j (17) where the weight is set to ω j = exp(− h − h j 2 2 /δ) implying that we perform Gaussian filtering on the given similar patches based on patch similarity. Once we have the location parameter µ, τ directly related to is computed by Note that, we do not have α and at the beginning of iteration. To obtain the initial estimation of sparse coefficient α (0) and scale parameter θ 0 , we firstly compute the principal component analysis (PCA) basis D i for each similar image patches H i . Then we have the initial estimation of sparse coefficients by α (0) = D −1 i H i . Finally, the initial estimation of the scale parameter is computed with one-sample maximumlikelihood (ML) estimation Second, for a fixed (B, ), one can update θ by solving Since the dictionary D is orthogonal, Eq. (20) can be rewritten as where H = DQ. By letting a i = φ i 2 2 where φ i is the i-th row of B, b i = −2φ i (α i ) T and c = 4σ 2 . The superscript T denotes the transposition operation. We obtain the following equation consistent with Eq. (15) in [38] (however, θ i is the scale parameter of tensor coefficients of multi-frame images in [38]) through rewriting Eq. (21)  derivative with respect to θ i and then setting it to be equal to 0 The final solution to Eq. (22) is summarized as At last, we compute B for a fixed (θ, ). The subproblem simply becomes and can be represented in a vectorization form whereα is the vectorization of the matrix A. Eq. (26) admits the following closed-form solution where γ i = (2σ 2 ) θ 2 i is a threshold. The final output of the structured sparse coding for a given similar patches group is D i i B i , then the denoised MR image can be recovered by averaging all reconstructed patches at the same location.

C. D-AMP FOR CS-MRI RECONSTRUCTION
In order to apply the above structured sparse coding to CS-MRI, we use the plug-and-play prior method in the frame of D-AMP and regard the procedure of structured sparse coding as a denoising process called nonlocal filtering. Note that, two nonlocal filters are utilized in our method, including a nonlocal mean filter used to estimate location parameters and a nonlocal sparse coding for the estimation of scale parameters.
The plug-and-play prior framework [44], [45] allows simple integration between inversion problems and priors, by applying a denoising algorithm, which corresponds to the used prior. With the plug-and-play prior method, the prior to be used does not have to be explicitly formulated as a penalty expression. Consequently, we can treat the above structured sparse coding as a ''black box'' which takes the noisy observation h as an input, and the denoised MR image as an output.  According to Eq. (6) and Eq. (7), we have the alternating expressions of our CS-MRI method where (·) denotes the proposed nonlocal filtering. Finally, we obtain the following global objective functional for CS-MRI recovery where g(x) is an implicit prior term directly associated to function (·) which denotes the proximal map defined as follows Eq. (32) shows that the variance of noise decreases along with iterations. Note that with regard to noise measurements y = Ax + w, where w is the external additive noise with the variance assumed as σ 2 w , our method is robust to external noise as the estimation of σ 2 w is also included in Eq. (32). In general, the proposed algorithm is summarized in Algorithm 1, named as the D-AMP algorithm with nonlocal filtering (NLF-DAMP).

IV. EXPERIMENTS
In order to verify the outstanding robustness of the proposed NLF-DAMP for reconstructing the image with external noise and various type of sampling masks and sampling rates, we compare our method with four image reconstruction algorithms, including two nonlocal sparsitybased algorithms: NLR-CS [28] and BM3D-AMP [29], and two deep learning-based algorithms ADMM-Net [24] and IFR-Net [46]. Nine standard test MR images with the same  size of 256 × 256 are employed as the experimental images in this paper, as shown in Figure 3. Both pseudo radial sampling and 2D random sampling schemes are adopted in our algorithms as shown in Figure 4 and Figure 5 respectively. In the 2D random sampling scheme, five sampling rates including 10%, 12%, 16%, 20% and 25% are chosen for comparisons, while in the pseudo radial sampling scheme, the numbers of sampling lines directly related to sampling rates on the masks are 15, 20, 30, 45 and 60 respectively. The experimental results including objective quality, subjective quality and runtime are present.
The peak signal-to-noise ratio (PSNR), and structural similarity (SSIM) are used to quantitatively evaluate the qualities of the reconstruction results.
We generate the CS measurements by randomly sampling the Fourier transform coefficients of test images, and then reconstruct MR images by five comparison algorithms. The main parameters of the proposed algorithm exist in the step   of grouping similar patches by the block matching method. 36 most similar patches with size of 6× 6 are chosen for each exemplar patch with size to construct the matrix of similar patches. We extract exemplar image patch in every 5 pixels along both horizontal and vertical directions to reduce the computational complexity. All of these experiments are implemented in MATLAB language. For other reconstruction algorithms, codes are downloaded from the authors' websites. For fair comparisons, default settings in their codes are adopted. We first present the experimental results for noiseless CS measurements and then report the results using noisy CS measurements. The noisy CS measurements are mixed with Gaussian white noise with standard deviation 5, 10, 15, 20 and 25.

A. EXPERIMENTS ON NOISELESS MEASUREMENTS
In this section, numerous experiments have been conducted to show the superiority of the proposed CS-MRI recovery method NLF-DAMP in dealing with the issues of different sampling masks and sampling rates by comparing with four reconstruction algorithms NLR-CS, BM3D-AMP, ADMM-Net and IFR-Net. A complete comparison is present in Table 1, Table 2, Table 3 and Table 4. As is clear from these tables, our method NLF-DAMP outperform all the other algorithms in a large majority of the tests. In the first experiment, we generate CS measurements by pseudo-radial subsampling of the Fourier transform coefficients of test MR images. Five radial subsampling masks associated with five sampling rates are shown in Figure 4. The PSNR comparison results of recovered images by all comparison methods are shown in Table 1. From Table 1, one can see that on the average, the proposed method NLF-DAMP is better than other competing methods in achieving higher PSNR values. In fact, the average gain of NLF-DAMP over ADMM-Net, IFR-Net, NLR-CS and BM3D-AMP can respectively be as much as 1.50 dB, 4.04 dB, 0.47 dB and 0.59 dB. The SSIM comparison results are shown in Table 2. The data from   Table 2 supports the above statement that NLF-DAMP is the best one.
Next, we generate CS measurements by randomly sampling the Fourier transform coefficients of input images.
A visualization of five 2D random subsampling masks is depicted in Figure 5. The quantitative results with regards to PSNR and SSIM are tabulated in Table 3 and Table 4, from which we can see that 1) NLF-DAMP is also achieve the best  PSNR and SSIM results on the average when 2D random subsampling scheme is employed; 2) two deep learning-based algorithms ADMM-Net and IFR-Net obtain the better performances at 10% and 25% sampling rates respectively when compared with other algorithms than the ones at other sampling rates, implying a comparatively poor generalization ability of deep learning-based algorithms in respect of reconstruction at different sampling rates. VOLUME 8, 2020  To facilitate the evaluation of subjective qualities, Figure 6 and Figure 8 show the visual comparisons of the reconstructed results on Brain2 and Head1 image with 30 sampling lines and 16% sampling rate by different methods, while the corresponding iterative curves (except for two deep learning-based methods) are given in Figure 7 and    Figure 7, we can clearly see that the proposed algorithm NLF-DAMP performs better than others, which enjoys great advantages in producing clearer images, e.g. in the edges of the organ and blood-vessels due to its capability of achieving a better spatial adaptation. It can not only perfectly reconstruct large-scale VOLUME 8, 2020 sharp edges but also well recover small-scale fine structures. The visual results recovered by the NLR-CS and BM3D-AMP algorithm are inferior to the one by NLF-DAMP with the lower PSNR and SSIM values and fewer details. Visual artifacts can still be clearly observed the one produced by the IFR-Net method. Although the image yielded by ADMM-Net is still clear, a small amount of artificial textures introduced by the method of deep learning can be found. The superiority of the proposed algorithm in visual quality could be demonstrated by these results.
The CPU time and PSNR are traced in each iteration for each of reconstruction methods. Figure 7 (a) (or Figure 9 (a)) and Figure 7 (b) (or Figure 9 (b)) present the iteration number vs. PSNR curves and CPU time vs. PSNR curves respectively. In the case of NLR-CS with more iterations, its results are every four iterations presented in our Figures. Our method achieves the best performance in terms of PSNR and CPU time after about 30 iterations in Figure 7 (a) and 35 iterations in Figure 9 (a), and after about 350 seconds in Figure 9 (b) and Figure10 (b). These curves demonstrate that NLF-DAMP can converge to a good reconstructed result in a reasonable amount of time. The deep learning methods ADMM-Net and IFR-Net are fast, which take only several seconds to reconstruct an image, but it spends a lot of time to train a deep learning network. Unlike BM3D-AMP that uses a denoiser written in C language, our pure MATLAB implementation of NLF-DAMP algorithm (without a C-coded optimization) still runs reasonably fast, and is better than NLR-CS which is also implemented in MATLAB language.
As we know, streaking artifacts produced by pseudo-radial sampling scheme are more difficult to remove than incoherent aliasing artifacts produced by 2D random sampling scheme, leading to a lower PSNR and SSIM results of reconstructed images by all comparison algorithms. Clearly, the results obtained indicate that the proposed method gets the best performances with various sampling patterns and sampling rates in noiseless reconstruction.

B. EXPERIMENTS ON NOISY MEASUREMENTS
In this section, we demonstrate that the NLF-DAMP algorithm performs far better than competing methods when in the presence of measurement noise. We generate noisy CS measurements by adding Gaussian noise w with the variance σ 2 w to the under-sampled Fourier transform coefficients of input images, i.e., y = Ax + w. The noise standard deviation σ w is respectively selected as 10, 15, 20 and 25 for both pseudo radial sampling (with 45 sampling lines) and 2D random sampling (with 20% sampling) scheme.
The PSNR and SSIM results of NLF-DAMP with varying amounts of measurement noise compared with other competing methods are shown in Table 5, Table 6, Table 7 and Table 8. Table 5 and Table 6 shows the SSIM and PSNR results with pseudo-radial sampling scheme respectively, while Table 7 and Table 8 present the ones with 2D random sampling scheme. From these tables, we can conclude that 1) the quality of the images reconstructed by all CS-MRI methods degrade seriously as the amount of measurement noise increases; 2) The proposed method NLF-DAMP outperforms than other method for most of images except for very few ones. The average PSNR improvements over ADMM-Net, IFR-Net, NLR-CS and BM3D-AMP are about 3.19 dB, 5.63 dB, 4.93 dB and 0.33 dB when using pseudo-radial sampling scheme 5.67 dB, 7.86 dB, 5.25 dB and 0.57 dB when using 2D random sampling scheme respectively.
Visual comparisons between the reconstructions by different methods with measurement noises are provided in Figure 8 and Figure 10, while the corresponding iteration number vs. PSNR curves and CPU time vs. PSNR curves are presented in Figure 9 and 11 respectively. It can be observed from Figure 8 and Figure 10 that BM3D-AMP and NLF-DAMP seem to deliver the better visual quality with both sampling schemes than the others, however, the MR images reconstructed by BM3D-AMP are over-smoothed when compared to the one by NLF-DAMP, resulting in a lack of image details. By contrast, reconstructed images by ADMM-Net, IFR-Net and NLR-CS all suffer from noticeable noise spots. The iteration curves clearly show that the NLR-CS is sensitive to severe noise. The RL-DAMP method can be found to be robust to noise with various sampling patterns in the experiments because of the joint estimation of signal and noise.

V. CONCLUSION
In this paper we have proposed a robust nonlocal filtering approach in the frame of D-AMP for compressive sensing magnetic resonance imaging. To fully exploit the spatial dependency, we design nonlocal filters and perform them on the grouped similar patches of a noisy input. For the purpose of making structural sparse coding more robust, we propose a new probabilistic model for sparse coefficients using Laplacian scale mixture model. LSM modeling translates structural sparse coding into an optimization problem with sparse coefficients and hidden variables. We have adopted the method of alternating optimization and shown that subproblems can be solved in closed-form. Experimental results have shown that the proposed algorithm can outperform existing CS-MRI techniques with various sampling patterns and rates especially in the presence of measurement noise.