MRI Restoration Using Edge-Guided Adversarial Learning

Magnetic resonance imaging (MRI) images acquired as multislice two-dimensional (2D) images present challenges when reformatted in orthogonal planes due to sparser sampling in the through-plane direction. Restoring the “missing” through-plane slices, or regions of an MRI image damaged by acquisition artifacts can be modeled as an image imputation task. In this work, we consider the damaged image data or missing through-plane slices as image masks and proposed an edge-guided generative adversarial network to restore brain MRI images. Inspired by the procedure of image inpainting, our proposed method decouples image repair into two stages: edge connection and contrast completion, both of which used general adversarial networks (GAN). We trained and tested on a dataset from the Human Connectome Project to test the application of our method for thick slice imputation, while we tested the artifact correction on clinical data and simulated datasets. Our Edge-Guided GAN had superior PSNR, SSIM, conspicuity and signal texture compared to traditional imputation tools, the Context Encoder and the Densely Connected Super Resolution Network with GAN (DCSRN-GAN). The proposed network may improve utilization of clinical 2D scans for 3D atlas generation and big-data comparative studies of brain morphometry.


I. INTRODUCTION
Magnetic resonance imaging (MRI), as an indispensable tool for medical diagnosis and imaging research, offers detailed visualization of the human torso, extremities and brain. However, artifacts often occur, reducing image quality, diagnostic utility, and scientific relevance [1]. While a plethora of two-dimensional (2D) MRI scans are acquired in hospitals, retrieving missing information due to image artifacts, or due to large slice thickness is of great importance, especially for downstream meta-analyses.
Types and manifestations of artifacts were reviewed by Stadler and Ba-ssalamah [2] and Zhuo and Gullapalli [3], of which some are obvious, and some are subtle, leading to misinterpretation or misdiagnosis. The most common one is The associate editor coordinating the review of this manuscript and approving it for publication was Junxiu Liu . motion artifact due to respiration or other movement of the imaging subject [4], [5]. It appears as blurring or coherent ghosting, and in more severe case, it smears the image. The other frequently encountered artifacts are equipmentrelated ones, such as spike (herringbone) artifacts, appearing as dark stripes overlaid on the image, or zipper artifacts, exhibiting increased noise that extends throughout the image slices [6], [7]. Finally, many of them exhibit as voids in the images. All of these often leads to the discarding of the affected slice, and hence a loss of potentially crucial information, especially in cases of pathologic conditions. Therefore, correcting the artifact affected slices is of great importance for both clinical and research work.
Another important image restoration application lies in retrieving anatomic information coded in the through-plane direction of 2D images, which can be formulated as missing slices. This is particularly imperative for T2-weighted The resolution in the through-plane direction (coronal and sagittal) is usually much lower than that in the in-plane (axial) direction. (b) Initially the slices to be restored are modeled as 1-valued masks in the through plane direction, while shown as masked rows in the other two planes. Note that the masked region shown in coronal and sagittal view in (b) is one slice out of every three slices, different than the mask size that we implemented in this paper (see Section III.A). It is a schematic illustration of how missing slices are represented in the other two orthogonal planes.
imaging, due to its long signal recovery time and vulnerability to motion artifacts [4], [8], [9]. Specifically, inter-slice spacing is three to six times larger than the in-plane resolution of individual slices. This results in a resolution that is much higher in-plane than in the through-plane direction, as shown in Fig 1.(a).
Both multi-slice 2D acquisitions, with their large inter-slice spacings, and artifact-corrupted images, are poorly suited for downstream segmentation and shape analyses, such as skull stripping, deformable registration, and surface or shape construction [10]- [12]. Therefore, retrieving missing slices to achieve isotropic-resolution, and correcting artifact affected regions are crucial steps in obtaining as much relevant information as possible out of the images.
Prior MRI restoration methods proposed to go from anisotropic 2D MRI images to isotropic ones can be grouped under two main categories: model-based and data-driven. The earliest methods include piecewise interpolation such as nearest neighbor, linear, polynomial and spline interpolations [13]. Mahmoudzadeh and Kashou [14] registered three 2D orthogonal scanning planes to a high-resolution grid, and combined three interpolated volumes to achieve the high-resolution image. On the other hand, using datadriven methods, Greenspan et al. [15] and Yang et al. [16] extended the iterative back projection (IBP) method proposed by Irani and Peleg [17], and modeled sparse representation and over-complete dictionary learning to restore missing slices. Dalca et al. [18] employed an expectation maximization algorithm to train a Gaussian mixture model, imputing the missing structures by learning solely from the available collection of sparsely sampled images. The work also investigated the effects of various slice thicknesses on the performance [18].
Mathematically, image restoration can be modeled as an ill-posed problem to find f −1 (cdot), the inverse of the image degrading mapping, and to minimize the difference between estimated results X and the desired but unknown images X, in forward model Y = f (X), where Y represents the observed images, and f is the image degrading mapping. Efforts have been made using IBP [17], non-local means, and matrix completion algorithms [19]- [21], from which the reconstruction and correction were iterated in a multiscaled manner. More recently, deep learning approaches have enjoyed explosive popularity and a powerful capability to improve reconstruction results. Convolution neural networks (CNN) [22] have achieved satisfactory results compared with previously applied methods [23]- [25]. However, the widely used optimization methods of CNNs minimize voxel-wise error between estimated and the ground truth images without regard for the underlying structure. This leads to an overall blurring and lower perceptual image quality [26], suggesting that CNN based methods struggle to retain high frequency information. Evidence from other imaging applications suggest that generative adversarial networks (GAN) better preserve the edges and image texture essential to perceptual quality [27]- [29], but suffer from poorer voxel-wise performance because of their emphasis on learned patterns [24]. Framework of the proposed method. The disconnected edges and their corresponding mask patterns are used to train the edge generator. The edges extracted from original image are used as edge references. The contrast generator trained by paired masked images and ground truth uses the completed edges generated from the edge generator as constraints. Note that ground truth image feeds to the edge discriminator as prior information, while to the contrast discriminator to be differentiated from the recovered image.
In this work, we propose a new method to improve the voxel-wise performance of GANs by cascading two networks focusing on specific tasks. We explicitly concentrate on restoring missing slices due to image artifacts or 2D scanning schemes, to generate anatomically plausible and consistent 3D volumes by imputing the missing slices. Our framework was inspired by image inpainting by artists, with the goal of ''reconstituting the missing or damaged portions of the work, in order to make it more legible and to restore its unity'' [30]. In image inpainting, an artist composes a drawing by initially delineating the spaces and shapes, using a ''lines first, color second'' principle [31]. Indeed, in both de-novo creation of an artistic painting or during image restoration, a completed sketch or edge recovery plays a vital role and comes before paint is applied to the canvas [32]. Attempts have been made to develop image inpainting using natural images [33]- [35]. Our proposed edge-guided image restoration network decouples the recovery of high and low-frequency components of the missing information, to generate coherent anatomical details from adjacent slices. We first apply a GAN to recover edge information based upon existing image context. A contrast completion GAN subsequently combines the ''sketched'' edges from the CNN to fill in appropriate image contrasts.

A. DATA REPRESENTATION AND PROPOSED FRAMEWORK
In this work, we model the damaged or missing through-plane slices (axial slices, illustrated in Fig. 1(a)) as binarized masks, where the masked regions are set to value 1. This in turn will be visualized as masked rows in the other two planes, as presented in Fig.1(b); retrieving missing slices is achieved by estimating the masked rows in the other two orthogonal planes. To mimic image inpainting, the first step is to connect the broken edges in the masked rows. Then a contrast completion network utilizes the connected edges from the first step and estimates the voxel intensities in the missing rows. This approach is achieved by edge-guided GAN (EG-GAN) to enhance perceptually consistent results.
Our proposed method consists of two steps: edge connection and contrast completion; both steps follow an adversarial model, consisting a generator and a discriminator (Fig 2). EG-GAN firstly connects the missing edges of the affected artifacts or low-resolution images by edge generator, taking 2D scans, and masks generated from missing slices in through-plane as input, supervised by the edges generated from the original images. As input and ground truth, edges of 2D images and original, isotropic resolution images, respectively, are extracted by a Canny edge detector [36]. In the second step of our method, a contrast generator fills the intensities based on the original contrast from 2D images, guided by edges generated from the first step and supervised by the original images.
Our framework was inspired by Nazeri et al. [37], which has achieved impressive results in image restoration for natural images. We started building our network from context encoder, an image semantic inpainting net implemented by Pathak et al. [38]. The design of two networks will be presented in detail in the following sections. 83860 VOLUME 8, 2020

B. DESIGN OF LOSS FUNCTIONS FOR EG-GAN 1) LOSS FUNCTION IN EDGE CONNECTION
The loss function of edge connection is extended from [39] and designed as: (1) where loss GAN 1 is the adversarial loss, and loss f is the featurematching loss [27], used to stabilize the network. To enhance the voxel-wise precision of the generated edge, the Dice similarity coefficient (DSC) loss [40] loss DSC is included. Finally, λ GAN 1 , λ f and λ DSC are regularization parameters.
We denote I GT as the original image (ground truth), and I M GT is the masked image: where M is the mask that is designed to mimic the thick slices. By the same token, C GT is the image contour and C M GT is its masked or degraded edges. The edge mapping performed by generator can be represented as: where C pred is the predicted image contour. C pred and C GT are designed as paired inputs of the discriminator to distinguish if C pred is real. The adversarial loss for the edge connection network can be written as: where I GT and C GT are ground truth images and their contours.
The feature-matching loss, loss loss f , calculating the difference of the activation maps generated by the hidden layers in the discriminator, is defined as: where L is the number of hidden layers in the discriminator, and N i is the number of elements in the map of the i th layer. D i represents the i th activation of the discriminator. The dice similarity coefficient calculates a spatial overlap index that ranges between 0 and 1 [41]. It has been widely used in evaluating two sets of binary segmentation results. In edge generator, DSC loss, loss DSC , is designed to restrict the bias towards background in learning [40], [42]: where c i GT ∈ C GT and c i pred ∈ C pred . The summation runs over the N voxels.

2) LOSS FUNCTION IN CONTRAST COMPLETION
The contrast completion network takes incomplete image I M GT as input, enhanced by the connected contour from the previous step. The contrast mapping G c can be represented as: where I pred is the prediction of the restored image. Similar to (3), the adversarial loss of the contrast completion network is defined as: where D C is the corresponding discriminator. To ensure that the reconstructed image has both high voxel-wised accuracy and good perceptual quality, the network is trained utilizing a combined loss function, including l 2 loss, perceptual loss [43], loss p , and style loss loss s [44], where l 2 loss is the most common reconstruction loss; perceptual loss is included so that results that are not perceptually similar to the ground truth will be refined; style loss [45] is chosen to ameliorate ''checkerboard'' artifacts due to transpose convolutional layers [26]. Taken together, the overall loss function of the contrast completion is:  [43], which was demonstrated to perform well in image-toimage translation and super-resolution [44], [46], [47]. The generators of both networks include encoders, eight residual blocks and decoders. Both discriminators follow a PatchGAN architecture [46], [48].
To further stabilize networks by scaling weight matrices using their respective largest singular values, spectral normalization is applied in both the generator and discriminator [49]. Note that the edge generator has spectral normalization and instance normalization across all the layers [49], [50], whereas the contrast generator only uses instance normalization, as learning high frequency information such as edges requires more restrictions to maintain the stability of the network [50]. However, for low frequency contrast information, spectral normalization is not necessary and might slow the training procedure. Therefore, spectral normalization is removed from the contrast generator.

III. EXPERIMENTS
In this section, we will introduce our experimental datasets, training and testing schemes, two state-of-art methods for similar applications for comparison, and our evaluation methods.

A. DATA DESCRIPTION
To demonstrate the generalization of EG-GAN, we used a large publicly available T1-weighted brain image dataset in the Human Connectome Project (HCP) S1200 collection [51]. We downloaded images after preprocessing pipelines, including distortion correction and brain extraction. We randomly chose 600 subjects for our experiments. The images come in 0.7 mm isotropic high resolution, and we removed boundary all-zero slices and fit them to 256 × 256 × 256. The whole dataset is split into 50% training, 25% validation and 25% testing, without overlapping subjects. The predicted results from the testing set only were used in our final performance evaluation and comparison.
The original images were used as ground truth images I GT and their counterpart thick-slice images, I M GT , were artificially generated mimicking damaged slices or 2D MRI scans. To stimulate typical clinical 2D scans to the greatest extend, the slice thickness is four times larger in through plane, i.e., three slices were masked (filled with the value 1, as illustrated in Fig1.b) for every four slices in through plane direction. Therefore, the thick slice images have the same size as the reference images (256 × 256 × 256).
In addition, to investigate the image quality difference of the imputed image using k-space or image-space resampling, we generated low resolution images using similar way as in Chen et al. [52]: we applied FFT to convert the original image into k-space; truncated the outer part of 3D k-space data; filled in the truncated data with zero; and finally, applied inverse FFT to convert to image-space. High frequency truncation and zero padding in k-space effects low-pass filtration with interpolation to restore the original image size. The interpolated slices (representing three out of every four slices) were masked in through-plane direction to mimic 2D MRI scenarios.
To evaluate the intra-subject generalization of our network and slice consistency, we chose the axial plane as the thick slice direction to conduct the following experiments.

B. TRAINING PROCEDURE
The models were implemented in PyTorch on High-Performance Computing clusters with NVIDIA Tesla P100 GPUs. Learning rate was set to 10 −4 for generators and 10 −4 for discriminators. The models were optimized by ADAM optimizer. Our training image size was 256 * 256, with a batch size of eight, and the Canny edge detection threshold was set to 0.5. The standard deviation of the Gaussian filter used in Canny detector was one. We set the maximum training iteration as 20,000 for the edge connection stage, and 40,000 for the image completion stage as no significant improvement afterward. . MRI through-plane imputation results: representative coronal and sagittal slices of one subject from the HCP dataset. The original T1weighted image (reference) is down-sampled in the through plane, and missing axial slices are restored by different methods: nearest neighbor, cubic, context encoder, and EG-GAN. Our method provides more visually plausible results which recover more brain anatomy without either stairway effects or broken white matter tracts. Comparing to context encoder, EG-GAN can mitigate the strip-shape artifact caused by masked model.
Our training time is about 230 ms/iteration and sums up to less than two hours to finish training edge connection stage, and less than 4 hours for image completion stage. The prediction time is about 2 minutes for each volume.

C. COMPARISON METHODS
To evaluate the competence of our proposed method, we first assessed the baseline of the performance of the simplest and fastest interpolation methods: nearest neighbor (NN) and cubic. In order to evaluate the improvement of the edge guidance, we then compared a variant of context encoder [38] (the contrast completion part of our proposed network) with the EG-GAN.
Due to very limited work on image imputation, we compared with a similar application, the GAN-based 3D MRI reconstruction in [53], coined DCSRN-GAN, which outperformed previously proposed methods in this task [23], [52]. We used the same data preparation steps as [52] and followed the same hyperparameters to build mDCSRN-GAN, as proposed by the authors [53]. Note that we did not implement our proposed method in 3D as in [52] and [53], due to memory constraints and because patch-based implementations degrade performance of the edge connection stage, which is essential for our method. To evaluate the performance of our method for super-resolution tasks, we inverse-Fourier transformed the images and performed linear down-sampling in Fourier space (in a way of zero-padding), as a linear up-sampling method. Similarly, we used pseudo-k-space down-sampled data as ''low-resolution'' images for input data while comparing DCSRN-GAN, the context encoder, and EG-GAN.
We also tested our model on two kinds of artifacts affected images, spike artifact and zipper artifact, which are the most relevant extension of our proposed mask model. To simulate the spike artifact, we randomly added spike gradients at different frequencies and angles in the pseudo-k-space of the HCP data. For the experiment of zipper artifact removal, we visually screened a large multi-center pediatric clinical dataset [54] including T1-weighted axial readout (256 × 256 × 22, 0.86mm × 0.86mm × 6mm) and T1-weighted sagittal readout (24 × 256 × 256, 6mm × 0.86mm × 0.86mm) under the supervision of a certified on-site radiologist. Two subjects (age = 9.5 and 13) were identified with images containing zipper artifacts only and were tested using our model trained on the HCP dataset.

D. EVALUATION
To evaluate the similarity between original images and our results, we compared the voxel-wise intensity value accuracy using peak signal-to-noise ratio (PSNR). Furthermore, VOLUME 8, 2020 FIGURE 5. MRI super-resolution results: representative coronal and sagittal slices of one subject from the HCP dataset. Low resolution coronal and sagittal planes are generated by down-sampling the original T1-weighted scan (reference) in pseudo 3D k-space, and 3D high resolution scans are reconstructed from multiple 2D axial slices by different methods: linear, DCSRN-GAN, context encoder, and EG-GAN. Our approach reconstructs images with more anatomically plausible details and more distinct edges.
we measured structural similarity index (SSIM) [55], which is considered to reflect perceptual image quality. Both metrics are used to evaluate each comparison independently. Fig.4 demonstrates the restored images in sagittal and coronal views from a random subject using the four methods. The top panel shows the results from the basic interpolation methods: NN and cubic. Either from the whole brain view or the regional zoomed-in one, we observe the stairway effects as expected from the results using NN in Fig.4 (a). The cubic interpolation yields more aesthetically pleasing results, but at the cost of some edge blurring (Fig.4b). White matter disconnection can be seen in the second zoomed image due to the interpolated values closer to the neighboring grey matter voxels, when comparing to the reference image in Fig.4 (e). The bottom panel compares the results of the context encoder and our proposed method, EG-GAN in Fig.4 (c) and Fig.4 (d), respectively. Neither context encoder nor EG-GAN exhibit the stairway effects or broken white matter tracts of the interpolation methods, and both exhibits many of the fine details of the brain anatomy, However, the result of EG-GAN appears much more visually plausible, and the method is capable of restoring fine details of small vessels (the second magnified region in Fig.4 (d)) as well as distinguishable and smooth boundaries for the cortex. Note that the result of the context encoder shows faint traces of the masked row (missing slices in the axial direction), which can be observed in all three magnified figures in Fig.4 (c). To demonstrate the inter-slice consistency, the axial slices that were filled in the through plane using EG-GAN are presented in Supplemental Fig. 1.

A. IMPUTATION
Quantitative results are summarized in Table. 1. Consistent with Fig. 4, the context encoder, which is widely used for image inpainting, achieved higher PSNR and SSIM than NN or cubic. However, EG-GAN outperformed traditional methods by roughly 10% higher PSNR and 5% higher SSIM. Note that the standard deviation of the PSNR in EG-GAN is only a third of those using other methods.

Fig. 5 exhibits the reconstructed images in sagittal and
coronal view of another random subject using pseudo-kspace down-sampled data. We compared the basic linear interpolation method, a state-of-art super-resolution method, DCSRN-GAN, and two image inpainting-based methods, context encoder and our EG-GAN in Fig.5 (a)-(d), respectively. Both linear interpolation and DCSRN-GAN avoid stairway effects but show overall blurring when comparing three magnified regions to the reference image in Fig.5 (e). We observed areas that DCSRN-GAN failed to restore, as seen on the top of the second magnified image in Fig.5 (b). When comparing the results of context encoder and EG-GAN, the context encoder is not able to restore regions where the folds of gyri and sulci are more complex, nor the blood vessel (the second magnified image in Fig.5 (c), whereas EG-GAN can (bright voxels between sulcus folds in coronal view of Fig.5 (e), and the second magnified image). The reconstructed axial slices are shown in the Supplemental Fig.2. Table. 2 shows the quantitative evaluation results of linear, DCSRN-GAN, context encoder, and EG-GAN methods. Note that DCSRN-GAN and the context encoder achieved comparable performance to EG-GAN in both PSNR and SSIM. However, when comparing the performance of super-resolution and image imputation (Table. 1), both PSNR and SSIM are higher in image imputation task. Fig.6 (a) and (d) present axial and coronal views of images with mild zipper artifacts from two subjects. The mask that we used in the training model can perfectly cover the artifact VOLUME 8, 2020  Evaluation results of four methods for super-resolution using pseudo-k-space down-sampled data presented as mean ± standard deviation. PSNR: peak signal-to-noise ratio; SSIM: structural similarity index.

C. ARTIFACT CORRECTION
corrupted rows (Fig.6 (b) and (e)). Fig.6 (c) and (f) demonstrate that our EG-GAN model can recover the zipper artifact corrupted rows. The magnified images present the detail of the recovered region, proving that the recovered rows removed the artifact while preserving the anatomical contrast both in the ventricular area (the magnified image of Fig.6 (c)) and cortex (the magnified image of Fig.6 (f)).
Our simulated spike artifact on a random subject in the HCP dataset is displayed in Fig.7 (a) and (d). Due to the different frequencies and angles in pseudo-space, they manifest different directions and brightnesses. The artifact corrected images shown in Fig.7 (b) and (e) restored the affected lines in (a) and (d).

V. DISCUSSION AND CONCLUSION
While true 3D images have many advantages, such as higher SNR and the ease of image reformatting and registration, the lengthy acquisitions required for 3D imaging are vulnerable to image motion. Hence, multiplanar 2D acquisitions are the norm in pediatric imaging, particularly for T2-weighted images. In this work, the process of restoring missing slices due to image artifacts or 2D scanning scheme was first modeled as image imputation problem. With this novel data representation, we employed the context encoder [38], which was initially designed to solve image inpainting problem, to solve our specific MRI restoration tasks (imputation, super-resolution, and artifact removal). However, the original context encoder is not capable of retrieving the missing slices due to lack of constraints. Therefore, we sought to edge as a constrain to boost the network performance. We hypothesized that an intact edge could enhance the overall system performance due to its intrinsic high frequency information and its ability to constrain contrast matching. The performance of our proposed edge-guided image restoration network supports this hypothesis by demonstrating higher PSNR and SSIM than the widely used context encoder, which was initially designed for generating the contents of an arbitrary image region conditioned on its surroundings [27], [38], [56].
One of the differences between our work and Nazeri's method [37] is that our inpainting model was rooted from context encoder, and we only predicted the masked region. However, the network in [37] predicts the whole image, modifying physically acquired data. This difference is also embedded into the loss functions. The reconstruction loss from [37] is only normalized by the mask size, while the context encoder, as well as our work directly calculates the reconstruction loss within the masked region. Despite excluding the primarily acquired data from image optimization, we did not observe any boundary artifacts.
In image imputation, the restored image using NN showed stairway effects due to duplication from neighboring slices ( Fig. 4 (a)). Although cubic interpolation outperformed NN (PSNR, SSIM, and aesthetics), the smoothness of interpolated regions led to both underestimation and overestimation in the target voxels. The context encoder fitted the regional voxel distributions in an image, producing better PSNR and SSIM than either NN or cubic interpolation. However, due to the lack of structural constraints, the masked region could not be fully recovered, and images were left with an abnormal texture and boundary effects for the mask sizes used in our study. By conditioning on edges, EG-GAN improved contrast generalization and alleviated model collapse near the boundaries [57], providing quantitatively and aesthetically superior results.
While EG-GAN outperformed all methods in both the imputation ( Fig. 4 and Table 1) and super-resolution ( Fig. 5 and Table 2) tasks, EG-GAN exhibited half of the standard deviation during imputation compared with super-resolution.
In imputation, uncorrupted adjacent slices were used to train the network, while during super-resolution adjacent slices suffer from through-plane blurring that may compromise edge reconstruction. In addition, for both the context encoder and DCSRN-GAN methods [38], [53], voxel-wise intensity similarity between the synthesized and real images were enforced (DCSRN-GAN yields the same SSIM as EG-GAN in Table 2), however, the structure of image content, such as the complex anatomical details were not emphasized. This is partially because context-encoder aims at reducing reconstruction loss (l 2 loss), and it lacks constraints to prevent model collapsing in prediction. In contrast, mDCSRN combines voxel-wised reconstruction loss and adversarial loss to stabilize the network and further improves the structural similarity. However, adversarial loss matching the generated and real distribution may produce artificial structures [58], which causes the higher standard deviation in SSIM comparing with EG-GAN. Close inspection of Fig. 5 demonstrates superior image sharpness and structural texture in EG-GAN compared with the DCSRN-GAN method. However, if we focus on comparing Fig. 5 (d) and (e), our method does not correctly retrieve the shape of the small vessel in the second zoomin figure (location is highlighted in coronal view from (a)).
It is because that the most portion of the vessel falls in the masked region, resulting the failure of connecting the correct counter of the vessel. Note that none of the other comparing methods are able to restore the exact shape of the vessel due to their intrinsic properties, yet this defect in restoring small anatomical structures could be alleviated by adjusting mask size or randomly shuffling the starting masked slice.
Other than task type, mask size also plays a critical role in network performance. Both SSIM and PSNR decrease up to 15% when increasing the mask size from 20% to 50% [59]. Our EG-GAN method collapsed when the mask size increased from 75% to 80%, primarily because edge continuity could not be unambiguously restored. We postulate that exploitation of information provided by two orthogonal 2D scans (which is commonly used in clinical practice) could potentially improve the performance of EG-GAN. There are ongoing efforts to combine orthogonal 2D images using Gaussian mixture models, or including various priors [60], [61]. Potentially those methods could be imbedded in EG-GAN when combining two orthogonal 2D scans in future work. For example, slice to volume image registration usually suffers from the lack of geometry information and may require image fusion process when handling multi-slices [62]. Additionally, the performance of directly registering 2D images to 3D volume might be affected by viewpoint angle. Although registration can be performed, the different 2D feature points might cause inaccurate results [63]. Therefore, building a superior isotropic 3D volume could benefit the downstream registration and multi-site image group analysis.
Image artifacts such as zipper and gradient spike artifacts are disruptive to clinical workflow because they are sporadic and do not always resolve when a sequence is repeated. In this paper, we demonstrate that EG-GAN offers a simple method for image restoration from these artifacts that could be easily implemented on clinical workstations. As zipper artifacts usually only affect a few lines/columns of the image, our method can correct the artifact with high fidelity. Spike artifacts, whose mask pattern can be irregular, are more challenging because they affect the entire image (Fig. 6) and the mask percentage is very close to the upper limits of restoration (75%). While the exhibited spike artifact in Fig. 7 could be fully restored, satisfactory results cannot be guaranteed when the spike artifact affects a larger fraction of the image. The same situation could apply to zipper artifacts, if the ''zipper'' affects more than 75% of the slices, which is the maximum masked area that our model could restore. Note that this mask size can be counted locally than globally, meaning that even if only five or six rows/columns of the entire image are corrupted, the masked ratio would be 5/6 or 6/7 locally, and hence prohibit full image recovery. This inference roots to the model that we presented: at least a quarter of the adjacent slices are needed to recover the missing edge, providing a completed and effective constraint, and further to complete the image with the enhanced consistency of the contrast-matching network.
In addition, for artifact correction, mask generation is required, which may be challenging for tasks such as motion artifact and cardiac tagging, whose masks could be highly irregular. Therefore, an automatic and accurate method might be required to extract the mask of the areas to be restored. Another limitation of artifact correction using our method is that lesions lying exclusively in the masked area could be painted over and not recovered.
As discussed in several places, edge completion plays a critical role in improving the image restoration results. This does not advance our network to utilize an end-toend learning fashion. We had to visually ensure that edge connection stage was well-trained before starting contrast completion. In practice, using task array on GPUs could make our proposed network an end-to-end training, but an intermediate check is recommended. Secondly, in our work, we used a fairly old and primitive technique for constructing edge maps (Canny edge detection) [36]. However, learningbased methods, such as holistically nested edge detection could potentially be used in combination with the Canny detector in the future work [64], [65] and promote a more efficient learning. In addition, more robust edge-detection and completion could potentially overcome the limitations encountered with 3D patch implementation.
In summary, our proposed edge-guided image restoration network decouples the recovery of high and low-frequency components of the missing information to generate coherent anatomical details from adjacent slices. The network proved to effectively restore the missing image detail either due to 2D scanning schemes, or due to image artifacts. We propose that EG-GAN could improve utilization of clinical 2D scans for 3D atlas generation and big-data comparative studies of brain morphometry. He is currently a Research Assistant with the Saban Research Institute, Children's Hospital Los Angeles, Los Angeles. His research interests include machine learning, signal and image processing, and magnetic resonance imaging physics. VOLUME 8, 2020 KANGNING ZHANG received the B.S. degree in electrical engineering from Beijing Jiaotong University, Beijing, China, in 2015, and the M.S. degree in electrical engineering from the University of Southern California, Los Angeles, CA, USA, in 2018. He is currently pursuing the Ph.D. degree in electrical and computer engineering with the University of California Davis. His research interests include image signal processing and compressive sensing.
NATASHA LEPORE received the B.Sc. degree in physics and mathematics from the University of Montreal, the master's degree in applied mathematics and general relativity from the University of Cambridge, and the Ph.D. degree in theoretical physics from Harvard University.
She was a Postdoctoral Fellow in medical imaging (with Prof. Paul Thompson) with the Laboratory of Neuroimaging, UCLA. Since 2009, she has been a Faculty Member in radiology and biomedical engineering with the Children's Hospital Los Angeles and the University of Southern California. She is currently the Director of the Computational Imaging of Brain Organization Research Group (CIBORG), specializes in mathematical and numerical methods to study brain anatomy and function though magnetic resonance imaging. These methods are applied to furthering the understanding of different neurological disorders and normal and abnormal brain development. He performed the Residency and Fellowship in pediatric cardiology, Yale. He joined the Children's Hospital Los Angeles/USC Keck School of Medicine, in 1999, studying wavelet-packet denoising applications with MRI. He was studying in cardiovascular consequences of hemoglobinopathies for almost a decade. He was a Principle investigator for the NIH-Sponsored Early Detection of Iron Cardiomyopathy Trial, whose goal is to identify earlier markers of cardiac dysfunction. He is currently the Director of the Cardiovascular MRI and specializes with the MRI assessment of congenital heart disease and noninvasive assessment of iron burden with MRI. He is one of the pioneers of the MRI-based cardiac and liver iron measurements and also studying oral chelation strategies in animals and humans. He also has funded projects examining pancreatic and pituitary iron burden with MRI and their functional correlates. He received the ARRA Challenge Grant to study the role of iron overload and other factors in sickle cell vasculopathy. He is exploring the links between abnormal red cell mechanics and vascular dynamics in the hemoglobinopathies. His research interests include relationship between cerebrovascular reserve, anemia, and white matter loss in chronic anemia patients at particularly high risk for silent stroke. He also explores white matter quantification and super-resolution image using machine learning methods.
Dr. Wood memberships, include the American Medical Association, the American Academy of Pediatrics, the Society for Cardiovascular Magnetic Resonance, and the International Society for Magnetic Resonance in Medicine. His awards and honors, include the Tau Beta Pi (National Engineering Honor Society), in 1982, the Alpha Omega Alpha, in 1993, the Alfred F. Towsley Award for Pediatrics, in 1994, the RSNA Scholar, in 2001, and the Russell Smith Award for Innovation in Pediatric Research, in 2009.