Stenosis Detection From Time-of-Flight Magnetic Resonance Angiography via Deep Learning 3D Squeeze and Excitation Residual Networks

Intracranial artery stenosis is an important public health concern internationally, due to it being one of the major causes of ischemic stroke. In this study, we aim to provide a computer-aided diagnosis algorithm capable of automatically distinguishing between Internal Carotid Artery (ICA) stenosis and normal to minimize the labor-intensiveness of stenosis detection. Using Time-of-Flight Magnetic Resonance Angiography (TOF-MRA), a novel deep learning detection model via 3D Squeeze and Excitation Residual Networks (SE-ResNet) is proposed. Pre-processing of TOF-MRA, data augmentation, training of 3D SE-ResNet, and testing using patch-based and patient-based methods with cross-validation is described. The proposed network using a database consisting of 50 normal cases (ICA-N) and 41 stenosis cases (ICA-S) with grade level of above 30% was evaluated. All 41 ICA-S cases were categorized according to the diameter (D_stenosis) of the artery at the site of the most severe stenosis by expert radiologists, whereas percent stenosis was measured by Warfarin-Aspirin Symptomatic Intracranial Disease (WASID) method. The proposed 3D SE-ResNet was further compared with more conventional networks including 3D ResNet and 3D VGG. The results showed the capability to detect stenosis achieving overall Area Under the Curve (AUC) and accuracies of 0.947 and 91.0% for patch-based and 0.884 and 81.0% for patient-based testing, respectively. In addition, the proposed 3D SE-ResNet achieved better performance against conventional 3D ResNet and 3D VGG with improvement rates of 0.053 and 0.095 for patch-based and 0.053 and 0.065 for patient-based testing in terms of AUC, respectively.


I. INTRODUCTION
Intracranial atherosclerotic disease (ICAD) is one of the most common fatal diseases worldwide occurring when brain arteries become blocked with some deposits of waste such as cholesterol and fat, leading to reduced blood flow in the brain. ICAD has been closely related to the cause and risk factors of ischemic stroke [1], [2]. Moreover, several studies have demonstrated that internal carotid artery (ICA) stenosis with grades of stenosis above 70% caused by atherosclerosis could lead to ischemic cerebrovascular events [3], [4] The associate editor coordinating the review of this manuscript and approving it for publication was Gina Tourassi. thus prevention of major vascular events in patients with symptomatic intracranial stenosis is important [5]. Timeof-flight magnetic resonance angiography (TOF-MRA) is a non-invasive imaging modality that is utilized for screening the network of arteries in the brain, and hence for stenosis diagnosis [6]- [8]. However, radiologists face challenges to diagnose the stenosis from TOF-MRA due to the difficulty of examining large amounts of brain images which is timeconsuming and fault-prone. An automated diagnostic tool for stenosis identification from TOF-MRA is required to support radiologists directly or via a second clinical assessment.
In previous studies, there have been many attempts to use automated or semi-automated methods to detect stenosis VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ either in the coronary or cerebral artery. Kang et al. developed an unsupervised computer method for coronary artery stenosis detection using three-dimensional coronary computed tomographic angiography (CTA) [9]. They achieved sensitivity, specificity, and area under the curve (AUC) of 93%, 80%, and 0.87 for stenosis detection, respectively (grade ≥ 25%). In [10], a multidetector method was utilized to detect stenosis using 64-row coronary CTA. Stenosis detection performance showed sensitivity and specificity of 84% and 92% per vessel-based and 94% and 83% per patientbased, respectively (grade ≥ 70%). In 2018, Zreik et al.
proposed a multi-scale deep learning convolutional neural network to detect stenosis at the left ventricle myocardium in rest coronary CTA [11]. Detection performance showed an accuracy of 71% and AUC of 0.74 per patient-based (grade≥ 50%). However, there have been few studies applied for cerebral artery stenosis detection. For example, a retrospective pilot trail method was proposed by Bucek et al. [12] for the quantification of ICA stenosis using CTA. These studies were done using CTA images. In clinical practice, the Warfarinaspirin symptomatic intracranial disease (WASID) method [13], which manually calculates percent stenosis has been used since 2000. Deep learning (DL) has been used to solve various types of imaging problems. Notably, various DL algorithms have been widely used for image recognition to perform specific tasks [14]. Recently, DL has been applied to several medical imaging fields, including detection of malignant pulmonary nodules [15], detection of cerebral aneurysms [16], segmentation of skin lesions [17], MR artifact denoising [18], MR image reconstruction [19]. A survey on DL in medical image analysis has also been published [20]. In addition, several studies have been developed to adapt 2D DL networks to 3D medical analysis [21], and detection [22]. However, a diagnostic tool for cerebral stenosis detection via DL using TOF-MRA has not been studied to the best of our knowledge.
Here, we developed a 3D detection DL algorithm for detecting ICA stenosis from TOF-MRA. ResNets (Residual Networks) have been used for many medical imaging applications [15], [20]. We propose a 3D Squeeze and Excitation Residual Networks (SE-ResNet), which is inspired by 2D SE-ResNet [21]. The 2D SE-ResNet, which combines residual networks with Squeeze and Excitation Block (SE Block) is one of the state-of-art networks for object recognition [21]. It has shown that it can improve the representation of a network by modeling the interdependencies between the channels of its convolutional features. In this study, we train 3D SE-ResNet with patched sub-volumes from the single-channel TOF-MRA (160 × 160 × 96 × 1 to 64 × 64 × 64 × 1), and testing was performed via patch-based and patient-based methods. In addition, we compare the performance of our 3D SE-ResNet against well-known methods (ResNet [22] and VGGNet [23]) using the same dataset and under the same experimental conditions. The organization of this paper proceeds as follows. First, we introduce the pre-processing of the utilized dataset.  Second, the details of the proposed 3D SE-ResNet method are described. Then, we present and evaluate the results of the proposed model for patch-based and patient-based tests compared to the recent state-of-the-art approaches. Finally, we show the conclusions of our study.

A. OVERVIEW OF THE PROPOSED NETWORK
The overall schematic diagram of the proposed deep learning 3D SE-ResNet for stenosis detection is shown in Fig. 1. The proposed work consists of four main stages: pre-processing, data augmentation, training of 3D SE-ResNet, and testing using patch-based and patient-based methods.

B. TOF-MRA DATASET
In this study, the TOF-MRA dataset of 91 patients were acquired from Seoul National University Hospital, Republic of Korea using a 3.0-T MRI unit (MAGNETOM Verio, Siemens Healthcare, Erlangen, Germany) with the following imaging parameters: matrix size of 512 × 416; field of view, 178 × 220 mm; pixel spacing of 0.43 mm; slice thickness = 0.6 mm; number of slices = 124; repetition time 23 msec, echo time 4.2 msec and multi-channel receivers. The dataset characteristics are shown in Table 1 which involves 50 Internal Carotid Artery Normal (ICA-N) cases and 41 Internal Carotid Artery Stenosis (ICA-S) ≥ 30% cases. The grade was determined by the risk of stroke in patients with asymptomatic internal carotid artery stenosis [2]. The characteristics of the individual stenosis are shown in Table 2. It includes a total of 51 stenoses from all patients. The stenosis was categorized according to the diameter (D_stenosis) of the artery at the site of the most severe stenosis [13], giving 7 cases with ≤ 1.5 mm, 18 cases in range 1.5 -1.9 mm, 19 cases in range 2.0 -2.4 mm, and 7 cases with ≥ 2.5 mm. This categorization was done to show the capability of the proposed 3D SE-ResNet in distinguishing between stenosis of various diameters versus normal patches. To diagnose intracranial artery stenosis in the TOF image, an expert radiologist (with over 10 years experience) evaluated the images using the following steps: 1) Transform the TOF images into Maximum Intensity Projection (MIP) image; 2) Discriminate stenosis in TOF image, which can be found in the MIP image, to distinguish between irregularity or artifact; 3) Use the Warfarin-Aspirin Symptomatic Intracranial Disease (WASID) method [13] to calculate the percent of stenosis. The procedure was further re-evaluated by another radiologist.

C. DATA PREPROCESSING
Due to the different imaging situations (e.g. discrepancies of pixel intensity and image location about the z-axes etc.), a pre-processing was used to align the images in the database similar to the method introduced in a previous study [24]. The following pre-processing steps were performed. First, 3D TOF images were interpolated to 0.5 mm isotropic space, and signal intensities were standardized (mean of zero and standard deviation of one). This step was done for data alignment, converting the voxel sizes and intensities of each subject to the same environment. Second, a slab of 48 mm thickness was extracted using the middle cerebral artery (MCA) as a reference point from 38 mm inferior to 10 mm superior region. This region was selected since most TOF exams include this region and thus can provide robustness to different scanning conditions. Third, cropping of the outer regions was performed to minimize the effects of non-ICA vascular structures such as high signals from the skull or eye regions. From this preprocessing, the final data were standardized with a matrix size of 160 × 160 × 96, pixel resolution of 0.5 mm, slice thickness of 0.5 mm.

D. PATCHED DATASET AND AUGMENTATION
The K-fold cross-validation (K=4) approach was used to better estimate generalization errors under our finite dataset [21]. The K-fold cross-validation strategy was used to divide data into small groups (four groups in our case) and perform the statistical analysis for each group independently. In each time, one group (i.e., 25% of whole data) is utilized for testing, while the remaining groups (i.e., 75%) are utilized for training. This means that the training was performed four times. This strategy ensures that every data gets to be in a test exactly once. Table 3 shows the original training and testing patches from our dataset [25]. Data augmentation to increase the training data were performed using the following steps.
Each TOF-MRA was divided into 32 smaller patches for each subject with a size of 64 × 64 × 64 pixels with a stride of 32. This patch size was determined to create small patched data that can fully contain stenosis regions. For patch-based training and testing, only the patches which contained stenosis in their arteries, labeled by radiologists, were manually selected from stenosis subject. By contrary, all patches from TOF-MRA were utilized for the patient-based test.
In addition, we increased the number of stenosis patches in patch-based training. Since the number of stenosis patches was less than the number of normal patches, they were augmented to 1,184 (or 1,248) patches similar to the number of normal patches in the corresponding fold test shown in Table 3. The 3D patched data were augmented by shifting about the x-and y-axes and flipped about the z-axes. Voxel was shifted within ±6.4 mm and flipped right to the left by uniformly generate random variables within the defined intervals [16], [26]. Since the size of stenosis was small, other augmentation operations that contain interpolation were not performed.
Finally, a uniform dataset of each fold were generated using the above augmentation methods to train the proposed stenosis detection network. Moreover, all training patches were randomly mixed in order to avoid detection bias.

E. PROPOSED DEEP LEARNING DETECTION METHOD
As shown in Fig. 2(b), the Plain Block was built upon convolution operation using 3 × 3 × 3 filters. The outputs of a convolution operator can be expressed as We expanded 2D SE-ResNet [21] to our customized 3D SE-ResNet as shown in Fig. 2. SE-Block [21] was proposed to increase the sensitivity to informative features that can be exploited by subsequent transformations and to suppress less useful ones. Moreover, we fine-tuned the SE-ResNet parameters (e.g. number of layers, filter sizes, etc.) for suitable performance in 3D. Our proposed 3D SE-ResNet scans the ICA and determines the probability of stenosis using the sigmoid function. The 3D SE-ResNet is the main module of the architecture, while an explanation of submodules (Plain Block, SE Block, and Residual Blocks) are described below.
Here, the RELU operator (δ) and batch normalization (BN) were used [27]. I = [i 1 , i 2 ,. . . , i C ] refers to the input Plain Block module and K = [k 1 , k 2 , . . . , k c ] means the learned set of filter kernels, where k c denotes the parameters of the c-th filter.

2) SQUEEZE-AND-EXCITATION (SE) BLOCK
In our proposed model, SE-Block (Fig. 2(c)), which can be performed in two operations, Squeeze, and Excitation was designed to recalibrate contextual information [21]. Using global averaging pooling, the squeeze operation was designed to squeeze global spatial information into a channel descriptor [21]. A statistic S = [s 1 , s 2 , . . . , s C ] ∈R C was generated by shrinking O through spatial dimensions X × Y × Z, where the c-th element of s is calculated by: The Excitation operation aims to fully capture channel-wise dependencies using the information aggregated in the squeeze operation. A fully connected neural network with two hidden layers was used to meet the two criteria: learning a nonlinear interaction and non-mutually exclusive relationship between channels. The outputs of this fully connected neural network S, whereS Here, σ refers to the sigmoid function, and W 1 ∈R (C/r)×C and W 2 ∈R C×(C/r) refers to each fully connected layer with a reduction ratio r (default set 16, [21]). Finally, the output O = [õ 1 ,õ 2 , . . . ,õ c ] can be rewritten by channel-wise multiplication between the scalars c and feature map o c as follows,õ

3) RESIDUAL BLOCK
Residual Blocks [22] (Fig.2 (d)) were added consecutively to generate a shortcut connection for efficient gradient propagate training of the deep convolution neural network. The two blocks generally used in Residual Blocks were implemented (i.e Identity Block and Projection Block). First, the output of the identity block can be written as where, I 0 is the input of the Residual Block module. when the input and output are of the same dimensions. Second, Projection Block [22] was used to match input and output dimensions for dimension decrease. The shortcuts of Projection Block go across feature maps with a stride of 2 in each Plain Block. Using linear projection W s , Projection Block output can be represented as

4) 3D SE-ResNet
Based on the residual network [22], Plain block ( Fig. 2(b)) reduces the internal covariance shift in the batch normalization phase, and SE Block (Fig. 2(c)) recalibrate contextual information in the squeeze phase and excitation phase. All 43328 VOLUME 8, 2020 global spatial information that went through Plain blocks, identity blocks, and projection blocks is squeezed into a channel descriptor by the global averaging pooling. At last, the sigmoid function determines ICA stenosis or ICA normal. Due to the 3D data complexity, the deeper network may provide an overfitting problem. We experimentally tuned the depth of the proposed 3D SE-ResNet according to the AUC.

F. TRAINING
The proposed 3D SE-ResNet was trained and tested utilizing different shapes of stenosis and normal patches. The network training was only performed utilizing the labeled patchbased data. In this study, we randomly divided all TOF-MRA 3D patches into training and test sets with 80% and 20% ratio, respectively. Of the training set, we randomly selected 10% of the data as a validation set in order to optimize the proposed deep learning network. The data augmentation including flipping and shifting processes were applied only to the training data. As mentioned, we performed a k-fold crossvalidation strategy (k=4) for the network optimization and overall evaluation. Optimization was performed using Adam optimizer. The initial learning rate was set to 0.001 and divided by a factor of 10 for every 20 epochs. Our proposed 3D SE-ResNet was trained with patched 3D TOF-MRA data for 50 epochs from scratch, while parameters of the network were fine-tuned during the training backpropagation phase.
The total learning time of each training fold took approximately 30 hours for 50 epochs using a single NVIDIA GeForce GTX 1080 TI GPU. This work was implemented using Keras framework with TensorFlow backend, CUDA8, and CUDNN5.1 on the operating system of Ubuntu 14.04. The code is available at https://github.com/hjdata11/3D-SEResNet.

G. TESTING
In this study, we performed two kinds of test evaluations; patch-based test and patient-based test ( Table 3). The patchbased test was performed where the prediction of the proposed network is based on individual patches compared against the ground-truth diagnosis by the radiologists. This determines the ability of the network to detect stenosis in local regions. A threshold probability value was selected to distinguish between stenosis versus normal. Second, we performed patient-based test where the performance was based on a per-subject evaluation. Here, the probabilities from the patches with the top 15 probability values of a subject were averaged and a threshold value was selected to determine whether the subject had stenosis or not. The top 15 values were experimentally chosen by investigating the AUC values as a function of the number of patches (Fig. 3). As seen in Fig. 3, the AUC value behaved asymptotically at a value near 15, and further increasing the number of patches did not enhance the AUC. As in the patient-based test, a threshold VOLUME 8, 2020  value for the averaged probability was selected to distinguish between stenosis versus normal.

H. EVALUATION METRICS
For objective measures, the confusion matrix and the Receiver operator characteristic (ROC) were utilized [28]. Sensitivity and specificity values were determined from the confusion matrix. In addition, the AUC of the ROC was determined to measure the overall performance. Sensitivity and specificity evaluation indices were used which is defined as, Sensitivity = (TP)/(TP + FN ).

A. STENOSIS PROBABILITY
This section presents the stenosis detection performance of our proposed 3D SE-ResNet. Fig. 4 shows examples of cases for stenosis positive subjects. Fig. 4(a)

B. RESULTS OF PATCH-BASED TEST
Results for the patch-based test for all four-fold test are shown in Table 4 in terms of sensitivity, specificity, AUC, and accuracy. At each fold test, a test dataset containing stenosis and normal cases were used to evaluate our proposed 3D SE-ResNet (refer to Table 3 for more details in numbers). Overall, 1,741 patches were used as a testing set, which contained 141 stenosis and 1,600 normal patches. The probability threshold to distinguish between stenosis versus normal was determined to be 0.015. The results show the robustness of our 3D SE-ResNet on detecting the stenosis with an overall average accuracy throughout four-fold tests of 91.0% in the patch-based test. It is clearly seen that the stenosis cases were correctly detected with an average of 81.0% in terms of sensitivity, whereas the normal cases were accurately detected with

C. RESULTS OF PATIENT-BASED TEST
Results for the patient-based test are shown in Table 5. The overall test dataset contained 91 patients which were divided into 41 stenoses and 50 normal cases. The averaged probability threshold in differentiating stenosis vs normal was experimentally determined to be 0.05. The results represent the robustness of the ensemble approach to detect the stenosis with an overall average accuracy of 81% through four-fold tests in patient-based. It demonstrates that 34 stenosis patients were precisely detected among 41 stenosis patients, whereas 40 out of 50 normal people were correctly detected. The  layers, 4 max pooling layers and 1 dense layer in the 3D VGG. To evaluate each model under the same conditions, similar hyper-parameter optimization was determined for all networks (i.e., 3D-VGG (15 convolutional, 4 max pooling, 1 fully connected), 3D-ResNet (17 convolutional, 1 max pooling, 1 average pooling, 1 fully connected), and proposed 3D-ResNet (17 convolutional, 1 max pooling, 1 average pooling, 15 fully connected)). All networks had approximately 800,000 training parameters. The motivation for adding the SE block was to improve the performance of the diagnostic ability compared to conventional 3D VGG or 3D ResNet. Our detection model achieved an overall AUCs of 0.947 with improvement rates of 0.053 and 0.095 compared to 3D ResNet and 3D VGG methods in the patch-based test, respectively. Our 3D SE-ResNet also outperformed other networks in the patient-based test with incremental rates of 0.053 and 0.065 compared to 3D ResNet and 3D VGG methods, respectively. These results indicate the capability and potential of the proposed method.

IV. DISCUSSION
Our results show the potential of the proposed 3D SE-ResNet in detecting the abnormalities of stenosis with high accuracy and AUC. The 3D SE-ResNet network adaptively generates global informative features and eliminates less useful representations by recalibrating channel-wise information through the convolutional layers. It can also be trained with small patches. In this study, dataset were uniformly distributed by applying data augmentation, and data augmentation was imposed again during the training. Learning better representations of the stenosis features of DL was improved by these augmentation approaches as concluded in [29], [30].
The detection method seems feasible as an aid for clinical usage since less than 1 second of processing time was required to detect the potential stenosis in the TOF-MRA volume. The training time per epoch and the test time per a single patch for each of the TOF-MRA volumes are shown in Table 6. It is seen that the proposed 3D SE-ResNet required almost the same computation time compared to 3D ResNet and 3D VGG methods even with the existing of Squeeze and Excitation Block that contained more fully connected neural network layers.
Our proposed 3D SE-ResNet outperformed one of the most prominent deep learning approaches (i.e., 3D Resnet and 3D VGG) on detection between stenosis and normal. To show the feasibility and usefulness of our model, we conducted patchbased test with small patches and the patient-based test with the ensemble approach. In a previous study, discrimination methods for stenosis [31] were non-automatic, whereas our proposed network can make an automatic comprehensive diagnosis of stenosis in ICA through the patch-based test and the patient-based test. It was shown that patch-based probability can estimate using the small matrix size (64 × 64 × 64). Furthermore, patient-based probability can be calculated within one subject and stenosis locations can be extracted from the full TOF-MRA volume. Therefore, this comprehensive judgment based on patch-based and patientbased seems to be useful for radiologists as a second-party evaluator.
In order to better understand the network's function, Grad Cam (Gradient-weighted Class Activation Mapping) [32] was used which can provide visual explanations for decisions and make them more transparent. Fig. 6 demonstrates several Grad Cam maps for stenosis ( Fig. 6(a)) and normal ( Fig. 6(b)). In the stenosis case, as seen, the Grad Cam map is seen to correspond with the stenosis region. In the normal case, the Grad Cam map of the normal case is seen to be more dispersed than the stenosis case. These maps show that the   network is adequately trained and that regions of importance are being monitored by the 3D SE-ResNet.
The limitations of this work are as follows. Although the proposed 3D SE-ResNet detection method outperformed other deep learning approaches, it still needs improvement in reducing false-negative and false-positive cases. Fig. 7(a,  b) represents the false-negative cases in the patch-based test, whereas Fig. 7(A, B) shows false-negative cases in the patient-based test. On the other hand, Fig. 8 shows the patchbased test (a, b) and the patient-based test (A, B) that were falsely predicted as positive. It is a very challenging task to distinguish between stenosis and normal cases due to their high similarity and irregularity and more data are needed for improvement. In spite of the fact that conventional x-ray digital subtraction angiography is a gold standard imaging technique, it cannot be routinely used due to the risk, inconvenience, and cost [33], [34]. In opposite, in daily clinical practice, TOF MRA is a well-established technique for detecting the stenosis of the intracranial arteries [6]- [8]. However, it is prone to artifacts from signal saturation and off-resonance near the skull base.
In the future, it would be worth studying the abilities to detect small stenosis residing in the middle cerebral artery (MCA) and/or the anterior cerebral artery (ACA) regions. Moreover, in different modalities, Maximum Intensity Projection (MIP) images may increase the accuracy of the stenosis detection by showing various angles and generating a great amount of training data. Therefore, MIP images could be utilized to improve the detection performance of stenosis in conjunction with TOF-MRA.

V. CONCLUSION
In this study, we presented a deep learning algorithm for stenosis detection via the 3D SE-ResNet network. The proposed 3D SE-ResNet utilized the Squeeze and Excitation Block to compress the important information and rescale features according to the importance, while Residual Block prevents gradient vanishing problem by skip connection. As a result, our model outperformed recent 3D deep learning VGG and ResNet approaches in detecting stenosis of patch-based and patient-based tests. From 1998 to 2003, he was a Researcher with the Stanford University Lucas MRI/S Imaging Center, CA, USA, and a Consulting Engineer with the General Electric Healthcare, Menlo Park. He is currently a Full Professor with Yonsei University. His research interests include MRI, biomedical signal processing, computer-aided diagnosis, magnetic resonance electrical property tomography, and myelin water imaging. VOLUME 8, 2020