Volumetric Segmentation of Brain Regions From MRI Scans Using 3D Convolutional Neural Networks

Automated brain segmentation is an active research domain due to the association of various neurological disorders with different regions of the brain, to help medical professionals in prognostics and diagnostics. Traditional techniques like atlas-based and pattern recognition-based methods led to the development of various tools for automated brain segmentation. Recently, deep learning techniques are outperforming classical state-of-the-art methods and gradually becoming more mature. Consequently, deep learning has been extensively employed as a tool for precise segmentation of brain regions because of its capability to learn the intricate features of the high-dimensional data. In this work, a network for the segmentation of multiple brain regions has been proposed that is based on 3D convolutional neural networks and utilizes residual learning and dilated convolution operations to efficiently learn the end-to-end mapping from MRI volumes to the voxel-level brain segments. This research is focused on the segmentation of up to nine brain regions including cerebrospinal fluid, white matter and gray matter as well as their sub-regions. Mean dice scores of 0.879 and 0.914 have been achieved for three and nine brain regions, respectively by using the data from three different sources. Comparative analysis shows that our network gives better dice scores for most of the brain regions than state-of-the-artwork. Moreover, the mean dice score of 0.903, obtained for eight brain regions segmentation with MRBrains18 dataset, is better than 0.876 which was achieved in the previous work.


I. INTRODUCTION
The domain of medical imaging analysis encompasses a variety of tasks ranging from tumor detection, tumor segmentation to organ and multi-organ segmentation. Segmentation amongst all has gained a reputation as one of the leading problems in this area as it helps in the detection, analysis and treatment of the organ or tissue-related problems [1]. Brain segmentation is important for the analysis of different brain regions as volume, surface area and The associate editor coordinating the review of this manuscript and approving it for publication was Jinjia Zhou . morphology of them have found to be linked with various neurological disorders such as Parkinson's and Alzheimer's diseases [2]- [4]. The precise segmentation of different brain regions and tissues is usually a prerequisite for the detection and diagnosis of various neurological disorders. The importance of brain segmentation can be realized through the fact that various conferences such as MICCAI [5] hold challenges for it, and they have continuously been held for so many years now. Over the past years, many techniques have been experimented to segment brain for finding the most accurate -results and even today it stands to be a hard task. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Manual segmentations are very time consuming and susceptible to variability, so researchers have moved towards the development of automated techniques. Initially, atlas-based segmentation or pattern-recognition based methods were extensively used for segmentation problems, but mostly atlas-based ones [6]- [15]. For atlas-based methods, the image at hand is mapped onto pre-defined atlas for that specific task. Although, atlas-based methods obtain good segmentation results and are generally robust to certain anomalies, however, their dependability on population-specific atlases might limit their applicability to the dataset that is not well represented by the atlas. Due to this, it becomes difficult to segment brain tissue/region types accurately. Moreover, atlas-based methods are suboptimal if the patient population in the dataset is significantly different from the atlas. In this case, these approaches are unable to perform well due to variability in brain morphology among patients. To overcome these limitations, pattern recognition approaches were proposed that use spatial, intensity or other information in atlas space as features for the segmentation of different regions [13], [16]- [18]. These afore-mentioned methods require explicit information or features for tasks at hand.
In view of the limitations of manual, atlas-based or pattern recognition-based segmentation methods, researchers have moved towards using deep learning architectures. Deep learning based segmentation methods are capable of self-learning and can generalized well over large amounts of data [19], [20]. Deep learning architectures are gradually becoming more mature and are outperforming classical stateof-the-art methods. The performance of deep learning methods depends on the availability of training datasets to achieve generalization.
Convolutional Neural Networks (CNN) have known to be useful in various computer-vision tasks including but not limited to image recognition, object detection, classification and segmentation. CNNs learn the features in a hierarchal manner through multiple convolutions across multiple layers and do not need any predefined features or spatial information. The convolutional layers learn the spatial spacing and their generalizability through training [21]- [23]. So, CNNs have been used by various researchers for the segmentation of brain regions by varying the architecture as per task. By deep learning methods, the brain has been successfully segmented into three major regions: Cerebrospinal Fluid (CSF), Gray Matter (GM) and White Matter (WM). Many sub-regions 8, 25, 134, etc. have also been segmented whilst using variations of CNNs [24]- [26].
Deep learning has most prevalently been used for the twodimensional images but as medical images are acquired in three-dimensional volumes, 3D CNNs have become the latest technique for the segmentation problems in the medical imaging domain [25], [27]. 3D images tend to incorporate spatial information which gets lost if 2D patches of those volumes are used. Hence this study presents a deep learning technique based on 3D CNN for segmentation of three and nine regions of the brain including WM, GM, CSF and their sub-regions.

II. RELATED WORK
Initially, research was focused on the segmentation of three major regions of the brain i.e. GM, WM and CSF. Tools like FreeSurfer [28] and FSL [29] have been used for the segmentation of the afore-mentioned regions using atlas-based methods. However, due to their time consumption, researchers have moved towards deep learning. Even within deep learning, two-dimensional architectures have been used extensively. So, for brain segmentation, the research started with the 2D CNNs leading to the usage of 3D architectures.
In this regard, a hybrid architecture of CNNs was proposed by de Brebisson et al. [26], which used three 2D patches and one 3D patch to segment 134 anatomical regions of the brain. 3D spatial information around each voxel, as well as 2D information in each plane (axial, coronal and sagittal), was utilized. Convolutional layers along with max pooling layers were utilized to extract and downscale features that were eventually combined through fully-connected and softmax layers. Also, to ensure refined segmentation, centroid distances between regions and voxels were used. The initial segmentation of the image was learned through the first network, then centroid distances were computed and the image along with centroids was given as input to full network for results and again centroid was re-calculated using the refined segmentation. The improved performance was achieved with the help of this additional input of centroids. In this work, 0.725 dice score was achieved for MICCAI multi-atlas labeling challenge dataset. 2D CNNs were used by Zhang et al. [30] to segment CSF, GM and WM of MRI scans of infants since the distinction between these regions of infants is not distinct as adults. In research presented by Moeskops et al. [31], volumes from three different regions of the body were used for segmentation whilst using a single network. An average dice score of around 0.80 was achieved by segmenting 6 tissues from MIC-CAI 2012 multi-atlas labeling challenge. In another study, 2D CNNs were utilized by Moeskops et al. [24] to segment 8 regions of the brain. Five different datasets for volumes of infants, young adults and aging adults were used. Their system was able to classify eight sub-regions and three major regions of the brain better than any previous work.
The usage of 3D CNNs in brain segmentation tasks has not been very prevalent except in recent years. The concept of residual learning into a 3D network was extended by Chen et al. [27], who introduced 3D residual neural networks. Residual learning strengthens the feature representation by using skip connections and adding the output of the preceding layer to its succeeding layer to enhance performance. This network produced a better dice score than that of the top ten teams on multi-modality volumes of MRBrains 2013 challenge. 2D, 2.5D and 3D patches of volumes were utilized by Milletari et al. [32] for segmentation. The feature vector was produced by the fully-connected patch of each dimension. The patch and the distance of voxel from where the patch was collected to the corresponding centroid in volume were stored in the database. To segment a new instance, the fully-connected features were used to identify K-nearest neighbors in the database based on feature vectors. After identifying the neighbors, the distances stored in the database were used to perform localized segmentations. Better results were achieved than VNet for brain segmentation task.
A 3D CNN was employed by Wachinger et al. [25] to perform segmentation of structured MRI scans by using multi-task learning and Condition Random Fields (CRF) to enhance performance. Better results were achieved than FreeSurfer and FSL for segmentations of 25 regions with a mean dice score of 0.897. Another 3D CNN based network architecture was presented by Wong et al. [33] that used exponential logarithmic loss function to segment 19 brain regions. The skip connections and deep supervision were employed in the network to improve the efficiency of 3D segmentations. The exponential logarithmic loss function helped to learn the brain regions that differ in their sizes, morphology and complexity.
AssemblyNet was developed by Coupé et al. [34] for the segmentation of 132 brain regions. This network was made of two assemblies of U-Nets that shared knowledge among the neighboring U-Nets. The features learned by the first assembly were refined by the second assembly. The majority voting scheme was used to obtain the final decision. The model was evaluated on the MRI images from three different datasets and 0.733 dice score was obtained. Similarly, patch-based 3D CNN was employed by Ganaye et al. [35] to segment eight regions of the brain. The network consisted of encoding and decoding layers for feature extraction and label reconstruction, respectively. To learn more robust features, transition layers and batch-normalization were used in every convolution layer. T1, T2-FLAIR and T1-IR scans were used to train and evaluate the model. Their model ranked 1st in the MRBrainS18 challenge at the MICCAI 2018. The literature review on the segmentation of brain regions using deep learning techniques has been summarized in Table 1.
Besides brain tissue and region segmentation, there are other studies on deep learning-based semantic segmentation for medical images. In this regard, Yang et al. [49] presented a method for the segmentation of left atrium (LA) and pulmonary veins using the Late Gadolinium-Enhanced Cardiac MRI (LGE-CMRI). The method was based on deep learning utilizing convolutional long-short term memory (convLSTM) based sequential learning and dilated residual learning to segment both heart sub-structures from LGE-MRI images obtaining a dice score of 0.897±0.053. Similarly, a recent study [50] was presented for the automatic segmentation of high-intensity scar tissue and left atrium anatomy in the 3D LGE-CMR images of atrial fibrillation patients. For the segmentation of both scar and left atrium, a method based on a multi-view-two-task (MVTT) recursive attention model was proposed, which consists of three subnetworks incorporating multi-view learning, convLSTM and attention mechanism. The mean dice scores of 93% and 87% were obtained for LA anatomy and scar, respectively.
Computer-aided diagnosis of brain tumors is also an active field of study in medical imaging. In this regard, over the years research has been carried out for not only the classification and detection [51]- [53] of brain tumor types but also their segmentation [54], [55] using deep learning methods. An automatic method for brain tumor segmentation from 3D MRI images was proposed by Dong et al. [56]. The developed method was based on U-Net based deep CNN and was evaluated on Multimodal Brain Tumor Image Segmentation (BRATS 2015) datasets achieving promising results. Prostate segmentation is another research area in medical imaging based semantic segmentation. Liu et al. [57] presented a deep learning based algorithm for automatic prostate zonal segmentation. The developed algorithm was based on CNN and was able to segment the peripheral zone (PZ) and prostatic transition zone (TZ) on T2-weighted MRI images. The model was developed using MRI scans obtained from 250 patients and tested on 63 patients. The dice similarity coefficient were 0.74±0.08 and 0.86±0.07 for PZ and TZ, respectively. The results were comparable to other methods for prorate zones segmentation.
Medical images are acquired in three-dimensional volumes, therefore, 3D CNN has been prevalently employed for the automatic segmentation tasks. 2D CNN fail to extract volume and context information from adjacent slices which may be useful for precise segmentation. Some studies [50], [58] utilized the attention mechanism to learn intra and inter-slice features and context information from 2D slices to perform medical imaging segmentation tasks. On the other hand, the use of 3D CNN for three-dimensional images tend to incorporate spatial information which gets lost if 2D patches of those volumes are used. 3D CNN work by using three-dimensional convolutional kernels to make predictions from the volumetric patch of a scan. Their ability to extract inter-slice information lead to improved performance in various studies [59]- [61]. Similarly, the mechanism of dilated convolutions [62] and residual learning [63] have been utilized to perform segmentation tasks with improved performance. Dilated convolutions aggregate multi-scale contextual information without losing resolution. Considering the advantages of these techniques, our aim in this study is to combine dilation and residual learning mechanisms in a 3D CNN to perform volumetric segmentation.
The benefit that deep learning has provided besides timeconsumption is that it made it feasible to segment more than a hundred anatomical regions of the brain as they follow a hierarchal feature learning. Nevertheless, even if multi-region segmentation of the brain has become possible, the precision is still a problem even with three basic regions as the dice score is usually near 90 but not over that. So, this paper focuses on the segmentation tasks of three and nine regions of the brain. Considering all the previous research and the amount of post or parallel processing that has to be employed for refined segmentation, we intend to demonstrate that with 3D CNNs, dilated convolutions and residual learning, better results can be achieved.

III. METHODOLOGY
Two segmentation tasks were performed in this work. The first task was the segmentation of the brain into three regions including GM, WM and CSF. And the second task was to segment brain into nine regions namely WM, CSF, Cortical Gray Matter (cGM), Brain Stem (BS), Cerebellum (CB), Basal Ganglia (BG), White Matter Lesions (WML), Ventricles (VT) and Infarction (INF). This section discusses the data sources and the deep learning architecture used for this study.
A. DATASET MRI brain scans were acquired from various databases and were preprocessed to be used for segmentation tasks. In the dataset, there were a total of 90 MRI scans; 34 scans from Alzheimer's Disease Neuroimaging Initiative (ADNI) [64], [65], 21 scans from MRBrain18 challenge [66] and 35 scans from MICCAI 2012 challenge [67].
For three brain regions, a segmentation dataset of 76 MRI scans was prepared that contained images from ADNI, MICCAI 2012 challenge and T1 weighted images from MRBrain18 challenge. Since MRI scans provided by MRBrain18 challenge contained segmented images, however, the remaining dataset was manually segmented to obtain labels for three brain segments. For this purpose FSL toolbox [29] was used and segmented data was generated in two steps. Firstly, automated brain extraction was performed by using FSL-BET [68], [69] for extracting brain voxels and removing non-brain regions/tissues such as skull and neck tissues from images. Secondly, brain scans were segmented into 3 regions (CSF, WM and GM) using FSL-FAST tool [70]. An algorithm based on expectation-maximization and hidden Markov random field was used by FSL-FAST to produce three segmentations of the whole brain by correcting non-homogeneities in intensity values. After this, the segmentations generated by FSL-FAST were manually corrected using ITK-Snap tool [71]. The details of the dataset for 3 brain segments have been presented in Table 2. For nine brain regions, the dataset of MRBrains18 challenge was used, as it provided MRI scans from 7 subjects in 5 different modalities (FLAIR, IR, registered IR, T1 weighted and registered T1 weighted). However, data from 3 modalities (T1 weighted, registered T1 weighted and FLAIR) was used in this study. There were 21 images in total along with their segmentation (cortical gray matter, cerebrospinal fluid in the extra-cerebral space, white matter, basal ganglia, ventricles, brain stem, white matter lesions, cerebellum and infarction). For the segmentation task of nine brain regions, registered T1 weighted and FLAIR images were used. Data augmentation was also employed that included rotation by −10 • to 10 • , translation by 0.9 to 1.1 and scaling by −10% to 10%. Data augmentation was performed by using SimpleITK [72] python library. The details of the dataset for 9 regions of the brain have been given in Table 3.

B. PROPOSED DEEP LEARNING ARCHITECTURE: HIGH3DSEGNET
In this work, a method is presented for the volumetric image segmentation of cortical and subcortical regions of the brain. Our method is based on 3D convolutional neural networks and utilizes the concepts of skip connections, residual learning and dilated convolutions for efficiently learning the endto-end mappings from MRI volumes to voxel-level brain segments. The architecture of the proposed network has been shown in Figure 1. VOLUME 8, 2020 The network contains 19 convolutional layers. In each convolutional layer, 3 × 3 × 3-voxel convolutions are applied, except for the last layer. Due to the small 3D convolutional kernels, the network has relatively fewer parameters. In the first five convolutional layers, 16 convolutional kernels are employed to learn low-level features from images. To learn features at multiple scales, dilated convolutions [73] are applied by gradually increasing the dilation factor as the layers go deeper.
Unlike the previous volumetric segmentation networks, for example, 3D U-net [59], that uses down-and up-sampling to learn hierarchical image features increasing computational cost; in this study, dilated convolutions has been used to compute features with high spatial resolution. To mathematically describe the dilated convolution for input feature map I with N channels and to up-sample the convolutional kernels with the dilation factor d, the output feature map O is generated as: where x, y, z denote spatial locations of volumes and W denotes kernels. The dilated convolution conserves spatial resolution of the images and creates a receptive field of (2d+1) 3 voxels. Therefore, in the later layers dilated convolutions are applied and kernels have been dilated by a factor of 2 in the layers six to nine, dilated by a factor of 4 in the layers ten to thirteen and dilated by a factor of 8 in the layers fourteen to seventeen, respectively. The convolutional layers, that employ dilated kernels, learn middle and high-level features from images.
Residual connections [63], [74] have been applied between two consecutive convolutional layers throughout the network to allow the fusion of features from different scales and to improve the information propagation. To mathematically describe this, if the input from the L-layer to the residual block is x L , then the output x L+1 from the residual block is: where f (x L, w L ) is the non-linear function. By stacking residual blocks, the output x N form the last layer can be denoted by: Within each residual block, each convolutional layer is associated with a batch-normalization layer [75] and a Rectified Linear Unit (ReLU) layer, arranged in a pre-activation order [63]. The second-last layer is a convolutional layer without any dilation and employs 64 kernels. The last layer in the network applies 1 × 1 × 1 convolutions. The learned features from the last layer are passed to the softmax layer to output predicted probabilities over all labels. The mean dice coefficient [61] was used as a loss function for the volumetric image segmentation task. Let the image volume is denoted by {x v } V v=1 and the L-label segmentation map is denoted by {y v } V v=1 , where V represents number the of voxels and y v ∈ {1, 2, 3, . . . , L} then the mean dice coefficient can be expressed as: where δ denotes Dirac delta function, Soft max l (x v ) represents the softmax classification score of x v over the l th -class. During training, the mean dice coefficient is maximized.

IV. RESULTS AND DISCUSSION
Our network was trained and evaluated on a combination of three datasets i.e. MICCAI 2012 challenge, ADNI and MICCAI MRBrainS18 challenge. Furthermore, two segmentation tasks were performed i.e. segmentation of 3 brain regions and segmentation of 9 brain regions. The network architecture for volumetric segmentation of MRI scans was designed in the same way as shown in Figure 1. The implementation was done using Python and especially TensorFlow and NiftyNet [76] library. All the experiments were conducted on an Intel(R) Core i5 system with 16GB RAM and a ZOTAC 11GB GPU.

A. SEGMENTATION RESULTS
For the segmentation task of three brain regions, there were total samples of 76 images from three different sources, as shown in Table 2. For the segmentation task of nine brain regions, there were total samples of 56 images that included augmented images and images of 2 different modalities from the MICCAI MRBrainS18 challenge as illustrated in Table 3.
In the training process of the network, all the weights were randomly initialized in a normal distribution with mean 0 and standard deviation 1. The learning rate was initialized with a value as 1e-03 and Adam optimizer was used. The dataset was split for training, validation and evaluation with 70%, 20% and 10% ratios, respectively. The network showed better convergence and speedy learning capabilities. However, since the network did not down-sample the inputs and number of kernel filters were increased by a certain factor, space complexity was higher. In the experiments, patch size of 88 × 88×88 was used. The quantitative results in terms of Dice Score (DS), Jaccard Score (JC), Symmetric Volume Difference (SVD) and Volumetric Overlap Error (VOE) of each brain structure obtained for both segmentation tasks have been shown in Table 4.
The proposed method was evaluated on MRI images from different sources. Since the training data was limited and there were chances of over-fitting. Firstly, suitable hyperparameters of the network were searched in various experiments. After, parameter tuning, the network was trained using the dataset and optimal hyper-parameters. The training of the network was performed until the dice coefficient stopped improving and validation loss stopped decreasing. After this, the testing data was used for evaluation and prediction. The accuracy of segmentation was evaluated by using mean dice score measure. The dice score for each sample in the testing dataset was calculated and an average is computed across all the samples in the testing dataset. The box plots in Figure 2 and Figure 3 represent the distribution of dice scores of each brain structure over the validation dataset for three and nine regions segmentation tasks, respectively. For the quantitative evaluation of segmentation results, the box plots were observed to identify the outliers and distribution of prediction accuracies. For the segmentation task of three brain regions, the dice scores were relatively better without any outliers. This shows that their accuracies were pretty much close to each other and our model performed considerably well on predicting all the testing samples. For nine brain regions, the results were quite different. Cerebellum, brain stem and ventricles segmented well across all the testing images. These three regions achieved high dice scores with a few outliers. The infraction was the region obtaining the highest dice score without any outlier. Although white matter lesions, cerebrospinal fluid, cortical gray matter, white matter and basal ganglia were the brain structures with most outliers, their dice score was relatively better.
The WHM structure was most challenging to learn among all the regions. Since these are lesions or abnormalities in WM and they don't have predefined size, shape or location. Most number of outliers are in WML class, however, the average dice score of this structure is better. Our method worked best on larger regions such as cerebellum and ventricles and achieved relatively low scores with irregular and smaller regions such as WML. However, our segmentation accuracy even for smaller regions was encouraging, thus overall dice score across all the testing samples and overall the brain structures was 0.914. If traditional categorical cross-entropy measure was used, it would have given more importance to large brain structures that would have resulted in under-or over-segmentation. The use of the dice coefficient helped to avoid such a problem. That's why our proposed method was able to perform better segmentation of WML even though it had the lowest dice score among all the brain regions. The brain images and their corresponding regions segmented by our method are illustrated in Figure 4 and Figure 5.  An input image ground truth and predicted segmentation have been shown. From the input data and ground truths, it was observed that the brain regions have very complex morphology and great variations across slices. These morphological variations are more evident in the brain images of different subjects. Brain regions not only have diverse appearances but also vary in size. A small tissue or region that appears in fewer slices is difficult to locate and segment automatically. Moreover, the intensity values of voxels also overlap significantly across brain tissues. This diversity of brain regions across slices and subjects makes the segmentation task very challenging.
Leveraging the advantages of 3D CNNs combined with the techniques of dilated convolutions and residual learning, our method uses the spatial information of voxels and effectively segments the brain regions despite their size. In the segmentation of three brain regions, it was noticed that our method accurately detected the borders of the brain tissues and accurately classified the voxels into GM, CSF and WM. Similarly, in the segmentation of nine brain regions, the brain regions were effectively segmented. However, there were a few false positives and false negatives in the results. For example, a few voxels belonging to some regions, particularly WM, BG and VT are different from the ground truth. While, CSF, cGM and WML tissues are quite learned accurately.
Notice that, the dice scores of some regions such as WML were relatively low compared to larger regions. However, they were predicted quite well by our model. The reason for this is their irregular morphology that affects the learning. Due to this, the dice scores vary significantly across brain scans, resulting in low overall dice scores. However, due to the small size of these brain structures, there were a few incorrect voxels predicted and hence fewer false positives and false negatives. Contrary to this, the predicted results of the large regions such as WM and VT contain more misclassified voxels and appear slightly different from their ground truths. Although their dice scores are better with fewer outliers, their large sizes require accurate prediction of a greater number of voxels than the smaller brain structures. cGM is the largest brain structure that has been predicted accurately.
These discrepancies in the results arise due to the overlapping and similar nature of voxel intensities. The voxels in the adjacent brain tissues have nearly similar coordinates. Some of the brain structures don't have predefined shapes or locations or both. Moreover, data is from different sources and subjects. These images have been acquired with different scanners and from subjects with different brain disorders. For example, the dataset acquired from ADNI contained scans of Alzheimer's patients. The scans from MRBrains18 were acquired from multiple patients with different health conditions such as dementia, Alzheimer's and diabetes as well as healthy subjects. These images contain varying degrees of white matter lesions and atrophy. The age of subjects was also different as patients with neurodegenerative disorders such as dementia and Alzheimer's are mostly of age >70. Moreover, neuro-degenerative disorders greatly affect the brain morphology and shapes of brain structures. The model had to learn not only the complex brain structures but also the variations associated with the brains of healthy or diseased patients. Despite, a few incorrect results, our method performed quite well for most of the brain regions and small brain regions have been learned accurately.

B. COMPARATIVE ANALYSIS
The achieved results were compared with other methods proposed in the previous studies. Although there are many tools and methods for the segmentation of GM, WM and CSF and most of them use atlas-based methods. For the comparison of results, those studies were identified that have used deep learning techniques and segmented the same brain regions that are under consideration in this study.
For the segmentation task of three brain regions, our results were compared with three recent studies. In this respect, Zhang et al. [30] presented a method based on 2D CNNs to segment infant brain scans into three tissue types using multi-modality MR images. The proposed network architecture employed convolutions, pooling and other operations in several layers to learn low-and high-level features. Another method was presented by Nie et al. [77] to segment infant brain scans into three tissue types. The method was based on employing fully convolutional networks (FCN) for the segmentation of three modalities of MR images including T1, T2 and fractional anisotropy. Initially, the network was trained with images of each modality separately. And then, the features learned from each modality were fused together to get the final segmentation maps.
While, Nguyen et al. [78] used 3D CNNs and Gaussian mixture models (GMM) to segment GM, WM and CSF. Firstly, GMM identified voxels that were easy to classify based on their intensity values. While CNN identified the voxels that had overlapping intensities. The method was evaluated on IBSR 18 dataset and better results were obtained for GM and WM but significantly lower dice score for CSF. Although, the aforementioned studies obtained better results, however, small datasets from healthy subjects were used for evaluation that might limit the applicability of these methods in clinical practices for disease diagnosis and prognosis.
For this study, a relatively large dataset from three different sources was obtained from healthy as well as patients with dementia, Alzheimer's and diabetes. Dice scores of 0.872, 0.872 and 0.896 were obtained for WM, GM and CSF, respectively by using our method. As compared to previous studies, our method didn't achieve the best dice scores for GM and WM. Since, Alzheimer's and dementia patients are characterized by WM microstructural and ischemic changes as well as cortical atrophy [79], [80], [81]. As a result of this, our dataset contained MRI scans with varying levels of WM lesion load and GM atrophy making precise segmentation of these regions a challenging task. Our method achieved reasonable dice scores for these two regions and can help to segment brain tissues of the patients that have been affected with neurodegenerative diseases. The diversity in the dataset makes the model more generalized and suitable for clinical applications related to neurological disorders. Our method succeeded to achieve the best dice score for CSF and overall dice score of 0.879 as compared to other methods. The detailed results and comparative analysis for the segmentation results of three brain structures in terms of dice scores are presented in Table 5. In order to perform comparative analysis for the segmentation task of nine brain regions, the previous studies were identified. It was observed that most of the work was on the segmentation of three brain tissues. A few studies have subdivided the three brain regions into sub-regions. We considered those studies for comparison that have worked on brain regions similar to ours. One such study was conducted by Moeskops et al. that presented a method for the automatic segmentation of brain tissues. The method was based on multi-scale CNNs and used T1, T2-FLAIR and T1-IR images as input. The dataset from the MRBrains13 challenge was used for performing brain segmentation. Although the data provided by the challenge contains three brain segments, however, for their study, these segments were subdivided into 8 regions including WM, cGM, CB, BS, WMH, Lateral Ventricular Cerebrospinal Fluid (lvCSF), Basal Ganglia and Thalami (BGT) and Peripheral Cerebrospinal Fluid (pCSF). Their method achieved an average dice score of 0.67.
Moreover, the dice score of 0.85 for cGM, 0.87 for WM, 0.93 for CB, 0.92 for BS, 0.82 for BGT, 0.93 for lvCSF and 0.76 for pCSF were achieved. For the comparison purpose, we considered the dice scores of cGM, WM, BS and CB only.
Another method was proposed by Luna et al. for the automatic segmentation of eight brain regions from MRI images using patch-based 3D CNNs. The network consisted of encoding and decoding layers for feature extraction and label reconstruction, respectively. To learn more robust features, transition layers and batch-normalization were used in every convolution layer. T1, T2-FLAIR and T1-IR scans were used to train and evaluate the model using the dataset from MICCAI 2018 challenge. Similarly, Ahn et al. [47] also worked on the segmentation task of eight brain regions. Their method used the attention module and CNN based approach. The network consisted of compression and intensity modules to improve feature representation and spatial attention of pixels. The proposed method was evaluated using MRBrainS18 dataset.
Luna et al. achieved an average dice score of 0.855 and Ahn et al. achieved 87.60. Since both studies considered eight brain tissue classes, to have a fair comparison, the results of eight brain regions were considered. The average dice score of eight brain regions with our method was 0.903 which was higher than both studies. Moreover, the dice scores of 0.879 for cGM, 0.839 for BG, 0.864 for WM, 0.844 for WML, 0.906 for CSF, 0.937 for VT, 0.973 for CB, 0.982 for BS and 0.999 for INF were obtained by using our method. By observing the results, it was found that our method obtained best dice scores for cGM, BG, CSF, VT, CB and BS. The results of other brain structures such as WM and WML were also encouraging making average dice scores over all the brain structures higher than previously achieved with any method. The analysis of results for the segmentation task of nine brain regions is presented in Table 6. Our results were compared with the common brain regions in both studies. The comparative analysis is illustrated in Figure 6.
In this study, a method has been proposed for the automatic segmentation of various brain structures using 3D CNNs. The model has been evaluated for the segmentation task of three and nine brain regions. Contrary to the previous studies that have segmented eight brain structures, an additional brain structure i.e. infraction was also considered for segmentation. It was shown that our brain structures segmentation approach can be employed to include INF as an additional segmentation class. Unlike, other methods, the inclusion of INF in our evaluation didn't decrease the performance, this structure was learned quite effectively and obtained the highest dice score of all the other brain structures. Moreover, another brain region WML that is challenging to segment, has also been included in the evaluation. Thus, performing segmentation of brain tissues, structures and abnormalities at the same time.
In this study, images from three different modalities (T1 weighted, registered T1 weighted and FLAIR) were used. It was shown that our method is not limited to the input modality of brain scans and can effectively work with other VOLUME 8, 2020 TABLE 6. Comparison of results for 9 brain regions segmentation. MR modalities. In terms of abnormality detection, it was observed that our model automatically detected lesions and infraction despite the image volume belongs to a healthy or a diseased patient. In clinical applications, a method that can automatically detect and segment brain structures and lesions would be beneficial to increase the sensitivity and precision of diagnosis. Additionally, data generated by such an automatic method can be used to improve the automatic segmentation method.
The data used in this study for the segmentation task of nine brain regions (Table 3) was anisotropic, scans having a voxel size of 0.958mm × 0.958mm × 3.0mm. The use of 3D CNNs is advantageous to learn features from isotropic images. Due to the low quality of anisotropic images, it is difficult for the network to learn features representing the dataset. Despite the anisotropic nature of scans used in this study, our method performed well for most of the brain regions. Similarly, motion and other artefacts were present in the dataset, especially in ADNI scans. The brain scans with artefacts were removed from the study in order to limit the influence of these artefacts on the performance of our model.
The evaluation of the model was performed on MRI images from different healthy and diseased subjects including patients with memory impairments. The segmentation results revealed that our approach can accurately predict brain structures with varying degrees of abnormalities, morphologies and scanners. The proposed model obtained the highest dice scores for 6 out of 8 brain regions. The average dice score for the eight brain regions obtained with our model was higher than the previous work on MRBrains18 dataset. Moreover, the development of an automatic segmentation method for brain tissues as well as abnormalities can facilitate the diagnosis of brain disorders especially in the aging population and can help to identify the disease related biomarkers for prevention and treatment.

V. CONCLUSION
This research targeted medical imaging segmentation problems, specifically to segment brain into various regions including cerebrospinal fluid, white matter and gray matter. Unlike, traditional segmentation methods in medical imaging that are based on graph theory or atlas-based techniques, this work was directed towards the use of a 3D deep learning segmentation algorithm. In this regard, a 3D CNN based network was employed for two segmentation tasks i.e. 3 and 9 brain regions. For 3 brain regions segmentation task, dice scores of 0.872, 0.872 and 0.896 were obtained for WM, GM and CSF, respectively. While, for the segmentation task of 9 brain regions, dice scores of 0.879, 0.839, 0.864, 0.844, 0.906, 0.937, 0.973, 0.982 and 0.999 were achieved for cGM, BG, WM, WML, CSF, VT, CB, BS and INF, respectively. Our network obtained mean dice scores of 0.879 and 0.914 for three and nine brain regions respectively. Moreover, a mean dice score of 0.903 obtained for eight brain regions was better than the previous research. This indicates the promise of developing medical image segmentation algorithms using 3D deep learning techniques. In the future, we intend to improve our network to achieve better results for all the brain regions under consideration.

ACKNOWLEDGMENT
Alzheimer's Disease Neuroimaging Initiative (ADNI) is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie Inc.; the Alzheimer's Association; the Alzheimer's Drug Discovery Foundation; Araclon Biotech; BioClinica Inc.; Biogen;

ETHICAL APPROVAL
We declare that all human and animal studies have been approved by the Medical University of South Carolina Institutional Review Board and have therefore been performed under the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments.