A New Convolutional Neural Network Architecture for Automatic Detection of Brain Tumors in Magnetic Resonance Imaging Images

Brain diseases are mainly caused by abnormal growth of brain cells that may damage the brain structure, and eventually will lead to malignant brain cancer. An early diagnosis to enable decisive treatment using a Computer-Aided Diagnosis (CAD) system has major challenges, especially accurate detection of different diseases in the magnetic resonance imaging (MRI) images. In this paper, a three-step preprocessing is proposed to enhance the quality of MRI images, along with a new Deep Convolutional Neural Network (DCNN) architecture for effective diagnosis of glioma, meningioma, and pituitary. The architecture uses batch normalization for fast training with a higher learning rate and ease initialization of the layer weights. The proposed architecture is a computationally lightweight model with a small number of convolutional, max-pooling layers and training iterations. A demonstrative comparison between the proposed architecture and other discussed models in this paper is conducted. An outstanding competitive accuracy is achieved of 98.22% overall, 99% in detecting glioma, 99.13% in detecting meningioma, 97.3% in detecting pituitary and 97.14% in detecting normal images when tested on a dataset with 3394 MRI images. Experimental results prove the robustness of the proposed architecture which has increased the detection accuracy of a variety of brain diseases in a short time.


I. INTRODUCTION
The human brain is the most important part of the body because it controls most of human actions such as memory, speech, thoughts and leg and arms movements [1]. Brain diseases are mostly caused by the abnormal growth of brain cells, which directly damage the brain structure and lead to brain cancer [2]. According to the world health organization (WHO) records, about 9.6 million on every side of the world died from cancer in 2018 [3]. Brain cancer is dangerous, rapidly growing and is deadly. Moreover, the complexity of brain construction is a major challenge, so timely and accurate diagnosis is necessary. The MRI images can provide better visualization of contrast and spatial definition [4]. The detection of brain abnormalities process is an important issue to determine whether the abnormalities exist or not in MRI images. Researchers uses deep learning in a wide zone with many medical science fields [5]. Since 2012, researchers The associate editor coordinating the review of this manuscript and approving it for publication was Yiming Tang . have used deep convolutional neural networks (DCNNs) a lot, which achieved great success in the image classification process [6]. Lately, DCNNs also scored promising results in the process of medical images classification [7]- [11]. This paper proposes an efficient deep diagnosis system, see figure 1. Now, Some outlines about the paper contribution will be indicated: • A three-step pre-processing method is proposed as an initial step. The pre-processing method enhances the quality of the MRI images, stretches their histogram and improves their contrast.
• Measuring the quality of the output image with blind referenceless image spatial quality evaluator (BRISQUE) is used as an assessment on the pre-processing phase. [15].
• A diagnosis architecture that uses DCNN to classify MRI images as glioma, meningioma, pituitary, and normal.
• The Batch normalization technique [22] is applied to train the model faster, get a higher learning rate and enable initializing layer weights easier. The MRI images dataset is entered into the proposed pre-processing steps. After that, it is entered into the proposed architecture in the training phase and in the testing phase to classify them as glioma, meningioma, pituitary, and normal.
• An analytical detailed comparison of Glioma, Meningioma and Pituitary detection is conducted between the proposed architecture and well-known approaches including (VGG16 [19] and VGG19 [20]) and the recent approaches like CNN-SVM [21]. We organized the rest of the paper as follows: First section II discusses the related work, Secondly, section III describes the applied methodologies in this paper, Thirdly, section IV discusses the experimental results. Fourth section V illustrates a performance comparison between wellknown DCNNs and the proposed model. Finally, section VI presents the concluding remarks.

II. RELATED WORK
Recently, there're many studies and researches about detecting brain tumor in MRI images. In this section, several reliable works are explored. Varthanana et al. [23] presented a method for brain tumors detection using a novel selforganizing map (SOM) and fuzzy k-mean (FKM). Their segments results have been validated by experienced radiologists. However, their proposed approach is complex and time consuming with real practical applications. Dhanachdra et al. [24] proposed a technique to improve MRI images quality. Their technique computes the initial value of cluster centers with helping of a subjective algorithm. They used another contrast stretching algorithm to enhance the input image quality. They also used K-mean algorithm in their classification process, but still, there're lack of classification accuracy. Varana et al. [25] used discrete wavelet transform (WDT) based on brain abnormal region. They explored a probabilistic neural network (PNN) to detect brain tumors in the MRI images. Sachdera et al. [26] proposed a principal component analysis-artificial neural network (PCA-ANN) for several classed brain tumor classification. They get a number of regoin of interests (ROIs) by the content-based contour (CBAC). Their experiments results showed respectable enhance in the accuracy from 77% without PCA to 91% with PCA. Bahadure et al. [1] used the Support Vector Machine (SVM) for the classification process. They also explored a Berkeley wavelet transform (BWT) for brain tumor detection. They extracted the relevant features then input them to the SVM. Corso et al. [27] proposed an automatic segmentation approach by combining a generative model-based technique and a graph-based affinities method. Their model was inserted into multi-level segmentation using a weighted aggregation algorithm. Dong et al. [28] proposed an approach to detect brain tumors using the U_Net-based deep convolutional neural network. Their method consists of encoding and decoding, which, allowed them to efficiently train their model by performing a set of data augmentation approaches. Soltaninejad et al. [29] proposed super-pixel method for the segmentation process to classify tumors using SVM. Soltaninejad et al. [30] proposed an approach for segmenting and automated detection for brain tumors based on a minimum redundancy maximum relevance (MRMR), extremely randomized tree (ERT) and SVM. Remeseiro et al. [11] presented a survey for the recent and efficient feature extraction methods used in such medical problems.
Zahraa et al. [31] proposed a hybrid approach based on multiple eigenvalues selection (MES) to automate the detection of brain tumors in the MRI images. Their approach scored 91.02% in the accuracy metric. Khairandish et al. [21] proposed a hybrid model based on a convolutional neural network (CNN) and SVM to detect brain tumors in the MRI images. They also applied a pre-processing approach on the MRI images, which hugely improves their accuracy score. However, their evaluation process was insufficient because they used only 100 cases for training and 220 cases for testing.
In the paper, we proposed a three-step pre-processing approach to enhance MRI images quality and a reliable deep convolutional neural network to accurately detecting brain tumors.

III. THE METHODOLOGIES A. THE PROPOSED PRE-PROCESSING APPROACH
In the classification challenge for detecting the brain tumor in MRI images, the identification of a correct pattern is the main key in the classification process. Many issues in the MRI images face the classification models, which mislearning can happen and leads award downgrading the classification accuracy. So, we proposed a three-step pre-processing approach.

1) REMOVING THE CONFUSING OBJECTS
Confusing objects such as texts and black areas on the right and left corners have been removed by cropping 100 pixels from each side of the image to get the exact brain object as shown in figure 2.

2) DENOISING THE MRI IMAGES
Non-local mean algorithm (NLM) [12] deal with noise efficiently in MRI images. The noise in these images lead  Figure A shows An example of MRI images before the cropping process. Figure B shows the same image after the cropping process. to learning undesirable patterns consequently, downgrading the classification accuracy. The NLM algorithm greatly enhances the quality of the MRI images as compared with Gaussian [13] and Median [14] algorithms according to the blind reference less image spatial evaluator (BRISQUE) [15]. See table 1.

3) HISTOGRAM EQUALIZATION
Histogram Equalization [32] extremely enhances the contrast in the MRI images. Moreover, it allows the detecting small details by setting regions lower contrast with appropriate contrast. It accomplishes this process by performing a separation to the most frequent intensity values. It also clears up the interference of the most frequent patterns in the MRI images as showed in figure 3-E.

B. DATASET
The dataset that has been used in the experiments and test formed based on Sartaj brain MRI images dataset [16] and the Navoneel brain tumor dataset [17]. The used dataset contains two types of MRI [18] images: T1-weighted and T2-weighted. T1-weighted images are produced using short time to echo (TE) and repetition time (RT) constraints, which are 14 and 500 milliseconds, respectively. T2-weighted images are produced using longer TE and RT constraints, which are 90 and 4000 milliseconds, respectively. The dataset has been divided into three folders

C. TRAINING STRATEGY
In our training strategy, we trained our model from scratch, so our model can be considered problem-based. We used the image data generator [33] to generate a sufficient number of MRI images for the training process. The generation process produced data from the same domain as the used dataset, so the models learned only the desirable features. The training process has 60 epochs with 385 stepper epoch and batch size 16. Using a batch size of 16, means that 16 samples are passed at a time to the trained model until all training data is passed to complete one single epoch. This value is suitable for Google Colaboratory since we have a limited RAM size of 13 GB. Therefore, increasing batch size in our case causes out of memory crashes during the training process. We saved the weights of each model after the training process. So, we don't need to repeat the training process to detect the abnormalities in a specified MRI Image. The average training time in seconds per one epoch is 253, 268, 233, and 196 for VGG16, VGG19, CNN-SVM, and the proposed model, respectively. We implemented the training process using 13 GB of RAM and the Tesla P100 GPU provided by Google Colaboratory Notebooks. We applied our training strategy to all of the explored models. See figure 4

D. THE PROPOSED MODEL
This paper proposes a Deep convolutional neural network (DCNN) model. The proposed model resolves many issues such as decreasing the overfitting [43], slow learning rates and lack of training accuracy due to the batch normalization operation, see subsubsection III-D2. The proposed model consists of a convolutional part and a classifier part. The convolutional part has ten convolutional layers, five batch normalization layers and four max-pooling layers. The classifier part has three dense layers and two dropout layers as showed in figure 5.

1) THE CONVOLUTIONAL OPERATION
The convolutional operation is an important part of the proposed model because it's responsible for gathering features from the MRI image. The expected features are good enough to perform a reliable training process. In case of the first convolution layer, the input vector consists of the input image and in case of the other convolution layers, the input vector consists of the previous layer feature maps. We perform the convolutional operation using equations 1 and 2.
where RELU is rectified linear unit activation function; x is the input to RELU.
The structure of the proposed architecture, this structure consists of five blocks, the first four is convolutional part and the last one is the classifier part, this structure also has ten convolutional layers, six BatchNormalization layers and four max-pooling layers.
where N represents number of feature maps in the input vector; r,y are feature map indices of the current layer and the previous layer respectively; T is the layer index; Initially P 0 y represents the input image vector and P T −1 y represents the feature maps vector of the previous layer of T layer; K is the kernel matrix; u,v are the indices of the kernel values; X,B are the size and bias of the filter respectively; The convolutional operation results in a distortion in the output values. This distortions causes Overfitting [43], which reduces the learning rates. Overfitting issue has been processed using the batch normalization operation as showed in next section.

2) THE KEY ROLE OF BATCH NORMALIZATION OPERATION
During the training process of a DCNN model, the distribution of input values for a specific layer depends on the previous layers of that model. This variability causes overfitting [43] and reduces the learning rates. In this paper, batch normalization [22] is hired to speed up the training process and decrease the Overfitting [43] issue by standardizing the input vector in a way that eliminates the noisy features, which stabilizes the training process, see Figure6. The normalization allows to use lower dropouts [44] rates because it acts as a regularizer and the input to this process is an vector. Batch normalization process has been performed through equations 3,4,and 5.
where M BN is the mean for the input X ; N is the number of elements in the input vector X .
where σ 2 BN is the variance for the input X .
where Y i is the output of the batch normalization operation.

IV. THE EXPERIMENTAL RESULTS
The explored model has been implemented using Python and Keras library [37] on TensorFlow [38], Google Colaboratory notebooks [39] along with Github where the used dataset is uploaded and Sklearn packages [40] of Python version 3.8.3 [41] which was released on 13 May 2020 to calculate all statistical and computations tasks. We hired the train part of the used dataset in the training process. Figure 6 shows the training accuracy and the training loss of the proposed model along with the discussed models, and it also shows the stable training process with decreasing overfitting and a high learning rate for the proposed model in the opposite of the discussed models. We also hired the test part of the used dataset to assess the result of the explored models (VGG16, VGG19, CNN-SVM, and the proposed model).
We demonstrate the confusion matrices for the explored models after using our pre-processing approach in table 3. In this paper, the NoTumor, GLIOMA, MENINGIOMA and PILUITARY classes are renowned as negative (showed by -sign in table 3), positive (showed by + sign in table 3), double-positive (showed by ++ in table 3) and triple-positive (showed by + + + in table 3) respectively. We demonstrate the confusion matrices for the explored models before using our pre-processing approach in table 4. We demonstrated the accuracy, specificity, sensitivity, and F1 score metrics for all of the explored models in table 5. We computed according to our test data part in the used dataset Specificity, Sensitivity, Accuracy, and F1-score due to the following equations: The Receiver Operating Characteristic curve (ROC curve) and the Precision-Recall curve (PR curve) are reliable performance evaluators [42]. We calculate the points of ROC curves and PR curves for a specific class by comparing that with the other classes.

V. DISCUSSION
In this paper, an analytical comparison among the proposed model and different explored models has been proceeded. An exploration has been presented between the proposed model and the other well-known deep convolutional neural network (DCNN) models such as VGG16 [20], VGG19 [20] and hybrid CNN-SVM [21]. These explored models were pre-trained with the standard ImageNet dataset [34] to get an initial weight, it has 1.2 million color images. This process was commonly hired to deal with the problem of having a small number of images in the investigated dataset.Therefore, the models can extract the basic patterns such as points, edges and lines. The explored models have two main parts VOLUME 10, 2022  in their structures, the convolutional part and the classifier part. The convolutional part extracts the inputted image's features and the classifier part classifies these features into one of the intended classes due to the discussed problem. the These explored deep models have a very large number of parameters to be trained. These models also require a large   number of computations and a large memory footprint. In our case ''Brain Tumor'', we have four classes, so, we need to adapt the classifier part of these models. So, we applied the transfer learning technique [35] by adding a dense layer that have four classes with ''Softmax'' activation function [36] to all of the explored models. Since softmax is an activation function used for multi-class classification problems. Table 6 shows the input parameters of the explored models and the proposed model in this work. On the other hand, we hired our proposed model as a DCNN problem-based model, see figure5. The proposed model has many important varieties as we compared with the explored model as follows: • Max-pooling layers can accelerate the MRI images diagnoses because it decreases the size of the output of the convolutional layer. However, these layers can result in no longer have some features from the MRI images. The well-known DCNN models such as VGG16 [19] and VGG19 [20] use 6 max-pooling layers for each one.
On the other hand, our proposed model uses 4 maxpooling layers only and it's a balanced number according to our results.
• The explored models [19]- [21] were designed to deal with small-sized images, so they configured their convolutional filter with 3 × 3 to be able to find small patterns. However, the brain tumor patterns are relatively large, so using a large filter size in our convolutional operations will be a better choice. This is what we do in our proposed model, where the filter size becomes 7×7.

VI. CONCLUSION
A deep convolution neural network architecture is proposed for glioma, meningioma and pituitary brain diseases detection with an objective of high classification accuracy within a short time. first, a proper brain tumor dataset for efficiently performing the training and testing process. Second, a threestep pre-processing approach was removing the confusing variables, denoising the MRI images and enhancing the contrast of these images. This approach positively and directly reflected on all of the explored models. Third, a training strategy includes training our model on the desirable patterns from scratch. Fourth, we hired our model to extract the MRI images features and efficiently classify them. We evaluate the proposed model on a dataset with 394 MRI images. The proposed model accomplished an accuracy of 97.72% overall, 99% in detecting glioma, 98.26% in detecting meningioma, 95.95% in detecting pituitary and 97.14% in detecting normal images. In real practice, the proposed model can be considered as an automated computer-aided detector tool to timely detect brain abnormalities in MRI images with high accuracy.

VII. FUTURE WORK
In the future, we are going to increase MRI images in the used dataset to improve the accuracy of the proposed model. Moreover, Applying the proposed approach to other types of medical images such as x-ray, computed tomography (CT), and ultrasound may constitute a principle of future studies.