Deep Learning Assisted Efficient AdaBoost Algorithm for Breast Cancer Detection and Early Diagnosis

Breast cancer is one of the most dangerous diseases and the second largest cause of female cancer death. Breast cancer starts when malignant, cancerous lumps start to grow from the breast cells. Self-tests and Periodic clinical checks help to early diagnosis and thereby improve the survival chances significantly. The breast cancer classification is a medical method that provides researchers and scientists with a great challenge. Neural networks have recently become a popular tool in cancer data classification. In this paper, Deep Learning assisted Efficient Adaboost Algorithm (DLA-EABA) for breast cancer detection has been mathematically proposed with advanced computational techniques. In addition to traditional computer vision approaches, tumor classification methods using transfers are being actively developed through the use of deep convolutional neural networks (CNNs). This study starts with examining the CNN-based transfer learning to characterize breast masses for different diagnostic, predictive tasks or prognostic or in several imaging modalities, such as Magnetic Resonance Imaging (MRI), Ultrasound (US), digital breast tomosynthesis and mammography. The deep learning framework contains several convolutional layers, LSTM, Max-pooling layers. The classification and error estimation that has been included in a fully connected layer and a softmax layer. This paper focuses on combining these machine learning approaches with the methods of selecting features and extracting them through evaluating their output using classification and segmentation techniques to find the most appropriate approach. The experimental results show that the high accuracy level of 97.2%, Sensitivity 98.3%, and Specificity 96.5% has been compared to other existing systems.

used to look inside the human body as a non-invasive method for helping doctors for diagnose and treat [3], [4]. An early breast cancer diagnosis can occur with any of the available imaging methods; it cannot be confirmed that these images are malignant alone [5]. There is a high risk of cancer cells being placed in the interstitial tissue veins or fluid until the microscopic exam of tissues from cancer to confirm their malignancy begins [6]. There is a possibility that cells drag along an operative incision or needle route that can increase the spread of cancer through biopsy [7]. The mammographic breast image is typically preprocessed to eliminate pectoral muscle in the diagnosis of breast cancer with a mammogram to encircle the detection process. The search for abnormalities can, therefore, be restricted to the breast profile area by eliminating the pectoral muscle and background areas from the mammogram [8], [9].
Cancer tissues with higher pixel intensities are easily detected than the other breast region. For normal tissues, dense breasts have intensities for the same to those in cancer regions and tumor regions must be successfully identified [10], [11]. The classification into benign and Malignant of the tumor tissues in the breast is a difficult task [12]. The extraction function is an important step in the study of a mammogram. Handcrafted features are utilized to represent the content of images in conventional methods [13]. As an alternative approach, the neural network has arisen to automatically eliminate the best features [14]. Deep learning is an emerging field where machine learning and AI use multiple nonlinear processing layers to learn features straightly from the information [15]. The high precision in the image recognition of deep learning models can be accomplished in conjunction with human performance. Figure 1 shows the survival rate for breast cancer by stage [16]. Survival rate for breast cancer by stage [22].
The purpose of this study is to improve the prediction of tumor prognosis and to improve a deeper classification of results [17], [18]. This paper's method shows a supervised classifier learning and unsupervised feature learning process in comparison with previous classifier approaches [19], [20]. The fully connected convolutional layer has been utilized for feature selection, feature extraction, detection, segmentation and classification for evaluating different stages of breast cancer [21].
In this case, CNN learns the full resolution of every pixel from the actual input data to achieve more precise pixel-to-pixel segmentation, especially on the edges of objects [23], [24]. To this end, the network sub-sampling and pooling layers can be eliminated and the convolutional layers can extract and learn the full spatial characteristics of the input signal [25], [26]. Figure 2 shows the traditional deep learning and machine learning difference.
The main contribution of the paper is listed as follows, • To propose the deep learning assisted efficient Adaboost algorithm for breast cancer detection and early diagnosis. • To build an ensemble classifier with a boosting algorithm to detect different metastases in breast tumors strongly.
• The experimental results have been demonstrated with the help of the dataset https://wiki.cancerimagingarchive. net/. The rest of the paper decorated as follows: Section 1 and section 2 discussed the background and significance of detecting breast cancer. In section 3 the Deep Learning assisted Efficient Adaboost Algorithm (DLA-EABA) for breast cancer detection has been proposed. In section 4 the experimental results have been illustrated. Finally, section 5 concludes the research article.

II. RELATED SURVEY AND IT'S IMPORTANT IN THE CURRENT AREA OF RESEARCH
Qi et al. [26] proposed the deep active learning framework (DALF) for the classification of breast cancer. This procedure consists of detailed observations of the most useful unlabeled samples inserted into the training sets. The model is then modified with a growing number of training models. The proposed deep-active learning system addresses two selection strategies: an entropy-based plan and a confidencebuilding plan. The approach suggested validated utilizing a histopathological image data collection available to the public, in which every image patch is binary categorized as malignant or benign. The suggested operation utilized active learning, to choose unlabeled samples for annotation and to update the increasing training set on an iterative basis. The major objective of the work is to ease a large-scale image classification annotation burden.
Saha and Chakraborty [27] suggested the Her2Net for classification and segmentation of cell membranes and nuclei in breast cancer estimation. The most critical instances were images with Her2Net monoclonal antibodies colored cytoplasm. The performance of the data cohorts has highly correlated output metrics in terms of the testing and training data sets, the data can be used in these cases. The Her2Net proposed shows an extremely low false-positive rate. Her2Net improved performance with an increase in the number of training image patches. Her2Net is used in patching, classifying and rating applications. They dealt with a deep neural network of convolutional and deconvolution components for the task of the cell membrane and nucleus segmentation. VOLUME 8, 2020 Their Her2Net proposal is that it can be incorporated into other code systems for segmentation, grading, and scoring.
Samala et al. [28] introduced Multi-Stage Transfer Learning for Digital Breast Tomosynthesis using deep neural networks (MSTL-DNN). The ImageNet knowledge first captured the mammography information and then optimized it in a multi-stages transfer process for digital Breast Tomosynthesis data. Mammography data and then DBT data have been fine-tuned. In the second phase, the freezing of most of the convolution neural network structure was compared two transmission networks with the first convolution layer.
Al-Antari et al. [29] proposed the Full resolution convolutional network (FrCN) with a CAD framework for X-ray mammograms. To detect the mass as malignant or benign and classify it. The publicly accessible and annotated INbreast database was used for the calculation of the suggested integrated CAD framework in terms of accuracy of classification, identification, and segmentation. The findings from the estimation of the proposed CAD framework by four-fold crossvalidation tests show that the breast data set offers a 98.96 % mass detection accuracy, 97.62 % Matthews correlation coefficient and 99.24 % F1 score. The pixel-to-pixel mass segmentation could be main to reducing the false-negative and positive pixel rate and increasing the overall performance of the CAD system proposed. In contrast to the latest methods on the subject, the findings of the Convolutional Neural Network classification based on segmentation show the viability and efficiency of the suggested CAD system. Shamy and Dheeba et al. [30] initialized the K-means Gaussian Mixture Model and Convolutional Neural Network (GMM-CNN) for the detection and classification of breast cancer. The first phase is to identify an interest region (ROI). The second stage is ROI texture extraction and feature optimization with the optimized feature selection algorithm. The third stage is to classify predicted anomalies as malignant or benign through CNN. The neural network approach led to a good accuracy functioning of the learning algorithm. This model was used to automate the classification by the expert on the identification of cancer needed, to enhance the identity of the breast cancer classification, of different types of breast cancer. The outcome analysis showed that the suggested model significantly decreases the processing time and improves the quality of the solutions.
To overcome these issues, in this paper, Deep Learning assisted Efficient Adaboost Algorithm (DLA-EABA) for breast cancer detection has been proposed. The deep CNN has been utilized to classify the masses as either malignant or benign and manually predicted masses are directly fed into a deep convolutional neural network to produce the integrated high-level deep image features. machine-learning strategies that are based on the hand-craft features. The automated mass detection is still a challenge. However, multiple studies have discussed the required to automatically identify breast anomalies. In section 3 the learning algorithm has been discussed with an effective solution for breast cancer detection.

III. MATERIALS AND METHODS
In this paper, Deep Learning assisted Efficient Adaboost Algorithm (DLA-EABA) for breast cancer detection has been proposed. To achieve high precision, CNN requires extensive data for training Because the large dataset has less available, training and research has been conducted on the Internet from the most available data. https://wiki.cancerimagingarchive.net/ is the data set used for this analysis. Here, Figure 3 shows that the Breast cancer detection and classification using the proposed DLA-EABA method.

A. CASE 1: AUTO ENCODER AND DECODER ANALYSIS FOR CLASSIFICATION
The proposed feature learning method, stacked autoencoder has been used to generate a deep convolutional neural network by stacking many auto encoders hierarchically, The non-linear transformation (NLT) can be taking from the combined depiction of the actual dataỸ as input, the encoder and decoder sections of the autoencoder contain multiple NLT as follows, As shown in the above equation where m indicates the number of layers and ρ indicates the activation function. g (i) , ω (i) and a (i) represent the weight matrix, hidden vector, and bias vector in the i th layer correspondingly. A. Spare autoencoder (SAE): The use of the cost function regularize leads to a sparse of the autoencoder. This regulator is based on the average neuron output activation value. The average neuron j output measurement is defined as follows, B. Reconstruction layer: Autoencoder minimizes the distance betweenỸ inputs and g (m) outputs a reconstructed basis. Since the number of NN parameters is exponential and there is very limited availability of training samples from the deep neural network training where the risk is addressed. To minimize this problem, the hidden layers have been penalized by certain sparsity penalties so that the reconstruction loss is demonstrated: As shown in equation (4) where g (m) denotes the output of the reconstruction and η is the hyperparameter to balance the bias of various sections. In this method, the network is configured as an input layer, four hidden layers, and the output layer heuristically. The activation function in the autoencoder: Exponential linear unit (ELU) has been employed to speeds up training in deep convolutional neural network and result in greater classification precision, where the activation function ρ is utilized, As shown in the above equation where β controls the value to which an Exponential linear unit imbues for negative net inputs. The biases has been initialized as 0 and the weight matrix during the training of the neural network ω (i) with a standardized distribution on each layer, As shown in equation (7) where V[-b, b] is the uniform distribution in the interval [−1,1] and l is the hidden layer size.

B. CASE 2: CLASSIFIER LEARNING ADABOOST ALGORITHM
The AdaBoost algorithm is used in the classifier learning of the intended approach to train classifiers which are based on a supervised learning algorithm for calculating a binary classification that divides positive and negative cases best. Let's consider the set of training examples {(y j , x j )} n j=1 , where y j indicates the training samples and x j is a boolean value assigned based on clinical data of cancer sufferers during the dataset preprocessing step. AdaBoost is an efficient method that improves the accuracy of classifying an easy learning algorithm by transforming a weaker classification set of h i (y )} into a higher classification of h(y ). In this study, decision stump as the weak classifier learning algorithm. The output of h(y ) is 1 if y' is classified as a positive example and otherwise 0.
This divergent limits weak classifiers to base on a single feature f i only which is an outcome, for every weak classifier that contains a single feature f i , a threshold θ i and a parity q i which either −1 to 1, therefore, denoting inequality direction The boosting algorithm estimates the positive value for θ i and q i for every weak classifier h i (y ). To meet this, it reviewed all potential mixtures of both q i and θ i , for which the number is restricted only on an infinite number of training, As shown in algorithm 1 the resulting parameter has been discussed. To accomplish the task of detecting the medical result for cancer patients a series of classifier learning labels will be included in the features produced from the intended unsupervised, two-phase, feature learning, method. In this paper, a divergence of the AdaBoost algorithm has been taken, which indicates high efficiency in the classification operation, as the learning method for the classifier.
In every test fold, the predicted areas are reviewed to be proper. The proposed deep learning assisted efficient AdaBoost algorithm can predict the masses or tumors even if they exist inside dense tissue or over pectoral muscles. Figure 4 demonstrates the examples of breast cancer detection using the proposed DLA-EABA on test images.

− r and e j = 0 while instance y j is properly classified by h r and 1, otherwise. Return
Network layer is used to analyze the image patches which is discussed as follows,

C. CASE 3: DEEP CONVOLUTIONAL NEURAL NETWORK LAYER
The convolution layer has been used to convert the kernels to the image patches. Drop a kernel/filter bank, As shown in equation (11) where l∈ {1, 2, 3, ...e m g is a linear filter of size n × n, e m g denotes the number of kernels and F m l . The filter F m l moves over the input patch J q to execute the local convolution operation. An input size ω × ω patch J m−1 q is convolved with a n × n local receptive field in the age J m . The conv out is called activation function and the following equation as (12), The Long Short-Term Memory contains an input memory cell, an output and three gates (that is, an input, an output gate and a forget). Such gates utilized logistic functions to measure the LSTM activation function. These gates have been regarded as conventional artificial neurons in our method. Further, this gate has its weight and partiality values which are basically beyond the LSTM output of the posterior layers. The input gate-controlled a system where a value has transferred to the memory. The forget gate is utilized to store a memory value and in the end, the LSTM activation controls have depends on the total attributes with the output gate. The gates and activations have been determined according to h = 1,2,3, ...
As shown in the above equations where S, ξ, and tan-h denotes the weight, activation function, and non-linear function correspondingly. Y g is the input, D h is the memory cell, G h is the output, O g is the output gate, J h is the input gate and F h forget gate.

D. CASE 4: SOFTMAX REGRESSION
This is a classification process generalizing multinomial problems with logistic regression. Softmax regression, linear regression that produces raw class scores, generates a class probability distribution. The Softmax function is used, As shown in the above function where x denotes the actual class andx denotes the predicted class, the loss is calculated by utilizing the cross-entropy function. Hence based on the above discussion, the experimental results show that the high accuracy level of 97.2%, Sensitivity 98.3%, and Specificity 96.5% has been compared to other existing systems.

IV. NUMERICAL RESULTS AND DISCUSSION BASED ON THREE POINTS LOGIC
Images have been obtained at three points: before treatment start(t1), after treatment start (t2), and either after treatment finishing or after completing all treatments (until surgery) (t3). Images are available at three stages. Positron Emission Tomography (PET) and computed tomography (CT) images have been obtained using an in-house support system so that the patient can be exposed to the MRI data registry. These series aim to provide clinical imaging data in the context 96950 VOLUME 8, 2020 of early treatment for breast cancer for the production and evaluation of quantitative imaging methods. The acquisition parameters for the CT testing are: the tube current for a 70 kg patient is 80 mA and therefore the scaling is up to 120 kVp for all patients, and the pitch is 1675/1. The tube voltage is 120 KVp. FDG administered activities for a patient of about 370 MBq (10 mCi) and weight-scaled 70 kg. An antecubital vein counter-lateral to the affected breast has administered intravenously with FDG. Emission data has been collected 2 minutes per bed position in 3D mode after 60 min. The emission scan has been obtained from the skull to the midfemur at first only in a prone position over the breast. Figure  5 (a) shows the breast cancer cells affected area and figure 5 (b) shows the image interpretation tissue density.

A. MATTHEWS CORRELATION COEFFICIENT INDEX
The number of patients with good results and those with low results in the cancer datasets is in severe imbalance; compared with 154 patients with good results, for example, there are only 28 patients in the data set with poor performance. The Matthews correlation coefficient index is expressed in equation (20), which are reported to be the efficient calculation criteria when the distribution of the datasets which are extremely unbalanced due to two major estimation parameters, As shown in equation (20) where t + denotes the true positive, t + denotes the true negative, f + denotes false positive and f − denotes false negative. Figure 6 shows the Matthews correlation coefficient index for the proposed DLA-EABA method. Table 1 shows the Matthews correlation coefficient index of the proposed DLA-EABA method. In the dataset, there is an overlap called distance and this is the case where the increasing false negatives or false positives (declining overlap) upturns the likelihood of a variant correlation. The proposed DLA-EABS method has a high correlation coefficient index when compared to other existing methods.

B. PERCENTAGE SURVIVAL RATE
Breast cancer can be detected early on and result in a survival rate and a decrease in mortality. Breast cancer is divided  into different stages, including in the size of the tumors, in lymph glands involvement, and metastasis. The survival rate of breast cancer shows the percentage of persons who survived a certain period after diagnosis. Hence, the one-, two-, three-, four-, five-, seven-and ten-year survival rate and the degree of importance of association between breast cancer survival rate and variables including side of involvement, size of the tumor, extent of tumor malignancy, number of lymph nodes involved, estrogen receptors, the status of progesterone receptor, age, stage of the disease, state of metastasis, form of pathology and type of surgery. Figure 7 demonstrates the percentage of survival rate of the proposed DLA-EABA system.

C. SEGMENTATION ACCURACY ANALYSIS
The anomalous hotspots may lead to an accurate separation of malignancy and benign tumors compared to whole breast features when using features for extraction. Further, it is important to diagnose and to locate the tumor spatially in the thermal image. For an accurate diagnosis of malignancy from thermograms, the possible tumor areas are therefore essential to the segment wit epoch loss. The following equation (21) shows the segmentation accuracy of the suggested approach.

D. DICE COEFFICIENT RATIO
DICE, known as the overlap index, is the method used most frequently for validating the segmentation of the medical volume. To order to evaluate reproductivity (repeatability), the DICE is often used not only to directly compare the automatic to the ground truth segmentation. DICE as a reproducibility measure as a manual annotation statistical validation where segmenters have annotated repeatedly the same MRI image, then a pair overlap is calculated for the repeated segmentation employing the DICE based on precision Vs recall time analysis. Figure 9 (a & b) shows the dice coefficient of the proposed DLA-EABS system. Table 2 demonstrates the Dice coefficient ratio of the suggested DLA-EABA approach. One important observation is that the correlation between Dice and the distancebased measurements decreases as the overlap decreases, i.e. when false positives and false negatives increase. In contrast to distance-based metrics, this is intuitive because voxels do not consider positions that are not in an overlapping region (false negatives and false positives), which means they have a similar value, regardless of the distance between voxels.

E. THE AREA UNDER THE CURVE
In the test folding from the prediction and segmentation phases, all segmented tumors via the suggested DLA-EABA are sequential into the classification steps. The efficiency of classification is calculated in terms of DC, MCC, segmentation accuracy, AUC. It achieves high AUC scores and ACC rates and shows better detection of either positive or negative instances based on Regio under Curve (ROC) analysis. The proposed ensemble classifier performs better AUC in most datasets. The proposed achieves less false negative values when compared to other existing methods. Figure 10 (a &  b) shows the AUC curve with segmentation results. This paper focuses on combining these machine learning approaches with the methods of selecting features and extracting them and evaluating their output using classification and segmentation techniques to find the most appropriate approach. The experimental results show that high accuracy 97.2%, Sensitivity 98.3%, and Specificity 96.5% when compared to other existing systems.

V. OUTCOMES AND ITS DISCUSSION
In this paper, the deep learning assisted efficient AdaBoost algorithm has been proposed for breast cancer detection and early diagnosis. The AdaBoost algorithm for the final prediction function to build an ensemble classifier. As the results of the estimation test show, our suggested approach as the greater capacity to predict and the deep-learning classifier is better than the other classifiers. Our discussion and analysis have shown excellent potential for fast generalization and directly enhanced the efficiency of the prediction of the result, which are derived automatically by the neural network. Taking advantage of the high-deep learning from the Convolutional Neural Network deep learning model proposed DLA-EABA helped improve the system performance. Deep learning approaches are adapted to the specific characteristics of a dataset, as they are based on machine learning and for each data set a certain model is created. The proposed DLA-EABS method has high accuracy in detecting breast cancer mass and increases the patient survival rate. The performance of the suggested approach is very high when compared to other existing methods. ZHONGJUN GAO received the master's degree from the University of Electronic Science and Technology. He has been engaged in national, medical, and health informatization for a long time. He is a Senior Engineer of the Medical and Big Data Institute, University of Electronic Science and Technology. He has mainly carried out national health information standards and standards compliance testing research, and participated in five national health industry standard formulation projects. He won one Third Prize of the Sichuan Science and Technology Progress Award, and one First Prize of the Science and Technology Award and the honorary title of Outstanding Science and Technology Worker from the Sichuan Institute of Electronics. He has five standard revision items, four of which have been officially released and used.
SHUANG WANG received the master's degree. She is the Deputy Director of the project department with the Shenzhen Center for Health Information, has participated in a number of health informatization projects, topics, and standards in Shenzhen, and has accumulated rich project management experience. She is a member of health information and health care big data with the Institute of Health Archives and the Regional Health Informatization Professional Committee, China.
MINGJIE HE received the B.E. degree from the South China University of Technology and the M.E. degree from the University of Electronic Science and Technology of China. He has 15 years of medical industry experience, participating in medical informatization and solution development, and specializing in software technology development, project management, and productization, especially PACS technology architecture and design. He is a Deputy Chief Architect and a Senior Project Manager of Jinpan Company. He has led a number of core technology development works, and led and participated in a number of national, provincial, and key research and development projects.
JIPENG FAN received the master's degree in cognitive linguistics from Southwest University for Nationalities. He is the Department Manager of Jinpan Company. He has SNOMED's official contact for international clinical medical terminology, and participated in the translation and publication of the Chinese version of the International DICOM Standard (2015C version) and other works. He has also participated in the formulation of one national medical imaging industry standard. He received the Third Prize of the Sichuan Science and Technology Progress Award.