Automated Breast Cancer Detection and Classification in Full Field Digital Mammograms Using Two Full and Cropped Detection Paths Approach

Breast cancer is one of the most severe diseases that threaten women’s life results in increasing the death rate annually as confirmed by the World Health Organization. Breast cancer early detection is one of the main reasons behind reducing cancer severity. However, with the huge number of mammograms taken daily, the checking process conducted by radiologists becomes lengthy, tiring, and pruning to errors process. Hence, with the tremendous success achieved by utilizing CNNs in bioinformatics, the development of Computer-Aided Detection (CAD) systems has proved its necessity to solve the challenging cases for the biopsies missed by the ordinary checking leads to decreasing the false positive and negative rates. In this paper, we present a YOLOV4 based CAD system to localize lesions in full and cropped mammograms and then classify them to obtain their pathology type. The proposed method mainly consists of three phases that are applied on the full-field digital mammograms of the INbreast dataset. First, the mammograms are preprocessed to remove any extra artifacts and then cropped into small, overlapped slices. Second, masses are localized through two paths: the full mammograms and the cropped slices detection after configuring the YOLO-V4 model. Third, other feature extractors like ResNet, VGG, Inception, etc. are used to classify the localized lesions to compare their performance against YOLO. The proposed method proved using the experimental results the impact of utilizing YOLO-V4 as a detector with the 2-paths of detection of a full mammogram and the cropped slices in a trial to avoid any data loss by resizing the large-sized mammograms. Our system succeeds in detecting the masses’ location with an overall accuracy of ≈98% which is more than the recently introduced breast cancer detection methods. Moreover, its ability to distinguish between benign and malignant tumors with an accuracy of ≈95%.


I. INTRODUCTION
Cancer is one of the most genetic and dangerous diseases that threaten people's life especially due to its late discovery. Cancer has more than 100 kinds and one of the most common types of cancer is breast cancer. According to World Health Organization (WHO), breast cancer is considered the most common and dangerous cancer type that spreads all over the world and annually leads to a larger death rate besides lung The associate editor coordinating the review of this manuscript and approving it for publication was Xi Peng .
cancer. In the united states only, it is expected by WHO that in 2021 about 43,600 women will die from breast cancer.
Breast cancer is categorized under carcinomas which are formed by the body cells that cover most of the outside and inside surfaces of the body. Each cancer type is mainly represented in either benign or malignant tumors [1]. Hence, once the breast cancer appears in a specific part of the breast, it starts in invading the surrounding tissues nearby the suspected one very fast especially the malignant tumors. The benign tumors are sometimes greater than the malignant ones however they are not spread widely like the malignant ones and once they are removed, they don't grow back. After many studies, it is confirmed that the noticeable increase in the death rate caused by breast cancer is due to the late detection through screening which results in almost all cases in bigger-sized tumors and consequently makes it difficult to be treated [2].
Also, with the huge number of mammograms screened daily, the process becomes hard, lengthy, complex and time consumable for radiologists and consequently in many cases, it pruned to errors [3]. Since radiologists till now miss between [10%-30%] of cancers either by taking decisions for some cases as benign despite their malignancy (false negative) or requiring additional screens due to their doubt about malignant tumors despite they are benign (false positive) [4], [5].
Here, comes the role of developing Computer-Aided Detection (CAD) systems to assist as a second reader with the radiologists' decisions to decrease both the false negatives or the false positives. The development and use of CAD systems are increased since 2001 and especially in the period from 2004 to 2008, which is increased by 91% either by private labs or hospitals [6]. Through the usage of CAD, it has been proved how it improves the process of cancer detection at earlier stages by decreasing the false positive and negative rates [7]- [10]. Not only this but also its use reduces the consumed time required by the radiologists to check the screened mammograms [11]. CADs are usually developed to localize the lesions existing in the screened breast mammogram [12], [13].
Mammography is considered one of the most common ways that can be applied to detect breast cancer in an earlier stage. The main reason behind making mammography the preferable choice for women is that it needs a very low dose of x-rays which makes it somehow safe unlike Magnetic Resonance Imaging (MRI) that needs a large dose of magnets and radio waves. Hence our proposed approach decides to work with mammographic datasets to detect the existing tumors.
Although many developments have been carried on to enhance the existing CADs for better detection accuracy, there are many challenges still acting as a barrier. These challenges are represented in either the developed CADs or the mammogram characteristics. The main problems in the already existing CADs are represented in the missing capability of applying them in the real-life systems used by hospitals or labs due to their long time for detection. Some CADs perform the detection in real-time but with lower accuracy. Hence, the good detection performance and the fast execution is somehow a trade-off problem. On the other hand, the screened mammograms are usually of large size and pixel depth which results in resizing them in almost published work to be ready for training a specific model. The resizing results in missing some of the important information that may exist in the screened mammograms and consequently it may affect the obtained accuracy [14].
Nowadays, the Convolutional Neural Network (CNN) has been proved its powerful capability to achieve a notice-able change in the obtained results compared with the other traditional methods [15]. The huge advances done on the deep learning models result in replacing the process of the handcrafted features with a deep learning-based model to automatically learn the essential required features that are more relevant in our case to the tumors we need to detect. Also, another important advantage for using deep learning models rather than the conventional ones is that their learning ability through more than one level of representation. So, the deep models can successfully learn different forms of data representations beginning from the raw data, to higher representation levels till reaching the highest level of representation to complete the learning process [16]- [18].
In this paper, we proposed a You Only Look Once (YOLO) based real-time CAD system using an introduced full mammogram cropping and merging algorithm that can localize masses in high accuracy and a very few seconds. The proposed system is composed of two paths. The first path is passing the full mammogram to our system after resizing it to localize masses. The second path is cropping the full mammogram into small slices without resizing the original mammogram and then detect any existing masses but in the sliced parts, then the sliced parts are finally merged to get any detected lesions missed from using the full mammogram in the first phase. The novelty in this work is represented in proposing a full detection and classification model based on YOLO-V4 to detect tumors in the breast using two approaches full and cropped detection paths then replace YOLO's classification layers with other features extractors to classify the localized lesions it obtains. By applying the proposed architecture because of the cropping idea, the shortage in the publicly available Full Field Digital Mammographic (FFDM) datasets is resolved. The proposed CAD system contains at its beginning an augmentation step different from the ordinary known augmentation methods that replace the small number of FFDMs with a large number of small cropped slices representing the full mammograms. Moreover, the proposed mammogram cropping idea succeeds in preserving the mammograms' high-resolution property by cropping each full mammogram into small slices to fit the YOLO model in a trial to avoid losing any data by resizing the full mammograms. The sliced overlapped parts result as well in catching some missed lesions not detected in the case of the full mammograms detection. Not only this but also the introduced cropping idea is represented as a third reader besides the proposed first layer full mammogram detection decision and the radiologists' decisions to confirm the localized lesions. To classify the detected tumors in the best manner, most of the feature extraction classifiers used by the most recent published work are applied versus YOLO to propose a comparative analysis on the performance of each in order to deduce the best classification model we can depend on it for obtaining the pathology type of the localized lesions.
The remaining of the paper is structured as follows: Section II presents some related work to the developed CADs responsible for breast cancer detection and classification. Then, the proposed approach architecture is described in section III with introducing the details of the 2-paths detection of abnormal lesions in the breast and the localized masses classification phase. In section IV, all the conducted experiments are presented to show the proposed method performance and compare it versus other recent states of artwork in breast cancer localization and finally, the work conclusion is contained in section V.

II. RELATED WORK
Recently, obvious progress has been achieved in the deep learning especially in extensively merging the use of various image processing techniques with other biological concepts to resolve critical issues like breast cancer detection. Numerous valuable attempts have been developed to obtain a successful computer-aided detection model to localize the existing lesions in a fast and accurate manner [19], [20].
Doi in [21] has proved the critical need for the development of different computer-aided detection systems with their impact on the early diagnosis of breast cancer which results in being one of the major research fields in diagnostic radiology and medical imaging. Besides Jiang et al. [22] and Chan et al. [23] that confirmed how the CADs can improve the performance of the masses detection process held by radiologists that results in presenting the CAD as a second eye for the doctors to check for breast cancer.
Jiang integrated a new dataset of breast mammograms named Film Mammography dataset number 3 (BCDR-F03). Then, they worked on the segmented tumors existing in breast mammograms by applying the GoogLeNet model and the AlexNet to classify them achieving an Area Under Curve (AUC) of 0.88 for GoogLeNet and 0.83 for AlexNet respectively [24]. Lotter et al. proposed a scanning model based on a CNN patch-based classifier to classify mammograms achieving an AUC of 0.92 [25].  [28]. Rodriguez-Ruiz et al. using the experimental results proved how the performance of the breast cancer detection using their proposed stand-alone Artificial Intelligence (AI) model is very near to the radiologists' evaluation. Their system obtained an AUC of 0.840 while the human (radiologists) evaluation reached an average AUC of 0.814 [29]. Moreover, other different schemes for proposed CADs in these new surveys [30], [31] proved the great impact of AI on breast cancer early detection.

III. PROPOSED METHOD
In this section, the proposed architecture phases to detect lesions in mammograms and classify them are discussed, with the evaluation method applied to check for cancerous tumors in new mammograms. As shown in Fig. 1, the proposed architecture is composed of three phases. The first phase is the preprocessing phase that prepares the Digital Imaging and Communications in Medicine (DICOM) mammograms to be in the required format without any extra artifacts. The second phase is the detection phase responsible for configuring YOLO-V4 to localize masses in the abnormal mammograms through two detection paths. Finally, the third phase which is the classification phase applied by replacing YOLO-V4 classification layers with other feature extractors to distinguish between benign or malignant detected lesions. Fig. 1 illustrates the overall architecture of the YOLO-V4 based detection model.

A. PHASE 1: PREPROCESSING PHASE
The mammograms are usually stored in form of Digital Imaging and Communications in Medicine (DICOM) format in large dimensions greater than those that fit in any deep learning method. Besides that, their ground truth annotations are not stored in a direct format. So, through this phase, the preprocessing steps shown in Fig. 2 are applied to prepare the mammographic datasets in the required format for masses checking.
The preprocessing step is considered an essential phase for both types of mammograms, either the scanned mammograms or FFDM. As clear from Fig. 3, the FFDM has better quality than the scanned mammogram since it is generated directly from the scanner in a digital format which results in reducing the noise quantity obtained due to the scanning process applied to convert the film mammogram to a scanned one.
So, the scanned mammograms are in a higher need of the preprocessing steps than FFDM. CADs obtain better results when they have applied on the full-field digital mammograms than the scanned film ones but generally more accurate results are obtained when the preprocessing steps are applied on either one of them. Since even in the case of the FFDM, sometimes labels exist at the top of some cases and even if the mammogram is free from any extra artifacts, some noise results in the obtained FFDM due to some dust obtained on the breast screening tool itself or any useless motion.
The first step as shown in Fig. 2 comes here to prepare the mammograms in the required format with their ground truth. In the beginning, all DICOM files are converted into readable images in TIFF (Tagged Image File Format) and (Portable Graphics Format) PNG formats. Since the mammograms' pixels' values are usually fit between 14-bits and 16-bits contrast resolution, so in our work, the TIFF format is used to preserve the original values for pixels that in most datasets are greater than 256 colors. However, TIFF consumes a large space to store the mammograms, so the PNG is also used but by scaling the 14-bits or the 16-bits resolution into 8-bits resolution to compare between its manner versus using TIFF mammograms. Then, as shown in Fig. 4 in (a) any artifacts like the extra labels indicate the date, mammograms view, etc. are removed by extracting the greatest component (breast) to get a noise-free mammogram like the one shown in subfigure (b).
Moreover, in order to reduce any useless space in the mammogram leaving the breast only, some mammograms contain large black space as shown in Fig. 4 in (c) that is output from (b). So, in a trial to resize the mammogram naturally without losing any data, continuous black spaces in a mammogram are removed to reduce the original size to fit the breast only, and consequently, the coordinates of the lesion are updated related to the new mammogram. All coordinates are extracted from the XML files attached with each case and then normalized to [0, 1] to fit any size the original mammogram will be resized to. The full preprocessing steps are shown in Fig. 2.

B. PHASE 2: MASSES DETECTION PHASE
It is concluded from the most recent work proposed for breast cancer detection that the full mammograms due to their large sizes, they are resized to small dimensions to fit the one of the deep learning voracious method [32]- [37]. The resizing issue is mainly summarized in a reduction in the image quality and the loss of some significant information that may feature the existent masses especially when the resizing occurred by a large ratio. For the sake of this, in this paper two paths are introduced to detect tumors which are the full mammogram detection and the cropped slices detection. YOLO has 4 versions and its latest version (YOLO-V4) is one used here for learning how to localize tumors in mammograms. The 3 versions are compared before on the INbreast dataset in [37] and based on the conducted comparative analysis, it is found that YOLO-V4 will perform better as will be proved in the experimental results. So, YOLO-V4 has been configured to be applied after adopting both detection paths as follows:

1) PATH 1 OF FULL MAMMOGRAMS DETECTION (FMD-PATH): MASSES LOCALIZATION IN FULL MAMMOGRAMS
As shown in Fig. 5, two strategies are applied to be compared and then obtain the better one to continue the localization process.
In the first strategy, the mammograms are labeled as benign, malignant, and normal mammograms, i.e. 2 classes (benign & malignant) besides the negative samples of the normal mammograms. While in the second strategy, the mammograms are labeled as normal and abnormal, i.e. 1 class (mass whatever its pathology type) besides the normal ones. In both strategies, the dataset is divided into 80% for the training and 20% for the testing. The idea behind this path is to make further analysis on a given case to get any missed masses not obtained by detection path 1. The second detection path includes three main steps. First, each full large-sized mammogram is cropped into small slices. Second, these small slices are prepared to input to YOLO-V4 for training to be able to check for any existing lesions in newly taken mammograms and also to be used for the model detection evaluation. The detection process is performed on these slices to act with them as new small mammograms contain lesions (abnormal) or not (normal). Then, the third and final main step is post-processing each mammogram's slices by merging them together to get the existent masses bounding boxes on the concatenated full mammogram. In this path, the mass is either obtained in the abnormal cases without knowing its malignancy type or nothing detected in the case of the normal mammograms. The full cropping algorithm is detailed in Fig. 6.
This phase contains at the beginning a different kind of data augmentation that is represented in cropping each full mammogram into several small slices with a specific overlapping ratio. The idea behind the cropping step is not only to replace the full mammogram resizing with a small focused slice but also overcoming the small size of the existing FFDM samples by creating new samples from slicing each mammogram which results in the existence of larger training and testing datasets using unconventional data augmentation method.

3) DETECTION INTERSECTION RULES OF FMD-PATH AND CMD-PATH
Both introduced detection paths are utilized here with each other to be able to check any existing lesions in the sample evaluated mammogram. The commonly used approach to localize lesions in a mammogram is resizing it to a smaller size, then apply on it a specific algorithm or model to get masses in abnormal cases. In this paper, this approach is applied through the FMD-Path but due to the missed lesions in some cases, the CMD-Path is proposed which enhances the detection performance since it succeeds in getting most of the lesions detected by the FMD-Path in addition to others not obtained by the FMD-Path. Moreover, usually the CADs till now act as a second reader besides the radiologists' decisions so using the proposed method, we introduced a third supporter to the final decision which represented in the CMD-Path proved using the experimental results its ability to act as a third reader to a specific mammographic case. Based on the obtained results, the FMD-Path is usually the main reference to our decision since it looks for the mammogram as one shot with all the properties and features included in it. Then, the necessity for applying the CMD-Path appears in the case of the dense mammograms or the mammograms containing tumors but not obvious since their features and color intensities are similar to the breast. So, results obtained from both paths of detection are combined as follows to get suspected areas in mammograms: 1) FMD-Path: The full mammogram is checked as an input to our CAD to get ''FMD-Path Detections Results'' using the best weights obtained from experiment 2. 2) CMD-Path: The full mammogram is first preprocessed to be cropped into slices of size 1024 × 1024, then the mammogram's slices are prepared to input the CAD system for lesions localization using the best weights VOLUME 9, 2021 Results'', then the confidence score (CS) of the obtained masses will be checked to take only those that their CF ≥ 0.5.

C. PHASE 3: MASSES CLASSIFICATION PHASE
Since both the second strategy of the first detection path (FMD-Path) or the second detection path (CMD-Path), the output in case of the abnormal mammograms is the bounding box of the tumor without differentiating if it is benign or malignant. So, to be able to propose a complete detection and classification model and also to be able to compare the proposed work performance with others, the obtained tumors must be classified into benign or malignant. So, this classification phase shown in Fig. 7 comes to continue classifying the detected masses obtained by applying either strategy 2 of the first detection path or the second one. In this phase, the vector of coordinates representing the bounding box of the existent masses is used to crop the mass area into separate images. Then, YOLO-V4 role of classifying objects after localizing them is replaced by other feature extractors. Multiple classifiers are used to distinguish between the benign and malignant masses to conclude the best one that can achieve the best classification result.

A. EVALUATION METRICS
Since the proposed model is mainly composed of two phases: masses localization followed by the classification of the detected masses. There are some evaluation metrics used to evaluate the accuracy of the lesions detection and others used to evaluate the masses classification performance.

1) DETECTION EVALUATION METRICS
We used the Intersection Over Union (IOU) metric to evaluate how much the model is accurate in localizing tumors in breast mammograms as follows: In our work, we consider the mass detected correctly if the IOU is greater than or equal to 0.5 and otherwise the detection is neglected.

2) CLASSIFICATION EVALUATION METRICS
Each localized mass is passed through a specific classifier to output its malignancy type then it is compared against the given ground truth to get one of four options. The first is the benign mass that is classified as benign (True Negative), the second is the benign mass classified as malignant (False Positive), the third is the malignant mass classified as malignant (True Positive) and finally, the malignant mass classified as benign (False Negative). These four metrics are the main components of the confusion matrix that is one of the significant metrics used to evaluate the classification accuracy. We measured True Negative (TN), True Positive (TP), False Negative (FN), and False Positive (FP) and used them to calculate the following: • Precision: • Overall accuracy • Mean Average Precision (mAP) The average precision (AP) is the calculated area under the curve of the recall with the precision shown in Fig. 8 The mAP is the mean of the AP that is computed for the benign and the malignant classes.  3328×4084 or 2560×3328. The full mammograms assigned to the testing set are 22 normal and 22 abnormal mammograms which represented 20 % of the overall samples. The number of lesions included in the abnormal cases is 24 since some cases contain more than one mass. The INbreast is considered the only public dataset that contains fully field digital mammograms. Also, it is characterized by the accurate lesions annotations that represent the masses' ground truth since they are validated and confirmed by two specialists. There are other public mammographic datasets and the most popular and commonly used ones are the Mammographic Image Analysis Society Digital Mammogram Database (MIAS), the Digital Database for Screening Mammography (DDSM) dataset, and the Curated Breast Imaging Subset of DDSM (CBIS-DDSM).
The MIAS is the oldest dataset containing digitized mammograms which have lower quality due to the scanning process and the digitizing tools used to prepare them as obvious in subfigure (a) in Fig. 9.
The DDSM dataset has a large set of mammograms, however as noticed from the given ground truth and confirmed in [40], the lesions annotations are not accurate as shown in Fig. 9 (b). Even if the DDSM is used in the training process, avoid using it in the detection and the classification evaluation process due to its low precision. The CBIS-DDSM is a modified and updated version from the DDSM which is VOLUME 9, 2021 considered nowadays the most recent mammographic dataset that contains correct ground truth annotations for scanned mammograms as the sample given in subfigure (c) in Fig. 9. Fig. 9 shows a sample from each dataset and as shown the FFDM mammogram of the INbreast in the last subfigure (d) is the one that owns the best quality and the most accurate given ground truth that fits the actual existent mass as a result of the absence of any scanning or digitizing process.
Since the CBIS-DDSM is considered the most accurate dataset contains scanned mammograms, we conducted some of the experiments to compare the performance of the introduced CAD system on it versus the INbreast FFDM mammograms. For example, when the FMD-Path using strategy 2 is applied on the CBIS-DDSM scanned mammograms, it achieves mAP of 64.67% which is 97.86% for the FFDM mammograms using the same settings and parameters. After interpreting the failed cases, we found that there are a lot of FP cancerous cases. This is in most cases due to the quality of the mammograms reduced due to digitizing the scanned version compared with the FFDM as shown by the given examples in Fig. 9. Also, in most FP scanned mammograms, the pixels' values are concentrated at very large values which results in false localized lesions. These are the main reasons which direct us to complete our experiments on FFDMs besides that there is another essential reason represented in the labs and the hospitals' dependency nowadays on diagnosing FFDM instead of scanned ones thanks to the great development that occurred in the existing medical equipment.

C. IMPLEMENTATION DETAILS
We have implemented the masses classification phase by configuring some of the popular and commonly used classification models using Keras that is based on Tensorflow as a backend. To preprocess the INbreast and extract the ground truth annotations or to post-process the mammogram slices to get the full detections, we used Matlab and Python 3.7 as our development environments. Besides using C++ programming languages to update and compile YOLO on Ubuntu 14.04 operating system. The proposed YOLO-V4 based CAD system is implemented on an Intel Core (TM) i7-9700K desktop processor 8 cores up to 4.9 GHz Turbo 300 series with 16 GB RAM and GIGABYTE GeForce RTX 2080 Ti overclocked 11G graphics card.
YOLO-V4 has some essential parameters that shall be changed based on the problem and the used dataset type. These parameters are set by specific values based on several experiments that were carried on with various values in a try to interpret them and how they affect the training process and the evaluated detected performance. Based on this, there are a set of parameters that will have fixed values in all experiments since it is proved its significance on the lesions detection and others have more than one value. The parameters with fixed values are: • Training iterations number = 4000. Since it is mentioned in [41] and confirmed by results as well, that 2000 iter-ations are sufficient to train only one class. So, since the maximum number of classes we have is 2, the number of iterations is set by 4000.
• Learning Rate (LR): Different values are used for learning rate as 0.001 & 0.0001 and by the experimental results it is found that starting with 0.001 is better than 0.0001. The parameters that vary based on each experiment are: • The number of classes (C): It is either two, i.e. benign & malignant versus normal mammograms, or one, i.e. abnormal versus normal mammograms.
• The Model input size: The network size can be set any value multiple of 32 as 448 × 448 or 608 × 608 or 832×832, etc. As the input mammogram size increased, the training process becomes better and the detection accuracy is enhanced. However, as the input mammogram size increased, the subdivisions can be occurred in a batch increased and consequently the training process becomes slower. So, in the main experiments, we usually resized the mammograms to 448 × 448 to perform fast experiments and then the experiment that obtained the best results was updated to resize its mammograms to larger dimensions.
• Steps: It is specified to change the LR value during the training process automatically after a specific number of iterations to avoid overfitting. For example, if it is set by 4000, 5000, this means that the first time for the LR change to be changed is at step number 4000 and then changed to another value at iteration 5000. The large LR allows the model to learn the general features of the localized masses in the training set in a faster manner. However, as the neural network sees a large amount of data, the LR needs to be decreased over time to result in changing the weights less aggressively.
• Scales used for learning rate change as follows:0.1 and then 0.1. This means that at iteration 4000 the LR is multiplied by 0.1 and then by 0.1 at iteration 5000.

D. EXPERIMENTS 1) EXPERIMENT 1: FMD-PATH USING STRATEGY 1
In this experiment, all mammograms were resized to smaller sizes to fit YOLO. Two trials are done in this experiment, the first full detection trial (F1) without including normal mammograms in the training set and the evaluation test. While the second full detection trial (F2) includes normal mammograms, i.e. only abnormal cases with benign and malignant lesions to obtain the results in Table. YOLO-V4 is configured using these values: •   the model not be able to distinguish between the normal cases and abnormal cases, which results in missing biopsies more than those missed in F1, i.e. treating some abnormal cases as normal cases due to the similarity between some benign cases and the normal ones.

2) EXPERIMENT 2: FMD-PATH USING STRATEGY 2
The goal of this experiment is to evaluate the proposed YOLO-V4 detection model but when all lesions are grouped into only class (abnormal case) versus the normal (massfree) cases. So, all masses whatever their types are annotated as abnormal cases. The training and the testing set become balanced here by adding the same number of normal mammograms with the abnormal ones. In all trials YOLO-V4 is configured using these values: • The number of classes (C): 1 (Mass) with negative samples representing the normal cases.
• Scales: SC1 option (0.1,0.1,10.0,0.1) which is used with S1 steps -SC2 option (0.1,0.1) which is used with S2 steps. As shown in Table 2, the full detection trial 4 (F4) is much better than full detection trial 3 (F3) and this is due to the way we changed the learning rate. Since the model is pre-trained before on images with general categories so, it doesn't have any knowledge about breast cancer and its features. For the sake of this, the learning rate shouldn't be decremented or changed at the beginning steps of the training process since we are starting with zero information and so the learning rate needs to be high then decremented at further steps as achieved by steps of S2 with scales of SC2.
On the other hand, when we compared the trials (F1 & F2) executed to detect lesions using strategy 1 with the trials (F3 & F4), we found that specializing the detector to learn how to localize cancer in the breast mammogram whatever its pathology type is better than learning how to localize benign and malignant lesions by nearly12%. So, by applying strategy 2 to train YOLO-V4 by its CSPDarknet53 backbone, 97.8% of the lesions in the testing set are detected with IOU greater than or equal to 0.5.
The best results obtained from all the conducted trials is from trial F4 which is using full mammograms and two values for the LR scale at late steps such that 22 lesions out of 24 in 22 mammograms are localized completely correct with mAP of 97.8%.

3) EXPERIMENT 3: CMD-PATH
The objective behind this experiment is to evaluate the mammogram cropping idea on the masses detection performance in the full mammograms. In this experiment, the slices with masses are treated as abnormal cases but act like zooming area in the mammogram and the slices without masses are treated as abnormal cases. However, since the mass region is considered very small relative to the whole mammogram, so the slices obtained with masses (abnormal area) are very small relative to those without masses (normal area). As shown in Table 3, this phase plays an important role in supporting the proposed method by a different kind of data augmentation to increase the used samples in either the training or the testing samples by slicing each mammogram into a different number of overlapped slices. Moreover, to reserve data balancing during training, we select random slices without masses and slices of the normal mammograms with the same number of the slices with masses, since the number of the slices with masses is the smallest one. For example, in the case of 224 × 224, we selected randomly 1840 samples from the set of slices without masses generated from the abnormal cases and 1840 samples from the cropped slices obtained from the normal cases besides mainly the 1840 slices with masses. Regarding the testing set, we should not know which slice contains mass in the real scenarios, so all slices shall be checked and hence in the evaluation phase, we took all the mammograms' cropped slices either from the normal or the abnormal mammograms.
In this experiment, the same 80% and 20% mammograms, representing the training set and the testing set respectively, are cropped into small slices using different values for the following parameters:  • Neglection ratio: Any slice is neglected from the training and the evaluation sets if either the mass is not fully included in the slice or the ratio of the non-zero pixels to the overall slice area is smaller than 25%.
• Scales: 0.1 & 0.1. As proved by the results in Table 4 and shown in the given example in Fig. 10, when the slice size increases the normal cases detected as positive (normal FP) decreases. Also, it is observed that the number of the abnormal cases detected true increased with the bigger slices' sizes.
The interpretation of these obtained results is that as the slice size increases, the almost bounding box covered the mass becomes appeared in the cropped slice, i.e.completely included. So, the model has trained more accurately by using nearly complete features of the full shape of mass, color, size, etc., and consequently, the TP of the abnormal cases increases, and the number of normal cases detected as abnormal decreases as well. While on the other hand when the slice size decreases, the number of the normal regions detected as positive (FP) increases because the mass in most cases is divided on more than 1 slice due to the small-sized slices. Consequently, it is trained on the colors only more than the full shape and characteristics of the mass. Hence with the small-sized slices, the FP increases since any bright region will be predicted as mass whether it is actually a tumor region or not.

4) EXPERIMENT 4: FMD-PATH USING STRATEGY 2 SUPPORTED WITH CMD-PATH
In this experiment, both paths of detection evaluated in experiments 2 and 3 are combined here using the specified  detection rules set to check any existing lesions in the same evaluated mammogram. As shown in Fig. 11, the proposed idea of applying both paths together on the same cases using the combination detection rules, associates in obtaining a higher number of true abnormal mammograms (TP mammograms) and a lower number of both missed abnormal cases (FN mammograms) and false abnormal regions as well (FP mammograms). The output of some test cases is shown in Fig. 12.

5) EXPERIMENT 5: MASSES CLASSIFICATION PHASE
This is the experiment of replacing YOLO features extractor layers with other different feature extractors to distinguish between either the benign or the malignant masses. Each pretrained classifier is fine-tuned on the masses of the training mammograms and then evaluated on the 22 localized lesions extracted from experiment 4 as shown in Table 5.
Since in the case of YOLO-V4, all the validation set must be first detected then classified. So, due to the small number of the existing samples, we used the testing set as the validation set to train better using larger samples. While for the remaining classifiers, we are able to divide them into training and validation sets. Since the training set is already augmented in the detection phase using the cropping idea based on the two-layer detection approach. Then, all masses are cropped from the training set used in the detection, which helped us in dividing them into 60% and 20% for the training and validation sets respectively, leaving the testing set as it is to introduce a fair comparison between the used classifiers and those obtained from YOLO-V4.
Based on experiment 1 of the FMD-Path using strategy 1 in which we merged the detection and the classification tasks together, a larger number of FNs is obtained than the trial based on two separate detection and classification phases of experiment 2 for the FMD-Path using strategy 2. Where, experiment 1 of the FMD-Path using strategy 1 achieves 85.5% for the detection accuracy and 90.0% for classifying the already truly detection lesions using merged detection and classification phases. While experiment 2 of the FMD-Path using strategy 2 achieves 97.8% for the detection accuracy VOLUME 9, 2021 which are classified later with 91.0% when YOLO's classification layers are replaced by other classifiers. Hence, the main and prior task in breast cancer is detecting correctly any cancerous cases if exist. So, we completed the classification process on the resultant localized lesions from experiment 2 of the FMD-Path using strategy 2 because it obtains a larger number of TPs and a smaller number of FNs than merged both processes. The main reason behind this is that as the number of correctly detected lesions (abnormal cases) increases and the number of correctly normal cases increases, the process of cancer detection at earlier stages will be improved. Consequently, the number of truly detected cancerous cases increases which results in early treatment and reduces the deaths rate as well. Moreover, the larger number of TP and confirmed localized cases will be classified to get their pathology type.
As shown in Table 5, Inception Net v3 is the one that achieves near or better results compared with those obtained from YOLO-V4 when it is used to detect benign versus malignant lesions due to the new updates added in the net of Inception V3 like the label smoothing that is added to the loss function to avoid overfitting caused when the network becomes too confident about a specific class. Also, the batch normalization added in Inception V3 in the auxiliary classifiers enhances the features learning process resulting in better overall accuracy compared with the other classifiers. The training and the validation accuracy of each classifier is shown in Fig. 13.

E. COMPARATIVE ANALYSIS
The proposed approach is compared against other recent proposed detection and classification methods in Table 6.
After the comparative analysis and the conducted experiments, the proposed system proved its effectiveness against others that achieves high accuracy with fast speed to check breast cancer. The proposed model contributes with the following: 1) An automated full computer-aided detection and classification model that succeeds in localizing lesions in abnormal mammograms with accuracy reached 98%.
2) The proposed model is trained on normal and abnormal mammograms such that the normal cases detected as normal and the abnormal are detected with bounding boxes representing the lesions that may exist in the breast. 3) Replacing YOLO-V4's features extraction layers with other features extraction classifiers to obtain the best results achieved by the Inception V3 model for distinguishing between the benign and the malignant tumors which is nearly 95%. 4) Proposing two paths detection system represented in FMD-Path and CMD-Path which results in a robust model since it succeeds in reducing either the false positives or the false negatives decisions taken by radiologists because it acts as a second reader. Not only this by using the 2-paths proposed approach, but it also acts as a second and a third mammogram reader after the radiologists to confirm any existing lesions if exist. 5) The CMD-Path succeeds in obtaining any missed masses from FMD-Path especially the small-sized masses because this path is applied on the mammograms with preserving their original features and resolution. It serves as a confirmation radiologist for checking any localized lesions by the FMD-Path using the combination detection rules. 6) The CMD-Path is used here as a novel augmentation method that increases the number of either the training or the testing samples which can be used in any other proposed model instead of the ordinary augmentation methods such as rotation, flipping, etc. It augments the input data by cropping the original image into small overlapped slices which introduces two advantages: increasing the training samples and creating new meaningful samples from the already existing ones. 7) The proposed 2-paths detection model is a real-time detection and classification system that checks any existing masses in the screened mammograms by generating the bounding boxes representing tumors and their pathology types in less than 10 seconds.

V. CONCLUSION
In this paper, we have proposed a YOLO-V4 based CAD system to localize any suspected cancerous area in the breast and classify them into benign or malignant if existing with high accuracy. The proposed model overcomes the issue of resizing the breast mammograms into smaller sizes to fit CNNs for cancer localization by proposing a mammograms cropping idea that acts as a third part detector. Nowadays, due to the huge number of mammograms taken daily and with the necessity to early discover breast cancer to reduce the death rate, the CADs play an essential role to be a second reader for the screened mammograms to decide with the radiologists one. Here, we proposed a 2-paths system that can confirm the detected cases two times without the radiologists' time. The two paths are applied together to avoid any dependency that may result in slowing the diagnosis process. The first path is responsible to take the full mammogram after resizing it into a smaller size to check any existing masses. While the second path is responsible to do the same detection process but in a different manner by cropping the full mammogram in its original size into small overlapped slices and then each slice is treated as a separate case that the model shall search in for tumors. This path detects lesions in the cropped slices of the full mammogram, then the obtained slices are merged to get the complete bounding box of existing cancer. The second path in the proposed architecture especially succeeds in localizing the small tumors missed from the first detection path. The proposed model is a real-time detector that can detect and classify any existing masses in maximum few seconds and it also improves the detection accuracy of the already existing CADs proposed recently by reaching 98% for detection. Moreover using different experiments, it is proved that it is better to use YOLO-V4 for detecting only cancerous areas whatever their types. For the sake of this, YOLO-V4 role of the feature extraction is replaced by more than one classifier that is fine-tuned on breast cancer images which results in depending on the Inception-V3 as the best classifier since it is the one that obtains the best results among the other 6 classifiers we used to classify the detected lesions from the combined proposed 2 paths.