A State-of-the-Art Survey on Deep Learning Methods for Detection of Architectural Distortion From Digital Mammography

Breast cancer is a type of cancer that has risen to be the second cause of death among women. Classification of breast tissues into normal, benign, or malignant depends on the presence of abnormalities like microcalcifications, masses, architectural distortions, and asymmetries. Architectural distortion (AD) is subtle in detection with no association with masses but shows the abnormal arrangement of tissue strands, often in a radial, spiculation, or random pattern. It is widely rated as the third symptom of breast cancer which is the most commonly missed abnormality. Most computational approaches characterizing abnormalities in breast images often concentrate on the detection of microcalcification and masses with architectural distortions appearing as a secondary finding. The subtle nature and a minimal occurrence of architectural distortions may seem to complicate computational approaches for its detection. As a result, little research interest has been recorded in this area. It is widely reported that some cases of recent breast cancer are wrongly diagnosed due to the omission in detecting the presence of architectural distortion at the early stage of the disease. However, we discovered that most computational solutions to early detection of breast cancer are focused mainly on detecting other abnormalities such as masses and microcalcification, which are some evidence of the advanced stage of the disease. To emphasise the little efforts channeled towards detection of AD compared to other abnormalities, this article aims to detail the review of such studies in the last decade. To the best of our knowledge, this study presents the first review which focuses on the detection of architectural distortion (AD) from mammographic images. Furthermore, this article presents a comprehensive review of approaches, advances, and challenges on the computational methods for detecting AD, with the sole aim of advancing the use of deep learning models in detecting AD. Moreover, a comparative study of performance analyses of articles surveyed in this article is investigated. Our findings revealed that about 70% of the existing literature adopted Gabor Filters, while just less than 10% leveraged on the state-of-the-art performances recorded in computer vision and deep learning, in building outstanding computational models for the detection of AD. The current study also discovered that using a deep learning approach, such as the convolution neural network (CNN) method, can yield a significant increase in performance for the task of detection of architectural distortions. This assertion is based on literature results obtained using the CNN, which generates an accuracy of 99.4% compared to the use of Gabor filters method, which accounts for 95% accuracy.


I. INTRODUCTION
Cancer is the uncontrolled growth and spread of cells. Breast cancer is a type of cancer that has risen to be the second cause of death among women. A fundamental characteristic The associate editor coordinating the review of this manuscript and approving it for publication was Donato Impedovo . of all forms of cancer is that the earlier they are detected and attended to, the easier they can be cured. In other words, if cancer is detected early, within a comprehensive cancer control plan, a significant number of cancer patients can be cured or have their lives prolonged significantly. This is because the growth rate of the affected cells can be exponential [1]. The Cancer Health Center (CHC) noted that most cases of cancer are detected and diagnosed after a tumor can be felt or when other symptoms have developed [2]. Breast cancer has the second highest mortality rate in women next to lung cancer and is the most common type of cancer in 140 countries of a total of 182 evaluated nations [3]. The US prediction on breast cancer towards 2019 revealed that about 268,600 new cases of invasive breast cancer would be diagnosed, 62,930 new cases of carcinoma in situ would be diagnosed, and 41,760 women would die from breast cancer [4]. Although these figures may appear to mirror what is obtainable in most developed economies, research has also shown that almost 50% of breast cancer cases and 58% of deaths occur in less developed countries. The increasing mortality rate arising from breast cancer is mostly due to the lack of early detection of the disease as over 33% and 81% of the population in ages 30-49 and 30-59 accounts for the incidences, respectively [5], [6], [9].
Breast cancer images are presented as mammography, magnetic resonance imaging (MRI), ultrasound (US), tomosynthesis (3D mammography), xeromammography (though no longer used), and galactography. Breast cancer detection using these images often presents abnormalities like architectural distortions, microcalcifications, asymmetries, and solid masses. Some of these imaging media have served as a means for breast cancer screening, in an attempt to diagnose the disease before symptoms begin to appear. Mammography is a low-dose x-ray imaging widely adopted for screening breast cancer in women. This imaging allows for early detection of breast cancer when it is in the impalpable or preclinical phase. The use of mammography has reduced the rate of diagnosis of advances in cases of breast cancer. Besides this, early detection of the disease has made it possible to make the treatment effective and localized to a region where the tumor is found. Therefore, this has reduced the mortality rate resulting from breast cancer. Mammography, especially the 3D mammography, can detect architectural distortions in breast images effectively.
Architectural distortion (AD) is the third most common appearance of non-palpable breast cancer, subtle with a variable presentation, and has no association with visible masses, but shows the abnormal arrangement of tissue strands, often in a radial, spiculated or random pattern. It can be caused by benign lesions such as post-surgical scar and radial scar as well as malignant lesions, such as invasive carcinoma AD. The invasive carcinoma AD accounts for an estimate of between 12% -45% of missed breast cancer in mammography, and it would often present itself as a secondary finding associated with a primary outcome such as masses or asymmetries [8]. Architectural distortions or AD are usually discovered in retrospect and only account for about 6% of screening that detects cancers. This is often the basis for its neglect in computational studies on the characterization of abnormalities in breast images. However, the detection of AD is important for ruling out possible potentially malignant lesions in the breast. Still, due to its subtlety, it is often missed on the screening mammography [9]. Similarly, AD is a mammographic finding associated with a high positive predictive value for malignancy in both screening and diagnostic mammography, between 10%-67% and 60%-83% respectively [10], [11].
Computational solutions such as the computer aided detection (CAD) systems have proven to be relevant in reducing observational oversights and false positive rates resulting from a wrong interpretation of medical images [12]. CADs have been applied specifically to the task of characterization of abnormalities (e.g. architectural distortions) in breast images. For instance, approaches like Gabor Filter, mathematical models, fuzzy logic, 2D Fourier transformation method and deep learning have been applied to this task. However, we discovered that deep learning models which are the most evolving in terms of classification performance accuracy are yet to gain the attention of researchers in exploiting it for classification of AD. Deep learning models have achieved interesting results through their state-of-theart model implementations aimed at detecting microcalcifications and solid masses. The deep learning models are based on the depth of layers which can extract features from images at multiple levels of abstraction [13]. An example of deep learning model is the convolutional neural network (CNN). The layers are convolutional layer, pooling layer, and fully connected (fc) layer [14]. Variation of hyperparameters (e.g depth of model) in such models has produced different image detection-based CNN architectures, some of which are CiFarNet [15], AlexNet [16], GoogLeNet or Inception v1 [17], Inception v3 [18], Inception v4 [19], Xception [20], ResNeXt-50 [21], ResNet [22], VGG [23], and LeNet [24], respectively. The CNN is widely deployed for the task of characterization of abnormalities in breast images, although it is scantily applied to the task of detection of architectural distortions. Notwithstanding, some of those architectures have achieved excellent performances in the detection of architectural distortions [25], and it continues to attract more research interests [26]- [28]. Because our study promotes the use of deep learning in advancing early diagnosis of breast cancer through the detection of AD, we devoted Section 3.3 to presenting an overview of CNN.
The objective of this article is to present computational studies on the detection of architectural distortion within the last decade, from 2009 to 2020. Even though our primary focus is on the existing studies related to AD, we also highlight some studies which majored on other abnormalities such as masses, microcalcifications, and asymmetries. Through this multi-dimensional research perspective, we believe the current study will provide researchers with a one-stop point for comparing trends in computational approaches and also research efforts aimed towards the detection of breast cancer. In addition to the above objectives, we present some discussions on image preprocessing techniques usually accompanied with feature detection and classification techniques on architectural distortion. This will also allow interested readers to observe the complete methods (from preprocessing VOLUME 8, 2020 to detection and then classification) for processing images containing architectural distortion.
Meanwhile, it is interesting to note that similar survey works [122] that presented a review of studies on computeraided detection or diagnosis (CAD) techniques for detection of breast cancer do exist in the literature. However, their focus is mainly on the detection of calcifications, masses, architectural distortion, and detection of bilateral asymmetry. As far as we know, no study has focused solely on detailing the review of studies on the detection of architectural distortion. The current research was carried out through an exhaustive search of academic online digital archives of publications consisting of conferences, journal articles, and books. In addition, the technical contributions of this article are as follows: • A state-of-the-art review of major studies on digital breast image preprocessing techniques. This is in addition to a review of studies on the characterization of masses, microcalcification, asymmetries, and architectural distortion.
• A presentation of a decade long systematic review of techniques of studies focused on addressing the detection of architectural distortions using computational methods.
• A presentation of significant findings from surveyed studies to avail researchers the opportunities available in the focused area of research interest.
• An outline of advances and critical challenges in the detection of AD.
The remainder of this article is organized as follows: Section 2 is focused on reviewing related literature; Section 3 provides the reader with an overview of basic concepts concerning digital breast image preprocessing techniques and other focused study areas; Section 4 presents the datasets and approaches of the studies reviewed; Section 5 outlines the metrics, results of computational experiments, and comparison of performances of reviewed papers; Section 6 discusses our findings; and finally, we conclude the study in Section 7.

II. RELATED WORKS
This section presents a review on some related works, preprocessing of medical images and on models applied to the task of characterization of abnormalities in digital images.

A. IMAGE PREPROCESSING: SEGMENTATION AND CROPPING
In this sub-section, we review recent literature in the areas of image cropping, particularly the extraction of the region of interests from whole size digital mammography. Our interest in reviewing this technique is motivated by the need to establish the fact that the characterization of abnormalities in down-sampled high resolution images is unlikely to be successful for mammography [29]. Although the manual cropping method has been widely adopted [21], this study focused on the automated method.
Image cropping operation is aimed at improving the quality of an image by removing distracting content and also adding aesthetics. Different approaches exist for achieving this task. These can be largely categorized into aestheticbased, ranking-based and attention-based approaches. Oftentimes, these approaches apply techniques such as machine learning, deep learning, visual composition and boundary simplicity, Gabor filters, segmentation, sparse coding, and saliency-based. The use of the deep learning model technique has outperformed recent state-of-the-art methods. A good example is a work in [30] which leveraged on geometrical properties of edge features based on an energy model to extract distorted abnormal structures associated with architectural distortions in suspicious regions. In addition to geometrical properties, contours obtained from a modified Single Univalue Segment Assimilating Nucleus filtered mammogram were also used in the extraction procedure. Literature is replete with different ROI extraction techniques which are not based on deep learning techniques [31]- [34]. Similarly, Xiang, et al. [35] claimed that they were able to automate the extraction of ROIs even from noisy medical images using an extraction algorithm with statistical moments. The approach estimated an optimal threshold value automatically using statistical moments through histogram decomposition technique. The result of the proposed algorithm and method showed that it outperformed similar techniques while demonstrating robustness.
Renukalatha and Suresh [36] framed image cropping as a regression problem to bounding boxes and associated visual quality scores to it. They then applied the CNN model which is capable of accepting the whole image of different sizes to predict bounding boxes and associated scores from full images. Experimentation of their proposed framework showed that enhancement of about 10% was obtained in comparison to other contemporary and related works. Similarly, Rahman, et al. [37] adopted the use of a deep learning model in combination with Gaussian filter and image scaling and cropping method to keep the better presentation of the visual object for extraction of ROIs. The approach was optimized to achieve a quality image and low computational complexity. They trained a large dataset of images to get a saliency map from the input image using graph-based segmentation and gray level adjustment to enhance and extract a more accurate and clear saliency map. Their proposed framework was able to extract optimum rectangle for identification of ROIs, by using the saliency map with minimum and maximum rectangular windows. The results of experimentation, carried out with Matlab and Caffe framework, revealed that the framework is not only fast, but also better for image cropping.
Contrary to the use of saliency map in the extraction of ROIs as discussed in [37], the research in [38] proved that the saliency map approach was limited by its false foreground objects. They proposed the use of common object discovery (COD) algorithms to mine the underlying canonical query objects from the resultant image collection and adopt the detected object regions of interest (ROIs) as a guide for image cropping. The use of COD was further enhanced through text-based search rankings. They concluded presenting experimental results which showed that the use of COD in combination with text-based search rankings approach outperformed down-sampling and saliency-based methods on both object localization accuracy and general thumbnail quality.
The use of Gabor filters variants for image analysis such as feature extraction, image segmentation and others is widespread. This is evidenced by the application of curvature Gabor filter in human authentication [39]; extraction of features of masses from mammography using Gabor filter and Cuckoo search algorithm [40]; extraction of ROIs in palmprint recognition task using the combination of Gabor filters and texture-based features [41], [42]; and other related uses of Gabor filter in [43]. However, in this category of Gabor filter application, the work which is of much interest to us is that which is proposed by Banik, et al. [44]. The authors in [44] investigated the detection of architectural distortions in mammography of interval cancer cases taken prior to the diagnosis of breast cancer. To identify and extract the ROIs, they combined Gabor filter with phase portrait analysis, fractal dimension, and texture analysis. A total of 4212 regions of interest (ROIs) were automatically obtained from 106 prior mammography images of 56 interval cancer cases. This includes 262 ROIs related to architectural distortions, and 52 prior mammography images of 13 normal cases. The result of experimentation showed that AUC was 0.75 with the Bayesian classifier, 0.71 with Fisher linear discriminant analysis, and 0.76 with an artificial neural network (ANN) based on radial basis functions (RBF), and attained a sensitivity of 0.80 at 10.5 false positives per image.
Plotting and combining histograms of blocks in regions of an image is another effective technique that has been employed in image processing for identification of ROIs. In [45], Boss, et al. proposed histogram based 8-neighborhood connected component labeling method for breast region extraction and removal of the pectoral muscle, and they were able to identify the breast region more accurately. Similarly, Agwu and Ohagwu [46] used histogrambased approach for extraction of ROIs from CT images. Apart from its application to medical image processing, the histogram technique in its various forms has been used for extraction of ROIs as in [47]- [49].
Another technique for extraction of ROIs from images is the use of threshold techniques. This approach has also gained research attention, as used in [50]- [55]. Most interestingly, the work by Ragab, et al. [56] and Sheba and Gladston Raj [57] appear to be interesting in their performances and approaches. [56], though, applied a manual method, and thereafter used the technique of threshold and region based for extraction ROIs. The method achieved the extraction process by first determining the tumor region using a threshold value. After some trials, they modified the threshold to a specific value (76) for all the images regardless of the size of the tumor. Then, the biggest area within this threshold along the image was determined, and the tumor was cropped automatically. Meanwhile, in Sheba and Gladston Raj [57], the regions of interest (ROI) were automatically detected and segmented from mammography using global thresholding, Otsu's method and morphological operations. The extracted ROIs were applied for the classification task, which was based on Feed-forward artificial neural networks using backpropagation to distinguish between healthy, benign and malignant breast parenchyma in digital mammography. Also, Pandey, et al. [58] applied the use of thresholding technique and convolution method or extraction of ROIs on magnetic resonance images (MRI).
Segmentation of the breast region is the first and one of the most important preprocessing steps of the mammogram analysis. This allows focusing on ROI in cancer images by detecting architectural distortion at the border. In [124] the authors applied texture filter method to the problem of segmentation of the breast region. Lastly, for this section, we present the work of de Vos, et al. [59] which employed the use of deep learning for localization and extraction of ROIs from images. The convolutional neural network (ConvNet) was trained to detect the presence of the anatomical structure of interest in axial, coronal, and sagittal slices extracted from a 3D image. The approach created 3D bounding boxes by combining the output of the ConvNet in all slices. Their localization method was compared with a manual method using the distances between automatically and manually defined reference bounding box centroids and walls. Several other works have adapted deep learning for the tasks for ROI extraction [60]- [64], and also using a high pass isotropic filter [147], and others are [146] and [124].

B. CHARACTERIZATION OF ABNORMALITIES IN DIGITAL BREAST IMAGES
Abnormalities in breast tissues are mostly of four types, namely: malignant masses, calcification, architectural distortions, and asymmetries. Although this study is aimed at reviewing studies focused on detection of the architectural distortions, we felt it necessary to present review studies aimed at other related abnormalities. Although different approaches have been adopted for the detection of breast cancer using patient records [80]- [82], this section focuses on studies detecting abnormalities in breast images.

1) MALIGNANT MASSES
We first present literature which does not use deep learning models for the task of detecting abnormalities in mammography. The work in [3], for instance, differs from the popular opinion of using deep learning models. In the research, feature extraction was based on multi-resolution wavelets while classification was performed by using SVM and ELM networks with modified kernels. By using multi-resolution, the authors were able to increase the texture and shape features extracted to improve the task of detection and classification. Experimentation of the approach was carried out using 355 images of fatty breast tissue of IRMA database, with 233 normal instances (no lesion), 72 benign, and 83 malignant cases, and attained an accuracy of 94.11%.
In [29], the authors extracted features from ROIs using speed-up robust features (SURF) and local binary pattern variance (LBPV) descriptors. The features were then represented as deep invariant features (DIFs) which in turn were constructed in supervised and unsupervised fashion through multilayer deep-learning architecture. The author experimented with a dataset of 600 region-of-interest (ROI) masses, including 300 benign and 300 malignant masses, obtained from two publicly available data sources. Results of the performance of DeepCAD obtained a sensitivity of 92%, a specificity of 84.2%, and accuracy of 91.5% and AUC of 0.91.
Using already fine-tuned or trained architecture helps to fast track the process of adapting the architecture for different problems. This was demonstrated in [56], which adapted AlexNet to segment whole images and then classify the extracted ROI. The authors modified AlexNet to classify two classes instead of 1,000 classes. This they achieved by introducing SVM classifier at only the last fully connected layer. Meanwhile, the approach used a segmentation technique (threshold and region based) to automate the process of extraction ROIs. The approach for the classification was based on SVM (as a classifier) and mammography images from the digital database for screening mammography (DDSM) and the Curated Breast Imaging Subset of DDSM (CBIS-DDSM). The study applied the trained DCNN on manually cropped inputs and successfully classified benign and malignant mass tumors obtaining an accuracy of 71.01%. However, when the ROIs were automatically extracted using segmentation techniques, the model achieved a performance of 0.88 (88%) for the area under the curve (AUC) base on the DDSM dataset. Moreover, when using the samples obtained from the CBIS-DDSM, the accuracy of the DCNN increased to 73.6%. Consequently, the SVM accuracy became 87.2% with an AUC equaling to 0.94 (94%).
Similarly, Levy and Jain [66] investigated the performance of the following architectures: AlexNet, GoogLeNet and a shallow CNN architecture. The three models were used in the classification of images as malignant or benign. To circumvent the challenge of overfitting, they used the techniques of transfer learning, batch normalization, careful preprocessing and data augmentation. For both the AlexNet and GoogLeNet, the researchers used the same base architecture as the original works but replaced the last fullyconnected (FC) layer to output classes. The shallow CNN proposed takes a 224 × 224 × 3 image as input, and it consists of 3 convolutional blocks composed of 3 × 3, three fully connected layers, and soft-max layer. Furthermore, they employed ReLU activation functions, Xavier weight initialization, and the Adam [15] update rule with a base learning rate of 10−3and batch size 64. The best model presented a result of 0.934 for recall at 0.924 for precision.
Jung, et al. [67] proposed the use of RetinaNet for detection of masses in mammography by adopting pre-trained weights (i.e., using weights pre-trained on GURO, training and testing on INbreast). This, they claimed, demonstrates that using weights pre-trained on datasets achieves a similar performance to directly using datasets in the training phase. Experimental setups using the public dataset INbreast and the in-house dataset GURO showed that their model obtained an outstanding performance of an average number of false positives of 0.34, and 0.03 when the confidence score is 0.95 in INbreast and GURO respectively. Likewise, Agarwal, et al. [68] employed the use of transfer learning to propose a patch-based CNN method for automated masses detection in full-field digital mammography (FFDM). They also investigated the performances of VGG16, ResNet50, and InceptionV3 architectures on the same dataset while applying the transfer learning technique to uncover the benefit of domain adaptation between the CBIS-DDSM (digitized) and INbreast (digital) datasets using the InceptionV3 CNN. Their experimentation showed that the InceptionV3 obtained the best performance for classifying the masses and nonmass breast region for CBIS-DDSM. Results showed that the transfer learning from CBIS-DDSM obtained a substantially higher performance with the best true positive rate (TPR) of 0.98 at 1.67 false positives per image (FPI), compared with transfer learning from ImageNet with TPR of 0.91 at 2.1 FPI.
Another research worth considering is the work of Arevalo, et al. [69] which was able to demonstrate that there is potential superiority when a deep learning based classifier is used to distinguish malignant and benign breast masses without segmenting the lesions and extracting the pre-defined image features. [70] also showed a performance of their learning model to have attained area under the ROC curve of 86%. Other related studies can be found in [172]- [174].

2) CALCIFICATION
The work in [71] combined the CC and MLO mammography views differentiating between malignant and benign tumors. They implemented a deep-learning classification method that is based on two view-level decisions, implemented by two neural networks, followed by a single-neuron layer that combines the view level decisions into a global decision that mimics the biopsy results. The model exploited the detection of features of clustered breast microcalcifications to classify tumors into benign and malignant categories. In related work, Sert, et al. [72] adapted a CNN model to the task of breast tumor classification as benign or malignant based on the detection of features of microcalcifications. The approach investigated the benefit of employing various preprocessing methods such as contrast scaling, dilation, cropping, decision fusion using an ensemble of networks, and with CNN model. Experimentation results showed that preprocessing poses great importance on the classification performance and obtained 94.0% and 95.0% for recall and precision respectively.
In most of the learning models reviewed so far, we observed that patches (dynamic or fixed size) from whole images served as inputs. Xi, et al. [65] successfully trained their model which accepts patches as input, and they then adapted the model on whole images. The models investigated were VGGNet and ResNet, with the later demonstrating the most appreciable accuracy at 92.53% in classifications. Meanwhile, Murali and Dinesh [73] employed a deep Convolutional Neural Network (CNN) and Random forest classifier for the classification of ROIs with malignant masses and microcalcifications. The AUC of CNN was 0.87, which was higher than the mean AUC of the radiologists (0.84), though the difference was not significant. On the other hand, [74], [26] circumvented the use of deep learning by adopting the use of wavelet decomposition. Although our research is focused on CNN models, their work is, however, worth mentioning and may interest others.

3) ARCHITECTURAL DISTORTION
In [75], the authors approached their task of feature extraction on inputs with architectural distortions and spiculated masses using Gabor filters and PPlanes. Furthermore, SVM and MLP were employed for the task of classification using the Mini-MIAS and DDSM datasets. Results showed that they achieved 90% sensitivity, 86% specificity in distinguishing AD from the normal breast tissue and 93% sensitivity and 88% specificity in classifying speculated masses; also, SVM classifiers achieved 96% sensitivity with 9.6 false positives per image in detection of spiculated masses and 97% sensitivity with 6.6 false positives per image while detecting architectural distortions. In related work, Rangayyan, et al. [76] also demonstrated the methods for the detection of architectural distortions in prior mammography of interval cancer cases based on analysis of the orientation of breast tissue patterns in mammography. This was achieved by applying Gabor filters and phase portraits to detect node-like sites of radiating or intersecting tissue patterns. Results obtained achieved a sensitivity of 80% at about five false positives per patient.
Others have leveraged the benefits of R-CNN as in [9] who introduced the detection of architectural distortions using a supervised pre-trained region-based network (R-CNN). Experimentation was based on the DDSM dataset, and results showed that they obtained over 80% sensitivity and specificity, and yields 0.46 false-positives per image at 83% truepositive rate. Similarly, Bakalo, et al. [77] demonstrated a novel network which combined two learning branches with region-level classification and region ranking in weakly and semi-supervised settings. Their results for weakly supervised learning showed an improvement of 4% in AUC, 10-17% in partial AUC and 8-15% in specificity at 0.85 sensitivity. Hang, et al. [78] applied GlimpseNet to autonomously extracts ROIs and then classify them to obtain the result that gained 4.1% increase inaccuracy.
Recently, there has been a surge in the use of basic CNN models in the characterization of architectural distortions from mammography. Qiu, et al. [79] proposed a framework using a combination of deep Convolutional Neural Network (CNN). The model is an 8 layer deep learning network that involves 3 pairs of convolution-max-pooling layers for automatic feature extraction and a multiple layer perceptron (MLP) classifier for feature categorization to process ROIs. The network contained 20, 10, and 5 feature maps of convolution layers. The MLP classifier is composed of one hidden layer and one logistic regression layer. The results of their experimentation achieved an AUC of 0.696±0.044, 0.802±0.037, 0.836±0.036, and 0.822±0.035 for fold 1 to 4 testing datasets respectively, with the overall AUC of the entire dataset 0.790±0.019.
Similarly, Jiao, et al. [80] also proposed a deep feature based framework combining intensity information for breast masses classification task. In related work, Bakkouri and Afdel [81] proposed a novel discriminative objective for supervised feature deep learning approach focused on the classification of tumors in mammography as malignant or benign, using Softmax layer as a classifier. The proposed network was enhanced with a scaling process based on Gaussian pyramids for obtaining regions of interest with normalized size. The DDSM and BCDR dataset were used in addition to data augmentation (geometric transformation) technique. The result of their experiments showed that they obtained an accuracy of 97.28%. Another deep learning model was used by Dubrovina, et al. [82], which was a novel supervised deep learning-based framework for region classification into semantically coherent tissues. Their research improvised data for training by training the CNN in an overlapping patchwise manner and adapting the convolutional neural network (CNN) to learn discriminative features automatically. The experimental result showed that they obtained an average dice coefficient of 0.71.
Samala, et al. [83] proposed a multi-task transfer learning DCNN to translate knowledge from non-medical images to medical diagnostic tasks through supervised multi-task transfer learning, digitized screen-film mammography (SFMs) and digital mammography (DMs) which were used to train the DCNN, which was then tested on SFMs. Experimentation was done with Institutional Review Board (IRB) approval, SFMs and DMs were collected from patient files, and additional SFMs were obtained from the Digital Database for Screening Mammography. The data set consisted of 2242 views with 2454 masses (1057 malignant, 1397 benign).
Mammogram-based CNN based models include the work of Antropova, et al. [84] which exploited the efficiency of pre-trained convolutional neural networks (CNNs) in a combination of pre-existing handcrafted features. These features were combined with low-to mid-level features using a pretrained CNN . Another use of a CNN model for classification of breast masses lesions aided with end-to-end learning process was proposed by Chougrad, et al. [85].
In [86], the authors applied convolutional neural networkdiscrete wavelet (CNN-DW) and convolutional neural network-curvelet transform (CNN-CT) for the purpose of detecting architectural distortion. The study also filtered the input using contrast limited adaptive histogram equalization (CLAHE) and then compared Softmax and support SVM for classification purpose. Results showed that CNN-DW and CNN-CT had achieved an accuracy rate of 81.83% and 83.74%, respectively. Jiang, et al. [87] also explored the possibility of combining the technique of transfer learning with GoogLeNet and AlexNet pre-trained on a large-scale visual database. Theresults of their research demonstrated that GoogLeNet reached an AUC of 0.88 outperforming AlexNet, which stood at AUC of 0.8.
Finally, here are some other related works which adopted other similar techniques: Sharma and Preet [88] applied Convolution Neural Network as a classifier on the mammogram images to enhance the accuracy rate of CAD. Performance of the different classifiers was measured on receiver operating characteristic. Experimentation results showed that the model attained an accuracy of 73%, with 71.5% sensitivity and 73.5% specificity for dense tissue, and an accuracy of 79.23%, 73.25% sensitivity and 74.5% specificity was achieved for fatty tissue. Similarly, Teare, et al. [89] presented two novel techniques (genetic search of image enhancement methods with CLAHE and DCNN) to address inherent challenges in the application of machine learning to the domain of mammography. The research also utilized dual deep convolutional neural networks at different scales for classification of full mammogram images and derivative patches combined with a random forest gating. The result obtained showed a specificity of 0.91 and a specificity of 0.80. Teare, et al's. [89] study was based on wavelet convolution neural network for detection of spiculated findings in low-contrast noisy mammography, such as architectural distortions and spiculated masses. The dataset used for experimentation consisted of CBIS-DDSM and reached an accuracy of over 85% for architectural distortions and -88% for spiculated masses. In [27], the authors proposed a detection scheme composed of two separate channels, each of them being dedicated to the detection of one of the target radiological signs for detection of masses and architectural distortions in DBT datasets. Lastly, Kamra, et al. [28] employed the use of texture models using support vector machine (SVM) classifier for texture classification of architectural distortions. The databases used were IRMA version of a digital database for screening mammogram (DDSM) and Mammographic Image Analysis Society (MIAS). Results showed an accuracy of 92.94 % using DDSM for fixed-size ROIs and 95.34 % for MIAS dataset.
The study in [121] aimed at detecting spiculated lesions and architectural distortions in digital breast tomosynthesis using a fast method and contrario modeling. The fast algorithm was implemented to significantly reduce the computational cost. Results obtained when applied to 38 breasts (10 containing a lesion), showed a sensitivity of 0.8 at 1.68 false positive per breast. The authors in [123] applied the discrete wavelet transform (DWT) on datasets consisting of 19 architectural distortions and 19 normal mammograms to detect breast architectural distortion. Using SVM as a classifier, the authors confirmed that it was effective by yielding an accuracy of 92.1%, a sensitivity of 89.5%, and a specificity of 94.7%.
Samulski and Karssemeijer in [125] proposed a multiview (e.g. MLO and CC views) CAD system based on Case-Based reasoning or learning method. They improved an already existing single-view lesion detection system and applied a correspondence classifier to detect malignant masses and architectural distortions. Four hundred and fifty-four mammograms consisting of four views with a malignant region were applied to their method. The research output result of mean sensitivity increased by 4.7% in the range of 0.01-0.5 false positives per image.
Banik, et al. [127] adopted the methods of Gabor filters, phase portrait analysis, angular spread of power, fractal analysis, Law's texture energy measures and Haralick's texture features. The study reported that 4224 ROIs, consisting of 301 true-positive, were automatically obtained from 106 prior mammograms of 56 interval-cancers. The images were applied to their methods for the detection of architectural distortion in prior mammograms of interval-cancer. The authors revealed that they obtained AUC of 0.76 with the Bayesian classifier, 0.75 with Fisher linear discriminant analysis, and 0.78 with a single-layer feed-forward neural network. Results showed that they obtained sensitivities of 0.80 and 0.90 at 5.8 and 8.1 false positives per image. In another study [131], the authors applied a similar approach. They, however, modeled the input as RÃnyi entropy of angular histograms composed with the Gabor magnitude response, angle, coherence, orientation strength, and the angular spread of power in the Fourier spectrum. In [26], Biswas and Mukherjee used a two-layer architecture generative model for extraction of distinctive textures for recognizing architectural distortion. The model was successfully applied to MIAS and DDSM datasets to obtain an impressive performance.
The authors [128] presented some methods for the detection of oriented features which were then applied to exploit the presence of oriented features in mammography. Their approach consisted of low-and high-level analysis. The lowlevel analysis includes the detection of oriented features in images, while high-level analysis relates to the discovery of patterns in the orientation field. The presence of oriented features in images often conveys important information about the scene or the objects contained therein. The study applied the phase portrait method to the detection of architectural distortion in mammograms. Torabi, et al., [129] applied Wavelet packet analysis on the two-dimensional histogram matrices of mammography to generate the filter banks to extract statistical features -skewness and kurtosis. Using the 5-fold cross-validation protocol, the authors claimed that their method improved the detection accuracy of architectural distortion. There is a proliferation of the use of Gabor filters method for the detection of architectural distortion as also seen in [132]. The authors adopted the method for detection of the orientation of the breast tissue at each pixel, the breast boundary, the nipple, and the pectoral muscle. Furthermore, they used the measure of coherence to find the angular deviation of the oriented structures in order to detect AD. Their technique yielded a sensitivity of 80% which was obtained at 10.3 false positives/image.
In a related study, the same authors in [133] used an approach that relies on the use of Gabor filters and phase portrait analysis, and measures of spicularity and angular dispersion of the patterns in automatically detected ROIs. To detect the presence of AD, the authors used the method of inclusion of an index of convergence of spicules which is computed from the Gabor magnitude and coherence using the Gabor angle response. After that, they measured radially weighted difference and angle-weighted difference measures of the intensity, Gabor magnitude, and Gabor angle response. In addition, they computed the angle-weighted difference in entropy of spicules computed from the intensity, Gabor magnitude, and Gabor angle response. Using pattern classification, they obtained AUC of 0.76 with an ANN based on radial basis functions, a sensitivity of 0.90 at 6.3 false positives per patient. Deviating from the conventional use of Gabor Filters, authors in [134] used Gaussian (DoG)-based filter method in conjunction with a thresholding technique. Their methods were able to effectively detect AD and also reduce the number of false positives.
Chang, et al., [135] proposed an image enhancement method and used the Laplacian of Gaussian (LoG) filter for feature extraction in digital images. Meanwhile, they examined the correlation between histological grade and stellate feature on 3D ultrasound imaging. Chakraborty, et al. [136] proposed the use of Gabor filters and statistical measures of the orientation for the detection of architectural distortion. They further applied two types of co-occurrence matrices and computed Haralick's 14 texture features to estimate the joint occurrence of the angles of oriented structures. Detection of ROIs was carried out by using Gabor filters and phase portrait analysis. By using an artificial neural network for classification, and the leave-one-image-out approach for cross-validation, their approach yielded AUC of 0.77, a sensitivity of 80% at 5.4 false positives per image. Similarly, in [137] the authors also used statistical measures of oriented patterns in conjunction with Gabor filters and phase portrait analysis. The result obtained revealed an AUC of 0.76, a sensitivity of 80% at 4.2 false positives per patient. Using a different approach, the study in [138] applied Monogenic Binary Coding (MBC) for features extraction by the analysis of oriented textures. They then adopted the Nearest Neighbor classifier to obtain 91.25% in terms of the average accuracy.
On the other hand, authors in [139] used Otsu technique which was performed for segmentation and then applied contoured transform and the phase portrait methods for feature extraction. In addition to image preprocessing, top-hat processing and concentration of white spaces in the sliding window also applied. Similar to the work in [123], the study presented in [140] also applied wavelet transform, phase portrait analysis, and angular spread of power analysis to improve the accuracy of detection of AD. Meanwhile filtering using morphological filtering and Otsu threshold method was also applied for image preprocessing. In addition to using Gabor filters and phase portrait analysis, this study [158] also applied Multiple Twin Bound Support Vector Machines Recursive Feature Elimination (MTWSVM-RFE), and Twin bounded Support Vector Machine (TWSVM) for classification purposes. Similarly, [164] also applied Gabor filters and phase portrait analysis, Multiple Twin Bound Support Vector Machines Recursive Feature Elimination (MTBSVM-RFE) and Twin bounded Support Vector Machine (TBSVM).
While adopting the popular use of Gabor filters, phase portrait analysis, authors in [141] were able to perform analysis of the angular spread of power and fractal analysis. Applying the neural network classifier, Bayesian classifier, and Fisher linear discriminant analysis, the study yielded AUC of 0.76, 0.77 and 0.76, respectively. They also obtained sensitivities of 0.80 and 0.90 at 5.7 and 8.8 false positives (FPs) per image. Similarly, [151] used adaptive Gabor filter to detect mammary gland structure in other to detect AD. In [144], the authors used a measure of divergence of oriented patterns in conjunction with the Gabor angle response; radially weighted difference and angle-weighted difference (AWD) measures of the intensity, Gabor magnitude, and Gabor angle response. The proposed method obtained a sensitivity of 0.80 at 5.3 false positives (FPs) per patient. The study in [142] used a similar approach to [123], [140] although they also attempted to compare the performance of SVM and relevance vector machine (RVM). In a unique approach, the authors in [143] proposed morphological processing for easily detecting AD. Authors in [148] adapted Gaussian mixture to model features extracted by the Curvelet coefficients and Spiculated Lesion Filters in order to detect AD. By applying their approach to the DDSM and MIAS databases, they obtained an accuracy of 92.78 %. In this [149] study, they applied a straight line approximation of pectoral muscle in addition to optimum thresholding to obtain an accuracy of 86.67%.
In another work, the researcher exploited the extraction of yields pixel by pixel vector maps using diffusion tensor imaging (DTI) to track the mammary architectural elements [150]. The study in [152] proposed a novel method using direction analysis of linear structures to detect AD. The determined directions (0Â • , 22.5Â • , 45Â • , 67.5Â • , 90Â • , 112.5Â • , 135Â • , and 157.5Â • ) were then used to calculate the isotropic indexes to extract suspicious areas. In [153], Bandelets was explored for the analysis of mammograms to detect the geometric flow which indicates directions in which the image gray levels have regular variations. The authors claimed that their approach outperformed Wavelets and Curvelets methods using Support Vector Machine (SVM). By using a different approach in [154], the study revealed that the application of multiscale statistical texture analysis allows for distinguishing between textural patterns of architectural distortion and normal breast parenchyma. In addition, the authors applied data-driven two-dimensional intrinsic mode functions (IMF) layers to extract high-frequency oscillations of the input. By using the nonlinear support vector machine classifier, the approach obtained AUC of 0.88.
In [155], the authors proposed the extraction of features using a sub-classes clustering based multi-task learning method (SMTL). After that, a sparse representation based classifier was used for the classification of AD or non-AD [156]. Using textural patterns of AD surrounding tissue in detecting the presence of AD, the authors first used BEMD algorithm to extract ROIs. After that, statistical signatures of IMF layers were computed further to identify the presence of AD in the ROIs. In [157] the authors combined transfer learning with automatic architectural distortion detection method. The study in [159] applied the concept of graph theory to denote the linear saliency in mammography, that is, before applying eigenvectors from the adjacency matrix to extract discriminant coefficients that represent graph nodes. Using Support Vector Machine (SVM) classifier and mini-MIAS and DDSM databases, their approach yielded AUC 0.93, accuracy rate of 89 %, sensitivity 95 %, and a specificity of 93 %. In [160], the performances of Local Mapped Pattern (LMP) were compared with Local Binary Pattern (LBP) in conjunction with a multilayer perceptron neural network to obtain an accuracy of 83%.
Also, the study in [161] attempted to compare the performance of a multiscale fractal dimension (FD) measurements with a single FD measurement. Their report showed that the former outperformed the latter. The study carried out the detection of AD by the application of a two-dimensional empirical mode decomposition (2D-EMD) algorithm to generate a multiscale representation of the mammograms. After that, they measured the fractal dimension from the multiresolution representation of the mammogram. Authors in [162] proposed the combination of Dense convolutional neural network (DenseNet) and the Squeeze-and-Excitation (SE) block to achieve SE-DenseNet. Experimenting with BCDR dataset, the approach yielded AUC was 0.984, and accuracy was 0.982. Using a deep learning approach, authors in [170] adapted a pre-trained VGG-16 network on ImageNet images in combination with transfer learning technique to obtain AUC = 0.89. Similarly, in [171], the study developed a deep learning model for the detection of AD through detection (Gabor filters) and aggregation (Faster-RCNN) in 2D and 3D, respectively. Their approach outperformed other similar models by obtaining a mean true positive fraction (MTPF) of 0.50 ± 0.04.
Other approaches include the detection of AD; the authors in [163] exploited the fractal dimension and lacunarity.
The study in [165] attempted to optimize the use of segmentation technique to reduce number input sent to the classifier in a bid to detect AD. More approaches are the use of 3-dimensionality of the imaging modality [166], application of Markov random fields along with watershed transform, together with mathematical morphological operators [167], context-based ensemble classifier approach [168], combination of Local Binary Pattern (LBP) and SVM [169]. The SVM classifier classifies the image into malignant and benign images. Some other approaches proposed for the detection of architectural distortions and abnormal structures in mammographic images are the analysis of bilateral and temporal cases using image registration methods such as global and rigid transformation to local deformable paradigms [125], [130].

III. PRELIMINARIES
In this section, brief preliminaries of basic concepts related to the survey in this article are presented.

A. ARCHITECTURAL DISTORTION
Mammography is the most widely used method to screen breast cancer. This X-ray film produced from the mammography can be converted and stored as full-field digital mammography (FFDM). FFDM are digital signals representing the actual X-ray film and are used in CADs operations like deep learning models. Images from mammography are typically viewed between two to four views, namely: cranialcaudal (CC), medio-lateral (ML), latero-medial (LM), and mediolateral-oblique (MLO) views. The CC view is a view taken from above, ML is a view taken from the center of the chest outward, LM view presents from the outer side of the breast and towards the center of the chest, and MLO view presents breast images from the upper-outer quadrant [90]. Figures 1 -4 present these four views of mammography images. Research has proven that combining more than one view in deep learning models has improved the performance of such a model. This proposal will attempt to gather and synthesize images from at least two views: CC and MLO views.
Radiologists and CAD systems usually look out for abnormalities in both the X-ray film and the FFDM. These abnormalities may be classified as benign (tumors not considered cancerous) and malignant (tumors are cancerous), and are differentiable from normal mammography as shown in Figures 5-7. The benign tumors can have round or oval shapes, while malignant tumors have a partially rounded shape with an irregular outline. Besides, the malignant masses will appear whiter than any tissue surrounding it [56]. The four abnormalities that are presented with digital mammography, namely: masses, calcifications, asymmetries, and architectural distortions are shown in Figures 8-11.
Architectural distortions usually present a subtle nature in the mammogram. It typifies distortions of normal breast architecture with no definite visible masses. This can be detected from appearances of spiculations radiating from a   point, focal retraction, or straightening at the edges of the parenchyma [10]. In a related study [98], the authors observed the difference between architectural distortions caused by benign and malignant tissues: benign causes of architectural distortions include radial scars; complex sclerosing lesions; sclerosing adenosis; fat necrosis; postprocedural change; and rare speculated benign lesions, such as granular cell tumor and breast fibromatosis. Malignant causes include breast cancer and ductal carcinoma in situ. These distortions are characterized by the abnormalities in Figure 12a -c. Architectural distortions on mammography present themselves mostly in invasive breast cancers, namely invasive lobular carcinoma (ILC) and invasive ductal carcinoma (IDC) and represent about 5-10% and 70-90% of invasive breast malignancies, respectively [98].

B. IMAGE CROPPING
Image cropping procedures are used by deep learning models to extract regions of interests (ROIs) and also to remove noisy regions from an image. Some studies have used segmented ROIs to reduce the computation of the CNNs by reducing image regions that may not benefit the desired task. While several studies have used the manual approach of cropping using ground truth data [21], the approaches used have been either completely automated or semi-automated. Another demand requiring image cropping is based on the task of data preparation of images before they are served as input into learning models. Image data may probably be centred by subtracting the per-channel mean pixel values calculated on VOLUME 8, 2020   the training dataset. The manual method is not as effective and supportive of the classification producers of learning models based on some human error prone techniques applied. For instance, in [54] the tumors in the DDSM dataset are labelled with a red contour and accordingly, these contours   are determined manually by examining the pixel values of the tumor and using them to extract the region. In addition, the manual methods of cropping images are very subjective, and lead to a lot of misinterpretation if the region of interest (ROI) is not extracted accurately. Similarly, there exist some semi-automatic segmentation algorithms with appreciable segmentation accuracy in the literature. However, these  [101].
techniques are computationally expensive and involve human intervention. Automatic image cropping is of great importance for improving the visual quality of images [33] with existing methods typically repurposing classifiers to perform cropping.
In digital mammography, locating an accurate, robust and efficient breast region segmentation technique remains a challenging problem [45]. Since digital mammography is available in large sizes, there is always a need to crop them into regulated sizes to accommodate their use as input into deep learning models [21]. Hence the need for algorithms to operate in order to identify the meaningful parts of images and discard unwanted area from images so as to keep the focus on important contents of images [100]. Some of the most used croppings approaches are the thresholding method, region based segmentation methods, region growing, region splitting and merging, contour-based methods, clustering based methods and model-based methods.

C. ARCHITECTURES OF SOME CONVOLUTION NEURAL NETWORK
The term convolutional neural network, abbreviated as CNN, is a deep learning model, similar to the basic neural network, and heavily used in computer vision for detection and classification tasks. CNN has been widely used for face recognition, object detection, image classification, and general pattern recognition. When it traverses in the forward direction, it is considered to be training and forward propagating, the opposite is the backward propagation used for adjusting the parameters of the model. CNN model has two categories of values: hyperparameters and parameters. Hyperparameters are variables which may be manually selected and tuned before training the model, while the parameter variable is automatically optimized during the training process. The effectiveness of a CNN model largely depends on how well its parameters have been optimized. Table 1 lists the parameters and hyperparameters used in CNN.
There are six different layers in CNN namely: input layer, convolutional layer (usually a combination of convolutional and activation function, e.g. ReLU), pooling layer, fully connected (FC) layer, softmax/logistic layer (classification layer), and output layer. Figure 13 illustrates a typical architecture and the usage of some parameters, and captures most layers of a deep learning model (CNN).
Input layer: This layer allows for input data in the form of images, modeled by a three-dimensional matrix to be prepared as input into the learning model. These three dimensions consist of the height (h), width (w), and the depth (d) or the number of channels, denoted by W × H × D. Let us assume we have an image of size 32 × 32 with a depth of 1, converting our matrix to vector will result to 1024. It is this vector that will serve as input (representing the image) into the model. If, for instance, we want to train our model with 500 images, the dimension of our input will then be (32,32,500).
Convolutional layer: This layer is used for performing the computational task known as convo-operation and contains ReLU activation. This operation allows for the extraction of features from the images passed in as input. Similarly, this operation is usually a dot multiplication of filters and weights until the operations are repeated over the entire image. The output will be the input of another convolutional layer. The convo-operation is a computation that convolves each image with filters to output some feature maps having the form W × H × D. For instance, the feature map for layer 2 can be computed as follows: W2 = H2 = (W1-F+2P) /S+1, S is the number of strides, D2 = F (filter), and P is the number VOLUME 8, 2020  of zero paddings. For each feature map, a non-linear activation function is applied (e.g. Sigmoid, ReLU). A non-linear activation function leaves the size of the volume unchanged (W2 × H2 × D2) [14].
Pooling layer: This layer is positioned just after the convolutional layer. It is used for scaling down the output from the convolutional layer. Usually, the maximum or average pooling operation is applied here.
Fully connected layer involves weights, biases, and neurons. It connects neurons in one layer to neurons in another layer. It is used to classify images between different categories by training.
Loss function (Softmax or Logistic layer) also referred to as the classification layer, is the last layer of CNN and is placed immediately after the fully connected layer. Either the softmax or logistic operations which are multi-classification and binary classification respectively may be used.
Output layer contains the label which is in the form of one-hot encoded. The output may be a probability of classes that best describes the image or single class like normal, benign, or malignant.
Both pre-trained (whose pre-trained weights are usually shared by deep learning libraries, such as TensorFlow, Keras and PyTorch) and un-trained deep learning in models exist for application to different tasks. These networks can be adapted and even fine-tuned based on the need at hand. Such existing models may be harnessed from community repository (popularly called a model zoo) storing models and parameters in an adaptable format [103]. Meanwhile, the dataset used on these models is usually partitioned into three: training data, validation data, and testing data. Training the model is carried out with the training set, a validation set is used for validating the model after some successive trainings, and the test dataset is used to evaluate the trained and validated model. The results and performances of the trained model are generated using the test dataset. When the network is being trained, loss values are calculated via forward propagation, and learnable parameters are fine-tuned through the backpropagation procedure [101]. The strength of a deep learning model (trained and untrained) mostly depends on the depth (or the number of layers) of its architecture. In the trained model the choice of optimal parameters and hyperparameters also influences the computational power of architecture.
In some cases, needing the application of CNN model, it is advisable to use a trained model rather than training from scratch [104]. However, one can intelligently learn from the  related model if a new one must be built from scratch. The commonly used pre-trained CNNs architectures for mammography are Alex-Net, VGG16, ResNet50 and GoogLeNet [99]. We shall, therefore, review the peculiarities of these architectures, highlighting their pros and cons. Table 2 summarizes these details.
Most of these deep CNN architectures were designed for a 1000 class classification task. However, new models are usually designed after them by merely adapting them to the required task (e.g. classification) and modifying the last three layers.

IV. DATASETS AND APPROACHES
In this section, we present a review of the computational approaches and methods applied in the studies surveyed. Meanwhile, we first discuss publicly available medical image datasets (mammography) that are available for research in the area considered by the literature reviewed. Also, a summary of popular data/image preprocessing techniques is discussed.

A. APPROACHES FOR SELECTION OF STUDIES
Journals were searched from various online archives like https://arxiv.org/, https://ieeexplore.ieee.org, PubMed, and https://hal.inria.fr/. For instance, we obtained 39 publications from the Scopus database and 44 publications from the IEEE Xplore database. Publications from the Scopus database consisted of 18 Conference Papers, 19 Journal publications and 1 Book Chapter, while that downloaded from the IEEE Xplore consisted of 34 Conference Papers, 7 Journal publications and 3 Book Chapters. However, we had to combine all publications retrieved from the databases listed above to eliminate duplications. The resulting findings from the harmonized publications formed the basis for the findings and discussions in this study. Our approach for the selection of papers reviewed is listed in Table 3.
Studies focused on detection or characterization of architectural distortion and preprocessing of digital mammography within the period under review revealed an interesting trend. We discovered that most of the publications were made in Conferences, followed by those published as a full article in Journals, and only a few appear as Book Chapters. For instance, we looked into the publications on the IEEE Xplore database, and we found that out of the 44 documents relating to the focus of this study, 34 were published as Conference papers, seven as Articles, while three appear as Book Chapters. Figures 15a-b and 16a-b present charts illustrating the distribution of these publications across the period considered: 2009-2020.
Furthermore, an attempt was made to uncover interesting bibliometric analysis of the publications relating to the detection or characterization of architectural distortions in mammography. This led to the use of ScientoPy -a Python script for scientometrics literature review (ScientoPy).
The outcome of the study to analyze the content of publications related to the detection and/classification of breast cancer through the identification of architectural distortion revealed the information on the following graphs. About 38 publications were identified for the following charting after merging and removal of duplications. First, we observed a trend of the choice of Journals/Publishers by the authors, as shown in Figure 17. Prior to the year 2018, a good number of the publications were published in Progress in Biomedical Optics and Imaging, Journal of Computer Assisted Radiology and Lecture Notes in Computer Science. However, from the year 2018 to 2019, we discovered that only two journals published documents relating to the detection/classification of architectural distortion. These journals are Biomedical Signal Processing and Control and the Journal of Medical Systems. This shift in the place of publication could indicate the attention such journals gave to the area in concern.
A careful analysis of the published documents within the period under review revealed that not much interest had been directed towards the application of deep learning models to the task of detecting AD. Figure 18 shows that the use of the Gabor Filter dominated methods used by researchers from the year 2009 through 2018. Similarly, Figure 19, which shows a Word Cloud analysis, also proved the truth of this assertion. This further confirms the fact that new methods and approaches need to be considered for the task of detection of architectural distortion in mammography.
Our bibliometric analysis also revealed the level of attention generated by researchers across countries in the issue of improving computational approaches for the detection of architectural distortion. We discovered that India and China top the list of researchers/countries which have invested research into it. Notably, between the years 2011-2020 and 2009-2018, there have been progressive publications coming from India and China, respectively. Although a large volume of research efforts came from Canada, it was short-lived as seen from its distribution from 2009 to 2014. Meanwhile, recent publications show that researchers from the United States are now gaining interest in the domain while India still maintains the lead. We also discovered that only a few authors were considered to have understood the relevance of applying computational methods to the detection of architectural distortion in mammography. Figures 20 and 21 illustrate these assertions.
In Figure 22 below, we present the graph showing the total number of papers/studies reviewed according to their year of publication. This represents a combined list of publications across the online databases visited after elimination of duplicated studies.

B. MAMMOGRAPHIC DATASETS
In mammography, the most available databases which are publicly accessible are the mammographic image analysis society (MIAS) database [108] and the digital database for screening mammography (DDSM) [65], INbreast database, breast cancer digital repository (BCDR), and image retrieval in medical applications (IRMA). We present a summary of the necessary information about each of these datasets in Table 4.

C. DATA/IMAGE PREPROCESSING AND AUGMENTATION TECHNIQUES
Due to the noise and lack of sharpness in some images, image preprocessing practices are employed. Methods such as contrast enhancement and image breast segmentation are used for such purpose. This process helps to remove the irrelevant background area, labels, artefacts, and pectoral muscles. The use of adaptive histogram equalization (AHE), called contrast limited adaptive histogram equalization (CLAHE), to improve the contrast in images is a technique employed. Another method is the use of a median filter for de-noising   and un-sharp masking to smoothen the images. This is achieved by image transformations such as vertical and horizontal flip, rotation at different degrees, and addition of Gaussian noise.
We observed that there are variations in the papers surveyed concerning the manner images are passed as input into their respective models. Some passed whole images into their model while others used patches. Patches or regions of interest (ROIs) are extracted from whole images using either a manual or automated approach. The manual approach simply relies on annotations accompanying the image datasets. At the same time, the procedure for automatic extraction and segmentation task for ROIs is done using threshold and region-based techniques. Image segmentation is used to divide an image into parts having similar features and properties [56]. In addition to those techniques used in automated ROIs extraction, there are deep learning models like R-CNN, Faster R-CNN and their variants that serve this purpose. The uses of the variant of R-CNN have gained attention among research works which major in the characterization of masses and calcification. Often, the ROIs extracted are of variable sizes or fixed size. Standard sizes are 64 × 64, 128 × 128, 299 × 299, and 512 × 512 pixels. Whole images are usually of size 1024 × 1024. The benefit of extracting ROIs is to centre the abnormality area in patches and to limit the search for abnormalities without any undue influence from the background or unwanted regions [73]. Meanwhile, we found interesting an image/data preprocessing approach wholly designed for supporting models detecting architectural distortions. The approach first detects VOLUME 8, 2020 the mammographic region of interest by removing pectoral muscle, and then applies some preprocessing operations as well as applying Otsu's thresholding [109].
Often, data augmentation technique has been employed to enhance the performance of deep learning models. Both standard generative adversarial networks (GANs) approaches have been used to synthesize data to augment available datasets which may not meet the requirement for applying detection models. For instance, Ben-Ari, et al. [9] improved their training set by image augmentation conducted on positive samples with five random shifts, three rotations and two flips (total of 10 augmentations).

D. MODELS APPLIED TO DETECTION OF ARCHITECTURAL DISTORTION
Deep learning models are often presented and designed as CNN. The design of this network architecture largely depends on the skillful and artistic selection of model parameters/hyperparameters (decay and learning rate), several epochs, the size of the dataset, and depth of the architecture (the more layers, the higher the tendency to extract more features, thereby increasing accuracy). For instance, [73] observed that using step decay rate while reducing learning rate by some percentage after a set number of training epochs increased the performance of characterization of abnormalities in mammography. Practically, a typical network may consist of five convolution layers, three fully connected layers and a SoftMax layer (for classification), and fully connected layer. Another practice among deep learning researchers characterizing abnormalities in breast images is the use of a technique known as transfer learning which allows the import of training parameters achieved in another model into a similar model. We observed from the literature that classification techniques like SVM, KNN, ANN, and Softmax are predominantly used for both binary and multiclass classification.
The Gabor filter is a sinusoidally modulated Gaussian function and can detect local line components and angle components in the image. Gabor function is the product of a 2D Gaussian and a complex exponential function as shown below: g θ,λ,σ 1,σ 2 (x,y) = exp −1/2{x y}{Mx y} T exp{ jπ λ (x cos θ + y sin θ) The Gabor function g θ (x, y) has an anisotropic shape, and the width of the filter function in the short axis direction VOLUME 8, 2020 could be adjusted by changing the standard deviation σ , the space aspect ratio γ , and the wavelength. The approach of architectural detection distortions using Gabor filter requires the use of some of the filters in a layered manner. For instance, a study that used an adaptive Gabor filter for analyzing the mammary gland structure was able to detect the distorted region of the mammary gland as an initial candidate using a concentration index followed by binarization and labeling. A similar application of Gabor filter to the detection of architectural distortions identified the best filter that matched the mammary gland structure pixel-by-pixel from a three Gabor filter system. After detecting the mammary gland, enhancement of the concentrated region and false positive reduction is performed. Recall, also referred to as True Positive Rate (TPR), measures the percentage of the positive group that was correctly predicted to be positive by the model. The F-Measure or F1-score is a combination of precision and recall and β and is used to adjust the importance of precision versus recall. Accuracy alone is usually insufficient to demonstrate the advancement attained by a model or classifier and is sometimes used with error rate to evaluate classification results. MCC provides a more relevant assessment compared to accuracy. Selectivity, also called True Negative Rate (TNR), measures the percentage of the negative group that was correctly predicted to be negative. Precision measures the percentage of the positively labeled samples that are actually positive. The recall does not consider the number of negative samples that are misclassified as positive, which can be problematic in problems containing imbalanced class data with many negative samples. At the same time, precision provides no insight into the number of samples from the positive group that were mislabeled as negative. MCC provides information about the negative case sample detected that is unbalanced compared with the positive sample detected. On the other hand, PPV is the number of the correct detected positive cases overall detected positive cases, while NPV is the number of the true negative cases detected overall negative cases.

V. COMPARISON OF COMPUTATIONAL EXPERIMENTS AND RESULTS
There were sixty-two (62) eligible studies based on the literature search strategy described in Table 4.

A. EXPERIMENTATION ENVIRONMENT AND PERFORMANCE METRICS OF REVIEWED STUDIES
Experimentations carried out using the Gabor Filter technique were done in lower computational resources compared to experimentations done with deep learning techniques. For instance, a study [106] using Gabor filter technique applied to 158 prior mammography images took about 6 minutes per image on a Dell Precision PWS 490 workstation with Quad Intel Xeon processors operating at 3.0 GHz, with 12 GB of RAM, whereas more computational resources like high demand for CPU and GPU are required to run similar experiments using deep learning technique. One of the main contributors to the steep rise of deep learning has been the use of GPU and computing libraries like Tensor, Keras, CUDA, and OpenCL. GPUs are highly parallel computing engines, which have an order of magnitude more execution threads than central processing units (CPUs). With current hardware, deep learning on GPUs is typically 10 to 30 times faster than on CPUs. For instance, the study in [9] which used a deep learning approach was carried out on the i7 Intel CPU with 64G RAM and TitanX GPU. This demonstrates the fact that deep learning enjoys the availability of advanced techniques for training large-scale deep learning models.
Performance measurements for the models described above are mostly compared using metrics such as sensitivity, specificity, accuracy, area under curve (AUC) and false positive rates (FPR). Our review of different studies revealed that FPR is mostly used for this task. The dominant use of FPR as a metric shows that the human limitation usually encountered in the interpretation of mammography highlights the subtlety in detecting architectural distortions. The use of accuracy, sensitivity, and specificity as performance measurement metrics is necessitated by the need for researchers to compare the performance of their approaches/models with similar works. However, we observed that what is important is a research breakthrough which can characterize architectural distortions in any image media sufficiently: mammography, tomosynthesis and ultrasound. This, we believe, will widely promote the adoption of computer vision in detecting abnormalities in medical images.

B. EVALUATION OF THE PERFORMANCES OF REVIEWED STUDIES
In this section, we shall carry out an evaluation and comparison of the performances of all the techniques/approaches adopted by reviewed studies. One of the objectives of this study is to review papers detecting the presence of architectural diction in mammography using deep learning models. However, we decided to include similar studies that approached the problem using different techniques from deep learning. The summary of our findings is detailed in Table 5.
We observed that the Gabor filter approach is capable of yielding high positive rates (FPR) instead of lowering it. Unfortunately, the approach has seriously dominated studies aimed at detecting architectural distortions in mammography VOLUME 8, 2020 O. N. Oyelade, A. E. Ezugwu: State-of-the-Art Survey on Deep Learning Methods for Detection of AD  despite advances made in deep learning models since the year 2012. Could the computational resource or algorithm design be the reason for this? Recent studies focusing on the use of CNN contradicts such school of thought. Another method used in detecting AD is the 2D Fourier Transform which appears to be least effective and least used, and so also the use of fuzzy-based detection models are least used. However, it seems to have performed well in [110] with 91.67% accuracy. Similarly, a study with a closer but better performance of 93% accuracy is that which used the windowbased approach, an approach that is rarely used. However, studies which adopted deep learning technique in detecting architectural distortions appeared to be very promising especially when used alongside some performance improvement measures like data augmentation. The research in [21], which was based on deep learning (using CNN+ data augmentation), yielded the best performance in detecting architectural distortions at an accuracy of 99.4%. This reveals that much can be achieved in the characterization of an abnormality (architectural distortions) in medical images when further research and efforts are channeled into the application of deep learning models. We also observed that deep learning studies which were carried out without the use of performance enhancement practices like normalization, image  preprocessing, and data augmentation performed lower than similar models which leveraged such performance enhancement practices. We also discovered that the use of a large number of inputs in the extraction of feature sets is more pronounced in deep learning models compared to Gabor Filter and 2D Fourier Transform techniques.
A summary of the comparison of the performances of the studies reviewed is presented in Figures 23 and 24. In Figure 23, we graphed a breakdown of studies on architectural distortions included in this survey, grouped into various techniques ranging from 2009-2020. On the other hand, Figure 24 presents a comparison of the use of classifiers, image preprocessing method, and heterogeneity of datasets in studies on architectural distortions included in this survey ranging from 2009-2020. Finally, Table 6 outlines the comparison of widely used mammogram databases in studies on architectural distortions reviewed in this survey ranging from 2009-2020.

VI. DISCUSSION
In this section, we shall focus our discussion on the findings and challenges discovered in the studies reviewed in search of advances made in detecting architectural distortions in breast images.

A. OUR FINDINGS
Research has shown that the subtle appearance presented by architectural distortions in the breast tissues accounts for 12% -45% of breast cancers missed when screening mammography [120]. This is often overlooked or misinterpreted in screening mammography due to the deficiency of human physicians. To overcome these limitations, our survey revealed that several computational approaches have attempted to mitigate this, yet with minimal achievement and restricting their clinical use. We therefore seek to point out some practices that characterized the approaches and studies reviewed.
The DDSM and MIAS databases were observed to have gained extensive usage in most studies we surveyed. These are digitalized images from mammography and do not include other forms of medical imaging like MRI, tomosynthesis, and ultrasound which may also help to hasten the discovery of architectural diction in breast tissues. We therefore advocate the development of approaches, deep learning models in particular, which can efficiently extract features of architectural distortions from all forms of medical images used in breast imaging. Secondly, we investigated the input pattern used in those techniques and found that the use of ROIs-based input rather than whole mammography/images was a widely adopted practice. This is necessary to reduce the search area of the detection algorithm. Also, the approach of using ROIs is widely reported to have enhanced performance of feature detection and image classification procedure. We, at this moment, suggest the use of patches since architectural distortion is easily detected when ROIs are used rather than whole images [110]. Meanwhile, the advantages of serving whole images into detection models may also be considered. Although we observed that manual cropping of images seems to dominate most studies, however deep learning models supporting automatic image cropping for regions of interest and segmentation are pervasive.
We also found that most studies embrace different approaches like filter response, search for linear structures or employing texture features, all of which are known to yield high false-positive rates. This, therefore, may advance the need for research into deep learning models which have demonstrated a very positive performance compared to the traditional approaches compared in Table 2. In addition to the adoption of deep learning models for feature extraction, we observed that SVM classifier (for binary classification: having architectural distortions or not) dominated most of the studies considered. However, some multi-class classifiers and even quadratic discriminant analyses were used. Though SVM has performed very well in binary classification we, however, advocate the use of multi-classifier to include other significant findings (like microcalcification, asymmetries and masses) in the characterization of abnormalities in breast tissues. Therefore, we conclude this section by noting that very few studies have taken advantage of advances in deep learning to advance the detection of architectural distortions in mammography.

B. CHALLENGES OF APPROACHES USED IN DETECTING ARCHITECTURAL DISTORTIONS
Architectural distortions are defined as distorted parenchyma with no definite mass, including thin straight lines or spiculations radiating from a point, and focal retraction, distortions, or straightening at the anterior or posterior edge of the parenchyma [119]. The detection of architectural distortion is challenging to computational models and much more to Radiologists because such images present a low visual signature and ambiguous boundaries. Therefore, there is a need for an improved computational solution to embrace approaches and innovation capable of tackling this challenge.
The following are other challenges that may promote advances in the task of detection of architectural distortions from medical images. Unavailability of large datasets of architectural distortion is a significant setback to deep learning models designed to detect the abnormality. This, however, may not signify a serious problem considering the wide acceptance and performance enhancement that data augmentation provides. Another challenge is the lack of labeled and annotated data which has become a limiting factor in many medical applications. For instance, the use of a few inputs used in Gabor filters methods has proven that the methods need to be tested with larger datasets [76] to ensure that high positive rates are reduced. Mammography is a widely VOLUME 8, 2020   used technique to diagnose breast cancer. Nevertheless, due to the nature of these images, superimposition of tissues may lead to obscured lesions or false alarms. Therefore, digital breast tomosynthesis (DBT) presents potentials to overcome this limitation [27]. But we discovered that no serious deep learning model had been proposed in tackling this problem. Furthermore, recent deep learning models have not been able to address the issue of accurately classifying images of breast biopsy tissue stained with hematoxylin and eosin into different histological grades.

VII. CONCLUSION
The current study focused on reviewing and discovering advances, practices and challenges of techniques and approaches adopted to the task of characterization of architectural diction from mammography. We pursued this course by surveying studies in the last decade -between the years 2009 and 2020. We discovered that approaches like Gabor Filter, window-based method, fuzzy logic, deep learning models, mathematic models and 2D Fourier transform in polar coordinates had been widely used. We discovered that minimal effort had been channeled towards applying deep learning technique to the task of detecting architectural distortion. In contrast, there are several studies which have successfully adapted deep learning models in characterizing microcalcifications, masses, and asymmetries. Investigations revealed that the subtle nature and rare-occurrence of architectural distortion could have dictated this. We note that since early diagnosis of breast cancer is made possible through the detection of architectural distortions, this should be a sufficient motivation to increase research in this direction. This research, therefore, seeks to advance the adoption of computer vision in isolating the cases of architectural diction in digital medical images.