Accurate Detection of Distorted Pectoral Muscle in Mammograms Using Specific Patterned Isocontours

Automatic detection of the pectoral muscle in mammograms is widely used in computer-aided diagnostic (CAD) systems for breast cancer. The pectoral muscle region has some prominent features such as the upper corner position, high density, and triangular shape. But, these features may be distorted due to the masses, artifacts, skin folds, and overlapping tissues, and other reasons. Despite recent developments in CAD technology, accurate detection of distorted pectoral muscle images remains a challenging task. In this study, we proposed an automatic method that uses a divided topographic representation to detect distorted pectoral muscle boundaries. After the preprocessing stage, firstly an isocontour map is generated and then divided into horizontal blocks. The contours of the pectoral muscle boundary in the blocks often reveal specific patterns in terms of location, geometric and topological features. We developed a new segmentation algorithm, rule-based contour detection (RBCD), to detect these specific patterned isocontours. The method applied to two datasets consisting of 84 and 201 mammogram images from MIAS and Inbreast databases respectively. Besides, some distorted pectoral muscle samples selected from these datasets were used to further analyze the performance of the proposed method. The mean False-Positive and the mean False-Negative rates of the proposed method for MIAS and Inbreast datasets were 0.92%, 1.26%, and 2.34%, 1.15%, respectively. The quantitative and qualitative results for the distorted pectoral muscle samples show that the proposed method outperformed the compared methods.


I. INTRODUCTION
Approximately 2.1 million women's breast cancer cases were diagnosed worldwide in 2018 [1]. Breast cancer among women diagnosed in the USA ranks second to the cancer death cases [2]. Mammography technique is frequently used for diagnosing breast cancer at an early stage. It uses low-energy X-rays to create a 2D image called a mammogram from a 3D breast. Since this technique mainly aims to allow the clear visualization of most of the breast tissue, a mammogram is typically taken from different views such as mediolateral oblique (MLO) and craniocaudal (CC). The MLO view shows the maximum amount of breast and pectoral muscle tissue. Because the breast and the pectoral muscle are adjacent to each other, the percentages of their The associate editor coordinating the review of this manuscript and approving it for publication was Kumaradevan Punithakumar . appearance on the MLO mammogram are also proportional to each other. The appearance of the pectoral muscle is one of the most effective criterion showing correct positioning [3].
Many computer-aided diagnosis systems for breast cancer have been developed in the literature. Most of these CAD systems need the breast and pectoral muscle regions to be segmented from mammograms before the classification of breast cancer [4]. Therefore, detection of the pectoral muscle is a functional preprocessing step commonly used in these systems [5]. The pectoral muscle in the MLO mammogram is the main landmark with some distinctive features, such as the upper corner position, high density, and triangular shape. Therefore, this region is usually used as a reference in the registration, 3-D reconstruction, and comparison pair process [6]. Since the intensity and texture of the pectoral muscle resemble that of the suspect regions, most of the CAD systems removed this region [7]. Also, the presence of this region adversely affects the breast tissue density quantification methods for the breast tissue [7]. As a result, the accurate detection of the pectoral muscle is a crucial step to improve the performance of the CAD systems. For this purpose, many automatic pectoral muscle detection methods have been proposed in the literature.
Some pectoral muscle features such as size, contrast, shape, and texture may vary depending on the anatomy of the muscle, the patient's position during image acquisition, artifacts, skin folds, overlapping tissues, and others. Although the features of the pectoral muscle are usually prominent, they can sometimes be unusually distorted for the reasons mentioned. Fig. 1 shows some examples of the mammogram images with distorted pectoral muscles taken from the MIAS Database and Inbreast Database [8]. Specific names of these images in the databases are given in the legend of the figure, respectively. The most common types of distorted pectoral muscles are summarized below and are also labeled for the mammograms shown in Fig. 1.
A few review studies [5], [7], [9], [10] in the automatic detection of pectoral muscle provided a systematic and comprehensive overview of many methods in this field. These studies widely reviewed the advantages and disadvantages of these methods. Although many methods have been developed in this area, most of them have not addressed the challenging task, which is the accurate segmentation of the distorted pectoral muscle. In this study, it is aimed to develop an automatic and robust method in order to overcome this challenge.
Many studies for CAD of breast cancer first detected the pectoral muscle and then removed it to reduce the FP ratio of their method. Most of these studies [10]- [13] did not evaluate the performance of their methods for the distorted pectoral muscle images. Some masses can be seen in the pectoral muscle boundary, as far as we know, there is no study that takes into account such masses when removing the pectoral muscle region. Unfortunately, these masses can be ignored while performing pectoral muscle removal.
Some studies in this field mainly consist of two consecutive processes; roughly initial estimation and then detection of the pectoral muscle. [14]- [18]. A straight-line was mostly used for the estimation process. Kwok et al. [14] estimated the straight-line by Hough transform and then refined it by the cliff detection algorithm. Kinoshita et al. [15] applied Hough transform on Probable Texture Gradient (PTG) map of the image instead of intensity map used by Kwok et al. [14], and followed by block averaging with the aid of approximated line. Kinoshita et al. [16] used Radon-domain information for the detection of straight-line candidates with a high gradient and then selected the longest straight-line candidate to detect the pectoral muscle edge. Bora [17] et al. used an average gradient, position, and shape based on features of the pectoral muscle for the straight-line estimation process. For the detection process, they modified the cliff detection algorithm proposed by Kwok et al. [14]. In order to roughly estimate the pectoral muscle region, Yin et al. [18] Maitra et al. [19] and Asgari Taghanaki et al. [20] defined a rectangle and some specific shapes based on geometric rules, and used as a iterative threshold method for the detection process. These studies used some constraints based on geometric and anatomical features of the image in the estimation process. To the best of our knowledge, the initial estimation process of the distorted pectoral muscle boundary is a challenging task. Furthermore, incorrect estimation adversely affects the subsequent steps.
Ferrari et al. [21] proposed two methods based on the Hough transform and the Gabor wavelets. The first method used the straight-line hypothesis based on geometric and anatomical constraints for the initial estimation process. The second method used the Gabor wavelet filter bank to overcome the limitation of the straight-line hypothesis. The salience of the pectoral muscle boundary was enhanced by using some specially designed Gabor filters. They computed magnitude and phase of the image using a vector summation procedure and obtained the magnitude value of each pixel propagated depending on the direction of the phase. They detected the genuine pectoral muscle edge using the resulting images. They selected a dataset consisting of 84 MLO mammograms from the MIAS (Mammographic Image Analysis Society, London, U.K.) database to measure the performance of these methods. In this study, two radiologists detected the pectoral muscle boundary using some image enhancement processes and further reading techniques. They calculated the percentage of false-positive (FP) and false-negative (FN ) pixels using the ground truth detected by the radiologists.
The mean FP and FN rates of these methods were 0.58%, 5.77%, and 1.98%, 25.19%, respectively. Most studies used this dataset and ground truth in the quantitative evaluation and comparison process.
A few studies in this field modified some region growing algorithms to detect the pectoral muscle. Initial seed, size restriction, and some constraints were mostly used for the stopping criteria. However, these methods may yield erroneous results when the contrast between the pectoral muscle and the surrounding tissue is overlapped. Ma et al. [22] detected the pectoral muscle based on adaptive pyramids (AP) and minimum spanning trees (MST). These methods yielded erroneous results for multi-layered and small-sized pectoral muscle. Camilus et al. [23] applied watershed segmentation to detect the pectoral muscle region candidates. They used these candidates in their proposed merging algorithm. However, their results were sensitive to both over segmented region candidates and the merging criterion. Li et al. [24] employed homogeneous texture and high-intensity deviation features of the image for the estimation process identify the initial pectoral muscle edge. They issued the Kalman filter to refine the ragged initial edge. Chen et al. [25] used first a shape-based enhancement filter and selected candidate seed points to initial estimation. But, the selection of these points is a laborious task for the distorted pectoral muscle.
Wei et al. [26] mainly aimed to detect the boundary of obscure pectoral muscle in MLO mammograms. They first partitioned the pectoral muscle and then used different threshold values for each partition. They used the Hough transform to refine these tentative boundaries belonging to each partition. However, the determination of the threshold values is a challenging task for distorted pectoral muscle examples. Therefore, the error in the first step may directly affect the other steps. This study presented some performance results on the distorted pectoral muscle examples that were only obscured. However, they did not deal with different types of distorted pectoral muscle examples.
Hong et al. [27] proposed a method based on a topographic representation called an isocontour map generated by the multi-scale approach in order to delineate salient regions such as the pectoral muscle, the breast boundary, nipple, and suspicious tissues. The isocontours of a salient region generally form a dense quasi-concentric pattern of contours. Therefore, the boundary of the salient region can be easily represented by a last outer contour from the isocontour map. Hong et al. [27] used only the nesting depth of the consecutive contours to measure the salience of the suspicious region. They did not comprehensively deal with the detection of the pectoral muscle boundary. However, they claimed that the pectoral muscle boundaries could be easily detected with this method since their anatomic properties were distinctive.
In our study, a new segmentation algorithm called a rule-based contour detection (RBCD) algorithm is proposed for the detection of the pectoral muscle boundary. This algorithm uses the isocontour map generated by the multi-scale approach to effectively reveal the distinguishing features of the pectoral muscle boundary. The relationship between the consecutive contours of the pectoral muscle usually shows specific patterned contours having the distinguishing features based on location, geometry, and topographic information. But, these features can sometimes be unstable and inefficient due to the distortion effects mentioned above. In this study, it was aimed to detect the distorted pectoral muscle boundary with high accuracy. Therefore, we divided the isocontour map into small horizontal blocks to eliminate the distortion effects. Consequently, distorted parts and robust parts are separated into different blocks and so the features of the contours of the robust parts become more stable and efficient. Even if the boundary is a curve, the contours of the robust parts can be represented by straight-line. Besides, our method does not need any predefined region, straight-line prediction, restrictions, or any suppositions, used in some previous studies. To the best of our knowledge, there is no other study in the literature which has these advantages for the distorted pectoral muscle.
The rest of this paper is organized as follows. Section II presents a block diagram of the proposed method. Each stage of the block diagram is explained in its subsections. The experimental results and the discussions are presented in Section III and IV. The last, Section V concludes the study and presents a projection for further studies about this topic.

II. PROPOSED METHOD
The block diagram of the proposed method is presented in Fig. 2. As seen from this figure, it consists of six stages. In this block diagram, the outputs of all stages for the sample mammogram image are shown next to the corresponding block, respectively. Each stage is described in detail below.

A. PREPROCESSING
Some common databases such as the MIAS, Inbreast, DDSM, and IRMA have often been used to evaluate the performance VOLUME 8, 2020 of CAD systems in the literature. The sizes of the images in these databases can vary due to the different brands and specifications of the medical imaging devices that acquire them. In the proposed method, since the image is divided horizontally into blocks, the images at different heights cause both the block height and the values of the features to be extracted to be variable. Therefore, at this stage, we first performed a size normalization using a scale factor. The height of the images in the MIAS database is 1024 pixels and is the smallest compared to the height of images in the other databases. Therefore, we chose this value as a reference height for the size normalization process. For the size normalization process, we used the scale factor calculated by dividing the reference height by the height of the image.
A typical mammogram consists of mainly two distinct regions such as the exposed breast region and the unexposed air background (non-breast) region. Furthermore, it can contain some objects such as a black band region, some labels, opaque markers, and artifacts. One of the aims of this stage is to determine the exposed breast region by deleting these objects. In this method, a coordinate system was used for the location features of the pectoral muscle boundary. Mammogram images are toward the right or left side for the left breast or right breast. This situation changes the direction of the coordinate system. In this study, the left MLO mammogram image was rotated to the right side to use the same standard coordinate system in both directions. For the purposes outlined above, we used five consecutive steps, which were detailed in our previous study [28]. These steps are size-normalization process, thresholding, morphological operations, cropping, and rotation.
It is necessary to suppress insignificant details in the isocontour map so that the contours become continuous and smooth. Some studies used the median filtering process [29]- [31] to preserve the sharpness of the edges while suppressing these details. For these purposes, we used a 5×5 median filter. The image (mdb 40) and the preprocessed image are shown in Fig. 3 (a) and (b), respectively.

B. GENERATION OF THE ISOCONTOUR MAP
The multi-scale approach is mostly used for the detection of the salient tissues in medical images. This approach generates an isocontour map called a topographic representation. This map consists of contours created by points with the same value according to the intensity value parameter. The multi-scale approach has some advantages as defined below according to the conventional ones; 1) The intensity variance of a medical image, which may vary by some imaging conditions, becomes invariant in terms of the isocontour map.
2) The isocontour map usually leads to continuous morphological edges. 3) A salient region on the medical image appearing distinctive against the surrounding background usually includes a dense quasi-concentric pattern of isocontours. 4) The relationship between sequential the contours can be comprehensively examined on the isocontour map. The proposed method in this study was designed by taking inspiration from the method in Hong et al study [26]. They used the relationship between consecutive contours of the salient regions such as mass, nipple, and pectoral muscle. As a result of the anatomical structure of a pectoral muscle, its isocontours are usually distinguished from the isocontours of the other tissues by the features of the position, salience, and geometric. Therefore, in this study, we used an isocontour map of the mammogram to detect the pectoral muscle boundary. A compact and connected region is denoted by R (t), where the intensity is higher than a given intensity ''t'' in the image. The isocontour (t),which is a simple curve for a given level ''t'' from the image is defined by the boundary of the region R (t). Connected regions R (I ) and an isocontour map M (I ) for the image I with N denoted by the number of quantization levels is given by An isocontour map formed by a multi-scale approach consists of any (t i ) at intensity t i which is the intensity range of between the minimum intensity t min and the maximum intensity t max in according to the N . Because the main purpose of this study is accurate segmentation in the distorted pectoral muscle regions, a fine-scale isocontour map which consists of all (t i ) for the intensity range of between t min and t max has been used. The number of quantization levels is described below; The isocontour map of the image named ''mdb 40'' is shown in Fig. 4 (a). In this study, intermediate-scale isocontour maps were preferred for the better selection of the relationship between the contours shown in these figures. These contours usually have some prominent features depending on location, geometric, and topographic information. These features may not always be stable and efficient due to the distortion effects mentioned above. As can be seen from Fig. 4, the boundary cannot be represented by a single continuous contour. In other words, there may not be patterned contours that point to the entire pectoral muscle boundary. Therefore, accurate detection of pectoral muscle boundary is a hard task for the distorted pectoral muscle examples.

C. DIVISION OF THE ISOCONTOUR MAP INTO BLOCKS
Although the pectoral muscle contours in the isocontour map usually have some distinctive features associated with their locations, geometric and topographic structures, as a result of the distortion effects mentioned above, these features may not always be stable and efficient for the distorted pectoral muscle. As can be seen from the isocontour map shown in Fig.4 (a), there may not be any pattern indicating the pectoral muscle contours, thereby the boundary cannot be represented by a single continuous contour. Although the pectoral muscle boundary can usually be in the form of a straight-line, it can also occur in the form of a combination of concave and convex shaped segments. For these reasons, accurate detection of the pectoral muscle boundary is a hard task for the distorted ones. In this study, to overcome this hard task, we propose dividing the isocontour map into horizontal blocks. This approach has two essential objectives; distorted contour parts are separated from the robust contour parts, and even if the whole boundary contour is a curved line, the robust contour parts become consecutive straight-line segments in the blocks.
In this study, we conducted a series of experiments to examine the relationship between the number of blocks and the straight-line similarity. For this purpose, the segmentation process by a straight-line representation for the number of blocks 1, 2, 4, 8, 16, and 32 was performed and then the results of this process were analyzed. In these experiments, we aimed to determine the most suitable block number providing the best optimization between process load and the segmentation error. In these experiments, we used two datasets including 84 and 201 images from MIAS and Inbreast databases. These datasets contain the pectoral muscle boundaries drawn by expert radiologists for each of all images. The error analyses were made assuming that these boundaries are the ground truth. In order to quantitatively evaluate the performance of the error analyses, we used some metrics such as FP and FN rates. The FP is the ratio of the number of remaining pixels outside the ground-truth region to the number of pixels in the ground-truth region. Similarly, the FN is the ratio of the number of not found pixels inside the ground-truth region to the number of pixels in the ground-truth region. The pixels in the detected region and the pixels in the ground truth region are denoted D and R, respectively. These metrics are given by For the first experiment, we selected five images that have especially curve-style pectoral muscle boundaries from the dataset. Firstly, the number of blocks was selected as 1.
In other words, the division process was not performed. These curved boundaries were represented by a single straight-line for the segmentation process, and then the TP, FP, and FN regions were qualitatively calculated and shown by red, blue and green colors in Fig. 5, respectively. As can be seen from this figure, if the division process is not performed, the segmentation error for a curved boundary is high when it represented by the straight-line.
In the second experiment, the division process for the number of 2, 4, 8, 16, and 32 blocks was performed on the same images used in the previous experiment. The pectoral VOLUME 8, 2020 boundary parts located in each of the blocks were represented by a straight-line for the segmentation process. Then, all FP and FN values were calculated for these block numbers for each image and shown by the blue and green bars in Fig. 6 (a-e), respectively. As can be seen from these plots, although the total error rates of the images for one block are too high, as the number of blocks increases, the total error rates decrease exponentially even if these boundaries are curve-shaped. Besides, it is seen that the FP or the FN values for the number of 16 or more blocks fall below 1%.
In the third experiment, the previous experiment was performed for each of the images in two datasets. The FP m and FN m values of the 1, 2, 4, 8, 16 and 32 blocks were calculated and shown in Fig.6 (f). As can be seen from this plot, the FP m and FN m decrease below 1% for the number of 16 or more blocks. We also observed in this experiment that the evaluations obtained in the previous experiment were valid for each image in the dataset.The results of these experiments revealed the relationship between the number of blocks and segmentation errors for the straight-line. In this respect, it is appropriate to choose the number of blocks at least 16. It is worth mentioning that the number of blocks more than 16 will increase process load in the next stages without adding much accuracy contribution.
The pectoral muscle always intersects with the vertical edge of the mammogram image. The height of the pectoral muscle is the measure of the vertical distance between the intersection and the starting point. Since this height is unknown, the division process is performed vertically not only on the pectoral muscle but on the entire image.As a result, as the ratio between the height of the pectoral muscle and the height of the image increases, the number of blocks of the pectoral muscle increases. We decided to the number of blocks according to the results of the distribution of the pectoral muscle height by the image height. For this purpose, in the third experiment, this distribution analysis shown in Fig. 7 was done for each image in the two datasets. Although the height of the pectoral muscle varies from patient to patient, as can be seen from this distribution, which is similar to the Gaussian distribution, while a few of them are very large or very small sized, most of them are medium size. As a result, if the number of blocks is selected as 16, the number of pectoral muscle blocks becomes about 8 for most of them.
Besides, we observed that the distortion amount did not exceed one-quarter of the pectoral muscle region. As a result, if the number of blocks is selected as 16, the distorted parts and the robust parts can be divided into different blocks, even in small-sized pectoral muscles. In the preprocessing stage, the resizing of the image height with 1024 pixels was performed for all databases. As a result, the block height is 64 pixels for the division process with 16 blocks. But, we decided to use the block height as 50 pixels to provide at least 16 blocks for small-sized breasts. The division process with the blocks of 50 pixels height was performed on the isocontour map shown Fig. 4 (a), the corresponding blocks are shown in Fig.4 (b). As can be seen from this figure, the distorted and robust parts were mainly separated into different blocks. As a result of the division process, some special patterned contours reveal for the robust parts. These contours show some features related to their axis intersection points, straight-line similarity, and topology of them. These features will be explained in detail in the next step.

D. DETECTION OF CANDIDATE BOUNDARY SEGMENTS
In this study, a Rule Based Contour Detection (RBCD) algorithm is proposed to detect specific patterned contours. This algorithm, which uses the distinctive features described below, is performed separately for each block, so distorted pectoral parts do not adversely affect robust parts. However, some tissues resembling pectoral boundary segments such as skin folds, artifacts, and breast border show specific patterned contours. Therefore, this algorithm can sometimes detect false boundary segments and also miss true boundary segments. In addition to this algorithm, we also proposed other algorithms to eliminate false boundary segments. These algorithms will be explained in the next step.
The first block for mdb 40 mammogram is shown in Fig.8 (a); some specific patterned contours belonging to skinfold, breast boundary, and pectoral muscle can be seen in this figure. As a result, there can be one or more candidate boundary segments in a block. The last outer contour of the specific patterned contours that belongs a salient tissue represents the boundary of this tissue. The RBCD algorithm is based on a decision tree classification method using location, geometry, VOLUME 8, 2020  and topographic features of these contours. These features are described in three sub-stages, respectively.

1) LOCATION FEATURES
As a result of the anatomical structure of the breast, the pectoral muscle starts from the upper right of the preprocessed mammogram image. Therefore, after the division process, the contours of the pectoral muscle parts show some location-based features. We defined these features with axis intersection points. To extract them, firstly a coordinate system is formed for a block, and then the contours are represented by an ordered sequence relative to these points. A contour (C m n ) of the pectoral muscle part for the mdb 40 mammogram image is shown in Fig. 9 (a). As can be seen from this figure, (m) and (n) show the number of blocks and the number of contours, respectively. The height and width of the image, the top diagonal extreme point and the bottom diagonal extreme point are denoted by h, w, tdp m ,and bdp m shown in Fig. 9 (b), respectively. As can be seen in this figure, trp m n is on the upper side of the block and brp m n is on the bottom side of the block except the last block. However, for the last block brp m n appears on the left side of the block.
As a result of the correct position filming of the MLO mammogram, the pectoral muscle boundary never reaches the diagonal of the preprocessed image. The extreme points of the contours are limited by the y-axis coordinates of the diagonal which intersect the up and bottom side of the blocks. These points are calculated by; An ordered sequence denoted by C m 1,N m contains contours,which have sequential extreme points limited by below equations. 2

) GEOMETRIC FEATURES
As can be seen from the results of the experiments in the division process, the straight-line similarity of the pectoral muscle parts increases with this process. Therefore, in this study, the amount of similarity to the straight-line was used to determine the contours of the pectoral muscle parts. To measure straight-line similarity, the simplest and most commonly applied linear regression form is the linear least-squares fitting technique. Root Mean Square Error (RMSE) is one of the frequently used statistical values for evaluating 'goodness of fit'. The low RMSE value of a contour part indicates that it has a high similarity to a straight-line. In this sub-stage, the axis intersection angles are also calculated by representing each contour with a straight-line. As a result of the anatomical structure of the pectoral muscle parts, these angle values can never rise above the correct angle. This straight-line representation and an intersection angle for one of the sequential contours are shown in Fig. 9 (a). The intersection angles and RMSE of the contours in the sequence are respectively denoted by;

3) TOPOGRAPHIC FEATURE
In this study, a topographic representation is used in order to delineate the pectoral muscle boundary. The isocontours of a salient region such as the pectoral muscle, the breast boundary, nipple, and suspicious tissues generally form a dense quasi-concentric pattern of contours. Therefore, the area between the sequential contours for the salient region is considerably smaller than the other regions. For this purpose, we analyzed the graph of change of area between sequential contours. It is calculated with the formula below;

4) RULE-BASED CONTOUR DETECTION
The RBCD algorithm consists of some decision tree rules using the features defined above. As a result of the experimental studies, the threshold values for the decision tree rules that determine the boundaries of the pectoral muscle are listed below.

RMSE m
1,N m < 4pixels and θ m 1,N m < 90 • and A m 1,N m < 50pixel 2 (18) The specific patterned contours of the different tissues in the ordered sequence are grouped separately for each of them. The salience of any tissue can be measured by the number of specific patterned contours that belongs to the tissue. While this number is big in a salience tissue, it is low in a subtle tissue. Our experimental studies showed that there are at least 5 specific patterned contours in the pectoral muscle segment. Therefore, the minimum number of specific patterned contours was chosen experimentally as 5. The end contour of specific patterned contours considered a candidate boundary. There may be one or more candidate boundaries in a block. The RBCD algorithm that finds the candidate boundaries for ''m'' block is defined below: This algorithm is executed separately for all blocks so that all candidate border regions are found. The specific patterned contours of skin fold and pectoral muscle are shown in Fig. 8 (b). The candidate boundaries are shown in Fig. 8 (c).

E. ELIMINATION OF FALSE BOUNDARY SEGMENTS
The candidate boundary segments may belong to the pectoral muscle, breast boundary, skinfolds, and the other similar tissues. Furthermore, some of the false boundary segments may be located in any pectoral muscle blocks, while others may be located in any breast blocks. Besides, the RBCD algorithm may not detect some boundary segments depending on the distortion effects. As a result of these possibilities, the outputs of the RBCD algorithm can be summarized in six different states given below.
1) There may be only and one true boundary segment in any pectoral block. 2) There may be one or more false boundary segments and one true boundary segment in any pectoral block.
3) There may be one or more false boundary segments and missed boundary segments in any pectoral block. 4) There may not be any boundary segment in any pectoral block. 5) There may not be any boundary segment in any breast block. 6) There may be one or more false boundary segments in any breast block. The boundary segments for mdb 40 and their cases are shown in Fig. 10 (a). As can be seen from this figure, all cases occurred in the sample images. In this stage, we proposed three algorithms to eliminate false boundary segments. The algorithms are mainly based on some knowledge below: • The pectoral muscle boundary may be represented by third-degree polynomial function because it can consist of concave, convex, straight line segments.
• It begins at the upper edge and ends at the left edge of the image.
• It continues to decrease gradually from up to down. As a result, we decided to represent the pectoral muscle border with a cubic polynomial.
As seen from Case 6 results in Fig. 10 (a), false positive boundary segments appear in not only pectoral blocks but also breast blocks. Therefore, we proposed an algorithm that can eliminate candidate boundary segments of Case 6 by detecting the last pectoral muscle block. This algorithm is given below: In the case of more than one candidate boundary segment in any block given Case 2, an optimum boundary segment must be selected; the others must be eliminated because one of them may only be true. We proposed an algorithm given below for these tasks. Firstly, all alternatives for different boundary paths are created using all combinations of the candidate boundary segments. The RMSE values of these different boundary paths are compared to each other to decide the optimum boundary path.

Algorithm 4 Refining the Optimum Boundary Path
Input: The optimum boundary segments True-positive Output: boundary segments Initialisation: 1: Fit a third-degree polynomial to the optimum boundary path. 2: If the RMSE of the third-degree polynomial is less than 6 pixels, go Step 9. 3: Create all combinations of sub-paths by eliminating each boundary segments one by one from on the optimum path. 4: Fit a third-degree polynomial to each of the sub-paths. 5: Calculate RMSE values for each of them. 6: Select the sub-path having the minimum RMSE. 7: Consider the sub-path as the optimum boundary path. 8: Step 1; 9: Accept the optimum boundary path.
As a result of algorithm 3, the false candidate boundary segments were eliminated with the exception of a single candidate boundary segment remaining in the pectoral muscle block. However, It is not yet known whether the remaining boundary segment is true or false. We developed Algorithm 4 given below to eliminate the remaining false boundary segments on the optimum boundary path. The maximum RMSE value selected experimentally 6 pixels was used to test the condition of accuracy. The elimination and curve-fitting operations are repeated until this condition is met. Consequently, this algorithm eliminates the remaining false boundary segments. Fig. 10 (a,b,c,d) show the results of Algorithms 1-4 for mdb40, respectively. As can be seen from this figure, first, candidate boundary segments are detected by Algorithm 1 and then all FP boundary segments are eliminated by Algorithms 2-4.

F. DETERMINATION OF MISSED BOUNDARY SEGMENTS
As shown in Fig. 10 (a), there may be missed boundary segments (Case 4) that cannot be detected as a result of distortion effects. Furthermore, the boundary segments are generally rough and not continuous. In the final stage, the curve fitting operation is used both to detect the missed boundary segments and smooth true boundary segments. A curve fitting process with a third-order polynomial is carried out to detect the pectoral muscle boundary. This stage produces the final output of the proposed method. The pectoral muscle boundary detected by the proposed method for mdb 40 is shown in Fig. 10 (e).

III. EXPERIMENTAL RESULTS
In order to test and compare the performance of the method, we searched for widely used databases containing expert radiologist information. Ferrari et al. [21] used a dataset including 84 MLO mammograms from the MIAS database. The radiologist drawings and the results of the pectoral muscle boundaries in this dataset can be provided by Ferrari et al. [21]. Therefore, many studies used this dataset to compare with state-of-the-art methods reported in the literature under the same conditions. This dataset also contains a considerable number of distorted pectoral muscle images.
The Inbreast database contains expert radiological information for the pectoral muscle boundaries. The experimental results in some studies [29], [32], [33] were carried out by using all 201 MLO images in this database. The fact that this database contains the radiological information has led to it being preferred as the second dataset in our experimental studies.
The main purpose of our study is to develop a method that is successful in distorted pectoral muscle boundaries, too. Consequently, we created a third dataset consisting only of the distorted pectoral muscle images. This dataset contains 12 images shown in Fig. 1 (a-l) from the MIAS database and 4 images shown in Fig. 1 (m-p) from the Inbreast database. In the literature, to the best of our knowledge, there are no studies providing a comparison of the quantitative results based on a distorted pectoral muscle dataset. Some studies given in Table 1 in this area made qualitative performance evaluations for some distorted images, but none of them made quantitative evaluations for these images one by one. The proposed method was applied to the MIAS and Inbreast datasets for the distorted pectoral muscle images one by one. The experimental results evaluated qualitatively and quantitatively were given below.

A. QUALITATIVE EVALUATION
In the first experiment, we applied our method for qualitative evaluation of the distorted pectoral muscle images in the third dataset. The results of our method for each of these images are shown in Fig. 11 (a-p) with yellow lines. For an effective qualitative evaluation, the results of the Hough, Gabor methods, and the radiologist drawings are shown simultaneously on the same images by blue, green, and red lines, respectively. Since the results of the Hough and Gabor methods for the images selected from the Inbreast database are not available, they could not be shown in Fig. 11 (m-p).

B. QUANTITATIVE EVALUATION
We performed some experiments on the three datasets for quantitative evaluations. Some metrics such as FP, FN , FP m , and FN m values were used in this study for this purpose. These metric values were calculated according to equations 5, 6, 7, 8 In the second experiment, we computed the FP m and FN m values of our method for the images in the first dataset. The results of the proposed method and the compared studies are shown in Table 1. These studies also used additional performance criteria based on the range of the metrics FP and FN . These criteria take into account the number of images that meet the predefined error ranges with the FP and FN percentages given in rows 3 to 6 of column 1 in Table 1. According to these criteria, the performance of our method, FIGURE 11. The qualitative results of the proposed method, Hough method,and Gabor method with radiologist drawings for the distorted pectoral muscle images shown by yellow, blue, green, and red, respectively.  The results of the proposed method, Hough method, and Gabor method for the images shown in Fig.1 (a-l).

FIGURE 12.
The number of images versus error ranges for the data given in Table 1. and those of others, are given in Table 1. In addition, these values are shown in the vertical bar graphical representation in Fig. 12 to effectively compare the performance of these studies in terms of these criteria. Each of the results of these methods was represented by different colors.
In the third experiment, we quantitatively evaluated the performance of our method on the distorted images shown in Fig (a-l). The FP and FN values of our method for these images were one by one computed. Our method, Hough, and Gabor method's results are shown in Table 2. The FP m and FN m values of these results and the performance of them according to the error ranges are given in Table 3. Besides, the number of images versus the error ranges for the data given in Table 3 is shown Fig. 13 with different colors for each method.
In the last experiment, we computed FP m and FN m values of the images given in third database. The results of our study and the results of the compared studies are shown in Table 4.

IV. DISCUSSIONS
The results of qualitative and quantitative experiments on the images in three data sets were given in the previous section. As a result, a total of 84 + 201 images were analyzed. The discussion of the results is given in the following paragraphs.    Table 3.
The results on a set of 16 images having the distorted pectoral muscle due to masses, artifacts, skinfolds and overlapping tissues, or other effects are shown in Fig. 11. As can be seen from this figure, our method yields satisfactory results for the distorted pectoral muscle examples. The boundaries obtained with our method are visually much closer to the pectoral muscle edges drawn by the radiologist. The Hough method yields unsatisfactory results on all of the images. The Gabor method yields satisfactory results only on some of these images. Consequently, the results of our study outperformed to both the Hough and Gabor methods results for the distorted pectoral muscle boundaries. Table 1 shows a performance comparison of our method and the state-of-the-art methods for the images in the MIAS database. We obtained these results from the second experiment described above. All FP m rates are between 0.58% and 3.71%. Although the FP m rate of our method ranks third with 0.92%, it is close to the result of the Gabor method, which gives the best rate. All FN m rates are between 2.34% and 25.19%. The FN m rates of the Hough method and the MST method are among the worst. Apart from these, all FN m ratios are very close to each other. However, the FN m rate of our method has the best performance with 2.34%.
As can be seen from Fig. 12, when the performance of the methods in Table 1 is evaluated according to the smallest error range, all results are very close to each other except the worst and the best ones. Our method showed the best performance by reaching about 74% of the images for the ''error range 1'' compared to these methods. The Hough method showed the worst performance by reaching about 78% of the images for the highest error range. Besides, the results of the Gabor, AP, and MST methods had 17, 5, and 3 images in the ''error range 6'', respectively. However, our method did not exceed the ''error range 3'' for any image. According to these results, it is seen that the proposed method yielded superior performance compared to the methods given in Table 1. As a result, our method outperformed the methods given in Table 1.
In Table 2, the quantitative evaluation results of our method for the distorted pectoral muscle images were given separately for each image along with the results of the Hough and Gabor methods. We took the quantitative results of distorted pectoral muscle images for these methods from Ferrari et al., but quantitative results are not presented for other methods given in Table 1 in the literature. As can be seen from Table 2, the FP rate of the Gabor and Hough methods for some distorted images was 100%. In terms of FN results, the Hough method shown a performance close to 3.5% on mdb109, between 23% and 93% on other images, whereas the results of the Gabor method between 0% and 42%. Our method yielded FP and FN values of less than 4.3% and 8.6%, respectively. Our method performed quite well compared to the Gabor and Hough methods. However, in terms of FP values for some images, the Gabor method performed slightly better. These results showed that our method has a superior performance for the distorted images.
As seen in Table 3, the FP m and FN m rates of the Hough and Gabor methods on the distorted images are more than 10% and these rates were found unsatisfactory. However, the proposed method yielded the FP m rate of 2.01% and the FN m rate of 3.44% respectively. As can be seen from Fig. 13, the number of images in the smallest error range is ten and corresponds to about 83%. In addition, all images are in the first two error ranges. In the Hough method, the images are in the last two error ranges, with a maximum of 9 images in the fifth range and corresponds to 75%. As for the Gabor method, the number of images in ''error range 3-6'' is ten and corresponds to about 83%. As a result, when our method was evaluated according to the error ranges for the distorted pectoral muscle images, it performed better than both Hough and Gabor methods. Table 4 shows the qualitative comparison of the state-of-art methods for the Inbreast database. As can be seen from this table, all FP m values are very small compared to the results obtained from the MIAS database in Table 1 and are between 0.3% and 2.4%. Rampun et al. [33] achieved the best FP m value. The FP m rate of our method ranks second with 1.26%. The FN m values are between 1.15% and 13.6%. Shi et al. [29] obtained the biggest FN m value. Our method achieved the best FN m value. Since the quantitative performance results of these methods for the distorted pectoral muscle images in this database are not in the literature, the quantitative evaluation results for each of the distorted pectoral muscle images shown in Table 2 could not presented for the Inbreast database.

V. CONCLUSION
In this study, we proposed an automatic method that uses a divided topographic representation to detect distorted pectoral muscle boundaries. A robust pectoral muscle boundary can be represented by a single continuous contour in the isocontour map. However, the distorted pectoral muscle boundary cannot be represented by a single continuous contour because some parts of them can be fuzzy or invisible. In this method, after the division of the isocontour map into blocks, the distorted and robust pectoral muscle parts are separated from each other and localized in different blocks. Even if the whole boundary contour is a curved line, the robust contour parts become straight-line segments in the blocks. The location, geometry, and topographic features of the pectoral muscle boundary become more efficient and stable. As a result, the specific patterned contours having these features are revealed for the robust pectoral muscle boundary parts. However, some of the pectoral muscle blocks do not have specific patterned contours due to the distorting effects.
Since the proposed method is performed separately for each block, no distorted pectoral part does not adversely affect the other robust parts. The candidate pectoral muscle boundary parts are detected from the specific patterned contours using the RBCD algorithm. Some of them may be false-positive because some tissues such as breast border, skinfold, and some tissues may show similar characteristics of the pectoral muscle boundary. The optimum boundary detection algorithms developed in this study eliminate false-positive boundary segments. Finally, the pectoral muscle boundary segments are combined and refined by the third-degree polynomial.
Our method was tested on two data sets consisting of 84 and 201 mammogram images, respectively, from MIAS and Inbreast databases. The method was also tested on the distorted images selected from these databases. The quantitative and qualitative results for the distorted images show that the proposed method outperformed the other compared methods.
In the future, the proposed method will be tested on distorted pectoral muscle images in other databases to further test its validity. Besides, by adding the division process to the state-of-the-art methods in this field, their performance on distorted images will be examined. Due to the overlapping tissue problem in radiology, some masses in the breast, lungs, liver, and brain can sometimes be partially blurred or invisible. In these cases, the proposed method can be adapted and used for the segmentation of these masses.
HAYATI TURE received the bachelor's and master's degrees from the Department of Electrical and Electronics Engineering, Karadeniz Technical University, in 1999 and 2003, respectively. He is currently pursuing the Ph.D. degree with the Department of Electrical and Electronics Engineering. Besides, he is also a Network Manager with the IT Department in the Computer Center, Karadeniz Technical University. His research interests include pattern recognition and biomedical image processing.
TEMEL KAYIKCIOGLU received the bachelor's, master's, and Ph.D. degrees from the Department of Electrical and Electronics Engineering, Karadeniz Technical University and Texas Tech University, in 1984, 1987, and 1993, respectively. He is currently a Professor with Karadeniz Technical University. His main research interests include biomedical signal, image processing, computational neuroscience, and brain-computer interface. He has governed more than 35 graduate theses, published more than 20 journal articles and a book chapter in these areas. He is an Editor Committee for the Turkish Journal of Electrical Engineering and Computer Science. He is the Head of IEEE 22nd Signal Processing and Communication Applications (SIU) Conference and National Biomedical Meeting (TIPTEKNO 2017).