Diagnosis Support Model of Cardiomegaly Based on CNN Using ResNet and Explainable Feature Map

Medical expert support systems using medical data significantly contribute to the field of medicine by utilizing intelligent image analysis technology. Such medical systems must be validated and must have a transparent internal structure. Therefore, the technique of analyzing the results of the medical system is an important factor. This study proposes a diagnosis support model of cardiomegaly based on CNN using ResNet and explainable feature map. To configure the model, initially, a cardiomegaly diagnosis model is configured using ResNet, and a chest X-ray data set is used and learned. As the result changes resulting from the input changes of the model configured for model diagnosis support are detected, an explainable feature map for analysis is implemented. The input changes of the model are made as each pixel of the required image is reversed in sequence and the changes of the neural network output layer are saved on the feature map. As a result, the configured analysis method provides information inside the neural network through an accuracy result close to 80% and a visually expressed feature map. In particular, this interpretation method can be extended to a more general form in a medical support system that analyzes patterns.


I. INTRODUCTION
The latest artificial intelligence plays a key role in diverse fields [1]. In the past, it was not difficult to interpret a simple neural network system. However, algorithms such as recent DNNs utilize a large matrix form, and image analysis systems consist of hundreds of layers and associated variables. Therefore, most artificial intelligence algorithms being commercialized recently are known as black box models [2], [3]. As the number of such black box models being used is gradually increasing, the demand for transparency is increasing among diverse AI-related interest groups [4]. Therefore, in fields such as finance and security required to make important decisions, providing reliable data relating to results is gradually becoming important. In particular, medical diagnosis support systems for medical expert diagnosis support must deliver diverse information relating to result interpretation The associate editor coordinating the review of this manuscript and approving it for publication was Mehdi Hosseinzadeh . [5], [6]. The inner-body environment of a patient with a particular disease varies diversely, a medical expert determines the importance of each factor according to the situation. Therefore, when a machine learning model determines the characteristics of a particular patient, it must propose the most important judgment grounds. Only after going through such a process, a medical expert can determine the plan to respond to each factor and the reliability of the model's results. As described, understanding the model's grounds for prediction is very important in determining whether or not the model's results are reliable.
This study constructs a medical diagnosis support model centering on cardiomegaly. To configure a diagnosis support model using an explainable feature map, the X-ray images for cardiomegaly judgment are used as the baseline data. Cardiomegaly is a disease that leads to enlargement of the heart in general as well as to an abnormally heavy heart by weight and is associated with diverse cardiac diseases, physiological statuses, and genetic factors [7]. Therefore, it may serve as a model index for discovering symptoms relating to diverse cardiac diseases. Such a diagnosis of cardiomegaly is conducted using diverse image equipment, and, in the case where an increase in the ventricular wall thickness is indicated, the case may be determined as having a disease. Currently, the diagnosis of cardiomegaly is conducted through diverse imaging techniques such as chest X-ray, electrocardiography, echocardiography, ventricular angiography, magnetic resonance imaging, etc. As described, the diagnosis of cardiomegaly can be conducted using diverse imaging equipment and diagnosis techniques. Among these, the method using X-ray images can predict the risk of cardiac hypertrophy and the probability of hospitalization using the cardiothoracic ratio [8]. In addition, various studies for diagnosing diseases related to heart and lung diseases using artificial neural networks are being conducted. These studies use various types of CNN models based on chest X-ray image data. Through this, it detects or locates abnormal patterns in image data such as Chest X-rays. These models show excellent performance using data from various medical institutions. In addition, it provides various visualization information so that the cause of the actual disease can be inferred. In this study, based on the existing research, we unify the range of diseases to the cardiomegaly and present a more improved visualization model for use in medical support systems. In particular, this method uses X-ray, this X-rays method is universally available because it is the most basic test items in medical examination practice, and can be an option that can be selected at relatively low cost in countries with high medical costs.
This study proposes a diagnosis support model of cardiomegaly based on CNN(Convolutional Neural Network) using ResNet (Residual Network) and explainable feature map. This method classifies diseases with an effective CNN algorithm only by inputting X-ray images and explains the reason for classification using an explainable feature map. This method classifies diseases with an effective CNN algorithm only by inputting X-ray images and explains the reason for classification using an explainable feature map. To do this, first construct an artificial neural network with ResNet and proceed with learning. The ResNet used consists of 7 layers consisting of BasicBlock layers, and each layer consists of 2 residual blocks [9]. Internally, the configuration changes utilizing SGD, AdaGrad, RMSProp, and Adam are used for accuracy comparison. Then, the results of neural network output layers relating to the produced judgment are fixed, and each pixel of the target image is reversed in sequence. The positive and negative changes in the result value of the neural network output layer resulting from the input data changes are saved on the feature map known as an explainable feature map. The two produced explainable feature maps can be visually expressed. As a result, the motion characteristics inside the artificial neural network are expressed according to the shape of the resulting image. Therefore, since cardiomegaly is diagnosed only by inputting an X-ray image, medical costs are greatly reduced. In addition, because the explainable feature map can be expressed visually, it can contribute to the internal interpretation and information delivery of the neural network to non-experts in artificial neural networks such as medical experts. In this study, the CNN-based image analysis and the cause and judgment of cardiac hypertrophy are described in Chapter 2. The proposed diagnosis support model of cardiomegaly based on CNN using ResNet and explainable feature map is described in Chapter 3. The results and performance evaluation are described in Chapter 4, and the conclusion is described in Chapter 5.

II. RELATED WORK A. CNN-BASED IMAGE ANALYSIS AND XAI
The accuracy of machine learning models such as CNN (Convolutional Neural Network)s is gradually increasing. However, the structure is more complicated and it is more difficult to interpret the results [10], [11]. Due to such characteristics, machine learning models are often expressed as a black box. Therefore, explaining the prediction results of a machine learning model is very difficult [12]. However, it is important to understand the reason why such models arrive at their results. In particular, the reason is, machine learning is based on past learning, and there is always a probability that such an operating method will malfunction when given data with high specificity.
Therefore, such imperfect judgment systems have more value as support systems, and support systems having an influence on medical treatments require a significant explanation of the generation process. Based on different perspectives, it can be said that XAI (eXplainable Artificial Intelligence) was studied as the initial artificial intelligence study was conducted, but it was actually appropriately defined by Fisher, Rudin, Dominic in 2004 [13]. Recent XAI is defined as methods and techniques in the application of artificial intelligence technology such that the results of the solution can be understood, and is considered, along with AI, an important part of the future [14], [15]. In the United States of America, a related state-led project is performed through DARPA(Defense Advanced Research Projects Agency) [16]. This program includes 11 detailed projects and will be continued until 2021. In Europe, the investments being made by European Commission are expected to reach a total of 10 billion euros by 2020 [15], [17].
According to a study by DARPA, XAI is focused on adding the explanation and HCI(Human Computer Interaction) functions to the basic model, and on improving the AI situation. In other words, through the use of XAI, the user guarantees fairness in the support system's decision-making process, makes the interpretation process more convenient, makes it possible to predict unstable results, and confirms the causal relationship of the model's reasoning [18]. Such XAI model is divided into a technique known as Feature Importance or Permutation Importance, PDP (Partial Dependence Plots), Surrogate analysis, etc. . . Recently, black box explanations through transparent approximations (BETA) [19], local interpretable model-agnostic explanations (LIME) [20], generalized additive models (GAMs) [21], etc. are being studied.
Initially, a technique known as Feature Importance or Permutation Importance [22] is a technique used to analyze which part of the data feature has an influence on the classification results. Therefore, a particular feature value is adjusted using a random value, and how greater the error turns out to be is measured. However, this technique has no negative/positive directions, and it is difficult to examine the influence according to scale or the dependency between features. PDP (Partial Dependence Plots) is a technique where the influence on the overall model is evaluated as a particular feature value is linearly changed [22]. This technique can be used to visually express the influence according to the feature changes. However, although such a method supplements the weaknesses of the feature importance technique, its calculation process takes too long. The surrogate analysis is a model where a number of surrogate analysis models sharing a similar function to the original function are made to determines the operating characteristics of the original model. Therefore, it is also known as an approximation model, response surface model, and emulator. Such a method allows the learning process without any knowledge of the model, and is divided into the global surrogate analysis using overall learning data and the local surrogate analysis interpreting one learning data. Such a method is applicable regardless of the type of machine learning, makes it easy to implement a surrogate analysis model, and makes the explanation simple. However, since it is not the model's direct interpretation, the result interpretation may be slightly different.
The local surrogate analysis includes LIME (Local Interpretable Model-agnostic Explanations) as its representative method, and a number of studies are conducted recently. LIME is a technique used to analyze how an artificial intelligence model interprets one individual data [23]. In the case of an image, the pixel mostly influenced by the result is selected, and this pixel is designated as the super pixel. Then, the areas surrounding the super pixel are enlarged and the areas sharing the same influence are selected. LIME's operating procedures are as follows. Initially, the original data is inserted into the model and the prediction value is extracted. Then, the fabricated data similar to the original data is entered into the black box model. The fabricated data is gradually changed. In this process, a linear model sharing similar values to the similarity weight value between fabricated data and original data and to the external prediction value of the black box is made. As a result, it is possible to comparatively simple linear models similar to the black box model. Therefore, it is possible to schematize the external input data importance having an influence on the results. Through this, it becomes possible to explain the prediction of the results according to the original data changes. As it is with surrogate analysis, since such LIME is applicable regardless of the type of intelligent model and interprets the results through the model's external input changes, it can be easily understood by humans and is intuitive.
In addition, as a similar concept to LIME, there is SHAP (Shapley Additive exPlanations) that is based on the coalitional game theory. SHAP configures the linear model of the results that change depending on the status of certain features during dataset configuration, and analyzes and interprets the coefficient(weight value). However, such a method requires a lot of calculations and tends to ignore the dependence between features. Filter visualization analyzes the influence on the results by blurring certain areas of the hidden layer during RNN-like image deep learning computation. LRP (Layerwise Relevance Propagation) back-traces the model's results and enters a heat map shape on the input image. This method back-traces from the output terminal to the input terminal, and uses a method of evaluating the certain relevance of each neuron.
As described, recently, diverse XAI models are being studied. XAI is a must to understand the internal structure of a deep learning model. Therefore, XAI is very useful to not only deep learning researchers, but also the end users of medical diagnosis systems such as medical support systems.

B. CAUSE AND READING OF CARDIAC HYPERTROPHY
A heart responds to diverse physical, hemodynamic, hormonal, and pathological stimuli. Therefore, a hypertrophic response is induced by diverse causes, and adjusts to cardiac load requirements by increasing the cardiac muscle mass. As a result, cardiomegaly which thickens the ventricular wall and increases the cardiac muscle weight occurs. In general, it occurs when the heart continuously overdoes itself due to a cause that intensifies the cardiac impulse, and is associated with diverse diseases, physiological states, and genetic factors. It may occur due to physiological states such as the exercise training of athletes or manual workers [7]. In addition, it is influenced by almost all cardiac diseases such as hypertension, myocardial infarction, arrhythmia, hormonal disorder, genetic mutation, etc... Continued hypertrophy may lead to cardiac failure, sudden death, cardiomyopathy, etc., and is a serious risk factor relating to cardiac diseases and cardiac death rate. In the U.S., approximately 500 thousand people are diagnosed with cardiac failure on an annual basis, and the death rate is almost 50% [24].
Cardiac hypertrophy occurring based on cardiac diseases may occur due to diseases such as hypertension, aortic stenosis, hypertrophic cardiomyopathy, invasive heart muscle disease, metabolic disorder, etc., and the cardiac hypertrophic response consists of a compensation mechanism that increases cardiac output in the early stage. The symptoms of this disease are difficulty with breathing, palpitation, chest stuffing, and chest pain. In particular, such symptoms may show or be intensified when a moderate intensity physical activity suddenly occurs. On the other hand, as far as so-called physiological hypertrophy occurring to elite athletes is concerned, physiological adaptation to periodic intensive physical training has an influence on the mass increase, ventricular enlargement, and ventricular wall thickening, and is related to the proportionally increased length and width 55804 VOLUME 9, 2021 of cardiomyocytes [25]. In such a case, since an incorrect diagnosis has an extensive influence on the sports organization and society, the ability to reliably distinguish normal training effects is important [26]. Since cardiac hypertrophy is associated with such diverse diseases and physiological states, analyzing a patient with cardiac hypertrophy is a serious problem in most cases, and the pathology of diseases such as left-ventricular dysfunction and ischemia is complicated [27].
The diagnosis of cardiac hypertrophy is conducted using diverse imaging equipment, and, in the case where an increase in the ventricular wall thickness is indicated, the case may be determined as having a disease, and, depending on the type of disease, myocardial fibrosis, morphological abnormalities in the heart, abnormal circulation of the coronary arteries and other structures, an abnormal electrocardiogram may be included as well. Cardiac hypertrophy may be suspected in the early stage due to heart murmur, decreased cardiac output, family medical history, new symptom, left ventricular hypertrophy, and abnormal electrocardiogram pattern. Currently, the diagnosis of cardiac hypertrophy is conducted through diverse imaging techniques such as chest X-ray, electrocardiography, echocardiography, left ventricular angiography, magnetic resonance imaging, etc.
Although the magnetic resonance imaging of the heart is rarely used to distinguish the cause of cardiac hypertrophy, it contributes to diagnosis data as it proposes the distribution and severity of myocardial morphology [26]. Then, in general, an echocardiography examination is conducted [27]. MRI is an important imaging test that plays an important role in the modern diagnosis of patients with cardiac hypertrophy, and provides the accurate definition of full left ventricular wall reconstruction and of the distribution and pattern of cardiac hypertrophy. It is particularly useful to patients whose echocardiography shows no clear anatomic features of the left ventricular wall [27]. Cardiac magnetic resonance uses a method that detects the phase shift of signals generated from the protons of an atomic nucleus, and is capable of quantifying the structural characteristics and blood flow status. CMR is a method capable of identifying the left ventricular hypertrophy areas not easily recognized through the echocardiography examination, and is the most powerful support imaging test method that provides clear diagnostic advantages to the selected patient with cardiac hypertrophy [28].
Echocardiography is useful as an early-stage examination method, since it can be used to evaluate hemodynamic abnormalities in the cardiovascular system. It can be used to confirm the tumor's shape, position, size, relationship with surrounding organs, and fluidity. In addition, since it is used to determine whether or not the tumor invaded the pericardium and myocardium and whether or not cardiac dysfunction is hemodynamically caused, it is possible to conduct a detailed follow-up observation. Therefore, echocardiography is useful for determining the future treatment plan for cardiac hypertrophy and conducting the post-treatment follow-up examination [29]. However, although echocardiography having high sensitivity and specificity can be used to visualize the left ventricular wall, it cannot be used to distinguish between the conditions based on muscle cell hypertrophy and the conditions that thicken the left ventricular mass and cardiac wall thickness due to the infiltration or accumulation in the metabolic organ cells [27]. Patients who seem to be influenced by cardiac hypertrophy cannot be clinically examined through echocardiography due to technical or other reasons [30].
As described above, the modernized imaging equipment and diagnosis techniques provide diverse methods that can be used to diagnose cardiac hypertrophy. However, the burden of expenses serves as a big problem and most of these examinations can only be executed at medical institutions equipped with high-priced diagnosis equipment. On the other hand, chest X-ray is a relatively low-priced examination that has been used traditionally and can be executed universally, and is capable of diagnosing cardiac hypertrophy by measuring the CTR(Cardio-Thoracic ratio) [31]. CTR is a method used to diagnose cardiac hypertrophy through chest X-ray imaging, and is measured using the following formula: CTR = (MR + ML)/TD. Figure 1 shows an X-ray image that measures the CTR. MR (Maximal Right) is the maximal diameter of the right heart, and ML (Maximal Left) is the maximal diameter of the left heart. TD (Thoracic Diameter) is the maximal diameter of the thorax, and CTR is normally considered abnormal when it is above 50% [32], [33].
Since the diagnosis of cardiac hypertrophy using lowpriced high-accessibility chest X-ray imaging is capable of predicting the risk that a higher cardiothoracic ratio means that the patient has a higher probability of becoming an inpatient, it is highly reliable in predicting cardiac hypertrophy even compared to a high-priced cardiac MR [8].
Recently, molecular genetic study results are bringing changes to the basic understanding of cardiac hypertrophy, and a clinical agreement is being reached [30]. For the case of patients diagnosed with cardiac hypertrophy, the gene tests are showing relevance. Therefore, genetic studies have the potential to enhance the reliability of the diagnosis of cardiac hypertrophy, and can play an important role in bringing solutions to ambiguous diagnoses in the future [27]. Currently, genetic studies remain at a research level due to time-consuming family medical history studies, but more universal and diversified gene diagnosis techniques are expected to become available within 5 years [30].
As described, cardiac hypertrophy is associated with diverse cardiac diseases, physiological statuses, and genetic factors. Also, long-term cardiac hypertrophy leads to cardiac failure and may lead to lead to death due to cardiac diseases [25]. Diverse universal techniques such as MRI, CMR, and echocardiography are used to distinguish cardiac hypertrophy. On the other hand, the X-ray imagery interpretation technique using X-ray, a type of electromagnetic wave, is not commonly used to diagnose cardiac diseases. However, since it is particular in that it could be easily collected as additional data relating to lung diseases and in that the results can be confirmed within a short amount of time, attempts to use it are being made in developing countries and in places having no access to universal medical services.

III. DIAGNOSIS SUPPORT MODEL OF CARDIOMEGALY BASED ON CNN USING RESNET AND EXPLAINABLE FEATURE MAP
To configure the diagnosis support model of cardiomegaly, A three-step procedure is required.
Step 1, it is necessary to configure a ResNet for X-ray image recognition. Data collection is prepared and pre-processed for the ResNet's learning, and the learning process is conducted.
Step 2, the target images classified as cardiovascular patients and softmax values of output layers are obtained in the neural network where learning has been completed. It also stores these values separately. In addition, a feature map is prepared. Through the partial reversal operation of the image, the influence of each part on the output layer is compared.
Step 3, the imaging process is conducted in a way allowing visual expressions. Figure 2 shows the progress of the diagnosis support model of cardiomegaly based on CNN using ResNet and explainable feature map. This 3-step mechanism allows the data flow within the neural network to be examined without changing the shape of the original artificial neural network.
A. DATA COLLECTION AND X-RAY IMAGE DATA PRE-PROCESSING X-ray images are used as the baseline data for cardiac hypertrophy reading. To intelligently read the X-ray image data and to explain the reason for that reading, an algorithm analyzing the chest X-ray images of no-findings and of patients with cardiac hypertrophy is required. Such an analysis algorithm should be able to separate the characteristics of the relative size and shape of the heart, and should identify whether or not the relatedness to cardiac hypertrophy is significant.
A CNN-based ResNet is used for cardiac hypertrophy distinction. the discriminating process learns images classified as cardiovascular and normal through CNN, and the learning-completed classification system distinguishes patients by classifying new X-ray images. The NIH (National Institute of Health) chest X-ray data set provided by NIH is used as the baseline data for learning [34], [35]. This data is an image data in PNG format, and 1 image has a resolution of 1024 * 1024. A total of 112,121 data are classified per disease and are anonymized and provided.
Initially, the data for learning and assessment are preprocessed. The initial 112,121 images are classified into 14 diseases and 1 'No Finding' group. Of the classified data, 20,797 data having an 'Overlapping disease' were excluded for appropriate learning data sorting. Of the remaining 91,324 data, 1,000 out of a total of 1,093 data classified as patients definitely diagnosed with cardiomegaly are selected. Then, the 'Undersampling Technique' is used to solve the class imbalance problem. To do so, of the 60,361 data classified as no-findings, 1,000 data are selected on a random basis. The finally selected data are as shown in Table 1. In Table 1, 'count' is the total number of data classified per disease, 'for learning' is the data used for learning, and 'for evaluation' is the data used for evaluation. Of the data classified as patients definitely diagnosed with cardiomegaly, 900 data are used as the data for learning and 100 data are used as the data for evaluation. This applies the same to the data classified as 'No Finding'.
A total of 1,800 data are used as the data for learning. This size satisfies the minimum size required for learning. However, since certain images are blurry or contain partially tilted data or blank spaces, there is a possibility that such factors may have an influence on accuracy. To mutually compare this study with other studies, the data are used without making any corrections.

B. IMAGE CLASSIFICATION USING CNN-BASED ResNet
It is necessary to evaluate the size and shape of the heart for cardiomegaly judgment, and an intelligent technique capable of learning and dividing the differences between non-patients and patients is required. Since the X-ray images of a number of patients and non-patients are collected as big data, it is possible to apply machine learning for big data processing. In particular, as far as medical data is concerned, the analysis method should vary depending on the disease and image form(X-ray, CT, MRI, etc.). In particular, since the image characteristics vary depending on the type of disease, an algorithm suitable for each disease must be used. In this study, a CNN is used for cardiomegaly judgment. A CNN utilizes an idea known as feature representation learning, and the intelligent system itself divides particular patterns. CNN is an algorithm demonstrating very excellent performance in the fields of image classification and voice recognition, and is useful for data clustering and classification [36], [37]. Therefore, it is a suitable algorithm for analyzing particular shapes such as cardiomegaly [38].
To implement a CNN algorithm, it is necessary to learn from the previously collected and classified X-ray images, and it is necessary to create a high-accuracy neural network based on such learning. Initially, the pre-processed X-ray images have a resolution of 1024 * 1024. Although such X-ray resolution is quite small for practical use, it is considered comparatively big for neural network learning. Therefore, the width of the input layer becomes large and bulky. In addition, a more deep structure is required to implement actually applicable accuracy. The problem with such a structure is that it creates the 'Vanishing Gradients' problem where the performance deteriorates as the neural network gets deeper [39], [40]. To solve such a problem, the ResNet structure is expanded and used. The structure of the ResNet complements the output by skipping the input data several layers, thereby increasing performance by overlapping multiple neural networks.
A ResNet is most particular in that it divides neural networks into small blocks and adds an input value to the output of the block. This block is known as a 'Residual Block'. Such a method not only significantly deepens the neural network model, but also creates no 'Vanishing Gradients' problem. In this study, this method is used to configure blocks, and the residual blocks are configured in a way overlapping each other. In addition, a batch regularization is performed per block. A ResNet regularizes the input data within the neural network by mean and variance. A drop-out-like effect is demonstrated through this process. The residual block used and the overall ResNet structure used are as shown in Figure 3.
The configured system consists of 7 layers consisting of BasicBlock layers. Each layer has 2 residual blocks. The initial input image goes through the convolution function. Then, the batch normalization and activation function processes are conducted. The input value is added to the output, and the data finally passes through the ReLU function. Such a process tied together as a module is known as a 'ResNet Layer'. Therefore, the data will eventually output the predictions as they repeatedly pass through the neural network inside the connected ResNet Layer. The internal shape varies depending on the depth of the layer. The image on the very first layer is divided and configured as a 16-stage channel, and then a block containing tensors having sizes of (8 x 512 x 512), (  For a neural network to operate, it is necessary to implement the error backpropagation within the neural network. Error backpropagation is a method used to adjust the weight value of each node to minimize the errors in the output value of each node. It moves in the reverse order of forwarding propagation, and moves from the final output node to the input in series. Therefore, such algorithm and weight value minimizing the error between each node's result value and the target value within the configured neural network system are required. The neural network learning optimization function calculates the error between input values as a gradient, and continuously moves it towards the side having a smaller gradient. Therefore, the node's weight value is renewed in the direction having a smaller error [41], [42]. The judgment classification using X-ray used in this study is particular in that its baseline data shape and content are different from those of the general image classification. As far as diseases such as pulmonary fibrosis and pulmonary tuberculosis are concerned, the specific organ position and image texture have a more significant meaning than the lung shape. Diseases such as cardiomegaly, mass, and atelectasis are determined mainly based on the organ shape and relative size. Therefore, it is necessary to select a suitable function. Representative functions include SGD, AdaGrad, RMSProp and Adam [43], and the performance varies depending on the learning contents and characteristics. In general studies, SGD is used, and, recently, to achieve more improved performance, AdaGrad or Adam is used. Initially, in the case of SGD [43], the loss function of weight parameter x is as shown in (1). ∂f /∂x is the gradient of the loss function. Therefore, the gradient is the change in result value f resulting from the change in parameter x. η is the learning rate.
AdaGrad [43] is a method where the learning rate is reduced as the learning makes progress. AdaGrad adjusts the learning rate per node. x is the weight parameter, and ∂f /∂x is the gradient of the loss function. h is the adjusted learning rate. The change in result value f resulting from the change in parameter x is as shown in (2).
h is reduced based on the following formula. is the element wise multiplication.
Such AdaGrad shape uses simple gradient accumulation and operates well in a simple-shape function. However, it tends to show decreased performance in a function consisting of a complex-shape curve. RMSProp [43] has an AdaGrad shape and complementarily applies the exponentially weighted moving average to the gradient. This method reflects the latest gradient more greatly. The change in result value f resulting from the change in parameter x is as shown in (4).
h is the adjusted learning rate, ρ is the hyper parameter, and ρ is adjusted to be reduced to reflect the latest gradient more greatly. Adam Optimizer (Adaptive Moment estimation) [43] is an algorithm combining momentum with the concept of RMSProp, and is a method to which the decaying average of gradients is applied. The formula for calculating Ah, a term used for Adaptive, and Mh, a term used for Momentum, is as shown in (5).
α is the adaptive term decay rate, and β is the momentum decay rate. In this study, SGD, AdaGrad, RMSProp, and Adam described above are used to mutually compare the accuracies.

C. CONFIGURATION OF EXPLAINABLE FEATURE MAP THROUGH IMAGE CONVERSION
The medical professional should have a good reason to explain the reason for the final medical judgment. After the artificial neural network's judgment, the explanation uses diverse measures such as text explanation, visual explanation, partial explanation, explanation by example, explanation by simplification, and function-related explanation [18]. In particular, the visual explanation is the most suitable method that delivers the complicated interaction within the model to users who are unfamiliar with the neural network. XAI such as LIME frequently used as a visual explanation method only displays object shapes on a CNN, and can be used as a visual explanation method. Such results are effective in classifying objects, but are not quite beneficial to a medical system analyzing particular patterns. In this study, LIME's basic idea is supplemented and its functional shape is expanded. The basic concept is to reverse the input value of the X-ray image per position and to use a method that detects the neural network's output node changes. The detection results are delivered to an expert in image form, and the visual explanation method is used. Such a method visually delivers the fundamental reason to the medical expert.
To do so, initially, the ResNet structure is prepared and the learning process is completed. An X-ray image requiring judgment is added to the learning-completed neural network. MaxN , the value of the node having the greatest value from among the finally classified output nodes, is confirmed. Then, each pixel of the image is reversed in sequence, and the changes in MaxM , the maximum value of the output node, are compared. The formula for calculating the PF (Positive Factor) value of Pixel P is as shown in (6).
The value of the node having the previous maximum value is MaxN , and the maximum value after reversing the pixel values of the input image is MaxM . The positive influence of the pixel having a positive influence on disease judgment is PF. Relu (Rectified Linear Unit) is sent out as 0 when the input value is smaller than 0 and is sent out as it is when the input value is greater than 0. Through this formula, it becomes possible to express the position serving as the reason for diagnosing the disease on the image. The medical expert can visually understand the shapes associated with disease occurrence through the image shapes. Meanwhile, the formula for determining the level of distinction influence of NF (Negative Factor), the pixel having a negative influence, is as shown in (7).
The PF or NF results are saved on the feature map according to the pixel's changed position. By converting the feature map's result values in image form, the influence of the changes in the data at particular X-ray points on the result values can be confirmed. To display the final results in image form, the PF brightness and NF brightness are converted and visually expressed. To do so, the normalization process is conducted individually, and the brightness level is adjusted. The pixel expression formula for this process is as shown in (8).
Pix is the previous brightness level of the feature map, and Pix new is the new brightness. Pix max is the maximum brightness value of the previous feature map, and Pix min is the minimum brightness value of the previous feature map. α is the brightness level, and β is the basic background brightness. The brightness can be adequately adjusted and used according to the service environment, and this formula makes it possible to more clearly confirm the feature map results. Through such a method, the important information of the input position can be expressed according to the size of the changes in the model's prediction value achieved based on the changes in the input data value. In particular, since it enables visual expressions, it significantly contributes to interpreting the intelligent image analysis results. In a disease diagnosis process, the explainable feature map image of the PP formula results serves as the reason for disease judgment. In addition, for a disease where the external shape is considered important, since the NP result is expressed as a high value at a point where there is a possibility to improve the disease, it can be utilized for purposes such as confirming the status of recovery from the disease.

D. EXPLAINABLE FEATURE MAP's IMAGE EXPRESSION
Python is used to implement a ResNet. Python (Ver 3.8.3) is a programming language that can be used to conveniently implement diverse types of neural network structures. As libraries, Tensorflow (Ver 2.3.0) and Pytorch (Ver 1.6.0) are mainly used. Tensorflow and Pytorch are open-source platforms for machine learning. Tensorflow uses nodes and edges to implement data calculation. Pytorch was developed by the Facebook AI Research Group. Pytorch has a simple structure and features prompt implementation. In addition, Python Matplotlib Package was used for image visualization, Numpy Package was used for value calculation, and Pickle Package was used for neural network saving.
The pre-processed 1,800 data for learning are added to the ResNet prepared as designed. Through the random padding of the images used for learning, the position of the images for learning is more diversely and flexibly used, and a filter having a measurement of 3 * 3 is used. Stride is set as 1. 16 data are batch-processed each time, and the accuracy is repeatedly confirmed 300 times. CNN has a multi-step configuration of its ResNet layer. Batch normalization is used in between layers where a residual block is used. Meanwhile, it is necessary to compare the changes in accuracy resulting from the selection of the neural network learning optimization function within the configured ResNet structure. To do so, the neural network learning optimization function is separately configured using SGD, AdaGrad, RMSProp, and Adam. Through this process, the overall evaluation process is repeatedly performed and the accuracy is mutually compared.
Once the ResNet is prepared, it is necessary to implement a judgment support function for experts. To do so, initially, an X-ray image confirming disease prevalence is selected. The final output node value of the selected image is saved as the standard value, and an explainable feature map having the same size as the image is prepared. Then, the selected image is reversed in sequence in a pixel unit, and the changes VOLUME 9, 2021 in the standard value are compared. The progress is as shown in Figure 4. An image can be expressed by expressing the form of an expandable feature map in brightness along with the number of values in which the change in the reference value is stored. However, in the case where it is performed in a pixel unit, the pattern characteristics generated on a layer close to the input layer are greatly presented in the results, and the influence on the output node is not greatly expressed [44]. Although such results contribute to understanding the characteristics of the neural network, they may provide unnecessary meanings to a medical expert. Therefore, to provide more smooth results, adjusting the size of changing pixels to 1 × 1 and 2 × 2 would lead to better results.

IV. RESULT AND PERFORMANCE EVALUATION
In general, an artificial neural network shows diverse evaluation results according to the shape of configured layers, selection of specific functions, learning rate, etc... Therefore, it is necessary to change, model, evaluate and test the internal configuration [45]. Through such a process, it is possible to configure a model demonstrating the highest accuracy. In addition, a CNN-based model for image analysis rapidly increases its operation quantity based on the number of nodes used according to the design shape. Therefore, it is also important to complete the learning process within the implementable learning time. Therefore, it is necessary to configure a shape demonstrating an adequate operation quantity to excellent accuracy ratio.

A. ACCURACY EVALUATION
The function accuracy of the configured neural network is evaluated. To do so, hardware consisting of an i5-9500 3.00 GHz CPU(Intel R , Santa Clara, California, USA) and a 16 GB memory is configured. The SGD, AdaGrad, RMSProp, and Adam functions are applied, and the 'Accuracy' is used for accuracy evaluation. Accuracy refers to the correct answer rate of the overall test results, and, since it is intuitive, it is most commonly used. The accuracy is calculated as shown in (9). TP is a case where it is actually positive and the result is positive as well, and FP is a case where it is truly negative but the result is positive. FN is a case where it is actually positive but the result is negative, and TN is a case where it is truly negative and the result is negative.

Accuracy =
TP + TN TP + TN + FP + FN (9) The accuracy is evaluated by applying SGD, AdaGrad, RMSProp, and Adam. The 160 epochs is conducted, and the accuracy and overfitting problem occurrence is confirmed. The evaluation results involving SGD and Adam are as shown in Figure 5. SGD and Adam both show a result close to 80% after 50 epochs. No performance enhancement is made after 100 epochs. In particular, for the case of Adam, it seems that overfitting occurs. The performance evaluation results involving AdaGrad and RSMPromp are as shown in Figure 6. AdaGrad and RMSProp both show a result of approximately 75% after 100 epochs. No performance enhancement is made after 100 epochs. For the case of RMSProp, since the overall appearance is unstable, it seems that putting it to practical use would require caution. The final highest accuracy results were achieved when Adam and SGD were used, and an accuracy result above 75% was shown at a learning rate of 60-80 epochs. Based on the results, not only the learning quantity, but also the accuracy showed a difference according to the selection of each function. Selecting more up-to-date neural network functions such RMSProp does not always lead to achieving good results. Therefore, when configuring a neural network in the future, it is necessary to take into consideration such factors according to the data type.
As a result, SGD showing an accuracy result of approximately 80% shows that it could be utilized to a certain extent in an actual situation.

B. INTERNAL INTERPRETATION USING EXPLAINABLE FACTOR MAP
Due to the particularities of medical data judgment, a slightly incorrect result may result in a major follow-up problem when applied to actual individual patients. Therefore, this type of implementation system is practically highly utilized as an expert assistance system. Therefore, it is necessary to internally interpret and combine the neural network based on the results. An explainable feature map can be used to visually express the part that has an influence on disease judgment. The part that has a positive influence on disease judgment is classified as a positive factor, and has a positive value on the feature map. The part that has a negative influence is classified as a negative factor, and has a negative value. The image implemented using such factors is as shown in Figure 7. As shown in Figure 7, the overall contour and heart appearance serve as important factors for cardiomegaly judgment on the upper image, and it can be seen that it becomes clearer as the learning quantity increases. This part is consistent with the process where the diagnosis of cardiomegaly [31] is done by measuring the cardio-thoracic ratio, the standard for actual cardiomegaly judgment, and this part suggests that the actual operation is normal. The lower image shows the part that has a negative influence on disease judgment. Depending on the type of disease, this factor image can be used as the standard for determining whether or not the patient recovered.
When providing the positive factor image to a medical expert, more smooth results can be provided by changing the size of pixels. Figure 8 shows the results based on the two input types set as 1 × 1 and 2 × 2. As shown in Figure 8, when input pixels set as 1 × 1 are used, the pattern shape is projected on the image and the results are neither smooth nor clear. When input pixels set as 2 × 2 are used, the same image becomes smooth and clear. The finally printed pixel pattern may vary depending on the shape of the neural network. Therefore, the size is applied empirically. In general, in the case where input pixels having a size of 2 × 2 are used, a smooth shape was displayed. Figure 9 shows the finally configured cardiomegaly diagnosis impact analysis result.  Figure 9 shows the results achieved by combining all the explainable feature map factors having an influence on the ResNet results on a red image. It shows that the ResNet's learning is focused on the heart. Such a result provides support in a way allowing the medical expert to mainly analyze the areas where red colors are concentrated. Through the image shape, the medical expert can understand the process and meaning of the disease judgment made by the neural network. In addition, information relating to disease position, progress status, and severity status can be supplemented.

V. CONCLUSION
This study proposed constructing a mutually complementary system by combining support algorithms such as the ResNet-based deep learning algorithm and explainable feature map. Through the proposed model, the accuracy of cardiomegaly judgment can be increased and the neural network's internal information can be provided. Through such a process, high accuracy and transparency are provided to VOLUME 9, 2021 disease judgment. In particular, by using chest X-ray imaging data, it becomes possible to use a low-priced, universally executable test to judge cardiomegaly and deliver the reason for judgment.
To implement a diagnosis support model of cardiomegaly, a large number of X-ray images are used to learn cardiomegaly and judge the status of newly added patients. To do so, A CNN-based ResNet is used, and, by comparing the accuracy results achieved from different types of neural network learning optimization functions within the neural network, an effective algorithm for cardiomegaly judgment was discovered. In addition, by using a separate input data conversion technique, the explainable feature map functions were implemented and combined. To evaluate the performance of the proposed model, the accuracy results achieved from the neural network learning optimization functions used within the neural network were compared. Based on the accuracy comparison results, it was confirmed that SGD showed an accuracy value close to 80% at a learning rate of 60-80 epochs. Through such a process, a significant accuracy value was secured. In addition, through the explainable feature map image achieved by partially converting the input data, whether or not the direction of learning within the actual neural network is correct can be confirmed. Through such a process, it was possible to provide convincing grounds for the judgment results.
As described, the diagnosis of cardiomegaly using CNN and chest X-ray imaging is low-priced, is highly accessible, and provides highly accurate results through AI algorithms. In addition, it shows that, in the case where disease judgment standards applicable to diseases such as cardiomegaly are confirmed through the explainable feature map to be consistent with the organ position and shape are shown on the X-ray image, a very effective support system can be constructed. As described, by proposing visual grounds explaining the decision process, a medical expert can have confidence in the model and can provide highly satisfying results. Such an analysis process and method supplement the insufficient accuracy of the judgment results solely dependent on CNN. In particular, since the input data conversion technique is applicable to diverse algorithms regardless of the type of artificial intelligence and is visually expressible, it is useful to similar diseases that are judged based on appearance. In addition, it may help researchers analyze the internal structure of a neural network, and serves as the basis for improving the structure of a neural network. However, in order to implement a more realistic model, the resolution and quality of the learning data must be improved, and various types of learning data according to race, age, and sex of cardiac hypertrophy patients and the amount of learning in the scale of big data and related evaluations are required. The learning model needs to be extended to a deeper neural network design and further study on the adjustment of detailed hyperparameters. Combining an intelligence system requiring an explanation with a medical support system is expected to become an essential process in the future medical intelligence system development process.