Image Pattern Recognition Combined With Data Mining for Diagnosis and Detection of Myocardial Infarction

How to apply data analysis algorithms to China’s primary hospitals is still a problem that needs to be solved. In order to effectively explore the pathogenesis of myocardial infarction disease, this study collected a large amount of real data as a basis for data analysis through data survey, improved traditional cluster analysis and data mining methods, and proposed effective data mining methods for myocardial infarction. In addition, this study analyzes data sources by implementing clustering analysis algorithms and combines data mining algorithms to provide decision support information for disease research. Finally, this study uses experimental methods of image data analysis combined with data mining methods to record the results. The research shows that the algorithm of this study has certain feasibility and can provide theoretical reference for subsequent related research.


I. INTRODUCTION
Cardiovascular disease is a disease that seriously harms human health worldwide. Recent studies have shown that their morbidity and mortality are increasing in developing countries. It is estimated that by 2020, more than 80% of global cardiovascular diseases will occur in developing countries. Due to the acceleration of urbanization, China and India will bear the brunt [1].
Foreign countries started early in medical data mining, especially in the United States, and have long been in a leading position. Many of the systems developed have been able to serve medical care well. For example, American scientists have studied a system called REMIND, which is designed to quickly identify patients who need to be implanted as soon as possible from a vast number of cases. Prior to this, doctors can only manually check the case or physical examination report to determine whether the patient needs to implant a pacemaker [2], which is not only time-consuming and laborious, but also very easy to delay the disease. The REMINDX system was deployed in the heart of South Carolina and plans to find people with a risk of sudden heart disease from 61,027 patients. REMINDX and medical experts separately searched and analyzed the The associate editor coordinating the review of this manuscript and approving it for publication was Zhihan lv . patient and obtained the test results, and the results were very gratifying. The result matching rate of REMIND and the experts are 94%, the sensitivity is 99%, the clarity is 90%, and the efficiency is very high.
In the past, doctors only spent a small amount of time to view the user's electrocardiogram, and did not compare with the previous data, which made doctors lack of predictive judgment on the recurrence of heart disease patients. However, now through machine learning and data mining, the model can be analyzed through accumulated data to find high-risk indicators [3]. In addition, other reports show that the introduction of medical big data analysis will generate $ 300 billion in value for the United States and 8% of US national health care spending [4].
In China, medical data mining started relatively late, but it has developed rapidly and achieved quite good results. For example, in 2014, Zheng Guang, He Xiaoying and others of the Institute of Basic Research in Traditional Chinese Medicine of China Academy of Chinese Medical Sciences invented a method involving data mining, and specifically disclosed a method and system for data mining of ''disease-syndrome-symptom, -Chinese medicine-western medicine'' in Chinese biomedical literature database. The method comprises: (1) constructing a structured sensitive keyword database; (2) downloading unstructured topic data and converting and storing the data into a local structured VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ document database; (3) Mining and labeling related texts of sensitive keywords in the database; (4) Removing the noise of data mining and modify the mining results accordingly, that is, the corresponding data mining results. It solves the technical problem that the data mining of diseases, syndromes, symptoms, traditional Chinese medicines and western medicines cannot be realized in the Chinese biomedical literature database in the prior art [5].
In 2015, Sun Changkai, Zhou Xuezhong and others from Dalian Medical University announced a system and method for targeted combination therapy analysis of complex brain diseases. The method effectively combines modern data mining technology, complex network analysis technology, cloud computing technology and big data technology, and dynamically queries, analyzes, and calculates potential drug targets of diseases according to different brain diseases, and solves the problems of long data processing time, long research period, lack of system, and single research target in medical research. At the same time, the method combines the foreign authoritative medical database to systematically, streamline and standardize the research of disease treatment methods. In addition, it plays an important role in reducing the cost of medical disease treatment analysis research and expands the application of computer advanced technology in the medical field [6].
Some studies have shown that the use of echocardiography to detect left ventricular dysfunction in the short term after myocardial infarction is an important indicator for evaluating prognosis. Therefore, non-invasive assessment of left ventricular function has become an important means of risk stratification after acute myocardial infarction. However, little attention has been paid to the study of the right ventricular function after acute myocardial infarction and its clinical significance. Studies have confirmed [7], regardless of the degree of left ventricular dysfunction, right heart failure after acute myocardial infarction is closely related to poor prognosis. The structure and shape of the right ventricle are irregular, showing an asymmetrical form, similar to the first month, making it difficult to measure its volume and cardiac function with a fixed model or formula. Therefore, two-dimensional ultrasound M-type is rarely used in clinical measurement [8]. However, RT-3DE can be based on geometric assumptions, showing the true shape and size of the ventricle, while measuring cardiac function. It not only provides more valuable quantitative information than two-dimensional ultrasound for the diagnosis and treatment of heart disease [9], but also can accurately quantify the three-dimensional space of cardiac indicators. There is a study using RT-3DE to measure the right ventricular volume. The results show that the measured results correlate well with the volume parameters measured by MRI, which is more accurate than two-dimensional ultrasound [10]. Therefore, this study aims to apply real-time three-dimensional echocardiography.
Although China has made great progress in the field of medical data mining, in the majority of grassroots hospitals, due to the importance and economic environment, doctors are still lagging behind in data analysis. Therefore, how to apply data analysis algorithms to China's primary hospitals is still a problem to be solved. Based on this, this study applies the big data mining method to the data mining of myocardial infarction diseases and combines the image analysis data to conduct research.

II. RESEARCH METHOD A. ALGORITHM RESEARCH
Elman is a dynamic local feedback network. The difference between the feedback network and the feedforward network is that the signal of the feedback network is not transmitted unidirectionally, and the output signal of the feedback network can be fed back to its input through the feedback connection. The Elman neural network is not a completely meaningful feedback network. Its signal transmission, after passing through the hidden layer, is not transmitted back to the input, but is connected to a new unit, which we call the ''connection layer''. In this way, the Elman neural network has a connection layer in addition to the input layer, the hidden layer, and the output layer unit. Moreover, the connection layer is used to memorize the output value of the hidden layer unit at the previous time, and can be regarded as a delay operator, which enables the network to have a dynamic memory function. The main structure of the network is shown in the Fig.1 [11]: We denotes an input layer, J denotes an implicit layer, L3 denotes a connection layer, and O denotes an output layer. X is the r-dimensional input vector, y is the m-dimensional output vector, Y J is the n-dimensional hidden layer output vector, and Y L is the n-dimensional connection layer output vector. w y oj , w x ji , w c jl is the weight of the hidden layer to the output layer, the input layer to the hidden layer, the acceptance layer to the hidden layer, g (·) is the activation function of the output neuron, f (·) is the activation function of the neurons in the hidden layer, and k is the number of iterations. The following is the calculation formula for the Elman neural network model [12]: It is the network error to judge whether the network iteration ends, and the error function is [13]: The derivation process is as follows [14]: In the formula, η 1 , η 2 , η 3 is the learning step of w c jl , w x ji , w y oj , respectively, i = 1, 2, · · · m; j = 1, 2, · · · n; q = 1, 2, · · · r.
In addition, this paper analyzes the data source by implementing the cluster analysis algorithm and divides the sample data according to the rules of age, gender, laboratory index, life history and so on, and defines high-risk groups. Finally, the paper provides decision support information for disease research through these two methods [15].
(1) The Apriori algorithm is used to mine association rules for multiple physiological indicators such as age, gender, blood pressure, heart rate, adverse life history and accompanying diseases in patients with acute myocardial infarction. The set of records for each patient is treated as one transaction, and each record is taken as an item of the transaction. When the item in each transaction appears once, the count is incremented by one, which belongs to the Boolean association. The Apriori algorithm should be selected first, and multi-angle analysis is performed by transforming multiple condition patterns [16].
(2) The K-beans algorithm is used to analyze multiple indicators of patients with myocardial infarction, such as age, gender, life history, disease history, etc. Furthermore, the division is based on age, and the results of the division are observed to determine the number of people in the interval, and those who meet the conditions of this interval are high-risk groups [17].
Because the paper takes the sampling analysis of the data, rather than the full data analysis, the error data is less, and it can be manually corrected by comparing the data. The information after the reduction is compared with the diagnosis of blood pressure and heart rhythm extracted by the HIS system, and if there is inconsistency information, the error content is corrected. The purpose of this step is to eliminate data noise and improve the accuracy of data mining results. In addition, if the body temperature, heart rate, and respiratory signs are displayed as abnormal values and are not indicated in the diagnosis, the diagnosis should be supplemented as required. For patients with normal blood pressure, it is necessary to determine whether this blood pressure value is caused by a decrease in blood pressure in the onset of hypertension according to the actual situation [18].

III. RESULTS
The GE Vivid E9 Color Doppler Ultrasound is used with a 2D M5S-D probe and a 3D image 4V-D cardiac volume probe. The subject took the left lateral position, calmed breathing, and connected the electrocardiogram of the synchronous lead. After a conventional two-dimensional ultrasound examination using a two-dimensional probe, the standard apical four-chamber view is converted to the TDI spectral Doppler form. The sampling line passes through the junction of the tricuspid annulus with the right ventricular free wall and the septum, and the basal, middle and apical segments of the right ventricular free wall. After that, the above five motion speed curves were recorded, and the scanning speed was 50 to 100 mm/s. The 4D probe is then converted to a standard apical four-chamber view and a 4D full-volume imaging is initiated and adjusted to a multi-cardiac cycle mode, ensuring that the right and right chamber walls are fully contained in the full-volume image. Finally, the conditions such as the fan angle are adjusted to make the frame rate at least 40% of the patient's heart rate, and the real-time three-dimensional full-volume image of 4 to 6 cardiac cycles is continuously acquired after the patient has no stitching misalignment. All motion pictures are stored on the instrument's hard drive and are saved and saved on a DVD disc after analysis (see Fig. 2). The right ventricular Tei index is calculated by the formula Tei index = (a − b ) / b (see Fig. 3). Furthermore, the TDI spectral Doppler method was used to measure the early diastolic peak velocity (Em), the atrial systolic peak velocity (Am) and the systolic peak velocity (Sm) in the basal, middle and apical segments of the right ventricular free wall. In the machine-initiated real-time three-dimensional ventricular volume measurement system, the midpoint of the tricuspid annulus and the intimal surface of the right ventricular apex can be manually clicked at the end diastolic and end-systolic end of the four-chamber view, respectively, and the analysis software automatically tracks the endocardium and epicardium frame by frame. The selfdelineated endocardium and epicardium can be manually adjusted if necessary. In addition, the software automatically calculates right ventricular end-diastolic volume (RVEDV), right ventricular end-systolic volume (RVESV), and right ventricular ejection fraction (RVEF). All of the above work was performed by two senior cardiac sonographers with unknown coronary angiography results, and the data were averaged using three consecutive cardiac cycles [19].
The image transformation is performed by image processing, and Fig. 2 to Fig. 3 are conversion processes, which can be summarized into the form shown in Fig.4.
After that, the left ventricular integral diameter, ring and longitudinal strain are automatically calculated by the software's own strain analysis function. Left ventricular cardiac function analysis uses CVI short 3D. Meanwhile, by manually delineating the contractile and diastolic endocardium and epicardium on the short axis of the left ventricle, the software automatically analyzed the cardiac function parameters such as ejection fraction and wall thickening rate. Infarct volume calculations in patients with myocardial infarction were performed using CVI Tissue Char. The left ventricular endocardium, epicardium, infarcted area,    and normal myocardium were manually delineated, and the infarct volume was calculated by software [20].
Based on the age attribute in the data object, the ages are grouped according to a group of ten years. The results are shown in Tab. 1 [21].   Through this Tab. 1, it is found that the equidistant grouping according to age cannot scientifically reflect the age distribution of the study population, the differences between the groups are not obvious, and the accuracy of the conclusions of the scientific research is greatly discounted. Therefore, the same data source is selected to be re-divided using the k-means algorithm. The results are shown in Tab. 2 [22]. This Tab.2 shows that patients under the age of 43 and those over 68 have the smallest proportion. Through previous studies on the causes of myocardial infarction, the causes are often related to bad habits (high salt, high sugar, high oil diet), and physical exercise is often an inducement. Patients between the ages of 40 and 60 have better physical strength than those over the age of 70 and are often more likely to trigger triggers. The diagnosis results of blood pressure, blood lipids, and blood sugar are introduced and clustered. Blood lipid: Z, blood pressure: Y, blood sugar: T [23]. Tab. 3 shows that the number of patients between the ages of 50 and 70 is the highest, and the blood pressure and blood lipids have the greatest impact on the disease. By studying the living environment and living habits of this group of people, this group of people, general diet, salt, and oil, We can infer that people in the 50-70 age group have a low general quality of life when they are young, and vegeTab.s and protein are too small, which leads to too much pickles in the diet. However, after they entered middle age, the quality of life of residents increased rapidly, the diet was more abundant, and retaliatory intake of meat was often seen, and the high-salt diet has become a habit. Therefore, this group of people has the largest number of people and the indicators are mainly blood pressure and blood lipids. In addition, this mode also has some functions of outlier analysis. As shown in the results, the 31-47-year-old patients have a large age span, but the number involved is small. However, according to the pathogenesis of acute myocardial infarction, the source of the disease is long-term coronary arteriosclerosis, which is a chronic disease. In this clustering mode, young patients belong to ''outliers''. From this, we can analyze the information of this group of patients to see if there are other causes of the disease in this group of patients, such as ''coronary myocardial bridge (a congenital coronary artery malformation)'' [24] The results of the white blood cell test were introduced and clustered, and the results are shown in Tab. 4 [25]. In the leukocyte results, only patients under the age of 52 have leukocytosis, and the larger population does not have symptoms of leukocytosis, which is different from the contents recorded in the literature. In this way, as a research goal, the study of these abnormal data cases provides some ideas for doctors' research [26].

IV. DISCUSSION AND ANALYSIS
Flowing blood is an indispensable condition for the tissues and organs of the body, including the heart. Acute myocardial infarction (AMI) occurs on the basis of coronary atherosclerosis. Due to the vasospasm, emotional agitation, severe physical labor, surgery and other incentives, the plaque in the lumen collapses and falls off, causing severe narrowing of the lumen. Moreover, an effective collateral circulation is not fully established, which results in severe myocardial ischemia for a long time (>1 hour), persistent long-lasting retrosternal chest pain, fever, increased white blood cell count and serum myocardial necrosis markers, and progressive changes in electrocardiogram. In addition, it can induce arrhythmia, shock and heart failure, which is a serious type of coronary heart disease. In recent years, with the improvement of living standards, under the influence of many factors such as high-fat diet, irregular work schedule and lack of physical exercise, its incidence is getting higher and higher, and the age of onset is also younger, which seriously affects human health [27].
Ventricular remodeling is an important pathological process after myocardial infarction. Cardiomyocytes are extensively edema within a few hours after acute myocardial ischemia, and this is a reversible injury. With the persistence of ischemia, coagulative necrosis, hemorrhage, interstitial edema, and microcirculatory disorders occur in cardiomyocytes. Moreover, as the disease progresses, collagen deposition in the infarct core region and granulation tissue formation and mechanism further develop into fibrosis and scar tissue. In addition, myocardial cells in the non-infarcted area were hyperemia, hypertrophy, degeneration, apoptosis, interstitial angiogenesis, collagen fibrosis, and the original myocardial fibers and fiber arrangement were disordered. The above pathological changes lead to abnormalities in the shape and structure of the heart. The necrosis of myocardial cells and the increase of intraventricular pressure in the diastolic period lead to thinning of the local wall, myocardial movement and even contradictory movement. However, non-infarcted areas are compensated thickening and myocardial contractile movement is increased. Due to the inconsistent myocardial contraction movement after infarction, the wall motion is impaired, and the myocardial mechanics changes, resulting in changes in myocardial contraction and diastolic strain [28].
Starting from solving clinical practical problems, this paper focuses on the construction principle of multi-dimensional data sets for acute myocardial infarction and focuses on the text information extraction technology applied in the construction process to achieve information preprocessing. Moreover, this paper combines image processing for data analysis, and analyzes the information of acute myocardial infarction cases through data mining algorithm.
(1) This paper first reviews the concepts, application examples, key technical tools and future development directions of data warehouse and data mining.
(2) This article builds a data warehouse with the theme of acute myocardial infarction and implements data integration through the kettle tool. At the same time, combined with the writing characteristics of acute myocardial infarction cases, this paper uses Chinese text segmentation and text information extraction techniques to structure the natural language forms of text cases. After the data warehouse is successfully built, it can be applied to complex queries, data mining, etc. This paper focuses on data mining.
(3) This paper uses data mining algorithms to cluster and analyze data. After many operations, the feasibility and efficiency of data mining technology in case analysis are proved. The results of the thesis provide data level help for the clinician's scientific research tasks. Moreover, using the analysis algorithm, doctors do not have to look through the medical records for manual excerpts when collecting data. In addition, data mining methods can be used to obtain comprehensive and accurate results when performing data analysis. Through data mining, we can find some ''funny'' knowledge hidden inside the data that has not been studied and provide direction for research. On the other hand, data mining can also be used to verify the accuracy of the test data, and to assist the doctor in scientific research with high efficiency and accuracy. Moreover, the auxiliary tools used in this article are mostly open source and free tools, which are suiTab. for ordinary grassroots hospitals in the economic environment, so that the entire data mining process guarantees the accuracy of the results and the economics of implementation.

V. CONCLUSION
How to apply data analysis algorithms to China's primary hospitals is still a problem to be solved. Based on this, this study applies the big data mining method to the data mining of myocardial infarction diseases and combines the image analysis data to conduct research. The study selected in this study is an acute myocardial infarction disease, and the patient's symptoms, physiological indicators, and accompanying disease information stored in the data warehouse are used as data sources. Moreover, this paper explores various high-risk factors and other information related to acute myocardial infarction by implementing association rules algorithm. In addition, this paper analyzes the data source by implementing cluster analysis algorithm and divides the sample data according to the rules of age, gender, laboratory index and life history, and delimits the high-risk groups. Moreover, this article provides decision support information for disease research through these two methods. Starting from solving clinical practical problems, this paper re-examines the principle of constructing multi-dimensional data sets for acute myocardial infarction and focuses on the text information extraction technology applied in the construction process to realize the pre-processing of information. At the same time, this paper combines image processing for data analysis, and analyzes the information of acute myocardial infarction cases through data mining algorithm. The research results show that the algorithm of this study has certain feasibility, and the in-depth analysis of myocardial infarction diseases can be carried out through data mining.
XIAOQIANG TANG received the bachelor's degree from Nanjing Medical University, in 2006, and the master's degree from Nanjing Medical University, in 2017. Since 2018, he has been served as a Medical Imaging Department secret book in Changzhou Second People's Hospital affiliated to Nanjing Medical University. His research interests include the diagnosis of heart-related diseases, especially coronary heart disease imaging omics and AI-assisted diagnosis, and he has participated in a number of multi-center research projects.
MING ZHANG graduated from Nanjing Medical University, in 2013 with a master's degree in seven-year clinical medicine. 2013-present Changzhou Second People's Hospital affiliated to Nanjing Medical University. His research interests include the diagnosis and differential diagnosis of chest diseases, proficiency in the operation of various CT post-treatment reconstruction techniques, and the introduction of a new hospital-level technology.
HAIFENG SHI received the bachelor's degree from Southeast University Medical School, in 2004, the master's degree in imaging medicine and nuclear Medicine from Southeast University, in 2009, and the Ph.D. degree in imaging medicine and nuclear Medicine from Fudan University, in 2014. He is the Deputy Director of Medical Imaging Department of Changzhou Second People's Hospital, Jiangsu Province top health talents, Jiangsu Province 333 high-level talents, Jiangsu Province six talent peak training objects, postgraduate tutor. His research interests include functional imaging of gynecologic tumors.
CHANGJIE PAN received the bachelor's degree from Soochow University Medical School, in 2003, and the master's degree from Nanjing Medical University, in 2009. He is currently pursuing the Ph.D. degree in imaging medicine and nuclear Medicine at Soochow University Clinical Medical School. Since 2014, he has served as director of medical imaging Department in Changzhou Second People's Hospital affiliated to Nanjing Medical University. His research interests include the diagnosis and differential diagnosis of heart disease, especially ischemic cardiomyopathy, CT and MRI, and lung disease, especially lung cancer. It has won the first and second prizes of the New technology Introduction Award of Jiangsu Health Commission, and the second prize of Changzhou Science and Technology Progress Award.