Implementation and Use of Disease Diagnosis Systems for Electronic Medical Records Based on Machine Learning: A Complete Review

Electronic health records are used to extract patient’s information instantly and remotely, which can help to keep track of patients’ due dates for checkups, immunizations, and to monitor health performance. The Health Insurance Portability and Accountability Act (HIPAA) in the USA protects the patient data confidentiality, but it can be used if data is re-identified using ‘HIPAA Safe Harbor’ technique. Usually, this re-identification is performed manually, which is very laborious and time captivating exertion. Various techniques have been proposed for automatic extraction of useful information, and accurate diagnosis of diseases. Most of these methods are based on Machine Learning and Deep Learning Methods, while the auxiliary diagnosis is performed using Rule-based methods. This review focuses on recently published papers, which are categorized into Rule-Based Methods, Machine Learning (ML) Methods, and Deep Learning (DL) Methods. Particularly, ML methods are further categorized into Support Vector Machine Methods (SVM), Bayes Methods, and Decision Tree Methods (DT). DL methods are decomposed into Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Deep Belief Network (DBN) and Autoencoders (AE) methods. The objective of this survey paper is to highlight both the strong and weak points of various proposed techniques in the disease diagnosis. Moreover, we present advantage, disadvantage, focused disease, dataset employed, and publication year of each category.


I. INTRODUCTION
The term diagnosis is used for finding symptoms of disease or analysis of the patients to determine the health conditions. The diagnosis is usually performed through one of these methods, i.e., examining the physical condition of the patient, exploring patient's history, or from diagnostic tests which are analysed by various healthcare professionals such as den- The associate editor coordinating the review of this manuscript and approving it for publication was Xiong Luo . tist, physician, chiropractor, physical therapist, or physician assistant and compounder etc. [1]. The patient's history is frequently saved in the form of a prescription for necessary medications, streamline workflow, and to keep track of the patient's performance. Initially, the prescription was saved in the form of the paper chart containing the type of diseases, suggested medicines, vaccination dates, treatment plans, and the test results of X-rays specific hospitals. However, in the modern age of the computer, the prescription is saved in a digital format which is known as an electronic VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ medical record (EMR) or electronic health record (EHR). These electronic records help the physicians to access the patient's records instantly, to keep track of patients' due dates for checkups and immunisations and monitor patients health performance and make decisions accordingly [2]. Although both these terms (EMR and EHR) are used interchangeably, according to the 'Office of the National Coordinator of Health Information Technology (ONC)' both the terms are utilised exclusively [3]. The EMR is the digital form of the prescription, which contains the patients' information collected in a provider's office for healthcare professionals. The EMR data can be either human generated or machine generated [4]. EMRs have multiple advantages over paper prescriptions including instant access, keep track of patients' information, save patients visits, screen patients, and enhance the healthcare's quality [5]. The scope of the EHRs is going beyond than EMRs, as it contains information of all the medical investigator involved in patients' health records. The patient's information is also shared with other clinicians and medical researchers in various hospitals to study and improve the root causes of the disease. EMRs contain temporal and heterogeneous doctor order information which may used as an input for treatment pattern discovery [6].
EHRs also facilitate the patients to see their records on how the progress is going on, which motivates them in many cases (not necessarily) [7], [8]. Though EMR and EHR provides many benefits to users and the practitioners, but there are a lot of challenges associated with the implementation of these electronic records including downtime of the computers, incapability of the computer experts, lack of communication among users and, security threats of confidentiality leakage etc. [9]- [11]. Though, these methods are the decent source of data collection in various hospitals to assist clinicians for the manual extraction of the relevant information. But, still, it is very tedious and time taking exertion to extract specific information from the bulk of data. There is a need for automatic systems for disease diagnosis from electronic medical records. Currently, EHR systems have been widely adopted by different healthcare professionals and institutions to provide fast, efficient and real-time healthcare services in economical ways [12], [13].
There are various healthcare systems has been developed to overcome the challenges of electronic records, including the volume of EHRs data, incompleteness, and inconsistency of the data. NLP based computational phenotyping has various applications such as new phenotype identification, diagnosis classification, and clinical trial screening by implementing different methods, i.e. rule-based methods, machine learning methods, and deep learning methods [14]. They are employed to transform the raw text into useful information, also known as case-base information in 'Case-Based Reasoning (CBR) Systems' [15].
We present the survey of various techniques employed for automatic disease diagnosis from the electronic record in this paper. The main aim of this survey is to provide a platform to researchers to review different disease diagnosis systems; to provide a detailed description of their work; to highlight advantages and disadvantages of the various approaches; to discuss present trends and the future directions, and to give a comparative analysis of the multiple techniques. The detail description of this review is given in Fig.1. The number of articles reviewed with respect to publication year and concerned disease is given in Fig.2 and Fig. 3. respectively.
The paper is structured into five sections. The existing work on the disease diagnosis approaches is discussed in Section 2. In Section 3, the challenges associated with these approaches are addressed. The discussion is given in section 4. Finally, the conclusion of the survey and future trends are presented in Section 5.

II. DATA-DRIVEN MODELS FOR DISEASE DIAGNOSIS
The researchers have designed multiple techniques for computer-aided diagnosis of diseases. This diagnosis has assisted in clinical and biomedical research by employing many applications, i.e., phenotype discovery, diagnosis classification, and patient screening etc. We have categorised various disease diagnosis methods based on their methodology. The disease diagnostic tasks are classified into rule-based methods, support vector machine methods, neural network methods, Bayesian methods, and deep learning methods.

A. RULE-BASED METHODS
Rule-based methods are used in computer science to learn and identifies rules and apply to extract specific patterns associated with that rule [16], [17]. These rules are used to identify and utilize a set of relational rules for capturing the knowledge, unlike the machine learning systems in which a single model is applied for the task of prediction. The rule-based methods include rule association mining [18], learning classifier systems [19], artificial immune systems [20], cloud based model using Fuzzy logic concepts [21], and other techniques based on rules. These methods are widely adopted by the researchers for phenotype cohort identification of the patients. The review of Shivade et al. [22] contains 24 rule-based methods out of 97 methods.
In [23], a rule-based method has been used to identify Cardiac Resynchronization Therapy from clinical records. A dataset of Clinical Translation Science Institute of University of Minnesota's consists of 6174 reports (training set 3700, test set 2474 reports) was used to match with New York Heart Association custom diagnosis codes designed locally by the hospital. The proposed method is compared with SVM method. The results of this study have shown that the rule-based method performed similarly like machine learning methods. The author has achieved state of the art results with precision, recall, and F-measure scores 94.99%, 92.13%, and 93.37%, respectively. This work can be advantageous by performing more validation of data and results.
A rule-based method using forward chaining in electronic health record has been proposed in [24]. The author has used the codes of International Classification of Disease (ICD-10) for the patients Haemogram reports. The forward  chaining (FC) rule-based method has been utilised to extract healthcare indicators in the dataset. The proposed method has been extracted information from patient records based on the ICD-10 code alike the method proposed in [25], which successfully extracted information with average f-measure of 96.83%.
Jorge et al. [26] have been employed rule-based methods to identify lupus patients from EHR dataset with 400 records (200 each for validation and training). The narrative and codified information have been extracted using NLP from the training set data. The author has been designed algorithm using 'penalised logistic regression' to categorise definite Systemic Lupus Erythematosus (SLE) and definite/probable SLE. The results of the top-performing rule-based algorithm (ICD-9 Code) has shown (86%, 84%) specificity, (60%, 69%) sensitivity for definite SLE and definite/probable SLE respectively.
A rule-based approach has been employed in [27] for the detection of adverse drug events (ADE). A dataset composed of 115447 records of Danish, French and Bulgarian hospitals. Association rules and decision tree methods are used to discover ADE detection rules concerning time constraints. The field experts have been manually validated, filtered, and reorganised the rules which were then placed into rules repository. There are 236 validated ADE detection rules have been discovered which were able to detect 27 various outcomes. The rules associated with anticoagulant drugs (35%), hyperkalemia (27%), and pharmacokinetic drug (25%) have been discovered automatically in this study.
Detailed medical information and relationship between text instances were extracted as a co-reference resolution from a patient record summary in [28]. Three NLP systems employed to resolve phrased which refer to the same entity, i.e. rule-based model, a maximum entropy model, and Markov logic network model (MLN). The results of this study have shown that the proposed model has achieved 4.3% and 5.7% F-scores on Beth and Partners datasets respectively. These three models are integrated into an ensemble system to generate state of the art results as compared to baseline systems with 87.21% performance measure which is 4.5% higher than i2b2 Track 1C average.
In [29], temporal trends have been discovered from the Nationwide Inpatient Sample (NIS) dataset. A combined method of rule mining and model-based recursive partitioning was used to discover temporal trends as a whole, but also for significant subgroups (based on patient age, gender, disease). The results of this study have proved that the trends discovery of subset based on the age-sex relationship is impossible using trend-tracking methods. The dataset consists of only 20% of the discharged patients in the USA; the significance of this research can be improved by adding more number of patient's data.
Petersen et al. [30] have been identified patients with coronary artery bypass or cardiac catheterisation using ICD-9. The dataset composed of 5151 patient records of Veteran Health Administration has been utilized in this study. The rule based method to diagnose artery disease, childhood obesity, and peripheral artery disease has achieved above 95% of sensitivity and specificity score. The positive predictive value of acute myocardial infarction is better than existing methods.
In [31], McCoy et al. have been performed a comparative analysis of association rule mining and crowdsourcing techniques for proper information extraction from electronic health records. These methods are evaluated on publically available EHR in which rule mining and crowdsourcing approach recognised 19586 and 31440 pairs respectively. The author has compared only 500 pairs of each in which 186 pairs were overlapping. The results of this study have shown that crowdsourcing approach has been identified most common relationships unlike association rule mining, which recognized only rare relationships. The better results can be obtained if both these methods are used in combination.
A rule-based system has been designed by Wiley et al. for the detection of stain induced myotoxicity [32]. The manual annotation has been performed of 300 allergy patients, and keywords are defined for this dataset. The set of rules have been used to detect contextual information specific to the identified keywords. They achieved 86% and 91% scores of positive predictive value (PPV), and negative predictive value (NPV), respectively.
In [33], the author has been developed a list containing concepts and secondary concepts found in the same sentence. The secondary concepts mainly contained medications. A set of rules have been defined after concepts initialization for the identification of phenotypes. The proposed method achieved 92% of the kappa value along with the original annotations. A similar concept based technique has also been implemented in [34] which achieved 0.72, 0.78, and 0.94 T, N, and M staging accuracies.
A rule-based heuristic approach has been used in [35], for the assertion of colorectal cancer. The colorectal-cancer concepts detected using MedLee approach [36]. They searched the concept contexts by applying a predefined set of rules. The proposed method achieved 99.6% F-measure for document level concept identification.
In [37], a rule-based approach has been developed by Li et al. to detect adverse drug events (ADE) and medical errors as in [27]. The author detected ADE from patient health records, clinical tests, and medications. The performance of the proposed method is evaluated on a trigger tool [38], and they reached a 100% agreement. The triggers are generally a combination of keywords that are used to extract specific information of the underlying disease.
Alodadi [39], proposed a system to extract radiology notes from electronic health records and discovered association rules to identify interrelationship between various medical entities. A textual dataset has been used to generate transactional records of each radiology report. The Bag-of-Concepts used to formalise raw clinical text and concept's weight was measured using text mining weight methods, i.e. TF-IDF (Text Frequency-Inverse Document Frequency). The weight of every item introduced using Concept Unique Identifiers (CUIs). The proposed methods have shown significant results because of three main measures, i.e. confident, lift, and weighted support.
Szenasi et al. [40] developed a rule-based method for concept extraction from unstructured medical records. NLP techniques employed to process medical text. The text has been decomposed into sentences in which token was identified. A Systematized Nomenclature of Medicine -Clinical Term (SNOMED-CT) was used to make queries for the identification of concepts. Three filters have been applied to obtain better results of the proposed method, i.e. hierarchy based filter, POS filter, context similarity filter, and hierarchy filter which are used in a sequence to reduce computational complexity. A Medline dataset was used to evaluate the method, which has achieved 88.77%, and 89.69% recall and precision respectively.
In [41], a rule-based grammar approach has been used to extract textual information from mamma carcinoma patient's records. The therapy suggestion was derived by defining seven significant variables from the extracted textual features. The proposed system was evaluated on the mammography use case. An integration technique based on rule-based decision support, semantic modelling and information extraction has been used to extract textual feature with 0.69 of accuracy and achieved 0.90 for lymph node status.
A novel algorithm has been developed to identify suicide, or suicidal ideation in electronic health records [42]. ICD-9 was applied to collect various patient's records from 2004 to 2010, in collaboration with the Food and Drug Administration. A training set of 50 records are used, classified as positive and negative by the experts. The study has shown that ICD version 9 had 0.55 positive predictive value (PPV), but ICD-09 in combination with NLP had PPV value of 0.97.
Sauer et al. have been used the rule-based approach on structured and semi-structured Veteran affairs data to extract pulmonary function test (PFT) records [43]. They used the NLP tool to retrieve Spirometric values and responses to bronchodilator challenge. A random set of 1001 documents are used to evaluate the performance of the model which achieved 98.9%, 98.8%, and 98.3% of precision, recall, and f-measure, respectively.

B. MACHINE LEARNING METHODS
Machine learning (ML) methods are most commonly used to construct medica database system from electronic health records for those patients who have undergone health examination [44]. These ML methods are subdivided into support vector machine methods, bayesian methods, and decision tree methods.

1) SUPPORT VECTOR MACHINE METHODS
Support Vector Machine (SVM) is one of the commonly used algorithms adopted by researchers for supervised classification. Vapnik developed SVM in 1990, which works by labelled data [45]. When both input and output are already given, then the input is used as a training set to classify the data as like output.
In [46], cancer diagnosis has performed on medical records extracted from EHR using SVM model. The proposed model was evaluated on 100 and 400 pieces of medical records for 10 and 3 different types of cancers and achieved 86.2% and 97.33% of prediction accuracy, respectively. The Radial Basis Function (RBF) kernel along with K-fold crossvalidation techniques was used to estimate the performance of the model. The higher prediction accuracy may be achieved by evaluating the proposed model on a larger dataset.
Brisimi et al. [47] has been focused on various diseases, e.g. chronic, heart and diabetes from electronic health records and considered as a binary classification problem. They employed various machine learning methods including Support Vector Machine (sparse and kernel), logistic regression (LR) and random forest (RF) for this binary classification problem. These methods are evaluated on the dataset of Boston Medical Center, England. They majorly contributed by using two novel methods K-LRT (likelihood ratio) and JCC (joint clustering and classification) methods. JCC has been applied to find out hidden clusters in positive samples and to recognize sparse classifiers of a cluster by separating positive and negative samples (hospitalized vs non-hospitalized).
In [48] SVM has been applied as a hybrid model by combining the features of weighted LDA and word vector model for liver disease classification. The hybrid model has been applied to get standard features, which were further utilized on the classification to get optimum results. The proposed method evaluated on cooperative hospital dataset (Sogou, Baidu, and Tencent) and achieved 99.1% accuracy over 100 dimensions. This combination of weighted LDA and word vector model gives the advantage of improved prediction accuracy, diagnostic reliability and full use of possible text information of EHR data.
Alemzadeh and Devarakonda [49] used SVM, NB, DT along with ensemble methods AdaBoost and Random Forests for automatic identification of disease control status from EHR data. The dataset used in this research was comprised of 55,000 clinical notes over a period of seven years (average of 133 notes per patient). The system has been evaluated on 5035 candidate snippets (2086 labels, 2949 unlabeled) and achieved 86% F-measure score in identifying disease status and 77% accuracy in classifying the status is controlled or not.
A machine learning method SVM has been applied on naive and expert-defined EHR features collection in [50] to detect Rheumatoid Arthritis (RA) cases with the help of medication exposures, billing codes, and NLP concepts. The Training of the SVM method has been performed on both naive and expert-defined data which had achieved precision and recall scores 0.94 and 0.87 respectively unlike the deterministic algorithms with 0.75 and 0.51 of precision and recall. A dataset of 10,000 patients was used in this research which was further classified as possible RA, definite RA, and not RA based on the test results. The validity of this method can be verified on different diseases.
A component-based NLP system HITEx has been developed by Zeng et al. [51] to extract findings for disease research. A record of 50 patients was used in the HITEx system to extract smoking, co-morbidity, and principal diagnosis. The SVM classifier was used to identify the smoking status of patients. It was tested and trained on 8500 smoke sentences with 10-folded cross evaluation and experimented on Weka tool. The HITEx achieved an accuracy of 0.82, 0.87, and 0.90 on principal diagnosis, co-morbidity, and smoking status extraction, respectively excluding insufficient data which has outperformed ICD-9. The better results of sensitivity, specificity, and accuracy can be obtained by combining the proposed method with ICD-9.   An NLP system has been designed in [52], [53] for pneumonia identification from narrative reports of the patient. This system was considered as a binary classification problem to predict whether a patient was positive for pneumonia on not. A dataset of 426 patients (unrestricted) was employed in this research during their stay in ICU. There were various features considered in research, i.e. unified modelling language system, word n-gram, and assertion value for pneumonia. The performance of the system was improved by utilizing feature selection approach characterized by statistical testing for pneumonia identification. A comparative analysis was performed on a restricted dataset of 236 patients. The proposed method has shown significant results of F-measure (0.8571 and 0.8176) and (0.5070 and 0.4910) on the restricted and unrestricted datasets, respectively.
Garla et al. [54] compared Laplacian and linear SVM for classification task based on clinical text along with the effect of semi-supervised learning were determined on Laplacian SVM performance. The training of SVM and Laplacian SVM was performed on 820 reports of ultrasound, MRI, and abdominal CT labelled for malignant liver lesions (77% positive class presence). Furthermore, they have used 19845 unlabeled samples (random) along with gold standard references for Laplacian SVM. A test set of 520 labelled reports was used for both these methods. The results of Laplacian SVM (labeled and unlabeled) has been outperformed supervised SVM (0.943 vs 0.911 sensitivity, 0.877 vs 0.883 PPV, and 0.773 vs 0.741 F1-score). The semi-supervised methods like Laplacian SVM are implemented on different unlabeled EHR data to improved clinical text classification.
An SVM based method has been employed to identify contralateral breast cancer from narrative text and pathology reports [55]. The medical concepts and their combinations are used in the proposed method for the identification of contralateral events in the notes. Moreover, pathology reports of the breast (either left or right) are used as an additional feature. SVM along with derived features are used to detect contralateral cancer. They achieved 0.93 and 0.89 area under the curve for the validation and test set, respectively. This method can be used to detect various disease because of the feature generation simplicity, and it can also be applied to other breast cancer events like distance or local recurrence.
A novel method has been proposed in [57] by combining machine learning, NLP, and ontology methods for patient identification. A dataset of Mayo Clinic electronic records from 2007 has been used to evaluate this method. SVM algorithm experimented on the extracted T2DM (type 2 diabetes mellitus) clinical notes of SNOMED concepts. The proposed methods are evaluated on precision, recall, and F-measure which achieved 95% f-measure score. The evaluation of T2DM is a minimal area; the proposed method can be applied to other diseases to attain better performance. A sparse discriminant analysis (SDA) has been used to predict heart disease patients using electronic health records. SDA was evaluated on 280 instances of heart disease dataset, and achieved 96% prediction accuracy. The proposed method was preferred over linear discriminant analysis (LDA) because of linear inseparability and poor evaluation of LDA parameters [58].
Chen et al. [56] have been used active learning (AL) with a support vector machine (SVM) based method for electronic health data. The performance of the proposed method is evaluated on three diseases cohorts' i.e. colorectal cancer (CRC), venous thromboembolism (VTE), and rheumatoid arthritis (RA) using two feature sets, i.e. unrefined features (all clinical concepts & billing codes) and refined features (domain experts selected). A performance comparison was made between active and passive learning (PL) based on random sampling. The results of AL outperformed PL on three phenotype tasks. AL reduced 68% and 23% of the annotated samples respectively when the unrefined feature was employed for RA and CRC and achieved 0.95 area under the curve (AUC). Furthermore, AL reduced 68% annotated samples for VTE with 0.70 AUC using refined features. The performance of the phenotype classifiers is improved using refined features. An efficient phenotype method can be developed by combining AL and feature engineering characterized by domain knowledge. VOLUME 8, 2020

2) BAYESIAN METHODS
The Bayesian Network (BN) and the Naive Bayes (NB) are probabilistic algorithms and both works elegantly with a large number of features [51]. There is no need for a dependency network for NB classifiers and works better with high dimensional features, unlike Bayesian network. In Naive Bayes, patterns are matched by examining a set of categorized documents. It is a probabilistic classifier that matches the data with the bag of words. The major aim of this classifier is to classify different documents into certain class or category. It streamlines the learning by classifying the features in an independent class. Naive-Bays accuracy is independent of the feature's dependencies in the class [59].
A Naive Bayes classification has been used to predict heart disease from electronic health records in [60]. The proposed system used to history based heart disease dataset to extract hidden knowledge (heart disease). This system utilized continuous data rather than classified data which has achieved significant results. The future works include an increase in data volume and by focusing on other diseases like liver and cancer to obtain better results.
Al-Aidaroos et al. [61] presented a Naive Bayes method for medical data classification. The classification of various diseases like breast cancer, liver disorders, lung cancer, and primary tumour etc. based on accuracy and area under the ROC curve has been performed. The proposed method was compared with five other methods evaluated on 15 datasets. The results of this have shown that NB outperformed others in medical classification. The proposed method followed by deep learning concepts can be employed to obtain better segmentation results. The future works include hybridization of NB with other approaches.
The Bayesian and decision tree methods of medical diagnosis have been initialized in [62]. These methods were employed to identify diagnostic problems like a diagnosis of thyroid diseases, primary tumour localization, rheumatology, and analytical reappearance of breast cancer. The NB has shown promising accuracy. The redundant knowledge can be used with NB for better performance and to deal with missing values.
Wang et al. [63] have been employed a Bayesian approach to predict the occurrence of brain metastasis from lung cancer. A global dataset comprised of 50 thousand cancer records obtained over the period of 1996 to 2010 in Taiwan was used in this research. The proposed has shown state of the art results in terms of sensitivity compared with SVM, and LR evaluated over performance metrics, i.e. accuracy, sensitivity, and specificity.
Bayesian classifier has been used to assess the progression or relapse of cancer in [64]. A brain tumour dataset of 142 patients collected from 2000 to 2005 was used in this study. A total of 96 attributes (binary form) were selected for training. The proposed model has been calculated the probability of being consigned to relapse and no-relapse risk of cancer. The proposed method achieved good results of accuracy, specificity, and sensitivity with values 0.84, 0.87, and 0.80, respectively.
Sakai et al. [65] employed a Bayesian network for diagnostic prediction of appendicitis. A dataset composed of 169 patients suspected of acute appendicitis was considered in this research. The proposed model was compared using performance metrics (error-rate and area under the ROC curve) with logistic regression, and neural network model. This model detected that 86 out of 169 patients were suffering from appendicitis and achieved the lowest error rate and reliable results of ROC curve.
Naive Bayes method has been used to predict heart diseases in [66]. A clinical dataset of 500 patients obtained from the diabetic research institute Chennai was used in this study. The different attributes like age, sex, diagnosis, and slope were selected for the proposed model. The performance of the method was evaluated using performance matrices like precision, recall, and f-measure, and achieved 0.71, 0.74, and 0.712 of these values, respectively.
In [67], genotype data has been to predict Alzheimer's disease using a Bayesian network. The genome data of 1411 patients were collected for this study (860 with Late-Onset Alzheimer Disease (LOAD), and 550 without LOAD). The performance of the proposed method was compared with the model-averaged (MANB), feature selection (FSNB), and without feature selection (NB) in terms of running-time, AUC, and calibration. MANB has been outperformed others based on these performance metrics with training time 16s and 0.72 of AUC to predict LOAD patients.

3) DECISION TREE METHODS
A decision tree is one of the commonly used algorithms for data analysis [68]. It contains terminal and non-terminal nodes. Each non-terminal node depicts a condition or test on a data item. This technique is generally used for the classification of records which is helpful for both association and regression tasks. The instances are sorted in a decision tree from non-terminal to terminal nodes [69]. It is easy to visualize and recognize the various advantages and disadvantages of data using decision tree [70]. There is a lot of work has been done for disease diagnosis using a decision tree.
In [71], the decision tree algorithm (C4.5 decision tree classifier) has been used for the diagnosis prediction of dengue fever in the primary stage of infection. The screening of 1200 patients (1012 from Singapore and 188 from Vietnam) has been carried out during the first 3 days of illness and followed up to the period of one month. A total of 364 patients has been found suffering from dengue (171 dengue hemorrhagic, 173 dengue fever, and 20 dengue shock syndrome). The proposed method achieved 0.847, 0.782, and 0.802 of accuracy, sensitivity, and specificity, respectively. This study is advantageous to predict the diagnosis of dengue disease using simple hematological and clinical parameters.
A heart disease prediction system has been proposed using a decision tree [72]. The structural information like age, chest  pain, gender, and heart rate has been extracted using machine learning tools. The J48 decision tree classifier has been used for feature selection evaluated on a dataset of 240 instances (120 each for healthy people and heart disease affected people). The proposed method achieved 0.85 of accuracy accurately predicting the conditions of heart disease.
In [73], Kuo et al. have been used decision trees for diagnosis of breast tumour from ultrasonic images. The training of the proposed model has been performed was extracted from the region of the interest (ROI). A 24-D texture feature set has been used to classify the tumour as malignant or benign. A dataset composed of 243 breast tumour images was VOLUME 8, 2020 collected from 1997 to 1998. The empirical results of this study have shown 95.5% accuracy.
In [74], the decision tree has been suggested for asthma diagnosis, and a fuzzy system was used to measure asthma control level. Asthma has been diagnosed using symptoms such as a sore throat, dry cough, and sneezing etc. while the control level was estimated based on breath shortness, daytime symptoms, and action limits etc. The data was collected in the form of questionnaires from the patients. Decision tree classifier was used for the diagnosis of asthmatic patients and achieved 0.90, and 0.783 of accuracy and kappa coefficient, respectively.
Shouman et al. [75] have been proposed decision tree algorithm with additional parameters like voting, equal frequency, discretization, and gain ratio for diagnosing heart disease patients. The proposed method was evaluated on Cleveland Clinic Heart disease dataset (composed of 76 raw attributes in which 13 are selected for this study). The result of this method was compared with commonly used decision tree classifiers such J4.8 and Bagging algorithm. The proposed method achieved 0.779, 0.852, and 0.841 of sensitivity, specificity, and accuracy, respectively.
The author has been suggested C4.5 decision tree classifier for the diagnosis of Hepatitis [76]. A total of 155 patients (both healthier and the liver affected patients) were selected in this study. There were 19 attributes used to construct the method including age, sex, steroid, fatigue, and varices etc. The proposed method achieved 0.8581 of accuracy for hepatitis diagnosis.
Pulmonary hypertension (PH) has been diagnosed from right heart catechization (RHC) and magnetic resonance imaging (MRI) using decision tree algorithm [77]. A total of 72 patients with suspected PH underwent MRI and RHC, in which 57 has PH, and 15 were diagnosed with no PH. The proposed model correctly classified 92% of the PH patients, while the rest of them were considered as misclassification error. The optimal results of this study can decrease the need for RHC with suspected PH.
In [78], five machine learning methods are employed to diagnose diabetic retinopathy (DR) from EHR data. A large retinal dataset comprised of 5057 records is collected from 301 hospitals in China. The preprocessing steps such as label binarization, standard scaleration, and normalization of values are performed to improve DR disease diagnosis accuracy. SVM, NB, DT, LR, and RF classification models are used for classification tasks. DT model achieved high diagnostic accuracy of 86.82%. The proposed method is advantageous as compared to other DR diagnostic methods because of low cost, low threshold, and high diagnostic accuracy.

C. DEEP LEARNING METHODS
Deep learning (DL) also referred to as deep machine learning or hierarchical learning, is used to model a high level of abstraction in data by applying various processing layers. Over the last few years, diagnosis of disease is performed using traditional methods including rule-based methods, statistical learning methods, and machine learning methods such as SVM, NB, and RF [79]. With the advancement in DL approaches, it is widely adopted by the researchers over the globe for various disease identification. DL has shown significant performance in many domains by capturing extensive range dependencies and constructing dense hierarchal features in an efficient way [80]- [83]. In order to build a construct effective disease diagnosis system, advancement in both data representation and development of ML architectures are imperative [84], [85]. Deep learning is based on deep neural networks (usually the term deep is used for networks for more than one hidden layer) which includes various networks such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Auto-encoder (AE), Deep Belief Networks (DBN), and more. We study the various deep learning techniques employed on EHR data for clinical tasks and discuss their practical advantages and future directions.

1) CONVOLUTIONAL NEURAL NETWORKS (CNN)
CNN has become very popular in medical diagnosis over the last few years. It is widely used in medical imaging data including lab reports and pathological reports etc. It imposes local connectivity on the unstructured data. The convolution layer and pooling layers are used in the experimentation of CNN architecture. CNN is used to extract highly informative features from EHR [95], [96].
A CNN based method has been used for the prognosis of diabetic retinopathy [87]. This CNN model has been restructured to build a new BNCNN model by adding the BN layer to traditional LeNet model. The dataset containing 500 records of DR, and no DR is gathered from 301 hospitals in China and abroad during the time period of 2009 to 2013. The proposed has been achieved state of the art training accuracy of 99.85% and testing accuracy of 97.56% which is 2% higher than LR method. BNCNN model is effective for preventing the gradient diffusion, improving the training speed and accuracy of the model. A model-based reasoning (MBR) algorithm for disease diagnosis based on EMR and natural language processing showed 95.86% accuracy [97].
Mehrabi et al. [93] have employed deep learning techniques for temporal pattern discovery. The dataset of Rochester Epidemiology was used for this research from Olmsted country. They have modelled every patient's data as a temporal matrix in which rows contains diagnosis coded of ICD-9 and Healthcare Cost and Utilization Project and clinical classification software (HCUP-CCS), and column contains diagnosis years with age not more than 18 had been selected. They constructed a Boltzmann machine network with three hidden layers and used the temporal matrix as a visible node. The results of the proposed method have shown relationships, i.e., blindness and eye disorder diagnosis code. The network weights were explored as a feature in the patient's data. The additional patterns marked by medical experts can be used to gain better results. A novel approach of disease inference has been performed by asking health questions, selected possible disease of their symptoms, and then applied sparsely connected deep learning method to infer possible diseases by giving the questions of patients [94]. Two main features were used in this research. Firstly, they mined discriminant features from raw medical features. Secondly, they considered raw features and signature in the first layer and hidden node in the second layer, respectively. The proposed method is sparsely connected, and node numbers are adjusted automatically, unlike the traditional deep learning methods, which are densely connected, and node numbers are tediously adjusted. This method is not suitable to identify discriminant features for each disease, which can be observed in future studies to gain wide accessibility of this method.
Deep patient: an unsupervised deep feature learning method introduced in [89] by Miotto et al. They employed this method for predictive modelling using clinical notes by driving patient representation. A Mount Sinai dataset of 7,000 patients was used by capturing hierarchal regularities and dependencies in clinical notes. The evaluation was performed on 76214 patients comprising 78 diseases. The results of this study have been outperformed those methods which were achieved using a representation based on raw health records. They achieved top performance to predict various diseases including cancer, diabetes, and schizophrenia. DL methods can improve clinical predictions if derived patient representations are used with this. The laboratory results can be included in this study to enhance its representation performance.
Choi et al. have proposed a novel method for signifying diverse medicinal concepts as real-valued vectors and deep learning method has been used to construct patient representation [90]. The trained medical concepts were utilized to capture heart failure prediction. The training of various models has been performed on 3884 and 28903 controls of HF. The classification methods (LR, SVM, NN, and KNN) achieved 23% AUC improvement using proposed representation. The effective results can be acquired by adopting lab reports and patient demographic information along with this representation.
A CNN model has been applied to analyze patient information from EHRs [91]. EHR of every patient was represented as a temporal matrix and CNN model was formed by creating four layers such as input layer (contains EHR matrices), convolution layer (extract phenotype from input layer), max pooling layer (presenting sparsity on identified phenotypes), and fully connected layer (softmax prediction). Moreover, temporal fusion procedure was examined for the smoothness of patient EHR. The proposed was evaluated qualitatively and quantitatively on EHR dataset collected over 4 years. CNN model can be avoided from over-fitting by reducing the number of parameters in the future. The proposed method can also be used for other works.
Zhu et al. [92] measured patient similarities using the DL model along with medical concepts. The similarities were evaluated based on the temporal matching of longitudinal patient clinical notes. Both the supervised and unsupervised methods were employed for this task, which preserves temporal properties in EHR. CNN model was used as a supervised learning method and learned representations of clinical records with medical events embedding from word2vec. The results of this study have been outperformed baseline methods with values of R1, Purity, and NM1 are 0.9887, 0.9882, and 0.9516 respectively. The future work includes the addition of time interval information to resolve data regulatory issues and the implementation of the proposed method in other domains such as health visualization.
Convolutional Attention for Multi-Label classification (CAML) based on CNN was employed to predict medical codes from electronic health records [88]. CAML used the attention mechanism to pool the convolution output for every label. The proposed method has shown strong improvements for the prediction task over the baseline methods of ICD9code. CAML was designed as an adaptable method for ICD-10 code. The discharge summaries integrated with MIMIC-III dataset (open repository for ICU medical records), can be used in the future to handle non-standard writing and out of vocabulary tokens.
The four multi-label classification approaches were employed by assigning ICD codes to discharge summaries, MIMIC-II and MIMIC-III clinical datasets [98]. They evaluated their models on all the ICD-9 codes, to make sure the real-world ICD-9 tagging. The experimentation was performed on the SVM model, continuous bag-of-words (CBOW) model, CNN model, and HA-GRU ('bidirectional Gated Recurrent Unit model with a Hierarchical Attention Mechanism'). HA-GRU achieved promising results by applying tokenization and hierarchical segmentation.
Brown et al. [99] have been used deep convolutional neural networks to diagnose retinopathy of prematurity (ROP) from retinal images. A dataset composed of 5511 retinal images and 5-folded cross-validation was used to evaluate the method. A standard reference diagnosis (RSD) has been assigned to every image based on image-grading (3 experts) and clinical diagnosis (1 expert) and classified into the normal, pre-plus, plus disease. This method achieved 0.94 and 0.99 of AUC for normal and plus disease, respectively. Furthermore, the algorithm achieved 0.91 of accuracy on a test set of 100 images, which has outperformed 6 of 8 ROP experts.
The disease prediction model based on electronic medical records has been developed to assess multiple diseases [86]. The proposed method has been characterized by the convolutional neural network for multiple diseases prediction. This method has been evaluated using data of 4298 patients with cerebral infection, pulmonary infection, and coronary heart disease. The CNN algorithm achieved the highest of 96.5% accuracy, and 96.6% F1-measure on D1 dataset for cerebral infection disease.

2) RECURRENT NEURAL NETWORKS (RNN)
Recurrent networks belong to the category of artificial neural networks. Unlike feedforward neural networks, the connections in nodes create a directed graph which displays a time sequence for temporal dynamic behaviour. The accurately mining of EHR records is very challenging because of various factors including incomplete prescription, free text writing, and disease variability.
A RNN based long short term memory (LSTM) architecture has been proposed for predicting the risk of heart failure from EHR [100]. A real-world combined dataset of 20,000 patients of congestive heart disease was to evaluate the performance of LSTM method. LSTM has outperformed LR, RF, and AdaBoost algorithms after 5-fold cross validation.
Deep-Diagnosis [101] was introduced by Shi et al. to address these challenges. The proposed method was constructed using RNN-based prediction diagnosis algorithm by mining pediatric EHR data. They performed preprocessing of Chinese unstructured EHRs dataset feature vector has been created using NLP concepts by transferring these values into sentence vector. The patients' symptoms and their relationship have been identified using the proposed bidirectional Recurrent Neural Network. A pediatric dataset composed of 81476 records were used to train and test the model which achieved 80.912 of precision. The method can be cross evaluated on various EHR datasets along with some other measures of NLP can be used to extract specific information from EHRs data as a future work of this study.
Luo [105] has been used RNN based Long-Short Term Memory (LSTM) model for classifying relations in EHRs data. The model was evaluated on relation classification challenge database i2b2/VA. The proposed model achieved f-measure of 0.61, 0.80, and 0.68 for classifying medical problem-treatment, medical problem-test, and medical problem-medical problem relations. The comparison between segment and sentence LSTM has been performed to demonstrate the difference between context and concept text. They evaluated the word-embedding impact on the performance LSTM and showed that the performance of the proposed method is comparable with existing methods.
The diagnosis and probability of clinical visits have been predicted using demographic information and RNN method (RNN-INFO) [102]. The data dimensionality has been reduced using neural network hidden layer and by extracting the hidden representation in clinical notes. A dataset composed of 289 thousand medical records and 28 thousand patients were used to evaluate the proposed model. This study has shown promising results than non-temporal models on both diagnosis and probability prediction. Furthermore, they verified that RNN-INFO (Model with Demographic Information) had shown better results than only RNN (Model without Demographic Information). The model with demographic information has a significant performance than a model without demographic information in probability prediction.
Wu et al. [103] introduced event sequences and their properties (evenness, synchronicity, and co-cardinality) for the classification of pediatric asthma (chronic disease). They also determined how the inverse of these properties, i.e. uneven, asynchronous, and multi-cardinality can assist accountings of relative time. The results of this study have shown that embedding time stamp with RNN model gives a better classification of patients with no asthma rather than asthma patients.
A bidirectional recurrent neural network (BiRNN) has been proposed to extract automatic medical information such as diseases and treatments from Chinese EHRs data [106]. The proposed model has been constructed using two steps. First steps involved the training of shallow BiRNN model. In the second step, medical information has been transferred from the general domain to train BiRNN for automatic recognition of concepts. Transfer-BiRNN has shown promising results than baseline methods and achieved 88.7%, 88.49% accuracy of disease and symptoms prediction.
A method based on RNN and the combination of static and dynamic information has been proposed for the prediction of clinical events [107]. A database collected from Chartie Hospital Berlin of kidney transplanted patients was used to evaluate the method. The database has three endpoints, i.e. kidney rejection, patient death, and kidney loss. The proposed model was used to predict these endpoints in each patient after 6-12 months of the clinic visit. RNN with Gated Recurrent Units (RNN-GRU) has outperformed other methods for this task. Furthermore, optimum results have been obtained using binary encoding inputs rather than normalizing input data. The other kidney loss information like biopsies and its results can be used in this model to obtain better results in the future.
The challenge of adverse medical detection (ADEs) has been solved using the proposed recurrent neural network based on context attention embedding (CA-RNN) [104]. The salient words concerning target have been located using the proposed context-aware attention technique. Then they combined a deep learning approach with the proposed technique to boost the performance of adverse medical event detection. The proposed method was evaluated on 8845 cardiovascular medical records and achieved 0.929 and 0.561 of recall and precision for AME of Ischemia. The future work of this study may include larger dataset and evaluation of this method on other diseases. VOLUME 8, 2020

3) DEEP BELIEF NETWORK (DBN)
A deep belief network is one of the representations of deep learning [112]. It has been widely adopted for multiple tasks such as pattern recognition, handwriting recognition, speech recognition and many other tasks [113], [114]. It is also used for classification task [115], soft sensor [116], and image-based monitoring [117]. DBN is an unsupervised method which works on unlabeled data. It is mostly applied to construct hierarchical structures and uses unsupervised learning for feature representation. It is restricted in acquiring features because there is no intermediate relationship between the neurons of the same layer [118].
In [108], a new multimodal disease recurrent convolutional neural network (MD-RCNN) has been proposed for the risk prediction of disease. This model has been used to extract both the structured and unstructured features in fine-grained. Deep Belief Network (DBN) has been used to obtain a non-linear connection between unstructured and structured data by fusing the features. The model has been evaluated on two Chinese datasets from 2013 to 2015 and achieved a state of art accuracy of 96%. The future directions of this article include other diseases prediction and people behaviours prediction.
A deep belief network (DBN) has been used in [111] for the diagnosis of Parkinson's disease (PD) based on the speech signals obtained from the UCI repository. The proposed method was trained on various healthy and patient voices, and feature extraction was performed by inputting DBN. The proposed method was used to categorize PD which composed of one output layer, and two stacked restricted Boltzmann machines. The RBM model was employed as unsupervised learning to overcome the problems of initial weights and backpropagation used supervised learning for fine-tuning. The proposed method achieved 94% of accuracy for the diagnosis of Parkinson's disease.
In [109] DBN has been employed for the diagnosis of one of the most common disease namely attention deficit hyperactivity disorder (ADHD). The proposed method used a greedy approach for the construction and the training of the network. The two training and testing datasets (New York and Neuroimaging) were provided by Global competitions ADHD-200. There were (222, and 41) and (48, and 25) samples of training and testing samples of New York University (NYU) and Neuroimaging (NI) dataset, respectively. They achieved state of the art accuracy of 0.6368 and 0.6983 on NYU and NI datasets respectively.
Faturrahman et al. [110] have been proposed DBN method for the classification of Alzheimer's disease (AD) based on structural modalities (MRI data). The result of this model has been compared with the support vector machine (SVM). The feature obtained from mean and standard deviation calculation (MSD) with fine-tuned hyper-parameters such as hidden nodes 25, epoch 250 and momentum from 0.5-0.9 of the proposed method achieved 73.6%, 71.2% and 76% accuracy, sensitivity and specificity, respectively and the based on voxel value (VV) with epoch 100, and momentum 0 achieved 91.7%, 90.5%, and 92.9% accuracy, sensitivity, and specificity, respectively. AD detection can be improved by applying feature selection and modality techniques like PET and CSF.

4) AUTOENCODER (AE)
Autoencoder is the representation of the artificial neural network, which is based on unsupervised learning to learn data encodings. It is generally used for dimensionality reduction by learning to encode for a set of data. AE concepts are now employed to learn a generative model of data [119], [120]. The modern concepts of AI have included AE stacked in deep neural networks [121].
Stacked autoencoder and softmax classification methods have been used for the diagnosis and classification of cervical cancer [122], [123]. A UCI dataset composed of 668 samples, 4 targets and 30 attributes were used for training and testing of the method. The proposed methodology was divided into two steps: autoencoder was applied on raw data to obtain reduced dimensionality dataset in the first step, and softmax layer has applied for the classification on the trained dataset in the second step. The dataset was divided into a training set (70%), and a test set (30%). The proposed model was implemented in four target variables (Citology, Schiller, Hinselmann, and Biopsy) and their classification performance was compared which achieved 0.978 of correct classification rate. The training of this model takes too much time because of the dimensionality reduction of samples. Therefore, future measures can be taken to improve the training time of the model.
In [124] automatic encoding of medical procedure has been done using autoencoders. The classification of medical procedures has been performed by query matching characterized by convolutional neural networks. The pipeline was created using CNN and AE with logistic regression (LR). F1 Micro score of 0.702 and 0.608 achieved on autoencoders and convolutional neural networks, respectively to determine relevance between query text and category text. A comparison of the proposed method with various existing methods has been performed based on the suitability of automatic encoding. The Bayes algorithm can be adopted as a guiding procedure for knowledge reasoning. The current dataset (only 24,092 pairs) is small for fine-tuning, which can be expanded as future work.
Hwang et al. [125] have been performed performance comparison of disease prediction by using generative adversarial networks (GANs) and conventional networks in combination with missing value prediction techniques. The promising results have been achieved using stacked autoencoder (missing value predicting method) and with auxiliary classifier GANs (AC-GANs: disease prediction) with 0.98, 0.95, and 0.99 of accuracy, sensitivity, and specificity, respectively. GAN is used as a generic model only, and AE is used to fill missing values in this study. The future direction of this work includes using GAN to fill the missing values.
The automatic diagnosis of prostate cancer (PCa) from MRI images has been performed using random forest and autoencoder methods [126]. An imbalance dataset composed of diffusion-weighted, B-value, apparent diffusion coefficient, and transaxial T2 weighted images were collected from 2016 challenge of PROSTATEx. Single-level Sparse Autoencoder (SAE) has been applied to extract high-level features. The class imbalance problem has been resolved using Weka resampling, SMOTE ('Synthetic Minority Oversampling Technique') and ADASYN ('Adaptive Synthetic'). The proposed method has been compared with other baseline methods, and it has observed that ADASYN followed by random forest classifier achieved promising results with 0.979 area under the ROC curve, 0.936 of accuracy, and 0.94 of F-measure for the diagnosis of PCa. A fully automated diagnosis system of prostate cancer can be built in future.

DL FRAMEWORKS FOR DISEASE DIAGNOSIS SYSTEMS
Internet is producing bulk of raw data which can be effectively used to train DL models. Because of the limitations of traditional ANN and ML techniques, DL leverages to cope with larger dataset, make intelligent decision and disease diagnosis, etc. The recent advancement in key enablers influence DL based disease diagnosis such as efficient regularizers (L1/L2, Dropout, etc.), loss functions (ADAM, SGD, RMSProp, and so on), and commodity hardware (e.g. GPUs and I/O bus speed). Moreover, GPU based systems have much improved to execute DL models for various diagnosis of diseases. The technology giants like Google, Microsoft, Apple and Facebook are massively investing on GPU-accelerated DL platforms. The DL models in these libraries are used to carry out the training with the help of data and model parallelism. The details of prominent platforms used for disease diagnosis are as follows:

D. OPEN PROBLEMS
Although electronic health records assist both the patient and clinicians to keep track of patient progress, and provides remote and faster access, still there are lot of challenges associated with EHR such as data volume, staff training, data altercation, vendor selection, insufficient support material, implementation delays, scarce info, data security and other Medicare technical issues. Different methods have been proposed to overcome the challenges associated with EHR. Though these methods are not intelligent enough to tackle all the problems of EHR, several successes have been reported by employing machine learning and deep learning methods.
Disease diagnosis using rule-based methods is the most commonly practised techniques by medical specialists. The rule-based methods are grounded in production rules (condition-action pairs). The construction of complex rules in case of highly dependent data and excessive volume is a very laborious and time-taking effort. Another challenge associated with rule-based methods is the scalability of data, because once you have written the rule, the addition of anything beyond that rules may lead to possible errors. It is always ambiguous to choose the rule when two different rules are created for the same problem.
Diagnosis using machine learning methods can work when condition reduced to classification task on physiological data, in areas where the clinician can be able to identify patterns indicating the presence or absence of that condition. ML methods don't work properly if it is not a classification problem. Usually, it requires a large amount of data for the training of the model; we cannot train the model with a small amount of data. Data preprocessing is another main challenge, because to deal with missing values and to improve the image quality is very strenuous (to obtain the minute details, and to differentiate between the background and the object). The designing of the ML model with labelled data is often useful, but the development of unsupervised ML model is very tedious and lengthy. Other challenges include data accessibility, accurate result interpretation, pertinent usage of image structure to design a model, and application of results in clinical practice. The results of disease diagnosis using ML methods can be improved by dealing with these challenges.
Deep learning methods have achieved high performance for medical image analysis task. These methods are easily scalable, independent of deep knowledge and can be developed rapidly. One of the major obstacles with a deep learning approach is the lack of training dataset. The acquisition of pertinent annotation and labelling is also the main problem. Labelling of the larger dataset (often annotated by the domain experts) such as vessel segmentation and 3-D radiology segmentation is very problematic because it takes a significant amount of time for labelling. The labelling noise and class imbalance are considered as the open challenges for researchers.

III. DISCUSSION
The physicians or medical experts diagnose a specific disease by observing the patient's physical condition, exploring symptoms, or through laboratory tests and pathological reports. The patient's history is usually stored as a prescription, for necessary medication, treatment plans, and to track disease progression. Initially, it was done through paper form, which was not an efficient method because of many reasons including remote access, a reminder of vaccination dates, streamline workflow, and keep track patient's performance. With the advancement of technology, the prescriptions are saved in an electronic format (digital format) known as EHR or EMR. These electronic records give assistance to the physicians or medical experts to access the patient records instantly, to keep track of patients' due dates for checkups and immunizations and monitor patient's health performance.
Initially, some medical expert systems are designed for automatic diagnosis of diseases. They were generally based on queries, and the list of diseases was stored in the database against their symptoms. Medical expert systems are generally concerned on the computer writing programs to perform disease diagnosis. Although many applications have been developed, these systems are still very expensive, and there have been very technical and usability issues. There are various issues related to the acceptance of these medical systems as discussed in [88], [92], [129]. These issues are generally concerned with accepting a ratio of these systems, more time consuming, and manufacturing cost etc. Fuzzy expert systems are commonly used for both experimental and applied medicine. Pabbi [130] employed the expert system to predict dengue fever and achieved 95% accuracy. They used a sample based dataset along with lab reports and symptoms to evaluate the proposed method. Main and Oldham [131] identified 17 rules for designing expert systems. They further highlighted the importance of these rules and changing the behaviour of the people with time explored. Knowledge-Based (KB) systems are widely adopted systems to handle knowledge-intensive tasks. They are used to handle the problems related to laboratory and pathology. A survey of KB systems in the laboratory was carried out by Spackman and Connelly [132], suggested that these systems have increasing values in laboratory and pathology.
An EHR contains various information including X-ray, CT, pathology and retinal images, laboratory results, progress notes, and medical histories. This information has been used extensively for automatic detection and classification of diseases. Although, this information is not complete enough to accurately diagnose the disease, different measures are used to achieve the optimum diagnostic accuracy like preprocessing, feature extraction, and implementation of the model. The textual information from patient progress notes and medical histories have been extracted for disease diagnosis [25], [133], [134]. The retinal images are used for various retinal diseases such as glaucoma [135], diabetic retinopathy [136], [137], hemorrhage [138], hard-exudate [139], retinopathy of prematurity [140], and age-related macular degeneration detection [141] etc. The pathology images are used as an input for the detection of colorectal cancer [142], prostate cancer [143], breast cancer [144], [145], malaria [146], and metastases [147] and classification of various diseases including colon cancer [148], glioma [149], cancer tissue [150] and thyroid cytopathology segmentation [151] etc. Likewise, the multiple diseases have been diagnosed over the years based on x-ray and CT images i.e. thoracic disease [152], dental disease [153], [154], Alzheimer's [155], pulmonary embolism [156], lung disease [157], and the liver image classification [158].
Rule-based systems are used to extract medical information from electronic health records [25], [26]. These rules are based on conditional statements to match the relevant cases. These rule-based systems extract the attributes related to the disease. The human specified extraction rules are based on pattern matching scheme. Many rule-based systems have been developed to process clinical reports including MedLEE [36], MetaMao [159], and SymTex [160] etc. MedEx: a rule bases system was developed by Xu et al. to extract medical information and achieved 0.90 f-measure on extracting drug names, routes, dosages and drug frequencies from clinical notes [161]. Although many rule bases system has shown state of the art results with above 80% accuracy to diagnose a particular disease but to write these rules, is a very tedious job. Moreover, these systems are not intelligent enough to make a decision, if some symptoms are coherent or beyond the rules list.
With the advancement in AI concepts, Machine Learning method has been evolved very rapidly, and have been used for medical diagnosis over the last few years. ML technique is used to develop automatic, sophisticated, and objective methods for the analysis of multidimensional biomedical data [162]. The various supervised and unsupervised machine learning methods including SVM, NB, DT, and Random Forests are used in medical diagnosis [47]- [52], [54]- [57]. VOLUME 8, 2020 A review of Malaria diagnosis using ML methods has been performed by Sajana and Narasingarao [163]. Malaria disease has been identified using segmentation of the erythrocytes, life stages and density estimation of parasites, and by applying multiple techniques. These methods have shown promising results even on severe diseases like various types of cancer, diabetic retinopathy, and diabetes etc. There is still plenty of research gap is available for the researcher to improve the prediction results by introducing novel techniques or by integrating multidimensional heterogeneous data with multiple approaches of feature selection and classification.
DL methods are used to model a high level of abstraction in data by applying various layers. Diagnosis through deep learning works when it is limited to the classification task on EHR data. Various deep learning models such as CNN, RNN, RBM, and AE have been used medical diagnosis from EHR data [88], [91], [92], [98], [99]. These methods have attained worldwide acceptance to be applied in disease diagnosis from discharge summaries because of various layers including pooling layer, convolution layer, normalization layer, and fully connected layers. DL is considered as the best choice for medical image analysis including object detection [164], [165], image classification [166]- [168], registration [169]- [171], and segmentation [172]- [175] etc.
In this review, we observed the various approaches used for disease diagnosis from electronic medical records. We categorized these approaches into rule-based methods, machine learning methods, and deep learning methods. We have covered the major categories of machine learning methods including support vector machine, Naive bays, decision tree, and random forest methods. Deep learning methods are sub-divided into convolutional neural network methods, recurrent neural network methods, autoencoder methods, concept representation methods, and restricted Boltzmann methods. We covered various aspects of research in a tabular format, and also present a graphical representation of publications with respect to year and disease.
We tried our level best to cover all the latest and existing techniques used for disease diagnosis using EHR. The advantages and drawbacks of various proposed algorithms in each category are presented in this paper. The yearly decomposition of papers with respect to its category is also given in graphical format. We expected to give the professional a structure for the current research and to familiarize individuals with the various disease diagnosis approaches.

IV. CONCLUSION
A lot of work has been done for automatic extraction of useful information from electronic health records, clinical notes and discharge summaries. The physician uses features extracted information as an input for the automatic diagnosis of disease. Knowledge bases initially performed this extraction. In the present era, various rule-based learning, machine learning, and deep learning concepts are used for the extraction and diagnosis of diseases. There are several challenges associated with this automatic extraction including missing values, incomplete information, and data abundance. We have reviewed recent research for the automatic diagnosis of various diseases from electronic medical records. We categorized our work into three classes 1) Rule-Based Methods, 2) Machine Learning Methods, and 3) Deep Learning Methods. These categories are further divided into sub-categories based on the proposed algorithm. In this review, we tried to cover almost all the latest and existing research of automatic diagnosis from electronic records. We presented the benefits, limitations, and future directions of various data-driven methods, dataset employed and focused disease. Moreover, we tried to establish a professional structure to familiarize with an up-to-date automatic disease diagnosis technique. He has a range of publications in these fields in the conferences and journals of repute. He is a Co-Chair, a TPC Member, and a Reviewer for prestigious international conferences and journals, including Smarttech 2020, CSAE 2018, 2019, and 2020, and IEEE ACCESS.
AZHAR IMRAN (Graduate Student Member, IEEE) received the B.S. degree in software engineering and the M.S. degree in computer science from the University of Sargodha, Pakistan, in 2012 and 2016, respectively. He is currently pursuing the Ph.D. degree with the Beijing University of Technology, Beijing, China. From 2012 to 2017, he was a Lecturer with the Department of Computer Science, University of Sargodha. He has published many articles in reputed journals and conferences. His research interests include machine learning, image processing, medical imaging, and data mining. His awards and honors include the China Scholarship Council (CSC), China, and the Star Contribution Award for best researcher 2018 from the Beijing University of Technology.
ANAS BILAL received the B.S. degree in telecommunication and networks from Iqra University, Pakistan, in 2013, and the M.S. degree in electrical and electronic systems from the University of Lahore, Pakistan, in 2016. He is currently pursuing the Ph.D. degree with the School of Information Technology, Beijing University of Technology. His research interests include neural networks, medical image analysis, machine learning, sentiment analysis, pattern recognition, signal and image processing, geometrical channel modeling, channel estimation, and mobile communications. VOLUME 8, 2020