An Electrocardiographic System With Anthropometrics via Machine Learning to Screen Left Ventricular Hypertrophy among Young Adults

The prevalence of physiological and pathological left ventricular hypertrophy (LVH) among young adults is about 5%. A use of electrocardiographic (ECG) voltage criteria and machine learning for the ECG parameters to identify the presence of LVH is estimated only 20-30% in the general population. The aim of this study is to develop an ECG system with anthropometric data using machine learning to increase the accuracy and sensitivity for a screen of LVH. In a large sample of 2,196 males, aged 17–45 years, the support vector machine (SVM) classifier is used as the machine learning method for 31 characteristics including age, body height and body weight in addition to 28 ECG parameters such as axes, intervals and voltages to link the output of LVH. The diagnosis of LVH is based on the echocardiographic criteria for young males to be 116 gram/meter2 (left ventricular mass (LVM)/body surface area) or 49 gram/meter2.7 (LVM/body height2.7). On the purpose of increasing sensitivity, the specificity is adjusted around 70-75% and all data tested in proposed model reveal high sensitivity to 86.7%. The area under curve (AUC) of the Precision-Recall (PR) curve is 0.308 in the proposed model which is better than 0.109 and 0.077 using Cornell and Sokolow-Lyon voltage criteria for LVH, respectively. Our system provides a novel screening tool using age, body height, body weight and ECG data to identify most of the LVH among young adults. It provides a fast, accurate and practical diagnosis tool to identify LVH.


I. INTRODUCTION
Artificial intelligence (AI) grows fast with the improvement of technology and the availability of various kinds of big data. Machine learning, an AI of the computational statistics, has been introduced in clinical medicine which could provide accurate diagnosis of disease and prediction of the risk [1]- [12]. For example, [12] utilizes the random survival forest technique identifying the top-20 risk factors of cardiovascular events and the performance is superior to the traditional risk calculators. In the modern era, using machine learning techniques has become an efficient and reliable tool for clinical practice by physicians globally.
Left ventricular hypertrophy (LVH), which is clinically considered as a sign of end-organ damage related to longterm hypertension, has been associated with heart failure and cardiovascular disease events among middle and old-aged individuals [13], [14]. In contrast, the prevalence of LVH in young adults is low, accounting for approximately 5% [15], and the phenotypes are usually caused by physiologic adaptions to intense physical training [16] and congenital hypertrophic cardiomyopathy [17]. A prior population research also shows that the presence of LVH at young ages is associated with higher risk of incident cardiovascular disease events [18]. The 12-lead surface electrocardiography (ECG) is the currently most used tool for screening the presence of LVH in the general population [19]. Several ECG-based criteria such as the Cornell and Sokolow-Lyon formulas have been proposed for more than 30 years [20], [21]; however, the performance of the ECG-based criteria for LVH consistently yields high specificity (>95%) but low sensitivity (20%-30%). Over the past 5 years, a few population studies were presented by machine learning and deep learning for the VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ ECG characteristics to detect presence of LVH [1]- [3]. For hypertrophic cardiomyopathy (HCM), one of the most common pathological LVH, Rahman et al. firstly used the random forest and the support vector machine (SVM) techniques, and 5-fold cross validations, for hundreds of ECG characteristics training, where showed excellent results regarding the sensitivity, specificity and precision up to 90% in a hospitalbased population study [2]. Subsequently, Tison et al. used the deep learning of convolutional neural network for numerous ECG parameters to identify HCM, consistently showing excellent results in a hospital-based population study [3]. However, the sensitivity using the specific ECG criteria for HCM has approached up to 90% [22], [23]. By contrast, a community-based population study using machine learning for the ECG to screen any unspecific LVH was proposed by Sparapani et al. [1]. Despite this study utilized the treebased, Bayesian nonparametric machine learning technique for a number of ECG characteristics alone, the sensitivity for detecting any unspecific LVH phenotype in a general population of middle and old aged individuals is increased up to 29.0%, merely a little improvement compared with the other ECG criteria for LVH [1]. Accordingly, computerized training of the ECG alone might not be adequate in screening for unspecific LVH in the general population level.
In this paper, we aim to develop a clinically accessible ECG-based system which uses a large sample of the military young personnel taking age, anthropometric data and several ECG characteristics into account for machine learning by the method of SVM to predict the presence of unspecific LVH as shown in Fig. 1. The rest of this paper is organized as follows. The materials are presented in Section II. In Section III, the proposed algorithm regarding the ECG system for unspecific LVH detection is described in detail. Section IV displays the experimental results. Section V concludes this paper.

II. DATA COLLECTION AND FEATURES SELECTION A. DATA COLLECTION
This study uses a population of 2,196 military males aged 17-45 years from the ancillary cardiorespiratory fitness and hospitalization events in armed forces (CHIEF) substudy performed in the Hualien Armed Forces General Hospital in Hualien city, Taiwan, R.O.C. Each participant received an ECG and an echocardiography at the same clinic visit in a health examination prior to the annual exercise tests for the military rank promotions and awards. The study design has been described in detail previously [24]- [33]. The raw data of 12-lead ECG parameters are interpreted by the software products of two ECG manufacturers: one is the CAR-DIOVIT MS-2015 (Schiller AG, Baar, Switzerland), and another one is the TC70 CARDIOGRAPH (Philips, Amsterdam, Netherlands). The transthoracic echocardiography is performed via the IE33 (Philips, Amsterdam, Netherlands). All the echocardiography and ECG procedures are implemented by the same technician who has been certificated with plenty of experiences for longer than 20 years. The 28 ECG characteristics adopted in the proposed method include heart rate, the durations of P wave, PR interval, QRS interval, QT interval and QTc interval in Lead II, and the axes of P, QRS, and T waves in Lead II, and the voltages of R waves in all Limb Leads I, II, III, aVR, aVL, aVF and S wave in Lead aVL, and the voltages of R and S waves in all precordial Leads V1-V6, where the voltage of 1 mV indicates 10 mm. In addition, a population of 203 military females aged 17-42 from the ancillary CHIEF substudy is utilized as an additional test set using the male model of machine learning by age, anthropometric data and ECG parameters. The comparison methods are the Sokolow-Lyon voltage criterion for LVH [20] and the Cornell voltage criteria for males and females [21], which are revealed in Table 1, respectively.  The diagnosis of LVH is based on the recommendations of the American Society of Echocardiography [34]. Quantification of left ventricular internal dimension (LVIDd) and left ventricular wall thickness including interventricular septum (IVSd) and left ventricular posterior wall (LVPWd) is measured by M-mode and 2-dimensional methods at the mitral valve tips and at the onset of the QRS complex in ECG of end diastole in echocardiographic parasternal long axis view. Left ventricular mass (LVM) is calculated on the basis of the corrected echocardiographic formula proposed by Devereux et al. [35] as shown in (1).
LVM is respectively indexed for body surface area (BSA) and for height 2.7 based on the Dubois and Dubois [36] and de Simone et al. formula [37], respectively. Echocardiographic LVH for young males is defined to be the 95 th percentile of the military males and according to the finding of a prior Coronary Artery Risk Development in Young Adults study (CARDIA) [18]. In addition, echocardiographic LVH for young females is defined according to the 95 th percentile of the military females in the CHIEF study and the results of a prior study for Southeastern Asia young females [38]. The sex-specific echocardiographic criteria for LVH are listed in Table 1. To develop the proposed machine learning method, the data are partitioned into 80% for cross validation and 20% for test for the male samples. The study protocol was approved by the Institutional Review Broad of Mennonite Christian Hospital (No. 16-05-008) in Hualien City, Taiwan.

B. PRE-TEST FOR INPUT FEATURES
To select the proper features, at initial stage, we stepwise add several biological parameters on the 28 ECG parameters, as input features for SVM machine learning to determine the most clinically efficient system. These biological parameters include age, body height, body weight, body mass index (BMI), waist circumference, systolic blood pressure (SBP), diastolic blood pressure (DBP) and body surface area (BSA). The preliminary performances of additional biological parameters and adopted 28 ECG parameters are listed in Table 2. For the stepwise pre-test of input features, we only take training set and test set without cross validation for SVM model to compare the results of various ECG-based combinations. As shown in Table 2, when more input parameters are trained, there are larger area under curves (AUCs) of the Receiver Operating Characteristic (ROC) curves and Precision-Recall (PR) curves in the test set. A significant improvement in AUCs of the ROC and PR curves is observed when using age, body height, body weight with the 28 ECG parameters as inputs to relate to the output of LVH. Additional inputs of BMI, waist circumference, BSA, SBP and DBP are neutral or merely increase a little in performance. Thus, the 31 features including age, body height, body weight, and the 28 ECG parameters are determined as the input features of our machine learning model. The average values in each parameter of the participants are revealed in Table 3. The label of LVH is by the echocardiographic LVM/BSA ≥ 116 gram/meter 2 or LVM/height 2.7 ≥ 49 gram/meter 2.7 for young males. As shown in Table 3, the characteristics in those with and those without LVH are continuous data which are expressed as mean ± standard deviation and compared by two samples t-test. A p-value < 0.05 is considered significant. Notably, older age, lower body height and greater body weight are observed in those with echocardiographic LVH.

III. PROPOSED METHOD
According to the preliminary pre-test outcomes, the 31 input factors for machine learning are age, body height, body weight and the adopted 28 ECG parameters. This paper uses the SVM for predicting the presence of LVH among young adults. The reason for selecting SVM as the model is due to the advantages of SVM classifier which are effective in high dimensional spaces and memory efficient, and could provide successful discriminative models in many fields [2], [39]- [41]. In addition, the training time and running time of SVM are extremely short. Therefore, we utilize the SVM machine learning technique which can be feasible in an ECG equipment to achieve practical application. The flowchart of the proposed method is illustrated in Fig.2.

A. DATA NORMALIZATION
Firstly, we use the normalization of Min-Max scaling [42], [43] to individually normalize the original data of 31 input features into the interval between 0∼1 for solving the problem of different dynamic ranges of various input features. Min-Max normalization performs a linear transformation on the original data. Each of the actual data d of feature f is mapped to a normalized value which lies in the range of 0 to 1. The Min-Max normalization is calculated by using (2).
where d indicates the original data of feature f among the 31 input features, min(f ) and max(f ) represent the minimum and maximum values of the input feature f , respectively. d denotes the normalized data.

B. CROSS VALIDATION
The data of 2,196 military males are segmented into a total training and validation set and a test set with 4:1 ratio. The total training and validation set is partitioned into four equal size groups. Among the four groups, one group is treated as the validation set for validating the model, and the remaining three groups are taken as the training set. Each of the four groups is used once as the validation set. The proportions of non-LVH and LVH cases are similar across each group. The cross validation process is then repeated four times. Four AUCs of PR curves from the four folds are averaged as a single performance of the results. By using a 4-fold cross validation, a better generalization assessment of the performance for training can be obtained.

C. APPLICATION OF SMOTE
The data numbers illustrated by four folds are described in detail in Table 4. Our datasets are predominately composed of non-LVH cases with only a small percentage of LVH cases since the prevalence of LVH in young adults is approximately 5%. For example, in Table 4, the 1st cross validation, the numbers of the training set and validation set are 1,317 (Non-LVH: 1,225, LVH: 92) and 439 (Non-LVH: 408, LVH: 31), respectively. This imbalance in sample sizes between the Non-LVH and LVH cases is obvious. The solution for this issue is to increase LVH cases in pre-processing by applying the synthetic minority over-sampling technique (SMOTE) [44]. The main idea of SMOTE is to create new minority class samples by choosing a near minority class neighbor randomly and interpolating as described as follows. Firstly, for each minority class sample S i , its k nearest neighbors from other minority class samples are taken. Secondly, minority class sample S j among the k neighbors is randomly selected. VOLUME 8, 2020 Finally, the S New is generated as the synthetic sample by interpolating between S i and S j as (3).
where rand (0, 1) stands for a random number between 0 and 1. S j ∈{k neighbors of S i }. The process of applying SMOTE can be treated as interpolating between two LVH samples in the viewpoint of geometry. The decision space for the LVH samples is expanded. Thus, it allows the SVM method to have a higher prediction performance on unknown LVH samples. The SMOTE is used in the process of 4-fold cross validation. The training data of the LVH group are preprocessed and augmented by SMOTE to be the same numbers with those of the Non-LVH group as 1,225, 1,230, 1,229 and 1,215, respectively, for the four folds as shown in Table 4.
We also compare the performances of the test set for the proposed model with and without using SMOTE.

D. MACHINE LEARNING MODEL
Our proposed method utilizes SVM [45]- [48] as a binary classifier for machine learning. In our method, a 31-dimensional vector represents a data point and we employ the linear kernel (Linear SVM) to separate such points by a 30-dimesional hyperplane. Maximum-margin is constructed by SVM so that the distance from the hyperplane to the nearest subset of the training data points (support vectors) of Non-LVH or LVH class is maximized. The softmargin SVM is adopted in our method. Soft-margin SVM allows the wide decision margin and some outliers are inside or on the wrong side of the margin. Let x i ∈ R 31 denote a 31-dimensional training vector with associated label y i ∈ {1, −1}. x i also includes the synthetic samples of LVH group applied by SMOTE and all the data of x i are processed by Min-Max normalization. n indicates the number of training vectors. The weight vector w, which is related to the construction of hyperplane for SVM, is obtained by solving the objective function as shown in (4) [41], [47]. The second term in (4) is the squared hinge loss (L2 loss) function for the soft-margin SVM evaluated on the training data and weighted by hyperparameter h. The soft-margin formulation can help in avoiding over-fitting.
where h is a hyperparameter which decides the trade-off between maximizing the margin and minimizing the training error. When h is large, avoiding misclassification is emphasized at the expense of maintaining the margin small, whereas when h is small, classification errors are presented less importance and focus is more on maximizing the margin. The optimized hyperparameter h is chosen by grid search according to the average AUC of the PR curves of the cross validation in our algorithm. As demonstrated in Fig.2, the hyperparameter h is initialized to 0.02. The training processes with the increment 0.001 of h for grid search is iterated until h reaches to 1. The optimized hyperparameter is chosen based on the highest AUC of the PR curves among the candidates of h.
After selecting the optimized hyperparameter, the SVM training model will be determined by the data in the total training and validation set. As shown in Table 5, the data of total training and validation set for the LVH group are preprocessed by SMOTE, and the number is increased to 1,633.

IV. RESULTS AND DISCUSSION
Our proposed method is implemented using the software scikit learn v0.20.2 with Python programming language [49]. In addition, the optimal weight vector w for hyperplane is obtained by LIBLINEAR (A Library for Large Linear Classification) [47], an open source library for large linear classification. The optimized hyperparameter h 0.322 is chosen when the highest AUC of the PR curve averaged from the 4-fold cross validation is found from the values of 981 trials.

A. PERFORMANCE MEASUREMENT
The specificity 70-75% is the criterion to decide the most appropriate test cut-off probability [50] for our SVM method. The performance is assessed by several standard measurements including accuracy, specificity, sensitivity (recall), precision, F 1 score, the AUC of the ROC curve and the AUC of the PR curve [51], [52].
The definitions of accuracy, specificity, sensitivity and precision are calculated by true positive (TP), true negative (TN), false positive (FP), and false negative (FN) as denoted in (5) - (8). The F 1 score, which is the harmonic average of the precision and recall, is described in (9).

B. RESULTS
The results of the 4-fold cross validation for the validation set with the optimized hyperparameter are shown in Table 6. The prevalence of LVH in the validation set is range from 4.8%  to 8.2% as shown in Table 6. Average accuracy, specificity, sensitivity, precision and F 1 score are 73.3%, 72.9%, 78.8%, 18.3% and 29.3%, respectively. The ROC and PR curves for the four folds are compared in Fig. 3. The average AUC of the ROC curve is 0.828 and the average AUC of the PR curve is 0.289. Table 7 shows the prediction results of the total training and validation set, test set and total data. In the total training and validation set, the SMOTE is applied for the LVH group to increase the prevalence rate to 50%. Thus, the precision, F 1 score and AUC of PR curves are much better than the other two datasets. In the test set and total data, the prevalence of LVH is generally distributed around 6-7% in the population of young adults. Our proposed SVM machine learning method is also compared with the Sokolow-Lyon voltage and Cornell voltage criteria for LVH as shown in Table 8. All data of the 2,196 military males are tested in the model. With the specificity of 73.3%, intended to be set between 70-75%, our SVM technique provides much better sensitivity 86.7% compared to 3.3% and 52.7% regarding the Cornell and Sokolow-Lyon voltage criteria, respectively. The ROC and PR curves for the three approaches are shown in Fig. 5. It is obvious that the proposed method is much superior to the other two traditional ECG voltage criteria.
In addition, we also test the CHIEF military female subcohort data with the label of echocardiographic LVH by the definitions of LVM/BSA ≥ 88 gram/meter 2 or LVM/height 2.7 ≥ 41 gram/meter 2.7 for young females using the proposed SVM VOLUME 8, 2020   model trained by the military young males. The baseline data with an average for each adopted biological and ECG features of the female participants with and without echocardiographic LVH are revealed in Table 9. It is contrary to the findings of the male participants that there are no significant differences in the adopted biological parameters including age, body height and body weight, and there are only two significant differences in the ECG characteristics including the amplitudes of R waves in chest Leads V1 and V3. In addition, the results of the female test set with regard to the accuracy, specificity, sensitivity, precision and F 1 score are 76.4%, 76.3%, 76.9%, 18.2% and 29.4%, respectively, and shown in detail in Table 10. Compared to the conventional Sokolow-Lyon voltage and Cornell voltage criteria specifically for females [21], the proposed SVM method can provide superior performance evaluated by F 1 score, and the AUCs of the ROC and PR curves. The ROC and PR curves obtained from the female's test data are shown in Fig. 6.   the disadvantages are that most of the studies include a small sample size of participants [53]- [55], or the output is aimed merely for HCM [2], [3] but not for all kinds of LVH phenotypes in the general population. To our knowledge, the Multi-Ethnic Study of Atherosclerosis (MESA) study might be the only one research in screening for the presence of any unspecific LVH based on the definition of cardiac magnetic resonance imaging in a large sample size of middle and old aged general population [1]. As compared with MESA, our study has superior results as both the ECG characteristics and simple biological features are trained by the SVM. On the basis of feature importance analysis of the proposed SVM model, it is obvious that body height and body weight are the strongest predictors of LVH among young adults. We also notice that an addition of systolic and diastolic blood pressures on the currently used SVM model can improve merely a little or is similar in the detection of LVH. It is reasonable that elevated levels of blood pressure are highly correlated with greater body mass index among young adults [56] and the effect time on cardiac remodeling is relatively short. Therefore, blood pressure may not play a critical role on the development of LVH in young adults.

V. CONCLUSION
This study develops a clinically effective ECG-based system with age and simple anthropometric data through the SVM machine learning technique in screening for unspecific LVH among young adults, which improves much in the sum of sensitivity and specificity as compared with the traditional ECG criteria for LVH or using the ECG parameters alone for the machine learning. The sensitivity of our proposed method achieves up to 92.6%. In addition, since the test performances regarding accuracy, specificity, sensitivity, precision, F 1 score, and the AUCs of the ROC and PR curves for the female samples are not optimal by adopting the SVM model trained by male samples, future studies should be done to clarify the validity of our system operated specifically for young females.