Heartbeat Classification by Random Forest With a Novel Context Feature: A Segment Label

Objective: Physicians use electrocardiograms (ECG) to diagnose cardiac abnormalities. Sometimes they need to take a deeper look at abnormal heartbeats to diagnose the patients more precisely. The objective of this research is to design a more accurate heartbeat classification algorithm to assist physicians in identifying specific types of the heartbeat. Methods and procedures: In this paper, we propose a novel feature called a segment label, to improve the performance of a heartbeat classifier. This feature, provided by a Convolutional Neural Network, encodes the information surrounding the particular heartbeat. The random forest classifier is trained based on this new feature and other traditional features to classify the heartbeats. Results: We validate our method on the MIT-BIH Arrhythmia dataset following the inter-patient evaluation paradigm. The proposed method is competitive with other similar works. It achieves an accuracy of 0.96, and F1-scores for normal beats, ventricular ectopic beats, and Supra-Ventricular Ectopic Beats (SVEB) of 0.98, 0.93, and 0.74, respectively. The precision and sensitivity for SVEB are 0.76 and 0.78, which outperforms the state-of-the-art methods. Conclusion: This study demonstrates that the segment label can contribute to precisely classifying heartbeats, especially those that require rhythm information as context information (e.g. SVEB). Clinical impact: Using a medical devices embedding our algorithm could ease the physicians’ processes of diagnosing cardiovascular diseases, especially for SVEB, in clinical implementation.


I. INTRODUCTION
Cardiovascular disorders cause 30% of the deaths worldwide [1]. The electrocardiogram (ECG) is an important tool for cardiologists to diagnose cardiovascular diseases [2], which can often be regarded as an ECG classification task. Meanwhile, single heartbeat classification based on ECG is also of great importance. Firstly, because ECG consists of heartbeat signals, heartbeat classification can be a foundation of ECG classification. Secondly, a good heartbeat classifier is a better tool to work with than an ECG recording classifier for the following reasons: 1) It can help cardiologists make more sophisticated diagnoses based on certain heartbeats. 2) It is more transparent than the ECG classifier since physicians can verify the results and understand the inference more easily, because it focuses on a smaller segment of ECGs.
In the last decade, many studies utilized machine learning to classify heartbeats. Those machine learning based classifiers are mostly trained with features from the following two categories: 1. medical features [3], [4], such as pre_rr, post_rr, local_rr, global_rr, etc. and 2. statistic features in the field of signal processing [3], [4], [5], such as higher-order statistics, wavelet transform coefficients, entropy, and energy density, etc. Besides those two categories, some studies proposed their novel features, such as Sparse Representation [6].
In recent years, some researchers also used deep learning to classify heartbeats. Among them, the one-dimensional Convolutional Neural Network (CNN) [14], [15], [16] is more widely used than the Deep Neural Network (DNN) [17]. To take the advantage of the context information around the particular heartbeat, some research use the Recurrent Neural Network (RNN) [18], and some use Echo State Network (ESN), which is less likely to overfit [19].
However, most states of the art methods do not show a very satisfactory performance following the inter-patient evaluation paradigm [20], especially for the heartbeat classes which require context information for that heartbeat, e.g. supraventricular ectopic beat (SVEB). This work focuses on solving this problem.
Cardiological studies suggest that it is impossible to correctly classify some heartbeats only based on the signal of that particular heartbeat itself [2]. Hence some works use RNN to convey temporal context information. However, because the RNN is likely to overfit when data is rare, we choose to use a traditional machine learning model -the random forest, instead of a deep learning model like the RNN. To convey context information without applying RNN, we proposed a novel context feature, called a segment label, for each heartbeat.
A segment label is provided by a pre-trained ECG recording classifier. The classifier predicts the ECG label based on the ECG segment including that particular heartbeat and its surrounding heartbeats. The label summarizes a longer period of the ECG characteristics and implies a cross-heartbeat feature. Therefore, it could provide context information around this particular heartbeat, and serve as a feature for further heartbeat classification. To our knowledge, this is the first work using a deep learning model to generate context features for the training of a heartbeat classifier.
The objective of this work is to design a better heartbeat classification algorithm, especially better on SVEB, with a novel feature: a segment label.

II. MATERIAL
We used three datasets to build and validate our methods. Among them, dataset A is used to train the heartbeat classifier, and hence to validate the heartbeat classification performance. Dataset B/C is used to train two ECG recording classifiers separately. The ECG recording classifier will be employed as one important part of the whole algorithm. We will describe its integration to the method in Section III.

A. DATASET A
We use a widely-used heartbeats classification dataset -MIT-BIH Arrhythmia [21] to validate our methods. This dataset contains 48 half-hour excerpts of two-channel ambulatory ECG recordings. In total, there are 109,449 heartbeats in 15 sub-classes. And according to ANSI/AAMI EC57:1993/(R)2008 recommendation [22], 15 sub-classes are grouped into 5 classes: normal sinus node (N), ventricular ectopic beat (VEB), supra-ventricular ectopic beat (SVEB), fusion heartbeats (F), unknown beat type (Q). As recommend in [23], four recordings with pacemaker are dropped, and the rest are separated into 22 train recordings (DS1) and 22 test recordings (DS2) in advance, as shown in Table 1.

B. DATASET B
This will be further employed as a pre-training dataset. It is based on the PhysioNet/CinC Challenge 2017 training dataset [24] containing 8,528 ECG recordings. They are processed by the zero-mean unit-variance filter, Then the dataset is augmented by 2,000 additional 10 seconds ECG segments, as well as 2,000 noisy ECG segments. Those noisy ECGs are made by time-reversing the existing 284 noisy ones in the dataset. See more details about the generation of those data in [25]. Their ECG labels are shown in Table 2.
C. DATASET C This will also be further employed as a pre-training dataset. The dataset consists of 6,877 ECGs from PhysioNet/CinC Challenge 2020 training dataset released at a competition event [26]. Their ECG labels are shown in Table 2.

III. METHODOLOGY
The overall process of the algorithm is shown in Fig. 1. First of all, preprocessed ECG are cut into continuous heartbeats  and ECG segments in parallel. Then, the heartbeats are used to extract features (Features I), while the ECG segments are used to infer the segment labels (Features II) by an ECG recording classifier, and the label is assigned to each heartbeat within the exact segment as a new feature of those heartbeats. Finally, Features II, combining with via Mutual Information score selected Features I, are employed to train a heartbeat classifier.
Therefore, two kinds of classifiers evolved within the process. One is a heartbeat classifier of our target. Another one is an ECG recording classifier, as shown at the bottom of Fig. 1. ECG recording classifier is a long-term signal classifier, which takes a period/clip of ECG as input rather than a single heartbeat. Thus, training an ECG recording classifier requires a different type of annotated dataset, compared to training a heartbeat classifier. Specifically, the objects annotated by the labels are not at the same level between heartbeat classification and ECG recording classification, as shown in Fig. 2. In our case, dataset A is employed as a heartbeat-level dataset to train the heartbeat classifier, and dataset B/C is employed as a recording-level dataset to train the ECG recording classifier.
A detailed introduction of the method is in the following: The signal is pre-processed by resampling the signal to 150 Hz. We also need some fiducial points of ECG for the calculation of some features. R peaks are directly provided from the dataset. Other fiducial points, such as P peak, the start of QRS complex, end of QRS complex, R peak, and S peak are calculated by a heuristic mathematical method same as [8]. Due to space limitations, no detailed introduction is provided here.

B. FEATURE EXTRACTION
The features consist of two parts. One is the conventional feature, which is some human-designed feature of that heartbeat, similar to [8]. Another is the segment label.

1) (Part I) CONVENTIONAL FEATURES
In the following, the signal of each heartbeat is defined as the ECG signal on the first lead between -250ms and 250ms centered on its R peak: • RR interval features (6 features) rr: The current RR interval, which means the number of samples between the R peak of this heartbeat and the previous heartbeat. pre_rr: previous RR interval. post_rr: post RR interval. Those three features are widely used for machine learning based heartbeat classification; the ratio between pre_rr and rr, as well as post_rr and rr; t rr : t-statistic of rr, with the standard error calculated from the last 32 rrs, same as [8].
• Medical morphology features (12 features) The amplitude of P wave, Q wave, R wave, S wave, the difference between the amplitude of P and Q wave, Q and R wave, R and S wave, and the distance between P peak and the start of QRS complex. Those 8 features are from [27]; The distance between Q and S peak; Width of QRS, and in half (QRSw2) and quarter level (QRSw4), as defined in [8] and shown in Fig. 3. Additionally, local normalized [8] versions of those 12 medical morphology features are also added to the feature set. They are named as ''. . . _norm''.
• Mathematical signal morphology features (48 features) Kurtosis and skewness of five equally length parts of the heartbeat signal as [3]; The same Discrete Wavelet Transform (DWT) coefficients of the heartbeat signal as [3]; and the same Hermite Polynomials Transform (HPT) coefficients of the signal as described in [28] (named as hbf _i for i-th coefficient). Following previous work [8], we use Mutual Information (MI) score as the feature selection method in our research. Mutual information is a score that measures the information dependence between the features and labels, which indicate the importance of each feature. We rank the features based on their MI scores from high to low, then only select the top-n features. The same proposed number of features n f as in [8], which is 6, is used. The selected features are QRSw2_norm, QRSw4_norm, rr_norm, post_rr/rr, hbf _6, and QRSw2.

2) (Part II) SEGMENT LABEL
A segment label is a novel feature, a discrete multi-class label. It is a word as a categorical variable. It is acquired from a pre-trained ECG classifier, and used as an additional context feature of the heartbeat. More specifically, the direct output of the pre-trained ECG classifier, which we call it ECG recording classifier, is a probability distribution over several possible ECG classes. We then take the most likely ECG class, namely the ECG class with the highest probability, as the segment label.
To get the segment label, first of all, the ECG of each patient will be divided into several continuous segments, as shown on the left side of Fig. 4. The concrete length of each ECG segment is determined by the pre-trained ECG classifier. As a comparison, we also tried a different division strategy which is overlapping a half, shown at the right side of Fig. 4. The comparison results is announced in Section IV-C.
Then this pre-trained ECG classification model will predict the label for these ECG segments after they have been preprocessed in the same way as the ECG classifier requires. The label will be assigned to each heartbeat in this segment as the segment label.
To further use segment label as a feature of heartbeats for heartbeat classification, we need to encode it first. We use two ways to encode segment labels. If there is no overlap, we use the label ID as the new feature. If there is overlap, the segment labels will be transformed into a binary string with the length of the count of different possible predictions of the ECG classification model. Each bit of this binary string means whether this heartbeat has that label as the segment label or not.
As Convolutional Neural Network (CNN) is widely used as the ECG classifier [14], [15], [16], we tested two kinds of CNN to assign the segment label to each heartbeat in our research. One is a novel CNN framework designed and trained by us, named as ecgclf_c. Another one is a ResNet framework from [25], named as ecgclf_b. This one serves as a comparison with our novel framework. The following introduces the details of those two segment label extraction models: 2a) ecgclf _b [25]: It is a ResNet model consisting of 16 1-dimensional ResNet blocks and a softmax last layer. The loss function is categorical cross-entropy. It is trained on dataset B with Adam optimizer. We refer to [25] for additional details.
2b) ecgclf _c: It is a CNN framework designed by us. It has 7 ConvUnits and 3 fully connected layers as shown in Fig. 5. The ConvUnit has a convolutional layer with 128 filters and filter size as 3, a maxpooling layer with pooling size as 2, and a dropout layer with dropout rate as 0.5. Each fully connected layer is followed by a batch normalization mechanism [29]. Finally, the model is covered with a sigmoid layer. Here we did not choose the softmax as the last layer because the ECG label from the dataset could be multiple for some ECGs. The loss function is binary cross-entropy, and the model is optimized by adam optimizer with a learning rate of 0.001.
This segment label extraction model is trained on dataset C. We applied two steps of data preprocessing to the ECG VOLUME 10, 2022 FIGURE 5. Model architecture of ecgclf_c, which is a ECG recording classifier in Fig. 1 serving as a segment label extractor.
signals from dataset C. One is using two consecutive median filters of 200ms and 600ms to remove the baseline of the ECG signal, similar to [3]. Another is normalizing the signal strength within each record. Besides that, we also cut the signal into 60 seconds if it is longer than 60 seconds, or padded it into 60 seconds with zeros if it is less than 60 seconds. We only took lead-II of the ECG and the signal is resampled to 360 Hz, hence the input has 21600 samples and only one channel (x seg ∈ R 21600×1 ). In our case, we used 80% of the data to train the model, 10% to validate it, and 10% to measure the performance of this ECG classification model. We train the model for 35 epochs. The learning rate and the number of epochs are tuned on the validation dataset. We choose the label with the highest prediction score as the model predicted label.

C. HEARTBEAT CLASSIFICATION
Random forest is used to classify the heartbeats with m features including features after feature selection and the segment label.
Random Forest is a bagging method of the decision trees. n d decision trees are trained on the same size bootstrap training set. In each split-step of training the decision tree, a subset of m features out of m features is used to choose the best split variable. The prediction of the random forest model is an aggregation of predictions of every decision tree based on votes counting. Namely, the label with the highest votes is the prediction of the random forest.
In our case, we set n d = 200, so the random forest has 200 decision trees. We set m = √ m . And the often-used gini function is applied as the function to measure the quality of a split when training each decision tree. Gini-function is defined as follows, where p i stands for the sample probability of class i in each group after the split: The random forest is trained with balanced sample weights. The balanced sample weights are inversely proportional to the class frequencies of the input data.

D. PERFORMANCE METRICS
To get a more balanced measurement of each targeted heartbeat class, we use the macro-averaged F1-score as the key performance metric of heartbeat classifiers.
Macro-averaged F1-score is defined in the following formula, with f i 1 means the F1-score of class i, and N means the number of classes: In (3), p i (means precision) and r i (means recall) are calculated as true positive(TP)/positive(P) and true positive(TP)/(true positive(TP) + false negative(FN)) for class i, respectively.

A. SEGMENT LABEL QUALITY
The quality of the segment label affects the performance of the heartbeat classifier. As the performance of the ECG recording classifier determines the quality of the segment label, We regard macro F1-score of the ECG classifier as the segment label quality. Namely, the segment label quality from ecgclf_b and ecgclf_c are 0.72 and 0.70 respectively. The detailed performance of ecgclf_b are shown in [25] and detailed performance of ecgclf_c are shown in Appendix Table 7.

B. HYPERPARAMETER TUNING
We tune the hyperparameters by ''leave-one-patient-out'' cross-validation like in [30], in which training heartbeats from a specific recording will be excluded as the validation dataset in each round. Fig. 6 shows the cross-validation hyperparameter-tuning for the number of decision trees n d based on macro-averaged F1-score.

C. THE PROPOSED METHOD PERFORMANCE
We call the model trained with the segment label given by ecgclf_b as RFS b , means Random Forest with Segment label extracted from ecgclf _b, and the model trained with the segment label provided by ecgclf_c as RFS c .
The models are implemented with scikit-learn framework [31]. The training time for those two proposed models RFS b and RFS c , on a machine with Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz, 32GB RAM and a GeForce RTX 2080Ti as the single GPU, are 56.1 seconds and 55.8 seconds, respectively. Those training time are not including the time of training the ECG classification model which extract segment labels.    Table 3 and Table 4 show the confusion matrix of proposed models RFS b and RFS c , respectively. It can be observed that the performances on F and Q classes are still not optimal, because of the serious data imbalance problem.
To compare the effect of different segment label extraction methods, different combinations of ECG classifiers and ECG segmentation methods are tested. As shown in Table 5, the overlap mechanism does not contribute to the performance.
It can be observed in Table 5 that RFS b 's performance on SVEB is better than others. Meanwhile, RFS c 's performance on VEB is better than others. RFS c has the best overall performance by achieving the highest macro-F1-score. This is probably because the performance of VEB label, which has more samples, contributes more to the overall performance.
So we propose RFS b and RFS c as our two new heartbeat classification methods.  Table 6 shows the performance comparison between our models and models from other works following the same inter-patient paradigm and the AAMI guidelines. Our models perform the best for SVEB with little cost to the performance of VEB, and our models have the same overall accuracy and F1-score on N as the best existing models. Comparing within our models, RFS b is better on SVEB while RFS c is better on VEB. To the best of our knowledge, there is no work that achieves a higher F1-score on SVEB than the proposed RFS b model. Compared to the work from Saenz-Cogollo [8], which could be considered as the basis of our work, our models achieve better SVEB performance without a drop in performance for other labels. This is proof that the segment label information could help the machine learning model to make better decisions for the heartbeat labels which requires the context information of the ECG.

V. DISCUSSION
Unlike other end-to-end heartbeat classifiers, our work not only uses the heartbeat-annotated dataset, but also tries to transfer some knowledge from another dataset. Unlike regular transfer learning, we train an ECG classifier from ECG annotated dataset and then transfer this knowledge from the ECG classifier to a heartbeat classifier via segment label. That is also the main novelty of this paper.
The proposed method could also be applied to recorded ECGs in real-time. The ECG recording classifier and heartbeat classifier will work simultaneously to label the heartbeats every 60 seconds automatically. The advantage of our method is that it has better performance in detecting SVEB, which is also a challenging task for physicians, while not scarifies overall performance.
The success of this approach opens a gate for heartbeat classification to use the knowledge of ECG recording classification. There is much more annotated data for ECG recording classification available to the public. Those data are also a treasure for learning about heartbeat classification. So the question of ''how to use this information'' deserves more research. Furthermore, The result of this work might not only be bound to the field of ECG analysis. ECG is similar to periodic signal data, and each heartbeat is a rough replicated period of this signal. This work indicates that the class of the whole signal can be a good hint for identifying the correct class of the periods which compose it.
Obviously, the correctness of the ECG classifier is important for this proposed method. In this work, we only tested one novel but simple CNN framework, and a published ResNet model as a comparison. This is just a starter for applying segment labels. Building a better single-lead ECG classifier could be the future work. More advanced architecture such as EfficientNet [33], SE-Net [34], bi-directional RNN [35], or Transformer [36], could be used. Besides that, we simply assign the same segment label to each heartbeat within the ECG segment in this work, whereas in theory, each heartbeat contributes differently to the ECG class. If we could use some kind of attention mechanism to assign this context feature differently, the benefits of segment labels might be greater. Furthermore, current SVEB detection in this work is based on a single lead, meanwhile, algorithms based on 12-leads might show larger improvement.

VI. CONCLUSION
In this research, we propose a random forest method to classify heartbeats. This random forest uses features concatenated by two parts. One part is some proposed features from other works as introduced in Section III-B, and we use mutual information scores as the basis for the feature selection. The second part, called the segment label, which is new in this field, is provided by a pre-trained CNN ECG classifier. The proposed methods use this segment label to transfer the context information, that the heartbeat classifier should know about, to the heartbeat. It worked as intended and achieves good results.
With this method, we obtained two heartbeat classification tools that are better at Supra-ventricular ectopic beats(SVEB) detection and also perform well in other classes. We are in the early stages of using this segmented label feature, but we have still made significant improvements. This is proof that it will be very promising to continue research on how to utilize ECG classification to improve heartbeats classification.  Table 2.

ACKNOWLEDGMENT
This work received the language support from Franziska Schuessler. All the data used in this research is open-sourced and can be easily accessed from the cited source.