Introduction
Arrhythmia is an abnormality of the heart's rhythm in which the heart beats slowly, rapidly, and irregularly. Some arrhythmias have risk of dangerous complications such as stroke, heart failure, and cardiac arrest [1], [2]. Since these arrhythmias can appear intermittently, it is considered difficult to detect them when patients visit a hospital to measure their electrocardiogram (ECG) [3]. Therefore, to diagnose an arrhythmia, the ECG signal is often measured using a Holter monitor for a period of 24 or 48 hours. However, previous studies have shown that 24- or 48-hour Holter monitoring is not effective in diagnosing some clinically important asymptomatic arrhythmias, such as episodes of atrial fibrillation or transient bradyarrhythmia [4], [5]. In addition, these Holter monitors, or memory recorders, can be inconvenient for patients due to their large size or complicated structure. Recently, patch-type single lead electrocardiographs have been introduced to reduce patient inconvenience, such as the Zio Patch – iRhythm in United States, Ezypro – SIGKNOW in Taiwan, and MEMO Patch – HUINNO in Korea. These patch-type electrocardiographs enhance patient convenience in daily life due to their lightweight and compact size.
The patch-type electrocardiograph is designed for low-power consumption and can record ECG signals for up to 14 days. According to previous studies, it was proven that most of symptomatic arrhythmias can be found when recording the ECG for about 14 days long [6]. Therefore, when a patient uses this patch-type long-term electrocardiograph, it is possible to diagnose arrhythmia and prevent dangerous complications such as stroke, heart failure, and cardiac arrest more accurately than 24-hour or 48-hour monitoring using a Holter monitor. However, since these patch-type electrocardiographs record 7 to 14 times more data than Holter monitors, clinical technicians must expend greater effort and time to analyze the ECG signals. Therefore, the patch-type long-term electrocardiographs need a stronger support from software for automated ECG analysis and arrhythmia classification than the Holter monitors.
With recent advances in artificial intelligence (AI) technology and improvements in hardware performance, the accuracy of AI now surpasses that of cardiologists in diagnosing arrhythmias. According to the study by Ng in 2017, the sensitivity and F1 score of the proposed artificial neural network model were 0.827 and 0.809 respectively, while 0.744 and 0.751 by six trained cardiologists [7]. The signal from the patch-type electrocardiograph is more labor-intensive for analysis than it from Holter monitor because the absolute amount of noise included in the signal increases due to the longer recording time. Moreover, the shape of some noise signals is similar to that of arrhythmia signals, which makes it difficult for machine learning models or algorithms to classify noise signals and arrhythmia signals. In previous studies, deep learning models were proposed to classify noise signals and ECG signals, but these models only used ECG data from ICU or Holter monitors [8], [9]. Therefore, the performance of these models in classifying ECG signals from wearable electrocardiographs has not been evaluated yet. Additionally, some studies have the problem of not using arrhythmia signals in the data, making it difficult to use them for arrhythmia diagnosis [10].
In this study, we propose a SE-ResNet-ViT hybrid classification model to classify noise from normal or arrhythmia ECG. We obtained data by measuring ECGs including arrhythmia signals and noise signals for 14 days in patients with a history of diagnosed arrhythmia or symptoms suspected arrhythmia by using HUINNO's MEMO Patch. The proposed model is thus trained with the obtained real-world ECG and noise data. Finally, we evaluated the model's performance in classifying signals as noise or not.
Methods
A. Data Acquisition
We collected data through a multi-center clinical trial conducted at Korea University Hospital and Seoul National University Bundang Hospital, which was approved by the Institutional Review Board of each hospital. The IRB numbers for the clinical trial are 2021AN0247(Korea University Hospital) and B-2105/686-002(Seoul National University Bundang Hospital) respectively. Patients who require ambulatory ECG monitoring were screened for eligibility if they were diagnosed with stroke or transient ischemic attack without any identified causes or if they had symptoms including palpitation, dizziness, or syncope. Patients were invited to participate in the study if they were aged between 19 and 80 years old, capable of providing voluntary informed consents, and able to adhere to the study protocol for 14 days of attaching a MEMO Patch for monitoring. A total of 149 people participated in the clinical trial, and the data from 70 of these people were randomly selected and analyzed in this study.
B. Device and Software
Fig. 1 shows the MEMO Patch used to record ECG from patients who participated in the clinical trial. The MEMO Patch is a single-lead adhesive patch-type ambulatory electrocardiograph, approved by the Ministry of Food and Drug Safety (MFDS) in the Republic of Korea. The device can operate for up to 14 days and record ECG at a 250 Hz sampling rate with 12-bit resolution. Patients visit the hospital, attach the device to their bodies, and then measure their ECG signals in about their daily lives. After 14 days, patients visit the hospital again and return the device. After returning the device, the technician downloads ECG data recorded in the memory in the device. ECG data is pre-annotated with a machine learning model for arrhythmia classification called MEMO Care provided by HUINNO manufacturer of the device. All the data used in this paper is then reviewed by clinical technicians.
C. Pre-processing and ECG Datasets
Some noise signals in the ECG signal can be removed with a simple digital filter. For example, baseline wander is low frequency noise that occurs by breathing, movement, or electrically charged electrodes [11]. This type of noise can be removed by applying a high-pass filter which has cut-off frequency under 1 Hz. The ECG signal also can be contaminated by the high frequency EMG signal when the patient is moving [12], which can be removed by applying a low-pass filter. Increasing the order of the filters and narrowing the cut-off frequency can effectively remove these noises from signals. However, this can also distort the ECG signal, and leading to a decrease in arrhythmia classification performance. Second order band-pass butterworth filter with 0.5-50Hz is applied in this paper to remove baseline drift and high-frequency noise for each 10 seconds ECG signal. Then the signals are normalized from 0 to 1 by minmax scaling.
The ECG signals were labeled through the following process: First, 117,000 noisy ECG signals were manually reviewed and selected by non-clinical experts. Then, 2,084 noise signals, 7,552 normal sinus rhythm (NSR) signals, and 8,086 arrhythmia signals were reviewed by clinical technicians and subsequently inspected by a cardiologist. Noise signals are not just noise only signals, but also include signals that are mixed with noise and ECG. The arrhythmia signals consist of atrial premature contractions (APC), ventricular premature contractions (VPC), atrial fibrillation (AF), supraventricular tachycardia (SVT), atrioventricular block (AVB), and other arrhythmias. The detailed classes and quantities of arrhythmia are shown in Table I. To collect more quantities of noise signals, we gathered data from 21 healthy people via the MEMO Patch in daily life. The training dataset and test dataset were separated into a 7:3 ratio of patients in each of the noise, NSR, and arrhythmia ECG signals. There is no patient overlap between the sets. In addition, the training dataset was split in an 8:2 ratio for training and validation. The overall process of data collection is shown in Fig. 2.
D. Architecture of Classification Model
To classify noise signals from ECG signals, we propose an architecture based on SE-ResNet-ViT hybrid model. The overall architecture of model is shown in Fig. 3. Following the introduction of the hybrid model combining convolutional neural network (CNN) with ViT in Dosovitskiy's Vision Transformer paper in 2020, many studies have used this hybrid architecture [13]. The hybrid model is a method of applying feature maps extracted from CNN to patch embedding projection. The hybrid model has been shown higher performance than the ResNet model in image classification. In addition, the hybrid model shows better performance than ViT in the small size model.
Previous studies have reported high classification performance when using ResNet-based deep learning models for ECG signal classification [14]. SE-ResNet is known to have high performance of classification among models with a CNN structure, and our previous study confirmed that it has higher performance of classification than the ResNet model [15]. Thus, we tried to hybrid SE-ResNet and ViT. The hybrid model projects the output of the feature map of CNN with a 1x1 patch size into the Transformer dimension. The difference between our previous study and the current one is that the sampling frequency has been changed from 200 Hz to 250 Hz. As a result, the input shape is now batch size × 2500 × 1. The stem layer consists of a convolution layer with kernel size: 7 and stride: 2, a maxpooling layer with window size: 3, stride: 2 and padding: 1. The composition of the layer, represented by the stride block, is the same as that of SE-ResNet. However, the stride is changed to 2 when performing convolution on the ResNet block.
Result
To optimize the model, we used Adam optimizer with an initial learning rate is 0.0005. We used cross entropy as a loss function and training process up to 40 epochs with a batch size of 512. All experiments were implemented with PyTorch and used RTX2080TI GPU. Model performance was evaluated using a test dataset that organized by the noise class and the non-noise class respectively. We evaluate the performance based on precision, recall, and F1 score in Table II. The average score for the two classes was calculated as a weighted average, considering the difference in quantity for each class. Confusion matrix is shown in Fig. 4. The weighted average of F1 score, precision, and recall are all calculated to be 0.964. However, the precision for noise is lower than the other scores at 0.932. It seems that this is because the number of noise signals in the test dataset is fewer than the number of non-noise signals, or the number of non-noise predictions is large. The detailed class of ECG signals classified as non-noise is shown in Table III. While reviewing the misclassified classes, it was found that VPC signals were most often misclassified as noise. Through comparison between misclassified noise and VPC signals, we observed that they had similar shapes to each other after minmax scaling shown in Fig. 5. Both waveforms appear to have wide QRS complexes and abnormal shapes.
An example of VPC and noise waveforms which become similar to each other after minmax scaling
Conclusion
Analysis of the signal from the long-term wearable electrocardiograph can be more labor-intensive because the absolute amount of noise included in the signal increases due to the extended recording time. To reduce this time consuming on operation, we suggest a machine learning based noise detection solution. In this study, we proposed a SE-ResNet-ViT hybrid model for classifying noise signals from wearable patch-type long-term ECG devices. To train and evaluate the model, we collected ECG data from the participants in clinical trials based on a diagnosis or suspicion of arrhythmia. The collected data was reviewed and labeled by clinical experts to noise or not. Finally, the weighted average of F1 score of our model was 0.964 which is high enough for accurate classification of noise signals measured by patch-type wearable ECG devices. However, we can observe that some VPC signals, which have a similar shape to noise signals, are sometimes misclassified as noise. In the future, we plan to reduce the cases of misclassification caused by the similarity in shape between VPC signals and some noise signals. We expect that the proposed method of noise classification can help to detect arrhythmias more accurately.
ACKNOWLEDGMENT
This work was supported by the Korea Medical Device Development Fund grant funded by the Korea government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health & Welfare, the Ministry of Food and Drug Safety) (Project Number: 1711138361, KMDF_PR_20200901_0174 and 1711139106, KMDF_PR_20210527_0004).