Improved Electrode Motion Artefact Denoising in ECG using Convolutional Neural Networks and a Custom Loss Function

Heart disease is the leading cause of mortality worldwide, and it is of utmost importance that clinicians and researchers understand the dynamics of the heart. As an electrical measure of the heart’s activity, the electrocardiogram, or ECG, is the gold standard for recording the cardiac state, whether monitoring the structure of the traces that make up the ECG or indicating key metrics such as heart rate variability. Long-term monitoring of ECG is often required to identify cardiovascular issues but proves impractical; therefore, patients will remotely collect their data. However, ECG signals can become contaminated with various noise sources during data collection. This paper proposes a custom loss function capable of denoising electrode motion artefact in ECG data to a higher standard than other, more common loss functions. We implement our custom loss function with a convolutional neural network to return high-quality ECG, suitable for calculating the aforementioned key metrics from a previously unobtainable state. The proposed model improves ECG signals overall signal-to-noise ratio and preserves the R waves structure. The model outperforms a standard mean squared error loss function with an improvement of 0.5 dB in terms of signal to noise ratio and improves the heart rate estimation by 25%.


I. INTRODUCTION
The electrocardiogram (ECG) is a non-invasive method to measure the heart's electrical activity and diagnose heart disease. According to the World Health Organisation (WHO), chronic heart disease was the number one cause of death from 2000-2019 [1]. Heart disease has also shown the most significant increase in deaths during this period. Long-term ECG monitoring is currently the gold standard for diagnosing cardiovascular diseases (CVDs). Unfortunately, obtaining reliable, long-term measurements of the cardiac state is a logistical challenge faced by health care professionals due to the time and resources involved. As a result, patients are frequently required to collect their ECG data remotely on a wearable device, indirectly leading to noise manifesting in the ECG signals. Contaminated signals can suppress the essential pathological biomarkers and, in some cases, will render the ECG completely unusable.
A clean ECG signal of a typical, healthy patient is shown in Figure 1. The ECG signal contains essential information about the cardiac pathologies affecting the heart, characterised by five peaks known as fiducial points, represented by the P-wave, QRS complex, and T-wave. ECG signals can be contaminated by many types of noise such as: baseline wander, powerline interference, electromyographic (EMG) noise, electrode motion artefact noise [2]. Baseline wander is a low-frequency artefact in electrocardiogram signal recordings that arises from breathing, electrically charged electrodes, or subject movement [3]. Muscle artefacts are generated due to skeletal muscle activity [4], and electrode motion is caused by changes in electrode-skin impedance and changes in skin potential [5].
In the remote setting, where clinicians cannot readily inspect ECG traces for artefacts, noise can become heavily involved in the signal, leading to a poor signal-to-noise ratio (SNR). This may conceal features that are important for diagnosis. For example, the interbeat interval (IBI), calculated using the R-wave which is valuable for heart rate variability (HRV) measurements, can become difficult to estimate if the introduced noise heavily disfigures the R wave. Therefore, denoising becomes fundamental to downstream ECG signal processing tasks.
Identifying and removing artefacts in the ECG signal can help improve patients diagnosis and treatment [6]. As more recordings are being conducted in the remote setting, this implies the manifestation of more artefacts in the ECG. Artefact detection approaches have been implemented in the literature to detect disturbances and help better estimate the quality of the recorded ECG signals [7]- [9]. Rather than taking an artefact detection approach, we seek to automatically suppress the artefact and improve the ECG signal structure essential for HRV analysis.
Conventional noise reduction methods focus on overall improvement in signal SNR but ignore the preservation of essential peaks. These peaks are necessary for heart rate, IBI, and HRV measurements for monitoring exercise, stress, and cardiovascular disease [10], [11]. In this paper, we demonstrate a Convolutional Neural Network and custom loss function for ECG signal denoising and R-wave preservation. This novel approach yields improved SNR and cardiovascular measures while allowing for less complex deep-learning models, thus advancing the development of simpler, more reliable remote patient monitoring devices.

II. RELATED WORKS
In the early stage of ECG denoising research, low-pass filters [12], adaptive filters [13] and filter banks [14] were utilised. An and Stylios [15] provide a review on the aforementioned filtering methods for motion artefact reduction methods in ECG and find that adaptive filtering performs better than the other reviewed denoising methods. Recently, there has been a move towards data-driven approaches for ECG signal denoising that are more suited to non-linear and non-stationary time series signals. Jin et al. [16] propose an ECG denoising framework through combining low-pass filtering and sparsity recovery that overcome issues present in statistical approaches such as the L1-norm. Chatterjee et al. provide a review of techniques for noise removal in ECG signals [2]. The authors review six methods of ECG signal denoising, namely empirical mode decomposition (EMD), wavelet-based models, sparsity-based models, Bayesian-filter-based models, hybrid models and deeplearning models based on autoencoders. In this paper, we opt for a deep-learning-based approach to denoising our ECG signals.
Corneliu et al. review deep-learning-based models for removal of noise in ECG signals [17]. The authors mainly focus on Long short-term memory networks (LSTMs) [18] and Convolution Neural Networks (CNNs). They find that CNNs outperform LSTMs in the deep-learning models.
Convolutional Neural Networks have contributed tremendously to the success of machine learning since their introduction in the 1990s. They are an example of neuroscientific principles influencing deep-learning [19], in that they are designed to mimic the processing of images in the visual cortex of the human brain [20]. Fully automatic learning of a CNN allows the neural network to extract features that are salient in the input data across different layers. Given the correct training, a CNN allows for the implementation of high accuracy classifiers without the need for signal processing or feature extraction knowledge. This had contributed to their success in practical applications, particularly with image classification. Here, we implement a CNN-based architecture with a custom-loss function for denoising our ECG signals.
Loss functions are used in statistical models to define an objective function that evaluates the model's performance and enables the model to learn its parameters by minimising a said loss function. The Mean Squared Error (MSE) is among the most popular loss functions used in machine learning problems. Mean squared error is calculated as the average of the squared differences between the predicted and actual values. Barton et al. introduce a non-standard loss function in Raman Spectra denoising by adding another term to the MSE loss to balance between overall signal denoising and excessive smoothing of spectral peaks [21]. The authors identified that traditional denoising algorithms, including CNNs with standard loss functions, successfully remove noise at the expense of smoothing or blurring the sharp spectral peaks, the heights of which are important in the context of Raman based diagnostics. Here, we identify the same problem for ECG signals, whereby traditional denoising methods can adversely distort the underlying heart signal. We extend the work of Barton et al. through the addition of multiple terms in the standard MSE loss, which helps improve signal denoising while maintaining the QRS complex structure, and in turn, the overall signal-to-noise ratio of the ECG.

A. COMPUTING PLATFORM
The experiments for this project were run on an Nvidia Titan Xp with PyTorch and Google Colaboratory in the interest of making the project readily deployable. The code for these experiments are available online 1 .

B. DATASETS
We use two datasets in this work; both datasets are open source and freely available on PhysioNet [22]. The first and primary dataset we use is the MIT-BIH Arrhythmia Database. This dataset contains 48 half-hour recordings of two-channel ambulatory ECG, including less common but clinically significant arrhythmias. The ECG recordings were digitised at 360 Hz with an analogue-to-digital converter (ADC) gain of 200 [23].
The secondary dataset used in this work is the MIT-BIH Noise Stress Test Database that includes 12 half-hour ECG recordings and three half-hour recordings of noise typical in ambulatory ECG recordings [24]. We are only concerned with using the noise recordings from this dataset. The noise recordings were made using physically active volunteers and standard ECG recorders, leads, and electrodes; the electrodes were placed on the limbs in positions where the subjects' ECGs were not visible. We select the electrode motion artefact and artificially add it to the MIT-BIH Arrhythmia data. Electrode motion artefact is generally considered the most troublesome since it can mimic the appearance of ectopic beats and cannot be removed easily by simple filters, as can noise of other types [24]. We visually present some examples of electrode motion artefact in Figure 2.

1) Preprocessing
For each of the 48 signals in the arrhythmia dataset, the electrode motion noise signal mixed linearly with the clean ECG record as defined by equation (1).
Where X is the original ECG signal, n is the electrode motion noise artefact and λ is the hyperparameter that controls the SNR of the noisy signal X n , and is further defined in equation (2).
Again, X is the clean ECG signal, n is the electrode motion artefact noise and a is the desired SNR value (in dB). The returned value, λ can then be multiplied by the noise, n and linearly added to the original ECG signal, X to return a noisy ECG signal, X n with a specified SNR value.
In addition, the dataset is divided into 80% of the data for training and 20% testing. We chose the first 38 records for the training set the last ten records for testing. We adopt a 'leave n-subjects' out validation approach. Following this, we chose a sliding window of 3-seconds over the data with an overlap of 0.5 seconds.

C. MODEL
We designed a four-layer 1-D convolutional network with batch normalisation and ReLU (Rectified Linear Units) followed by a fully connected layer. The model architecture can be seen in Figure 3. We chose our CNN model as it has successfully estimated heart rate in previous works [25], [26]. In addition, CNNs are pervasive in embedded devices and have achieved high performance in many real-world problems. However, their implementation often requires highperformance hardware [27], [28]. Therefore, designing a CNN model also allows us to demonstrate the benefits of our custom loss function that can reduce the complexity of such systems. See Section V for further details on computational complexity regarding our proposed algorithm.  The input size of the network is 1080 samples (3s long signal). The noisy data is input to the model, and the denoised data is the output. VOLUME 4, 2016

D. CUSTOM LOSS FUNCTION
We design a custom loss function to prioritise overall signal improvement and preservation of important signal peaks. The loss function plays a critical role in training deep-learning models. The Mean Square Error (MSE) is commonly implemented as the loss function in signal denoising tasks. However, one problem that may occur when using it is that equal priority is given to areas of limited signal information. While removing noise throughout the signal is essential, it is important that signal features should not be mistaken for noise and smoothed. Hence, we utilised a more practical solution by designing a loss function with two MSE components; the first is the global MSE, which is the typical loss function, and the second is the MSE pertaining only to regions where the QRS wave features exist. In addition, a weighting parameter, α, is applied to the second MSE term, effectively enabling us to control the weighting given to important signal features, as opposed to 'flat' regions in the signal. Unlike previous studies, our loss function considers all QRS complexes in the signal; see equation (3). The average QRS complex range is 100ms in healthy individuals; therefore, we select 100ms on either side of the R-peak, which is easily identifiable by an automated routine based on an intense local maximum. This ensures we capture the regular QRS complexes and the complexes with clinically significant arrhythmias present. The value α is a hyperparameter that is used to determine the level of importance placed on the QRS complexes by the loss function relative to the ECG signal as a whole. The idea behind the loss function needing the location of QRS holds for training. When denoising the noisy test data, we use no labels, only the noisy ECG time-series signal. So the intention here is that during training, the loss function allows the model parameters to learn to emphasise the QRS part of the signal.
Where, again, X n represents the noisy signal and X is the clean ground truth signal. n is the total number of QRS complexes in the 3-second signal segment, R Xi is the i th QRS complex in signal X and R Xni is the i th QRS complex in signal X n .

E. EVALUATION
To quantitatively evaluate our denoised data, we look at the SNR improvement in the ECG signal, heart rate error prediction, IBI and HRV of the denoised vs noisy ECG signals. We also qualitatively evaluate our results through a visual inspection in both the time-series domain of the ECG.
SNR is defined as the ratio of signal power to noise power, often expressed in decibels. A ratio higher than 1:1 (greater than 0 dB) indicates more signal than noise. For example, a SNR of 12dB is a more heavily corrupted signal than a SNR of 24dB, as more signal power is present in the 24dB signals. The SNR of the contaminated ECG segment can be adjusted by changing the parameter λ as in equation (4). Where X is the ECG signal of interest, n is the artefact, and λ is the hyperparameter that controls the SNR. Furthermore, it should be noted λ is calculated through the RMS of the input signal x and the RMS of the input motion artefact signal n. SN R Xn = 10 * log 10 RM S(X) RM S(λ · n) The Root Mean Square (RMS) of a signal is given in equation (5). N is defined as the number of samples in the ECG signal segment, and a i denotes the i th sample in the ECG signal. N = 1080.

1) Interbeat interval (IBI)
Heart rate in physiological studies is mostly derived from measurements taken from the electrocardiogram. First, the number of R waves per unit time, or the time between these waves (interbeat interval), is measured. This time can be translated to the rate of the heart for any collection of beats. Such detailed measurements permit how the heart reacts beat by beat to environmental and physiological stimuli. Unfortunately, while the interbeat interval is essential for clinical diagnosis, it is easily corrupted by noise. For the IBI, we calculate the location of the R-peaks using scipy.signal.f ind_peaks [29] and return the R-peak differences in milliseconds.

2) Heart rate variability (HRV)
Informative cardiac metrics rely not just on the heart rate but also on how the heart rate varies. Thus, another vital feature of measuring the cardiac state is heart rate variation. HRV is the temporal variation between consecutive heartbeats (RR intervals). A higher heart rate variability is associated with good health. On the other hand, a low HRV is associated with ill health -it becomes a significant predictor of mortality from several diseases [30]. In this experiment, we will use the R peaks in calculating the HRV. To reliably measure HRV and low-frequency cardiac components, long-term ECG records of at least 24 hours are necessary. However, short recordings can effectively capture the higher-frequency cardiac components. Recordings as short as 5 minutes are adequate for HRV but limit the sources of variability [31], [32]. Therefore, we analyse the ECG signals over 1 hour. The HRV analysis is computed using the neurokit2 package [33].

IV. RESULTS
The following section details the denoising results of the electrode motion artefact on the ECG dataset at differing noise and α levels. Figures 4 to 6 illustrate the clean, groundtruth signal, the noisy ECG signal corrupted with electrode motion artefact and the denoised ECG signal. We show qualitative results for the same noisy input signal at 6dB, 12dB and 24dB SNR levels. It becomes readily apparent that the model denoises the heavily corrupted ECG signals.
ECG signals at 6dB level FIGURE 4. ECG signals before and after denoising with an artificial offset present in the noisy signal for visualisation purposes. The initial SNR of the noisy signal is 6dB. The heavily corrupted 6dB signal shows no distinct QRS structure. The denoised signal contains a more recognisable QRS complex structure.
ECG signals at 12dB level FIGURE 5. ECG signals before and after denoising with an artificial offset present in the noisy signal for visualisation purposes. The initial SNR of the noisy signal is 12dB. Although, not as heavily corrupted as the 6dB signal the QRS complexes are difficult to visually discern. The denoised signal has distinct QRS complexes similar to the clean signal.
We demonstrate our metrics for the 24, 12 and 6dB noisy input signal at varying α levels to evaluate the model and custom loss function quantitatively. Firstly, we compare the SNR at the input to SNR at the output and compare the input and output signals' heart rate error (HRE). The HRE is defined here as the absolute difference between the estimated heart rate for a given ECG sample and the heart rate calculated from its corresponding ground-truth ECG sample. Finally, we present the Pearson Correlation Coefficient (PCC) as a quantitative metric to measure the linear relationship between our signals. The results for these metrics can be found in Table 1.
As can be seen in the SNR, HRE and PCC metrics in Table 1, the custom loss function outperforms the standard MSE loss function for medium-to-high noise levels. With increasing noise levels, the higher α values perform better.
ECG signals at 24dB level FIGURE 6. ECG signals before and after denoising with an artificial offset present in the noisy signal for visualisation purposes. The initial SNR of the noisy signal is 24dB. The mildly distorted 24dB signal contains clear QRS locations with some noisy elements. The denoised signal still retains these QRS complexes and a reduction in the overall noise present in the signal. Furthermore, for the 12dB signal, the custom loss function improves the SNR levels by 0.5 dB, the HRE by eight beats per minute or 25% and improves the PCC from 0.71 to 0.8 versus the standard MSE loss function. However, we consider this a good but not comprehensive evaluation of the models as the purpose of the custom loss function is to preserve the importance of QRS complexes. Therefore, we further analyse the IBI and HRV of the denoised ECG signals below as they are concerned with the R-peaks of the signals. The results of which can be found from Figures 7 to 9.
For the remainder of this section, we show results for the VOLUME 4, 2016 12dB signals at α = 2. Figure 7 illustrates the R-R intervals in the ground-truth signal. Figure 8 illustrates the R-R intervals for the denoised signals and Figure 9 for the noisy signal. The R-R intervals are calculated over the whole test dataset that contains around 1500 ECG segments, each segment is 3 seconds long. The denoised version shows an R-R variation similar to the clean ground-truth, whereas the noisy ECG signals lie far outside the ground-truth range. To be more explicit, the distributions of the R-R intervals cover a range a of 450−750ms for the ground truth ECG vs 500ms−700ms range for the denoised ECG and a 300ms − 1000ms range for the noisy ECG. The HRV values for the ECG signals are as presented in Table 2. HRV Mean is the mean of the R-R intervals, and HRV SDNN is the standard deviation of the R-R intervals. The outliers in Figure 8 are also present in Figure 7 of the ground truth signal. These outliers can manifest themselves in many ways and are a testament to why this task is not trivial to solve with standard methods. From the denoised HRV, we can see that many outliers have been denoised. A clean, distinct HRV can provide considerable reference to practitioners for the diagnosis of patients.
Overall, more accurate cardiac health can be deduced from the cleaner HRV/IBI information following the denoising of the ECG signals. Distribution of R-R intervals

V. DISCUSSION
We have found that the proposed loss function outperforms the standard mean squared error loss function based on the experimental results shown in Table 1 where the output of the SNR improves by 0.5 dB. More noticeably, the heart rate error prediction improves using the custom loss across most α and input SNR values. Specifically, the QRS complexes are better preserved by the custom loss function, leading to the preservation of important, relevant ECG information. This allows us to calculate heart rate more accurately as the QRS complexes now have more importance placed on them by the loss function.
We also completed a short experiment to demonstrate that the proposed loss function further allows for less complex and faster models for denoising the ECG signals. Figures 10  and 11 presents qualitative evidence of the denoising capabilities of a simple CNN model without and with the custom loss function, respectively. The less-complex CNN model is made of 4 convolutional layers and one output layer. The original model presented in Figure 3 contains 2,322,180 trainable parameters, whereas the reduced complexity model discussed here contains only 1,562 trainable parameters. A reduction of almost 1,500 times the number of parameters.
It is clear that the reduced complexity CNN model using the standard MSE loss function does not converge nor learn any characteristics of the ECG. In contrast, the same model with the custom loss function learns to denoise the ECG adequately well and achieves better performance with the same volume of training data and training epochs.
ECG signals at 24dB input with simple CNN and MSE loss function ECG signals at 24dB input with simple CNN and custom loss function FIGURE 11. ECG signals before and after denoising with a less-complex CNN and custom loss function. An artificial offset present in the noisy signal for visualisation purposes. The initial SNR of the noisy signal is 24dB. The reduced complexity CNN model using the custom loss function learns to denoise the ECG adequately well.

VI. CONCLUSION
This paper proposes a convolution network with a novel loss function to preserve the QRS complex structure and denoise noisy ECG signals. The proposed model and custom loss function computes a weighted combination of global and local Mean Square Errors and improves the denoising performance of the ECG in terms of the SNR and heart rate. This demonstrates the capability of the algorithm to balance between denoising the signal and preserving the peaks effectively. Furthermore, the HRV of the denoised ECG signals corresponds closely to that of the ground-truth ECG, which significantly benefits HRV analysis, see Table 2. With high noise reduction and low signal distortion, the practicality and superiority of the proposed method become evident and more suitable for clinical prognosis. Observing our additional experiment, we show that our custom loss function can reduce the computational cost associated with CNNs. Applications such as TensorFlow Lite and PyTorch Mobile now allow for on-device deep-learning frameworks. As such, this custom loss function demonstrates that we can run accurate physiological signal processing more sustainably in the interest of green AI.

ACKNOWLEDGMENT
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp used for this research.