Journals & Magazines >IEEE Open Journal of Engineer... >Volume: 6

Generation of Seismocardiography Heartbeats Using a Wasserstein Generative Adversarial Network With Feature Control

A Generative Adversarial network generated synthetic seismocardiography heartbeats from the latent space and conditional subject identifier. This resulted in realistic he...

Impact Statement:A GAN for SCG enables the generation of diverse, realistic cardiac signals which offer a cost-effective and scalable way to improve diagnostic and predictive algorithms.

Abstract:

Goal: Seismocardiography (SCG) offers critical insights into cardiac performance, but its analysis often faces challenges due to the limited availability of data. This st...Show More

Metadata

Impact Statement:

A GAN for SCG enables the generation of diverse, realistic cardiac signals which offer a cost-effective and scalable way to improve diagnostic and predictive algorithms.

Abstract:

Goal: Seismocardiography (SCG) offers critical insights into cardiac performance, but its analysis often faces challenges due to the limited availability of data. This study aims to generate synthetic SCG heartbeats which can augment existing datasets to enable more research avenues. Methods: We trained a Wasserstein generative adversarial network (GAN) with gradient penalty on authentic SCG heartbeats. It was conditioned with embedded subject-specific identifiers to create individualized heartbeats. We employed linear permutations in the latent and conditional spaces to control signal features, and a convolutional network to classify lung volume states from real and synthetic data separately. Results: The model effectively replicated SCG signal morphology, while maintaining a level of variance which matches the variability of cardiac activity. Comparisons with real SCG waveforms yielded Pearson's r-squared correlation of 0.62 for average heartbeats. Linear manipulations were successful in controlling simple features although they were limited in more complex characteristics. Additionally, the model demonstrated strong performance in practical applications, with the synthetic data achieving an accuracy of 88% in lung volume classification as compared to 89% achieved with real data. Augmenting real data with additional synthetic data improved performance by 3%. Conclusions: GANs for artificial SCG heartbeat generation produce realistic and diverse results that have the potential to overcome data limitations, thereby enhancing SCG-based research.

A Generative Adversarial network generated synthetic seismocardiography heartbeats from the latent space and conditional subject identifier. This resulted in realistic he...

Published in: IEEE Open Journal of Engineering in Medicine and Biology ( Volume: 6)

Page(s): 119 - 126

Date of Publication: 23 October 2024

Electronic ISSN: 2644-1276

DOI: 10.1109/OJEMB.2024.3485535

Funding Agency:

Contents

SECTION I.

Introduction

Seismocardiography (SCG) is an innovative cardiovascular monitoring method using wearable technology to measure chest wall vibrations caused by the mechanical activity of the heart. It captures key vibrations, especially those linked to valvular movements, and provides crucial physiological metrics, including are heart rate [1], pre-ejection period [2], left-ventricular ejection time [3], and respiration [4]. These features can empower clinicians to assess cardiac function continuously and beyond the clinic [5]. Compared to technologies like electrocardiography (ECG), SCG benefits from its compact form and the ability to integrate multiple physiological metrics into a single, unobtrusive sensor, simplifying monitoring and offering unique insights into cardiac mechanics. With the growing use of machine learning in its processing, SCG can analyze nuanced cardiac conditions such as heart failure [6], valvular disorders [7], stroke volume [8], and blood pressure [9]. However, effective model training requires large datasets, which are scarce due to SCG's emerging status [10]. Advancing SCG processing requires researchers to collect their own data, a task full of difficulties. With few commercially available options, researchers must often build custom solutions [11], [12], design clinically validated trials, and navigate the logistical hurdles like time, cost, and subject recruitment. These obstacles hinder the broader use of SCG in research and clinical settings, emphasizing the need for more accessible, comprehensive datasets to fully realize the technology's potential.

Synthetic data generation offers an alternative to address these challenges, filling gaps where real data is limited or unavailable. It can train algorithms, augment datasets to improve performance, or correct class imbalance, like underrepresented cardiac conditions or demographics [13], [14], enhancing model robustness [15], [16]. Moreover, synthetic data can mitigate privacy concerns by replicating patient data characteristics without personal information, aiding data sharing while protecting privacy [17], [18].

The Generative Adversarial Network (GAN), an image generation technique [19] is effective for time-series generation [20]. In biomedical application, GANs have generated signals such as ECG [16], [21], photoplethysmography [13], and encephalography [22]. However, SCG generation remains underexplored, with current efforts using cycle-GANs to generate SCG from ECG [23], or transformer networks to generate SCG from a predefined embedding signal [24], treating SCG generation as a sequence-to-sequence translation problem, which requires a predefined signal to be inputted to the generator.

In this work, we propose a GAN to generate SCG heartbeats directly from the latent space, independent of input signals like ECG. This enhances creativity and flexibility, allowing unique outputs not limited by direct input mappings. It also improves generalization since abstract latent representations enable models to handle variations not explicitly seen during training and provides control over generated features. This work improves upon the GAN concept presented in [25]. Generated heartbeats are analyzed for structural similarity to real heartbeats, and we demonstrate how subject-specific features can be tuned. Finally, the model is applied to a real-world problem to validate the utility of these generated heartbeats.

SECTION II.

Materials and Methods

A. Dataset

The research was carried out at McGill University involving 62 participants (27 females, 35 males) with (mean ± standard deviation) age: 24.6 ± 4.5 years, height: 172 ± 10.4 cm, and weight: 70.2 ± 16.3 kgs. Only healthy subjects were considered, and all subjects had no known cardiovascular or respiratory ailments. Data was collected in three scenarios to capture a broad spectrum of SCG patterns influences by varying lung volumes, which are known to significantly modulate the SCG signal [26]. Subjects were recorded at rest, during breath-holding post-inhalations, and during breath-holding post-exhalation. Data collection and pre-processing details are described in the Supplementary Materials, resulting in a dataset comprising 39764 normalized SCG heartbeats providing a high-quality input for developing accurate and realistic synthetic heartbeats using generative models.

B. Generative Framework

To generate personalized, subject-specific SCG data, we employed a conditional GAN [27]. For details on the background GANs, outlining the adversarial process between generator and discriminator, along with the subject-specific conditioning refer to the Supplementary Materials. The model architecture was built using a similar structure to the deep convolutional GAN (DCGAN) framework [28], tailored to our data, and trained using the Wasserstein GAN with gradient penalty (WGAN-GP) [29]. The WGAN-GP approach is particularly effective for generating SCG heartbeats because it provides a more stable training process, reducing the risk of mode collapse that can often distort the variety of generated heartbeats. In this paradigm, rather than utilizing the binary cross entropy loss for training, a Wasserstein distance is adopted as the objective function [30]. A distinctive feature of this method is the introduction of a gradient penalty to ensure that the gradients closely approximate one [29]. This gradient penalty improves upon the weight clipping technique introduced in the foundational Wasserstein GAN (WGAN) [30], a method that occasionally led to suboptimal outcomes and constrained the network's potential [29]. The model architecture can be seen in Fig. 1. The generator was supplied with two inputs: the random latent space, and the unique subject identifier. An embedding layer processed the labels, producing an output dimension of 50. Subsequently, a dense layer containing 32 nodes was applied. Separately, the latent space, characterized by a random vector encompassing 100 elements, was fed to a fully connected layer with 32 nodes, with a leaky ReLu activation. Both inputs were then concatenated. The main structure of the generator consisted of four one-dimensional, transposed convolutional layers, successively containing 256, 128, 64, and 1 filters. Each layer had a kernel size of 4, stride of 2, and ‘same’ padding. Following each layer, batch normalization and a leaky ReLu activation were implemented, except for the final layer, which employed just a hyperbolic tangent activation.

Fig. 1.

Architecture of the proposed generative adversarial network. The generator (left) receives inputs from subject labels and a random latent space and outputs generated heartbeats. On the other hand, the discriminator (right) takes inputs from subject labels and either real or generated heartbeats and subsequently evaluates the authenticity of the heartbeat data.

Show All

For the discriminator model, two inputs were designated: the subject labels and the heartbeats, which could either be authentic or synthetic. The subject label embedding mirrored the procedure in the generator. Heartbeats were inputted directly with the processed subject identifiers. The discriminator was composed of three one-dimensional strided convolutional layers, sequentially equipped with 64, 128, and 256 filters. These layers maintained a kernel size of 4, a stride of 2, ‘same’ padding, and were each followed by a leaky ReLu activation. Following the convolutional layers was a flatten layer and a concluding single-node dense layer.

C. Training and Evaluation

The GAN structure was trained to minimize the Wasserstein distance at the discriminator output between the real and synthetic SCG heartbeats. The mail goal is generating high quality and realistic SCG heartbeats, though assessing the quality of generated samples is an ongoing challenge in generative models [31]. For SCG, there are currently no standard references for quality assessment, however, the pseudo-periodicity of the SCG waveform [12] enables performance evaluation. Despite variability between subjects, patterns in SCG morphology are relatively stable within individuals, making template matching a common method for tracking cardiac time intervals [32]. Therefore, we can compare the structural similarity of the generated and real samples with respect to their templates using Pearson's squared correlation coefficient (r²), and root-mean-squared error (RMSE), which together provide a comprehensive view of accuracy and consistency.

Details on the evaluation can be found in the Supplementary Materials. In summary, we evaluated four scenarios: First, metrics were calculated between real SCG samples and their real template to establish a baseline for natural SCG variability. Second, metrics between synthetic SCG samples and their synthetic template tested the similarity of synthetic samples, ideally matching the baseline. Third, metrics between synthetic samples and real templates evaluated how well synthetic samples mimic real ones. Finally, synthetic and real templates were compared for an average similarity per subject. Our proposed model was compared to DCGAN [25], [28] and WGAN [30] implementations, using the same convolutional structure.

D. Feature Control

After training, we investigated methods to control features in the generated signals. Tuning waveform can enhance realism, provide diverse samples, or offer customizations for specific research needs. Feature control was demonstrated by manipulating the generator's input via the latent and conditional spaces.

1) Latent Space Manipulations

The generator constructs SCG heartbeats from latent space inputs, which acts as seeds for specific inputs in the latent space. By interpreting and decoding the latent space into features, we can manipulate it to create tunable heartbeats. While many works have explored the latent space [33], [34], it has been shown that even linear combinations can produce meaningful tuning of parameters [35]. First, the trained model generated 1000 heartbeats for each subject. We then extracted the following features from the heartbeats: maximum and minimum amplitude, amplitudes of the first vibrational pulse (V1) and the second vibrational pulse (V2) [36], average frequency, energy, and timing. Then, as shown in (1), a linear regression model for each subject was used to map each feature, $f$ , to the latent vectors, $Z$ with coefficients, $w$ .

$\begin{equation*} f = w\ \cdot Z \tag{1} \end{equation*}$ View Source

Then, as shown in (2), using the regression coefficients, $w$ , and a step size, $\epsilon$ , we can make small permutations to each latent vector, ${{Z}_0}$ , to create a new latent vector, $\hat{Z}$ , which manipulates the desired feature in each heartbeat.

$\begin{equation*} \hat{Z} = {{Z}_0} + \epsilon \cdot w \tag{2} \end{equation*}$ View Source

2) Conditional Vector Manipulations

The conditional label enabled the generator to produce subject-specific SCG features. Although this was useful for generating realistic heartbeats from subjects within the dataset, it restricted the generator's ability to fabricate heartbeats from subjects beyond the original dataset. To circumvent this dilemma, we harnessed the potential of linear combinations applied to the conditional vectors, fostering the creation of novel SCG patterns. The embedding layer of the model translated an integer-based subject identifier into a distinct 50-dimensional vector. By strategically manipulating this vector, new, unique combinations can be synthesized [37]. For instance, by selecting vectors corresponding to two distinct subjects, we can interpolate between them, generating a composite subject that exhibits characteristics derived from both precursors. We therefore employed a linear combination, as shown in (3), to define a new subject vector, denoted as $\hat{Y}$ .

$\begin{equation*} \hat{Y} = {{\alpha }_s} \cdot {{Y}_1} + \left( {1 - {{\alpha }_s}} \right) \cdot {{Y}_2} \tag{3} \end{equation*}$ View Source

This is achieved by applying a specific ratio, ${{\alpha }_s}$ , between two pre-existing subjects, ${{Y}_1}$ and ${{Y}_2}$ . This ratio provides the means to finely tune the influence of each subject's features in the composite heartbeat pattern of the newly generated subject.

E. Application of Synthetic Heartbeats for Lung Volume Classification

After exploring tuning waveform features, we assessed the practical utility of synthetic SCG signals. Although the generator produces synthetic SCG heartbeats that structurally and statistically resemble the real heartbeats, their utility in practical applications remains to be demonstrated. To test if synthetic heartbeats can replace authentic signals, we utilized them in a real-world scenario: classifying lung volume states with a convolutional neural network. We adapted a previously demonstrated model [38] to classify high lung volume and low lung volume states during breath-holding. We built and trained a convolutional neural network with the same configuration as described in the paper. The subjects were split into a training set and a test set using an 80/20 train-test split. Using just the training subjects, we trained separate GANs on high and low lung volume heartbeats, to generate targeted synthetic data. The classification network was trained separately on authentic and synthetic datasets, using equal data amounts. Both training instances were evaluated on the same real test data to determine if synthetic data can replace real data in training. Finally, we augmented the complete real training dataset with a varying amount of synthetic data to test if it could improve performance. We evaluated the classification performance on the same held out set of real, unseen subjects to gauge the effectiveness of synthetic data in this application.

SECTION III.

Results

A. Generated Heartbeats

SCG heartbeats were generated by the conditional WGAN-GP model explained in Section II-C. Fig. 2 compares authentic heartbeats (top row) and synthetic heartbeats (middle row) for three subjects, both randomly selected. The model produced synthetic heartbeats that visually resemble real heartbeats while maintaining their variability [12]. Generally, the first vibrational pulse (V1) appears clearer than the second (V2) [36] in both real and synthetic heartbeats due to the variability of systolic period. The third row of Fig. 2 shows average waveforms from 50 randomly sampled real and synthetic heartbeats from the same subjects, demonstrating alignment of generated templates with real ones, reflecting the intra-subject stability.

Fig. 2.

(a)–(c) Example of 10 randomly selected real heartbeats from three subjects, and (d)–(f) 10 randomly generated synthetic heartbeats from the same three subjects. (g)–(i) Ensemble averaged templates of each subject shown for real samples (black) and synthetic samples (red).

Show All

The performance and structural similarity of the generated results on three model architectures, DCGAN, WGAN, and WGAN-GP was quantified with r² and RMSE. Fig. 3 shows a comparison of the three models. As a baseline, real samples compared to their respective templates, produced a r² of 0.32, and RMSE of 0.084 m/s². The generated heartbeats were compared to the templates from both the real samples and the synthetic samples. Ideally, if the model was capturing realistic SCG heartbeats with the same level of variability, both metrics should be on par with the baseline. For the comparison of synthetic samples to their real templates, all models had an r² much lower than the baseline. However, for the RMSE, the DCGAN and WGAN-GP approaches were roughly equal to the baseline scenario, and the WGAN model had a much lower RMSE. Next, comparing synthetic samples to their respective synthetic templates, DCGAN and WGAN had a much higher r² whereas WGAN-GP was on par with the baseline. For RMSE, DCGAN and WGAN were much lower than the baseline whereas WGAN-GP again was on par. This demonstrates that while DCGAN and WGAN produced realistic looking results, they lacked the natural variability observed in SCG. Additionally, when examined visually, we observed that DCGAN suffered from mode collapse within each subject. Conversely, WGAN failed to learn the class conditions. Only WGAN-GP produced both realistic and diverse results, with errors closest to the baseline. The final evaluation metric was comparing the synthetic templates to their respective real templates, where the ideal case would maximize the r² or minimize the RMSE. Similarly, to the previous metrics, the WGAN-GP method had the best results, with an r² of 0.60 and an RMSE of 0.035 m/s².

Fig. 3.

Structural similarity metrics of (a) Pearson's correlation coefficient (r²), and (b) root-mean-squared-error (RMSE). DCGAN (blue), WGAN (purple), and WGAN-GP (red) were evaluated in three scenarios: comparing synthetic samples to their real templates, synthetic samples to their synthetic templates, and synthetic templates to their real templates. The baseline metric which compares the real samples to their real templates is shown by the black dashed line.

Show All

B. Feature Tuning

1) Latent Space

Tunable features were controlled with a linear regression from the latent space to extract feature values. We evaluated each subject using 50 generated heartbeats. The average correlation coefficient between 10 discrete linear steps, $\epsilon$ , and the extracted feature, $f$ , is shown in Table I. The results show that amplitude features can be tuned using linear latent space permutations. More complex features, such as mean frequency or signal energy, had a poor correlation. Additionally, the results show that manipulating cardio-specific features, such as the amplitudes of V1 or V2, can also be adaptively tuned. Fig. 4(a) shows tuning the maximum amplitude of V1, in which a range of amplitudes from 0.24 to 0.38 is shown for five $\epsilon$ values. Note the rest of the heartbeat remained relatively unchanged. Then in Fig. 4(b), the same heartbeat was manipulated to tune the maximum amplitude of V2, demonstrating an amplitude range of 0.14 to 0.25. Again, the rest of the heartbeat remained relatively unchanged. These examples highlight the ability to generate heartbeats with tunable features without having to retrain the entire generator.

TABLE I Correlation Coefficient of Extracted Features During Latent Space Manipulation

$Fig. 4. - Tunable features from manipulating the latent space of a single heartbeat for (a) max amplitude of the first vibrational pulse (V1) and (b) max amplitude of the second vibrational pulse (V2), for five discrete linear steps, $\epsilon $, in the latent space. (c) Average waveforms during linear interpolation between the conditional vector of two subjects with the percentage of the first subject shown in the legend.$

Fig. 4.

Tunable features from manipulating the latent space of a single heartbeat for (a) max amplitude of the first vibrational pulse (V1) and (b) max amplitude of the second vibrational pulse (V2), for five discrete linear steps, $\epsilon$ , in the latent space. (c) Average waveforms during linear interpolation between the conditional vector of two subjects with the percentage of the first subject shown in the legend.

Show All

2) Conditional Vector

Novel subject combinations were created by interpolating between two existing conditional vectors. This resulted in heartbeats that exhibited features partially from each subject. Fig. 4(c) shows average waveforms of generated heartbeats after interpolation, where the model generated heartbeats between two different subjects using several splitting ratios. These results indicate that novel templates can be created to produce subjects outside of the original dataset.

C. Lung Volume Classification

To evaluate the effectiveness of synthetic SCG heartbeats, we compared real and synthetic datasets for training a lung volume classifier. The classification model was trained on real data from 80% of the subjects and tested on the remaining 20% of unseen subjects. This achieved an accuracy of 89% with a 95% confidence interval of ±0.89%. When the same model was trained on synthetic data but tested on the same real test dataset, it demonstrated an accuracy of 88% with a 95% confidence interval of ±0.93%. This minor reduction in performance suggests that while the synthetic heartbeats slightly underperform, they retain most of the structural and functional fidelity. The synthetic dataset's ability to train a model with comparable accuracy highlights its potential utility in scenarios where real data might be scarce or impractical to gather, such as in preliminary research settings or in developing diagnostic tools for rare cardiac conditions. Additionally, the model was trained on real data and augmented with synthetic data. The entire training set was used with a varying amount of additional synthetic data added into the dataset. As shown in Fig. 5, augmenting the real dataset with 50% more synthetic data improved accuracy by 3% to 92%, with a 95% confidence interval of ±0.77%. This increase in performance shows how synthetic data can be used to supplement existing datasets, to improve the training of machine learning models.

Fig. 5.

Lung volume state classification accuracy with the entire training dataset augmented by increasing amounts of additional synthetic data, shown as a percentage of the size of the real dataset. Dashed line shows the baseline accuracy where no synthetic data is added.

Show All

SECTION IV.

Discussion

Our conditional WGANP-GP model outperformed DCGAN and WGAN by producing realistic and diverse samples while avoiding mode and condition collapse, which had limited previous models [25]. It demonstrated structural similarity between synthetic and real heartbeats. Validation on the lung volume classification task demonstrated the utility of SCG in a practical application. The results demonstrated that synthetic heartbeats could supplement existing datasets to improve performance or be used on their own to train new models. However, further validation is needed to confirm if the synthetic beats represent other useful cardiac information across different scenarios. Additionally, while this work focused on SCG signals, the framework could be extended to other physiological signals and future work should evaluate multi-signal generation for a more comprehensive cardiovascular assessment.

The feature control results demonstrated the ability to manipulate the latent space to produce controllable features. Overcoming the natural variability in SCG signals is a difficult problem when designing SCG-related algorithms [39]. This technique can generate heartbeats conforming to specific patterns for various applications, improving algorithm robustness and performance. Tuning features also aids machine learning models by providing insights into underlying mechanisms, enhancing the reliability and interpretability of results. Subject interpolation further allowed generation of subjects outside the original dataset, which could improve algorithm training and exploring morphology differences.

This work was limited to linear combinations in the latent space, restricting the generation of complex features. Linear manipulations often fail to capture the nonlinear dynamics and intricate relationships between different physiological and pathological features inherent in cardiac function. Further work should incorporate non-linear combinations such as transfer functions, kernels, or deep learning to manipulate the latent space [35], as well as networks designed for intricate feature manipulations [40]. Incorporating additional parameters into the conditional architecture could also tailor the generation to specific conditions.

This work was confined to healthy, young subjects, limiting the generalizability and applicability. To enhance the robustness and representation of the general population, future work should incorporate a diverse cohort with a wider range of ages, demographics, and cardiac conditions, such as arrhythmias and heart failure. Another key limitation is that SCG was recorded in resting states, providing a controlled environment to capture clear signals of the heart. However, extending SCG to ambulatory or daily-life settings introduces challenges such as movement artifacts, and additional noise that can obscure cardiac signals, requiring advanced processing to isolate heartbeats [41], [42]. Future work should analyze SCG in more realistic scenarios. Finally, this work only examined producing single heartbeats at a time. Although there is variance in each heartbeat, a realistic system should be expanded to produce longer segments that incorporate temporal relationships to examine effects such as heart rate variability.

SECTION V.

Conclusion

Our approach addresses the challenges associated with the collection of SCG data for cardiovascular research and machine learning applications. By leveraging a conditional WGAN-GP architecture, the model successfully generated synthetic heartbeats that closely resemble real SCG signals. The similarity between generated heartbeats and actual SCG data was confirmed by quantitative structural analysis, through r² and RMSE. Notably, the model captures subject specific SCG features, providing a valuable tool for creating diverse datasets and augmenting existing ones. Furthermore, as determined from the lung volume classification test, models trained with synthetic SCG heartbeats achieve comparable performance metrics to those trained with real data, which validates the synthetic data in training real-world models. This capability is particularly significant as it can aid in training and validating models without the challenges associated with large-scale data collection. Linear combinations in the latent space or conditional space enabled controlled generation of SCG features. Further refinement and validation of the model's capabilities in diverse clinical scenarios will be crucial for realizing its full potential in overcoming data limitations and improving cardiovascular healthcare and diagnosis.

Supplementary Materials

The supplementary materials include detailed descriptions of the data collection and preprocessing, background on the GAN process, and evaluation metric descriptions.

Ethics Statement

This research was approved by the McGill Review Ethics Board (Approval Number: 6-0619), approved August 12th, 2019. Prior written consent was obtained from all participants before involvement in the study.

Author Contributions

J.S., Y.D., and D.P contributed to the study design. J.S., and Y.D. contributed to data collection, methodology, results and editing the manuscript. J.S contributed to implementation, evaluation, data analysis, and drafting the manuscript. D.P. contributed to project administration, supervision, and resources. All authors reviewed the manuscript.

Conflict of Interest

The authors declare no conflict of interests.

References is not available for this document.

Generation of Seismocardiography Heartbeats Using a Wasserstein Generative Adversarial Network With Feature Control

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction