Introduction
Seismocardiography (SCG) is an innovative cardiovascular monitoring method using wearable technology to measure chest wall vibrations caused by the mechanical activity of the heart. It captures key vibrations, especially those linked to valvular movements, and provides crucial physiological metrics, including are heart rate [1], pre-ejection period [2], left-ventricular ejection time [3], and respiration [4]. These features can empower clinicians to assess cardiac function continuously and beyond the clinic [5]. Compared to technologies like electrocardiography (ECG), SCG benefits from its compact form and the ability to integrate multiple physiological metrics into a single, unobtrusive sensor, simplifying monitoring and offering unique insights into cardiac mechanics. With the growing use of machine learning in its processing, SCG can analyze nuanced cardiac conditions such as heart failure [6], valvular disorders [7], stroke volume [8], and blood pressure [9]. However, effective model training requires large datasets, which are scarce due to SCG's emerging status [10]. Advancing SCG processing requires researchers to collect their own data, a task full of difficulties. With few commercially available options, researchers must often build custom solutions [11], [12], design clinically validated trials, and navigate the logistical hurdles like time, cost, and subject recruitment. These obstacles hinder the broader use of SCG in research and clinical settings, emphasizing the need for more accessible, comprehensive datasets to fully realize the technology's potential.
Synthetic data generation offers an alternative to address these challenges, filling gaps where real data is limited or unavailable. It can train algorithms, augment datasets to improve performance, or correct class imbalance, like underrepresented cardiac conditions or demographics [13], [14], enhancing model robustness [15], [16]. Moreover, synthetic data can mitigate privacy concerns by replicating patient data characteristics without personal information, aiding data sharing while protecting privacy [17], [18].
The Generative Adversarial Network (GAN), an image generation technique [19] is effective for time-series generation [20]. In biomedical application, GANs have generated signals such as ECG [16], [21], photoplethysmography [13], and encephalography [22]. However, SCG generation remains underexplored, with current efforts using cycle-GANs to generate SCG from ECG [23], or transformer networks to generate SCG from a predefined embedding signal [24], treating SCG generation as a sequence-to-sequence translation problem, which requires a predefined signal to be inputted to the generator.
In this work, we propose a GAN to generate SCG heartbeats directly from the latent space, independent of input signals like ECG. This enhances creativity and flexibility, allowing unique outputs not limited by direct input mappings. It also improves generalization since abstract latent representations enable models to handle variations not explicitly seen during training and provides control over generated features. This work improves upon the GAN concept presented in [25]. Generated heartbeats are analyzed for structural similarity to real heartbeats, and we demonstrate how subject-specific features can be tuned. Finally, the model is applied to a real-world problem to validate the utility of these generated heartbeats.
Materials and Methods
A. Dataset
The research was carried out at McGill University involving 62 participants (27 females, 35 males) with (mean ± standard deviation) age: 24.6 ± 4.5 years, height: 172 ± 10.4 cm, and weight: 70.2 ± 16.3 kgs. Only healthy subjects were considered, and all subjects had no known cardiovascular or respiratory ailments. Data was collected in three scenarios to capture a broad spectrum of SCG patterns influences by varying lung volumes, which are known to significantly modulate the SCG signal [26]. Subjects were recorded at rest, during breath-holding post-inhalations, and during breath-holding post-exhalation. Data collection and pre-processing details are described in the Supplementary Materials, resulting in a dataset comprising 39764 normalized SCG heartbeats providing a high-quality input for developing accurate and realistic synthetic heartbeats using generative models.
B. Generative Framework
To generate personalized, subject-specific SCG data, we employed a conditional GAN [27]. For details on the background GANs, outlining the adversarial process between generator and discriminator, along with the subject-specific conditioning refer to the Supplementary Materials. The model architecture was built using a similar structure to the deep convolutional GAN (DCGAN) framework [28], tailored to our data, and trained using the Wasserstein GAN with gradient penalty (WGAN-GP) [29]. The WGAN-GP approach is particularly effective for generating SCG heartbeats because it provides a more stable training process, reducing the risk of mode collapse that can often distort the variety of generated heartbeats. In this paradigm, rather than utilizing the binary cross entropy loss for training, a Wasserstein distance is adopted as the objective function [30]. A distinctive feature of this method is the introduction of a gradient penalty to ensure that the gradients closely approximate one [29]. This gradient penalty improves upon the weight clipping technique introduced in the foundational Wasserstein GAN (WGAN) [30], a method that occasionally led to suboptimal outcomes and constrained the network's potential [29]. The model architecture can be seen in Fig. 1. The generator was supplied with two inputs: the random latent space, and the unique subject identifier. An embedding layer processed the labels, producing an output dimension of 50. Subsequently, a dense layer containing 32 nodes was applied. Separately, the latent space, characterized by a random vector encompassing 100 elements, was fed to a fully connected layer with 32 nodes, with a leaky ReLu activation. Both inputs were then concatenated. The main structure of the generator consisted of four one-dimensional, transposed convolutional layers, successively containing 256, 128, 64, and 1 filters. Each layer had a kernel size of 4, stride of 2, and ‘same’ padding. Following each layer, batch normalization and a leaky ReLu activation were implemented, except for the final layer, which employed just a hyperbolic tangent activation.
Architecture of the proposed generative adversarial network. The generator (left) receives inputs from subject labels and a random latent space and outputs generated heartbeats. On the other hand, the discriminator (right) takes inputs from subject labels and either real or generated heartbeats and subsequently evaluates the authenticity of the heartbeat data.
For the discriminator model, two inputs were designated: the subject labels and the heartbeats, which could either be authentic or synthetic. The subject label embedding mirrored the procedure in the generator. Heartbeats were inputted directly with the processed subject identifiers. The discriminator was composed of three one-dimensional strided convolutional layers, sequentially equipped with 64, 128, and 256 filters. These layers maintained a kernel size of 4, a stride of 2, ‘same’ padding, and were each followed by a leaky ReLu activation. Following the convolutional layers was a flatten layer and a concluding single-node dense layer.
C. Training and Evaluation
The GAN structure was trained to minimize the Wasserstein distance at the discriminator output between the real and synthetic SCG heartbeats. The mail goal is generating high quality and realistic SCG heartbeats, though assessing the quality of generated samples is an ongoing challenge in generative models [31]. For SCG, there are currently no standard references for quality assessment, however, the pseudo-periodicity of the SCG waveform [12] enables performance evaluation. Despite variability between subjects, patterns in SCG morphology are relatively stable within individuals, making template matching a common method for tracking cardiac time intervals [32]. Therefore, we can compare the structural similarity of the generated and real samples with respect to their templates using Pearson's squared correlation coefficient (r2), and root-mean-squared error (RMSE), which together provide a comprehensive view of accuracy and consistency.
Details on the evaluation can be found in the Supplementary Materials. In summary, we evaluated four scenarios: First, metrics were calculated between real SCG samples and their real template to establish a baseline for natural SCG variability. Second, metrics between synthetic SCG samples and their synthetic template tested the similarity of synthetic samples, ideally matching the baseline. Third, metrics between synthetic samples and real templates evaluated how well synthetic samples mimic real ones. Finally, synthetic and real templates were compared for an average similarity per subject. Our proposed model was compared to DCGAN [25], [28] and WGAN [30] implementations, using the same convolutional structure.
D. Feature Control
After training, we investigated methods to control features in the generated signals. Tuning waveform can enhance realism, provide diverse samples, or offer customizations for specific research needs. Feature control was demonstrated by manipulating the generator's input via the latent and conditional spaces.
1) Latent Space Manipulations
The generator constructs SCG heartbeats from latent space inputs, which acts as seeds for specific inputs in the latent space. By interpreting and decoding the latent space into features, we can manipulate it to create tunable heartbeats. While many works have explored the latent space [33], [34], it has been shown that even linear combinations can produce meaningful tuning of parameters [35]. First, the trained model generated 1000 heartbeats for each subject. We then extracted the following features from the heartbeats: maximum and minimum amplitude, amplitudes of the first vibrational pulse (V1) and the second vibrational pulse (V2) [36], average frequency, energy, and timing. Then, as shown in (1), a linear regression model for each subject was used to map each feature, \begin{equation*}
f = w\ \cdot Z \tag{1}
\end{equation*}
Then, as shown in (2), using the regression coefficients, \begin{equation*}
\hat{Z} = {{Z}_0} + \epsilon \cdot w \tag{2}
\end{equation*}
2) Conditional Vector Manipulations
The conditional label enabled the generator to produce subject-specific SCG features. Although this was useful for generating realistic heartbeats from subjects within the dataset, it restricted the generator's ability to fabricate heartbeats from subjects beyond the original dataset. To circumvent this dilemma, we harnessed the potential of linear combinations applied to the conditional vectors, fostering the creation of novel SCG patterns. The embedding layer of the model translated an integer-based subject identifier into a distinct 50-dimensional vector. By strategically manipulating this vector, new, unique combinations can be synthesized [37]. For instance, by selecting vectors corresponding to two distinct subjects, we can interpolate between them, generating a composite subject that exhibits characteristics derived from both precursors. We therefore employed a linear combination, as shown in (3), to define a new subject vector, denoted as \begin{equation*}
\hat{Y} = {{\alpha }_s} \cdot {{Y}_1} + \left( {1 - {{\alpha }_s}} \right) \cdot {{Y}_2} \tag{3}
\end{equation*}
This is achieved by applying a specific ratio,
E. Application of Synthetic Heartbeats for Lung Volume Classification
After exploring tuning waveform features, we assessed the practical utility of synthetic SCG signals. Although the generator produces synthetic SCG heartbeats that structurally and statistically resemble the real heartbeats, their utility in practical applications remains to be demonstrated. To test if synthetic heartbeats can replace authentic signals, we utilized them in a real-world scenario: classifying lung volume states with a convolutional neural network. We adapted a previously demonstrated model [38] to classify high lung volume and low lung volume states during breath-holding. We built and trained a convolutional neural network with the same configuration as described in the paper. The subjects were split into a training set and a test set using an 80/20 train-test split. Using just the training subjects, we trained separate GANs on high and low lung volume heartbeats, to generate targeted synthetic data. The classification network was trained separately on authentic and synthetic datasets, using equal data amounts. Both training instances were evaluated on the same real test data to determine if synthetic data can replace real data in training. Finally, we augmented the complete real training dataset with a varying amount of synthetic data to test if it could improve performance. We evaluated the classification performance on the same held out set of real, unseen subjects to gauge the effectiveness of synthetic data in this application.
Results
A. Generated Heartbeats
SCG heartbeats were generated by the conditional WGAN-GP model explained in Section II-C. Fig. 2 compares authentic heartbeats (top row) and synthetic heartbeats (middle row) for three subjects, both randomly selected. The model produced synthetic heartbeats that visually resemble real heartbeats while maintaining their variability [12]. Generally, the first vibrational pulse (V1) appears clearer than the second (V2) [36] in both real and synthetic heartbeats due to the variability of systolic period. The third row of Fig. 2 shows average waveforms from 50 randomly sampled real and synthetic heartbeats from the same subjects, demonstrating alignment of generated templates with real ones, reflecting the intra-subject stability.
(a)–(c) Example of 10 randomly selected real heartbeats from three subjects, and (d)–(f) 10 randomly generated synthetic heartbeats from the same three subjects. (g)–(i) Ensemble averaged templates of each subject shown for real samples (black) and synthetic samples (red).
The performance and structural similarity of the generated results on three model architectures, DCGAN, WGAN, and WGAN-GP was quantified with r2 and RMSE. Fig. 3 shows a comparison of the three models. As a baseline, real samples compared to their respective templates, produced a r2 of 0.32, and RMSE of 0.084 m/s2. The generated heartbeats were compared to the templates from both the real samples and the synthetic samples. Ideally, if the model was capturing realistic SCG heartbeats with the same level of variability, both metrics should be on par with the baseline. For the comparison of synthetic samples to their real templates, all models had an r2 much lower than the baseline. However, for the RMSE, the DCGAN and WGAN-GP approaches were roughly equal to the baseline scenario, and the WGAN model had a much lower RMSE. Next, comparing synthetic samples to their respective synthetic templates, DCGAN and WGAN had a much higher r2 whereas WGAN-GP was on par with the baseline. For RMSE, DCGAN and WGAN were much lower than the baseline whereas WGAN-GP again was on par. This demonstrates that while DCGAN and WGAN produced realistic looking results, they lacked the natural variability observed in SCG. Additionally, when examined visually, we observed that DCGAN suffered from mode collapse within each subject. Conversely, WGAN failed to learn the class conditions. Only WGAN-GP produced both realistic and diverse results, with errors closest to the baseline. The final evaluation metric was comparing the synthetic templates to their respective real templates, where the ideal case would maximize the r2 or minimize the RMSE. Similarly, to the previous metrics, the WGAN-GP method had the best results, with an r2 of 0.60 and an RMSE of 0.035 m/s2.
Structural similarity metrics of (a) Pearson's correlation coefficient (r2), and (b) root-mean-squared-error (RMSE). DCGAN (blue), WGAN (purple), and WGAN-GP (red) were evaluated in three scenarios: comparing synthetic samples to their real templates, synthetic samples to their synthetic templates, and synthetic templates to their real templates. The baseline metric which compares the real samples to their real templates is shown by the black dashed line.
B. Feature Tuning
1) Latent Space
Tunable features were controlled with a linear regression from the latent space to extract feature values. We evaluated each subject using 50 generated heartbeats. The average correlation coefficient between 10 discrete linear steps,
Tunable features from manipulating the latent space of a single heartbeat for (a) max amplitude of the first vibrational pulse (V1) and (b) max amplitude of the second vibrational pulse (V2), for five discrete linear steps,
2) Conditional Vector
Novel subject combinations were created by interpolating between two existing conditional vectors. This resulted in heartbeats that exhibited features partially from each subject. Fig. 4(c) shows average waveforms of generated heartbeats after interpolation, where the model generated heartbeats between two different subjects using several splitting ratios. These results indicate that novel templates can be created to produce subjects outside of the original dataset.
C. Lung Volume Classification
To evaluate the effectiveness of synthetic SCG heartbeats, we compared real and synthetic datasets for training a lung volume classifier. The classification model was trained on real data from 80% of the subjects and tested on the remaining 20% of unseen subjects. This achieved an accuracy of 89% with a 95% confidence interval of ±0.89%. When the same model was trained on synthetic data but tested on the same real test dataset, it demonstrated an accuracy of 88% with a 95% confidence interval of ±0.93%. This minor reduction in performance suggests that while the synthetic heartbeats slightly underperform, they retain most of the structural and functional fidelity. The synthetic dataset's ability to train a model with comparable accuracy highlights its potential utility in scenarios where real data might be scarce or impractical to gather, such as in preliminary research settings or in developing diagnostic tools for rare cardiac conditions. Additionally, the model was trained on real data and augmented with synthetic data. The entire training set was used with a varying amount of additional synthetic data added into the dataset. As shown in Fig. 5, augmenting the real dataset with 50% more synthetic data improved accuracy by 3% to 92%, with a 95% confidence interval of ±0.77%. This increase in performance shows how synthetic data can be used to supplement existing datasets, to improve the training of machine learning models.
Lung volume state classification accuracy with the entire training dataset augmented by increasing amounts of additional synthetic data, shown as a percentage of the size of the real dataset. Dashed line shows the baseline accuracy where no synthetic data is added.
Discussion
Our conditional WGANP-GP model outperformed DCGAN and WGAN by producing realistic and diverse samples while avoiding mode and condition collapse, which had limited previous models [25]. It demonstrated structural similarity between synthetic and real heartbeats. Validation on the lung volume classification task demonstrated the utility of SCG in a practical application. The results demonstrated that synthetic heartbeats could supplement existing datasets to improve performance or be used on their own to train new models. However, further validation is needed to confirm if the synthetic beats represent other useful cardiac information across different scenarios. Additionally, while this work focused on SCG signals, the framework could be extended to other physiological signals and future work should evaluate multi-signal generation for a more comprehensive cardiovascular assessment.
The feature control results demonstrated the ability to manipulate the latent space to produce controllable features. Overcoming the natural variability in SCG signals is a difficult problem when designing SCG-related algorithms [39]. This technique can generate heartbeats conforming to specific patterns for various applications, improving algorithm robustness and performance. Tuning features also aids machine learning models by providing insights into underlying mechanisms, enhancing the reliability and interpretability of results. Subject interpolation further allowed generation of subjects outside the original dataset, which could improve algorithm training and exploring morphology differences.
This work was limited to linear combinations in the latent space, restricting the generation of complex features. Linear manipulations often fail to capture the nonlinear dynamics and intricate relationships between different physiological and pathological features inherent in cardiac function. Further work should incorporate non-linear combinations such as transfer functions, kernels, or deep learning to manipulate the latent space [35], as well as networks designed for intricate feature manipulations [40]. Incorporating additional parameters into the conditional architecture could also tailor the generation to specific conditions.
This work was confined to healthy, young subjects, limiting the generalizability and applicability. To enhance the robustness and representation of the general population, future work should incorporate a diverse cohort with a wider range of ages, demographics, and cardiac conditions, such as arrhythmias and heart failure. Another key limitation is that SCG was recorded in resting states, providing a controlled environment to capture clear signals of the heart. However, extending SCG to ambulatory or daily-life settings introduces challenges such as movement artifacts, and additional noise that can obscure cardiac signals, requiring advanced processing to isolate heartbeats [41], [42]. Future work should analyze SCG in more realistic scenarios. Finally, this work only examined producing single heartbeats at a time. Although there is variance in each heartbeat, a realistic system should be expanded to produce longer segments that incorporate temporal relationships to examine effects such as heart rate variability.
Conclusion
Our approach addresses the challenges associated with the collection of SCG data for cardiovascular research and machine learning applications. By leveraging a conditional WGAN-GP architecture, the model successfully generated synthetic heartbeats that closely resemble real SCG signals. The similarity between generated heartbeats and actual SCG data was confirmed by quantitative structural analysis, through r2 and RMSE. Notably, the model captures subject specific SCG features, providing a valuable tool for creating diverse datasets and augmenting existing ones. Furthermore, as determined from the lung volume classification test, models trained with synthetic SCG heartbeats achieve comparable performance metrics to those trained with real data, which validates the synthetic data in training real-world models. This capability is particularly significant as it can aid in training and validating models without the challenges associated with large-scale data collection. Linear combinations in the latent space or conditional space enabled controlled generation of SCG features. Further refinement and validation of the model's capabilities in diverse clinical scenarios will be crucial for realizing its full potential in overcoming data limitations and improving cardiovascular healthcare and diagnosis.
Supplementary Materials
The supplementary materials include detailed descriptions of the data collection and preprocessing, background on the GAN process, and evaluation metric descriptions.
Ethics Statement
This research was approved by the McGill Review Ethics Board (Approval Number: 6-0619), approved August 12th, 2019. Prior written consent was obtained from all participants before involvement in the study.
Author Contributions
J.S., Y.D., and D.P contributed to the study design. J.S., and Y.D. contributed to data collection, methodology, results and editing the manuscript. J.S contributed to implementation, evaluation, data analysis, and drafting the manuscript. D.P. contributed to project administration, supervision, and resources. All authors reviewed the manuscript.
Conflict of Interest
The authors declare no conflict of interests.