Self-Supervised Learning-Based General Laboratory Progress Pretrained Model for Cardiovascular Event Detection

Objective: Leveraging patient data through machine learning techniques in disease care offers a multitude of substantial benefits. Nonetheless, the inherent nature of patient data poses several challenges. Prevalent cases amass substantial longitudinal data owing to their patient volume and consistent follow-ups, however, longitudinal laboratory data are renowned for their irregularity, temporality, absenteeism, and sparsity; In contrast, recruitment for rare or specific cases is often constrained due to their limited patient size and episodic observations. This study employed self-supervised learning (SSL) to pretrain a generalized laboratory progress (GLP) model that captures the overall progression of six common laboratory markers in prevalent cardiovascular cases, with the intention of transferring this knowledge to aid in the detection of specific cardiovascular event. Methods and procedures: GLP implemented a two-stage training approach, leveraging the information embedded within interpolated data and amplify the performance of SSL. After GLP pretraining, it is transferred for target vessel revascularization (TVR) detection. Results: The proposed two-stage training improved the performance of pure SSL, and the transferability of GLP exhibited distinctiveness. After GLP processing, the classification exhibited a notable enhancement, with averaged accuracy rising from 0.63 to 0.90. All evaluated metrics demonstrated substantial superiority ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} ${p} < 0.01$\end{document}) compared to prior GLP processing. Conclusion: Our study effectively engages in translational engineering by transferring patient progression of cardiovascular laboratory parameters from one patient group to another, transcending the limitations of data availability. The transferability of disease progression optimized the strategies of examinations and treatments, and improves patient prognosis while using commonly available laboratory parameters. The potential for expanding this approach to encompass other diseases holds great promise. Clinical impact: Our study effectively transposes patient progression from one cohort to another, surpassing the constraints of episodic observation. The transferability of disease progression contributed to cardiovascular event assessment.


I. INTRODUCTION
R EGULAR surveillance stands as an imperative facet within cardiovascular disorders management [1].Laboratory analysis constitutes a vital component, involving multifarious chemical tests that scrutinize blood, urine, or body tissue specimens.These tests gauge the body's response to food intake, medication, and treatment, thus providing crucial insights into disease progression and signaling the need for medication or dietary modifications.For chronic diseases, laboratory results are more meaningful when observed longitudinally rather than episodically.A vast repository of longitudinal data has been accumulated for prevalent diseases such as hypertension (HTN) and diabetes mellitus (DM).Nonetheless, the inherent nature of these data, characterized by irregularity, absenteeism, and sparsity, presents challenges in leveraging their full potential for machine learning applications.
Conversely, rare or specific cases are frequently associated with a limited patient population and episodic observations, also impeding the integration of machine learning technology in their progression assessment.This study employed selfsupervised learning (SSL) to pretrain a generalized laboratory progress (GLP) model.GLP captures the overall progression of common laboratory markers in prevalent cardiovascular cases, with the intention of transferring this knowledge to aid in the detection of target vessel revascularization (TVR) occurrences in patients undergoing percutaneous coronary intervention (PCI).

Challenging nature of laboratory data
Diverse methodological approaches can be adopted depending on the distinct characteristics of the population under study.Cross-sectional studies, which capture the patient status at a single time point, provide only a temporary snapshot and preliminary glimpses into future disease progression.In contrast, cohort studies, with their longitudinal observations, offer a more comprehensive understanding of disease development [2].However, the collection of such data over an extended period can be intricate, time-consuming, and costly, frequently suffered from patient dropouts and incomplete data [3,4].
Consequently, rare or specific diseases frequently resort to cross-sectional studies due to limited patient size, yielding episodic observations.In the context of prevalent cases, longitudinal data is more readily accessible owing to continuous follow-ups.Nonetheless, these observations heavily rely on patient adherence, insurance regulations, clinical guidelines, and the clinical judgment of physicians.Any disruption to these factors can lead to irregularity and sparsity, where observations may be skipped, or sampled irregularly over a prolonged period [5,6].Similarly to electronic health records (EHRs), laboratory test records encompass a wealth of abundant and longitudinal patient information, still, notable for their irregularity, temporality, and sparsity, often accompanied by noisy outliers and missing values [3,4,6].

Related works
Machine learning techniques have undergone extensive investigation, facilitating diverse applications that contribute to clinical care through the utilization of EHRs [7,8,9].SSL has recently gained attention due to its ability to derive labels for training data directly from the data itself [10,11,12,13,14].This offers a unique opportunity to leverage the vast amounts of data available without relying on quality annotations [15,16,13].Generally, SSL can be categorized into generative and contrastive learning approaches [17,18].Generative models possess the capability to generate new samples from the underlying distribution or recover the original data distribution [13,19].For instance, Simone et al. [19] showcased the utilization of a generative adversarial network (GAN) for the synthesis of electrocardiography (ECG) data.Their study achieved remarkable results by generating a wide spectrum of ECG patterns that preserved synchronization and abnormalities.Yoon et al. [14] leveraged SSL strategies to impute corrupted values and train on unlabeled data in the domain of genomics and clinical data.Furthermore, Lee et al. [20] employed GPT-4 to summarize physician-patient conversations and generated clinical notes.
Meanwhile, contrastive learning aims to capture the relationship between input data and prediction targets, thereby generating a global contextual representation that is shared among samples [13,16].For instance, Kiyasseh et al. [16] devised a temporal and spatial discriminative approach for ECG analysis, extracting patient-specific representations through leveraging contrastive loss.Zhang et al. [21] employed contrastive learning to capture the distances between temporal and frequency components, and applied the pretrain representations to various time-series databases, including ECG, human activity recognition, and physical status monitoring.Wickstrøm et al. [22] proposed a contrastive framework based on mixing up augmentation for uni-and multivariate timeseries data, which was then transferred to ECG classification.Furthermore, Ouyang et al. [23] trained an encoder using unlabeled retinal images through contrastive learning, and subsequently fine-tuned the encoder for the classification of reference and non-reference cases.
Many SSL applications focus on learning from a pretext task and transferring the learned representations to a different domain.This entails transfer learning, a concept that improves a classifier in one domain (pretext task) using more readily obtainable data and then applying this acquired knowledge to another domain (downstream task) [24].For instance, Tang et al. [25] utilized a teacher-student self-training model to capture information from a large-scale unlabeled dataset of wearable and mobile sensing data, which was subsequently transferred to seven different datasets with varying sensor types, populations, and protocols.Similarly, Spathis et al. [26] used activity accelerometer sensor data as input to forecast heart rate and transferred the learned representations to capture physiologically meaningful and personalized information using linear classifiers.
Nontheless, the aforementioned works assume that signals and information were collected on a regular basis and do not address the issues of irregularity, absenteeism, and sparsity commonly encountered in EHRs.Tipirneni et al. [27] bypass the irregular, absent, and sparse nature of EHRs by training the model in a SSL manner and mask out unobserved forecasts in the loss function during training.Furthermore, SSL applications have been recognized as challenging in terms of finding effective pretrain tasks [17].Most studies have relied on an empirical trial-and-error approach to identify the most suitable pretext tasks.The transferability of pretrain models has shown mixed results when applied to more specific domains such as medicine [28].For instance, Liu et al. [29] were unable to successfully transfer ImageNet for detecting lymph node metastasis in pathology images, citing significant domain differences between natural scenes and pathology images as the reason for the transfer failure.Another example is the discussion surrounding language representation models for the biomedical domain [30].Gu et al. [31] argued that models trained on domain-specific vocabulary outperform those trained on general corpora due to differences in word distributions between general and biomedical corpora .

Cardiovascular diseases and associated risk factors
Cardiovascular disease stands as one of the leading causes of global mortality.Established traditional risk factors incorporate HTN, DM, and smoking [32].These risk factors can contribute to endothelial injury, plaque formation, and coronary thrombus formation [33], consequently driving the progression of cardiovascular disease.PCI has emerged as a widely employed treatment modality for cardiovascular disease [34].
Although the implementation of drug-eluting stents (DES) has significantly reduced the incidence of TVR in recent years [35,36], the occurrence of TVR after DES implantation, with an incidence ranging from 3% to 20%, remains a prevalent clinical concern [37,38,39].Hence, the prevention of TVR and the reduction of re-admission rates continue to be significant clinical challenges in the field of cardiovascular medicine following PCI.With respect to the timing of cardiac events, patients undergoing PCI face a risk of subsequent adverse events, including TVR [40,41].However, achieving a consensus on accurate preprocedural risk stratification and prognosis assessment to identify high-risk patients prior to PCI remains an ongoing pursuit.
TVR is associated with complex pathophysiological mechanisms involving lipid metabolic disorders [42] and inflammatory processes [43].Several previous studies have explored potential predictive factors linked to a high incidence of TVR based on patient and procedure-related variables [44,45,46].However, dedicated applications for TVR prediction are yet to be developed.Most studies have focused on identifying comprehensive predictors for TVR or developing prediction models without specifically targeting individualized risks [45,47,48].The collection of data for comprehensive predictors may introduce burden and complexities when integrating such applications into routine clinical practice and thus warrants careful consideration.

Laboratory markers of cardiovascular diseases
Earlier studies have indicated that preprocedural parameters are associated with cardiovascular disease.Total cholesterol levels (Chol) and low-density lipoprotein cholesterol (LDLc) are strongly linked to cardiometabolic diseases and widely accepted in diagnostic practices.Conversely, the plasma level of high-density lipoprotein cholesterol (HDL-c) exhibits an inverse relationship with the risk of cardiovascular diseases [49].Clinical studies have highlighted the connection between circulating white blood cells (WBCs) and cardiovascular outcomes, demonstrating that elevated WBC count increases the short-and long-term risk in patients with acute coronary syndromes (ACSs) [50].Hage et al. [51] reported that baseline fasting blood glucose (glucose AC) predicts restenosis, suggesting that focusing on glucose reduction rather than solely normalizing glucose levels is more beneficial [51].Additionally, serum uric acid (UA) has been identified as a prognostic cardiovascular biomarker, predicting total and cardiovascular mortality in the context of secondary prevention of coronary artery disease, as demonstrated by the Verona Heart Study [52].Furthermore, the National Cholesterol Education Program III (NCEP III) recommends the use of Chol or LDL-c in conjunction with HDL-c (Chol/HDL-c, LDLc/HDL-c) as markers for screening and treating patients with cardiovascular disease, along with the utilization of the 10year risk Framingham scoring assessment [53].

II. METHODOLOGY
To address the challenges previously mentioned, we have devised the following propositions for our work: (1) In order to address the challenges stemming from irregularity, absenteeism, and sparsity within longitudinal observations, we have deployed interpolation and SSL techniques to infer absented data.(2) For patients with limited numbers and episodic observations, we developed a pretrain model specifically tailored to capture the temporal latent representation of prevalent cases and transfer disease progress knowledge to these smaller cohorts.(3) Our work is based on commonly available resources, avoiding the need to initiate new trials for extensive patient data collection.We focused on six laboratory parameters: the Chol/HDL-c ratio, LDL-c, the LDL-c/HDL-c ratio, glucose AC, WBC, and UA.(4) Our work leverages the intercorrelation among cardiometabolic diseases as an indication of pretext and downstream tasks.
Our objective is to construct a two-stage pretraining model that captures the laboratory progress of general cases and utilizes this information to predict cardiac events in another patient group.Prior research has indicated that incorporating a two-step training approach, involving pretraining the model on a domain-general dataset followed by training on domainspecific datasets, yields enhancements in transitioning representations to the downstream task [12,30,23].Hence, we propose a two-stage training process: Stage 1 involves learning general laboratory progress information based on interpolated data, followed by Stage 2, where SSL is employed to refine the model's progression representation using non-interpolated data.Subsequently, GLP model is fine-tuned to classify the occurrence of TVR.
The following sections outline the detailed methodology of GLP.Initially, we introduced the interpolation and framing method for longitudinal data.Subsequently, we outlined the design of the GLP model, along with the training algorithm and the design of downstream classifier.We also elucidated the validation methods employed for both the pretext and downstream tasks.Lastly, we provided details regarding the patient recruitment process and the datasets utilized for both tasks.

Interpolation methods
Interpolation serves the purpose of inferring values that lie between two known observations.It aims to approximate the values of f (x) that fulfill the interpolation conditions f (x j ) = y j for j = 0, 1, . . ., n.This study encompassed the evaluation of three interpolation techniques: linear interpolation, piecewise cubic Hermite interpolating polynomial (PCHIP), and barycentric interpolation.The values in linear interpolation were derived by considering the gradients of the known observations, denoted as: where t signifies the time of estimation, ŷj denotes the estimated value at t j , and i < j < k.Meanwhile, PCHIP interpolation [54] defines as the slopes at x j .If the signs of d j and d i differ or either of them equals zero, ŷj is set to 0. Otherwise, it is determined using the weighted harmonic mean, expressed as: where Finally, for barycentric interpolation [55,56], a given set of nodes x 0 , x 1 , . . ., x n and masses w 0 , w 1 , . . ., w n are utilized to determine the functions w 0 (x), w 1 (x), . . ., w n (x) satisfying: Here, x represents the barycenter of the nodes, which can be employed for interpolation using: where b i corresponds to the linear function.Fig. 1 illustrates the segmented period of glucose AC, employing different interpolation methods.

Longitudinal data framing
By defining the laboratory observations for each patient during the study period as y t0 , y t1 , . . ., y tn , where t i represents the months in the timeline and t i ∈ R, we designate y t0 and y tn as the actual observed values.Let y tm denote the secondto-last observed value of a patient.Interpolation takes place between y t0 and y tm when y ti is missing.The observations were organized into a frame with a designated time interval r.Therefore, the longitudinal data is framed as y ti : y ti+r , with subsequent frames incrementing by one step (i+1) while i + r ≤ m − 1.If ∥t i , t m−1 ∥ ≤ r, the frame is omitted.These segmented frames, denoted as the interpolated data, serve as the input of Stage 1. Their corresponding prediction target is y ti+r+1 , where i + r + 1 ≤ m.
From the previous stage, the frame y tm−r : y tm and the last observed value y tn are isolated.They are the non-interpolated data in Stage 2, with y tm−r : y tm serving as the input and y n as the prediction target.Considering that coronary stent trials primarily focus on target vessel/lesion-related clinical outcomes within the shorter term, particularly in the first 12 months post-PCI [57], we set r as 12 months.The framing process is visually depicted in Figure 2.

GLP model design
GLP was designed to monitor the laboratory progress of patients and forecast their observations for the subsequent month.Fig. 3 presents the model architecture of GLP, which consists of two main components: the longitudinal iterative block (LIBC) and a regressor.The LIBC comprises a Bidirectional Long Short-Term Memory (BiLSTM) layer and a Fully Connected (FC) condensing layer.A Rectified Linear Unit (ReLU) activation function is applied after each layer to enhance the non-linearity of the model.The BiLSTM processes the input data in both the forward and backward directions [58], enabling the capture of contextual information from past data.The number of hidden nodes for the BiLSTM layer was set to 5. The FC condensing layer is utilized to compress the output of the BiLSTM layer back to the original input, facilitating an autoregressive flow.
On the other hand, the regressor consists of two FC layers followed by a ReLU activation function in between.The number of hidden nodes for the regressor is set as 5, 2, 2, and 1, respectively.The model used the Mean Squared Error (MSE) as loss function, which measures the average squared difference between the predicted values and the actual values.

Two-stage training algorithm
During Stage 1 training, the interpolated data were utilized in supervised learning.An additional parameter called the number of certainty mask (certain) was introduced in Stage 1.It denotes the required number of real observations within a frame, indicating the level of real observations the models require to generate reliable predictions.It also serves as a means to address uncertainty within the training data.Adhering to insurance regulations, patients were scheduled for cardiovascular disease examinations every three months, resulting in a maximum of four actual observations within a 12-month timeframe.Consequently, the range of 0 to 5 was explored for certain, and the optimal value was determined as a parameter setting for GLP.
In Stage 2 training, we employed an autoregressive based SSL, which takes inputs from a time series regressed on previous inputs from the same time series.The probability of each input is conditioned on the preceding input, and can be formulated as: where x t represents the input at time t, p θ denotes the probability, and max θ p θ signifies the maximized likelihood [13].Starting from frame y tm−r : y tm , the model utilizes the parameters θ obtained from Stage 1 and predicts y tm+1 .The subsequent input frame becomes y tm−r+1 : y tm+1 , with the prediction target located at y tm+2 .This process continues until the prediction target reached y tn .The predicted data then becomes part of the training data for the next frame, allowing the model to learn from its own generation.
To summarize, in Stage 1, there is no time gap (g = 0) between the input frame and the prediction target.The input passes through the LIBC once and then enters the regressor.On the other hand, in Stage 2, g = n − m − 1 and g > 0, and the process iterates until the prediction target is reached.The training algorithms for Stage 1 and 2 are explicitly outlined in Algorithm 1 and 2. Each laboratory parameter was trained individually and optimized to achieve the best performance.
The six pretrained GLP models were subsequently concatenated in a multimodal fashion and utilized for domain transfer to perform TVR occurrence classification, as illustrated in Figure 4. for batch{x k } N k=1 ∈ X interpolated do for t ∈ {1, 2, . . ., max(g k )} do select estimated ŷk at time stamp g i + 1 Update the parameters θ of the model 18: end for Output: ŷ

Input vector and normalization
The input of GLP consists of five features: age, gender, certain, discrete value encoding, and normalized laboratory values.Numeric values, such as age and laboratory values, are normalized using the natural logarithm of one plus the input (log1p).This transformation ensures that the values are projected into a vector space above zero, preventing potential errors that could arise from maldistribution between positive and negative values.Additionally, log1p is accurate for small values of x, ensuring that 1 + x = 1 with floating-point accuracy without significantly altering the original value [59].The use of log1p also prevents information leakage [60], unlike scaling which necessitates knowledge of the maximum and minimum values.Regardless of the employed training approaches, the data were randomly divided into training and testing datasets in an 80:20 ratio.All training processes utilized the 5-fold crossvalidation technique, and the reported results represent the mean value obtained from five repetitions of the training process [61].Ablation studies were conducted to analyze the performance of different training processes combined with various interpolation methods.The outcome of the model was assessed using the R-squared (R 2 ) metric, which indicates the proportion of variance in the dependent variable that can be predicted by the independent variables in the model.R 2 values range from zero to one, with a value of one representing a perfect fit to the data and a value of zero indicating a poor fit.R 2 < 0 suggests that the model performs worse than a horizontal mean line passing through the mean value of the data.Statistical significance to determine differences in model performance was assessed using an independent T-test, with p < 0.05 indicating statistical significance.

Downstream classifier design and validation
The collection of the six laboratory values occurred at the time of performing PCI.Subsequently, g (month-based) was computed based on the temporal disparity between the PCI and TVR dates.Patient information, including gender, age, and the six laboratory parameters, was collected and normalized following the aforementioned procedures.Due to the imbalanced distribution of patients between those with TVR occurrence (42) and those without (441), TVR-negative cases were randomly downsampled.As a result, only 84 patients entered the training process.Patient data are processed by the frozen GLPs, which are concatenated in a multimodal fashion (as shown in Figure 4).
Owing to the limited patient volume, non-neural network algorithms were chosen for training the downstream classifier.The included methods are Light Gradient Boosting Machine (LightGBM), Support Vector Machine (SVM), Logistic Regression (LR), and K-Nearest Neighbors (KNN).These approaches were selected for their diverse mechanisms.Cohen's Kappa was calculated to assess the agreements between the classifiers, and the mean value of Cohen's Kappa represented the overall agreements among the pairwise classifiers.A value of Kappa ≤ 0 indicates no agreement, while an increase in Kappa signifies an increase in agreement, with a value of 1 indicating perfect agreement.
Additionally, we compared the latent progress representations produced by GLP by extracting the outputs of LIBC (P rogress emb ) and regressor (P rogress out ), depicted in Figure 3.The transferred results based on the original data (normalized but not processed with GLP), P rogress emb , and P rogress out were compared.The reported performance represents the mean value obtained after executing the training process five times.The downstream classifier is a binary classifier that distinguishes TVR occurrence (positive/negative).The evaluation metrics for TVR incidence include the area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, precision, and F1 score.To determine the statistical significance of the differences in evaluation metrics between the original data and the extracted representations, an independent T-test was also conducted.
Furthermore, we plotted the distribution of the original data, P rogress out at g/2, and P rogress out at g to visually depict the contribution of GLP throughout the process.Here, g/2 signifies that the iteration before reaching the intended event.

Patient recruitment and datasets
Two datasets were obtained from two diverse medical institutes.The pretext dataset was obtained from the Chang Gung Research Database [62], a multi-institutional electronic medical records database comprising original medical records of seven medical institutes in Taiwan.We included patients who were diagnosed with HTN prior to DM. Patients diagnosed with hypertension before the age of 40, those with any oncology visits, or individuals with observations spanning less than a year were excluded.The date of diagnosis was determined based on the International Classification of Diseases (ICD) encoding or the date of medication prescription.The ICD codes and medications used for the indications can be found in Appendix A. When a patient had been coded as having HTN or DM twice or more during a year, the onset date of the disease was defined as the first coded date.If the date of the first medication prescription preceded the ICD coded date, then the earlier date was designated as the onset date of the disease.The time interval between the HTN and DM diagnoses was set to be ≥ 3 months.
We gathered data on patients between the onset of HTN and DM; that is, the patient was diagnosed with HTN, but yet to be determined as DM.Demographic information and laboratory data of the enrolled patients were collected, including age, sex, Chol/HDL-c, LDL-c, LDL-c/HDL-c, glucose AC, WBC, and UA.Erroneous values such as "NA" or "." were excluded.A total of 9,720 patients were included, and laboratory data were collected between January 2001 and December 2019.
The downstream dataset was acquired from the Taipei Veterans General Hospital, a tertiary hospital situated in northern Taiwan.We recruited 891 patients with noninvasive evidence of myocardial ischemia who underwent PCI between January 2005 and January 2022.Patients with ACS, acute decompensated congestive heart failure, acute or chronic infections, autoimmune diseases, malignancies with a prognosis of less than one year, unstable hemodynamic status, or those unable to receive dual antiplatelet therapy were excluded.Angiographically successful coronary intervention was defined as residual stenosis of less than 30%, and coronary thrombolysis in myocardial infarction grade 3 flow was achieved at the conclusion of the procedure without any significant complications.All patients were followed up and to monitor the occurrence of TVR.
Patients necessitating TVR were labeled as positive, and the corresponding dates were recorded.Patients without TVR until the end of January 2022 were classified as negative, with the date of January 31, 2022 recorded as the endpoint.Within this dataset, the available information encompassed age, sex, PCI date, TVR date, and the aforementioned six laboratory values (collected during the PCI procedure).The time intervals between the PCI and TVR dates were calculated.Patients with incomplete data were excluded from the analysis to avoid deviations into this study, as the imputation of missing values could potentially have that effect.Consequently, a total of 483 patients were included in the subsequent analysis, comprising 42 TVR and 441 non-TVR cases.
The pretext dataset are longitudinal observations that consists of multiple events, whereas the downstream dataset are episodic records that consists of one observation event for each patient.All patient data were de-identified prior to analysis.This study was approved by the Institutional Review Board of the Chang Gung Medical Foundation (No. 202000376B0) and Taipei Veterans General Hospital (No. 2019-12-012CC).

III. RESULTS
Table I illustrates the demographic information of the patients enlisted from the two datasets.It is noteworthy that patients in the pretext dataset (HTN to DM patients) exhibit a relatively younger age compared to those in the downstream dataset (PCI patients).It is observed that TVR typically occurs within a period of 2.32 ± 2.64 years.Fig. 5 depicted the overall performance of GLP, assessed by averaging the R 2 results across certain ranging from 0 to 5. Fig. 5(a All figures inspect into the effect of different interpolation methods.Observing from the figures, the two-stage training approach (Fig. 5(b)) is the only approach that surpasses SSL (baseline).Using linear (R 2 = 0.49, p = 0.508) and PCHIP (R 2 = 0.48, p = 0.603) both yield better results than SSL, although the differences did not reach statistical significance.However, when compared to barycentric interpolation, both linear (p = 0.031) and PCHIP (p = 0.046) show significantly higher performance.Other training approaches, including supervised training depicted in Fig. 5(c) and hybrid training in Fig. 5(d), demonstrate weaker performance compared to the horizontal mean line (R 2 < 0), with larger variations.
Table II provides a summary of the optimal certain settings for each GLP, which were found to be inconsistent across   LGBM: Light Gradient Boosting Machine; SVM: Support Vector Machine; LR: Logistic Regression; KNN: K-Nearest Neighbors (KNN); AUROC: area under the receiver operating characteristics curve; Avg.: average; *: Statistical significant differences between P rogress emb and original data; **: Statistical significant differences between P rogressout and original data, as well as between P rogressout and P rogress emb .different parameters.It is noteworthy that, in most parameters, the two-stage training process consistently outperformed SSL, although the degree of improvement remained marginal (p > 0.05).While specifying the certain value resulted in enhancing GLP performance (evidenced by an increase from 0.49 to 0.57 for linear interpolation), the correlation between R 2 and certain was determined to be weak based on Pearson correlation analysis (linear: -0.002, PCHIP: 0.026, and barycentric: -0.040).This result indicated that R 2 and certain are not linear correlated.The detailed performances of certain for each analysis can be found in Appendix C. Based the best-performing linear interpolation approach, we trained the GLP pretrain models by optimizing certain value for each parameters individually.
Table III presents the result of the downstream task.After the processing by GLP, P rogress out exhibited a significant enhancement in classification performance.On average, the measures attained an AUROC of 0.91, accuracy of 0.90, sensitivity of 0.80, specificity of 0.98, and F1 score of 0.86.P rogress out displayed a substantial superiority over the performance of the original data (p < 0.01) and P rogress emb (p < 0.01).The key distinction between P rogress emb and P rogress out lies in the fact that P rogress out represents a condensed version that solely indicates the laboratory values for the subsequent month, whereas P rogress emb still retains information on age, gender, certain, discrete value range encoding, and laboratory values.This condensed version has a distillation effect, where the regressor was trained to provide a more valuable indication of future trends based on these information.Notably, LGBM emerged as the best-performing algorithm for both P rogress emb and P rogress out , while SVM demonstrated the highest performance when utilizing the original data.The Kappa score demonstrates that the agreement among classifiers increased from 0.37 to 0.94, indicating that GLP alters the distribution of the original data and simplifies the classification task, irrespective of the algorithm mechanism employed.
Fig. 6 visually illustrates the changes in data distribution, using glucose AC and LDL-c as exemplars.The figure depicts that prior to GLP processing (Fig. 6a), distinguishing between TVR and non-TVR cases was challenging.However, after GLP processing, the non-TVR cases gradually converged towards a singular point, while the TVR cases displayed a more scattered distribution (as depicted in Fig. 6b to c).

IV. DISCUSSION
We have successfully engage in translational engineering by transferring the progression of cardiovascular laboratory parameters from one patient group to another, without confining to episodic observations.To the best of our knowledge, this is the first study to apply the transfer of laboratory progression between patient groups, demonstrating that disease progression can be effectively transferred through deep neural network processing.The generality of our findings is assured.The datasets were sourced from two distinct medical institutions, signifying divergent patient populations, personnel, operational protocols, and measurement equipment.This finding opens up opportunities to leverage the trends observed in general cases for developing data-driven applications targeting patient groups with limited data availability.
Pretrained models capture the temporal dynamics and generate latent representations that enhance predictive capabilities for other tasks.With the widespread adoption of EHRs in modern hospitals, a large amount of data has been collected from prevalent cases, such as patients with HTN and DM.Conversely, specific cases, such as TVR occurance after PCI, remain limited in number.Given that HTN and DM are known risk factors for TVR [32], we successfully transferred the trend of laboratory progress observed in HTN patients (who had yet to be diagnosed as diabetic) to predict the progression of PCI patients.
Prior investigations [63] have highlighted that training deep neural network models directly through gradient descent can yield randomly initialized models that are less optimized.In contrast, commencing the training process with a pretrained model enables the preservation and utilization of previously acquired knowledge, thereby transforming our randomly initialized models into exceptional pretrained feature extractors [13].The practice of acquiring knowledge from a more generalized domain and subsequently fine-tuning it to a specific domain during pretraining has previously exhibited efficacy [30].It has been observed that representations acquired from supervised objectives tend to be more domain-specific and possess limited transferability to out-of-distribution domains [11,13,19].Thus, modifying the model based on SSL confers enhanced flexibility and facilitates the extrapolation of knowledge across domains.This transformation enhanced our ability to infer patient progress.Despite the heterogeneous characteristics exhibited by the two patient cohorts, there existed intercorrelations among cardiometabolic diseases.
SSL proves viable in bridging the gap caused by irregularity, absenteeism, and sparsity in prolonged monitoring, enabling adequate prediction.Learning from interpolated data in advance exhibits the capacity to enhance prediction performance, however, the effects vary across interpolation methods.Linear and PCHIP methodologies are designed to acquire a continuous function, while barycentric interpolation is tailored to leverage the center of mass for interpolation.The distribution pattern of laboratory progress resembles more closely a continuous curve that extends over time, rather than being clustered in discrete bundles.Our findings reveal that linear and PCHIP interpolation yield more informative estimations, enhancing the performance of SSL.In the case of LDL-c and LDLc/HDL-c examinations, interpolation expands the tolerance for periods of absence.Thus, our work implies that with sustainable estimation, patient risks can be monitored with less frequent returns and examinations (beyond the conventional 3month interval), alleviating the burdens of travel and medical expenses for patients.
The results indicate that without the support of GLP, the original data lacked sufficient distinctiveness to enable substantial TVR predictions.Upon processing the data with GLP, the classifier successfully achieved a satisfactory separation.Non-TVR cases gradually converged towards a single point, while TVR cases exhibited a more scattered pattern.These findings align with clinical observations, which suggest that stable patients exhibit less variation, whereas those with scattered observations face increased risks.The distinctiveness is not constrained by algorithmic mechanisms.However, it can be observed that the data were transformed from a hyperplane cluster distribution (where SVM demonstrated highest perfor-  The user scenario of GLP is depicted in Fig. 7. Following PCI treatment, patients are required to undergo follow-up visits at the outpatient setting on a 3-month basis.Clinicians assess patient risk to prevent TVR occurrence.GLP identify patients at a higher risk, prompting the initiation of more advanced examinations, such as treadmill tests, thallium scans, or coronary computerized tomography (CT).Conversely, patients with a relatively lower restenosis risk can also be separated, serving as a screening tool for more precise event detection and examination resource allocation.In general, GLP optimizes treatment strategies and improves patient prognosis while maintaining simplicity and user-friendliness without relying on accumulating comprehensive TVR predictors.
To broader the applications to other disease, an effective approach for identifying the appropriate pretext and downstream tasks becomes indispensable.Our findings suggested that through leveraging the intercorrelations among diseases, such as comorbidity or shared risk factors, represents a more proficient approach for selecting pretext and downstream tasks.The intercorrelations imply underlying similarities and suggest the possibility of translating disease progression between different patient groups.This approach offers a more precise alternative to empirical trial-and-error methods.
This study is subjected to certain limitations: While other approaches, such as GAN, have demonstrated effectiveness in recovering the distribution of absent observations, we opted for interpolation as it is considered a simpler and less computationally expensive method that still achieves excellent transfer performances.However, further exploration is required to determine the necessity or indispensable advantages of integrating a more advanced distribution generator; Our work is constrained by the unavailability of genuine patient observations at the target event, which prevented us from precisely reversing the exact output.Additionally, it is important to acknowledge that, at present, the output of GLP corresponds to a representation that cannot be straightforwardly reconstructed as real-world values through simple inverse-normalization.To achieve accurate inverse mapping, training an additional network or a decoder might be necessary to effectively map the latent output to real-world values; Furthermore, it is worth noting that this study exclusively focused on the analysis of numeric laboratory results and its applicability is limited to other numeric examinations.To make GLP a more comprehensive laboratory information extractor, it is necessary to incorporate different types of laboratory analyses, including categorical variables.
SSL also inherits certain limitations, such as the need for a larger amount of training data, more training iterations, and computational expenses compared to supervised training [15].Additionally, deep neural networks employ a substantial number of trainable parameters and layers, making the model challenging to interpret [64,19].Thereby, SSL sacrificed interpretability for accuracy.This consideration should be taken seriously when utilizing the technology, as previous research has indicated that the level of trust in AI systems impacts the outcome of system utilization [65,66].Apart from interpretability, adopting AI system in to clinical workflow involved in multidisciplinary integration, encompassing areas such as the user interface design (research of Human-Computer Interactions) [67,65], and factors that influence the acceptance of technology [66].Ensuring system usability and reliability while resolving the disparities between proof-ofconcepts and real-life environments is crucial for the successful adoption of AI systems into daily practice.This endeavor necessitates further collaboration to facilitate the reform of healthcare.

V. CONCLUSION
Our research successfully translate the progression trends of cardiovascular laboratory parameters between patient groups by capitalizing on the advantages of SSL and pretrained models.This discovery paves the road for wider data-driven applications in healthcare, and also functions as a screening tool for more precise event detection and judicious allocation of examination resources.Additionally, our study suggests that patients can benefits from diminished frequency of visits and onerous examinations by implementing sustainable estimation.To accomplish healthcare reform through the utilization of AI systems, the key to success and optimal system utilization lies in the multifaceted aspects involved multidisciplinary collaborations.

Fig. 1 .
Fig. 1.Interpolation results based on different methods.This showcases a segmented period of glucose AC.The x-axis corresponds to the timeline, while the y-axis represents the laboratory values.
) provides an overview of the general disparities among different training approaches, incorporating all interpolation methods.It reveals that SSL and two-stage training exhibit similar performances, both achieving a mean R 2 value of 0.46 when rounded to the second decimal place.On the other hand, supervised training (Stage 1 training) and hybrid training achieved lower means with larger variations.Fig. 5(b) to Fig. 5(d) delves into the exploration of two-stage training, supervised training, hybrid training, respectively.

Fig. 5 .
Fig. 5. R 2 values for different training approaches.SSL (baseline) is compared with (a) averaged R 2 across different training approaches; (b) two-stage training employing different interpolation methods; (c) supervised training employing different interpolation methods; and (d) hybrid training employing different interpolation methods.The mean value is represented by the green triangle and its corresponding numerical figure is provided in the text below.The presented outcomes are an average of five repetitions of training and prediction, determined by aggregating the R 2 outcomes across certain values ranging from 0 to 5. SSL: self-supervised learning.

Fig. 6 .
Fig. 6.Alterations in distribution prior to and subsequent to GLP processing.The values were inverse from normalization.g: time gap between the input frame and prediction target.

Fig. 7 .
Fig. 7.The user scenario of GLP adoption.Advanced examinations indicates treadmill test, thallium scan, or coronary computerized tomography.
Duration between HTN and DM signifies the cumulative observed timeframe for each individual.Occurrence duration delineates the temporal interval between PCI and TVR, and non-occurred duration characterizes the observed time span for patients who did not experienced TVR.

TABLE II R
2OF GLP BASED ON OPTIMIZED CERTAINTY MASK CONFIGURATIONS WBC: white blood cells; UA: Uric Acid.The differitation did not reach the level of statistical significance (p > 0.05).The month indicated in the brackets signifies the necessary follow-up visits for patients based on the requisite count of actual observations within a 12month period.