Atrial Periodic Source Spectrum From Preoperative Body Surface Potentials Predicts Long-Term Recurrence of Atrial Fibrillation

Objective: About half of patients experience recurrence of atrial fibrillation (AF) within three to five years after a single catheter ablation procedure. The suboptimality of the long-term outcomes likely results from the inter-patient variability of AF mechanisms, which can be remedied by improved patient screening. We aim to improve the interpretation of body surface potentials (BSPs), such as 12-lead electrocardiograms and 252-lead BSP maps, to aid preoperative patient screening. Methods: We developed the Atrial Periodic Source Spectrum (APSS), a novel patient-specific representation based on atrial periodic content, computed on the f-wave segments of patient BSPs, using a second-order blind source separation and a Gaussian Process for regression. With follow-up data, Cox's proportional hazard model was used to select the most relevant feature from preoperative APSSs responsible for AF recurrence. Results: Over 138 persistent AF patients, the presence of highly periodic content with cycle lengths between 220–230 ms or 350–400 ms indicates higher risks of 4-year post-ablation AF recurrence (log-rank test, p-value $< 0.001$). Conclusion and Significance: Preoperative BSPs demonstrate effective prediction in the long-term outcomes, highlighting their potential for patient screening in AF ablation therapy.


I. INTRODUCTION
A TRIAL fibrillation (AF) is the most common cardiac arrhythmia, where the atria beat irregularly and rapidly. Catheter ablation is the most effective treatment for AF and improves quality of life [1]. Adoption of well-studied, standard ablation protocols can improve efficiency and unify the best practice to different centers. However, only 54.1% of paroxysmal AF patients and 41.8% of non-paroxysmal AF patients remained free from AF after three to five years by standard single-procedure ablation protocol [2], and no significant difference was found between standard protocols in terms of long-term AF freedom [3], likely due to inter-patient variability. Thus, there is a strong need to improve preoperative patient screening for AF ablation.
Body surface potentials (BSPs), such as 12-lead electrocardiograms (ECGs) and 252-lead body surface potential maps (BSPMs), are ideal tools for non-invasive mapping. Nonetheless, existing methods to analyze the BSPs have drawbacks of limited accuracy [4], as the problem of reconstructing the electrical waves in the atria using the torso potentials is mathematically ill-posed. Furthermore, as the number of patients enrolled in clinical trials is usually limited, and the simultaneous mapping of the atrial and torso potentials during AF is rarely done in clinics, it has been challenging to apply supervised learning techniques on patient data. Considering this challenge, prediction of mechanisms targeted by ablation protocols was done by either utilizing synthetic data for the training process of supervised learning [5], [6], or by performing "virtual ablation" on the synthetic atria built on patient-specific atrial meshes and biophysical properties of the atrial tissue [7], [8]. These approaches, albeit promising, rely on a strong assumption that computer simulations faithfully represent patient AF, which still needs further clinical validation.
In this study, we aimed to develop a prediction model which directly correlates patient BSPs with their ablation outcomes, and which could be trained directly on existing patient data. A key intuition in our proposed method is that the periodic content on the atria is physiologically important, and it can be extracted from BSPs using second-order blind source separation (SO-BSS) [5], [9]. The periodic content is associated with different phenomena such as focal sources from pulmonary veins [10] and , and the 10 strongest SO-BSS sources (S). Signals shown are for a synthetic AF maintained by focal source and its induced rotors. Each signal is colored by the maximal auto-correlation (MaxAC). Note that only 10 sample signals of V and E are shown, where signals of V were randomly sampled, and signals of E were sampled to reflect the full range of MaxAC variability. Signals of S and E were sorted according to MaxAC for comparison. The white and gray nodes denote observed and unobserved variables, respectively, and green nodes signify SO-BSS source variables. A and B denote two linear mixing matrices in (2) and (1). (Step 1) An example non-invasive APSS from BSPM (estimated with l = 10 with a Gaussian Process Regression (GPR) model), with estimated mean (signal-level APSS) and standard deviation of the GPR predictive distribution over cycle length (CL). (Step 2) Extraction of APSS features (red bars) using max aggregation (black plot) and downsampling from all APSSs of different recordings (top colored plots) of a patient. (Step 3) Example log hazard ratios (HRs) with 95% confidence interval (CI) (x-axis), showing the top-3 APSS features with the highest hazard ratios, with HRs of all the other features being 0, derived from Cox's survival regression model using patient-level APSS features from BSPM with l = 10 for GPR. HR > 1 means a higher value of the factor is likely to result in a decrease of post-ablation AF freedom. In this case, 220-230 ms is the only feature with the 95% CI of HR > 1. (Step 4) Post-ablation outcomes of two patient groups with low/high hazard, i.e. low/high value in the top-1 APSS feature, by cross-validation over 138 patients.
other areas [11], as well as reentrant sources [12], such as rotors and macro-reentries. The dominant frequency (DF) presents intra-patient regional stability in persistent AF patients [13], and also exhibits correlation between torso regions and their nearest atrial regions [14]. This study presents the development of a representation of atrial periodic content extracted from a single beat of BSP from AF patients, which we term atrial periodic source spectrum (APSS), and apply it to screen preoperative persistent AF patients for eventual AF recurrence.

II. METHODS
An overview of our pipeline to use APSS for AF patient screening is shown in Fig. 1. We first present our methods to non-invasively compute APSS, which is a vector representation to summarize the periodic sources of the atria from one beat (the f-wave segment in an R-R interval) of BSP, as shown in Fig. 1 (Step 1). We then show that the aggregation of APSSs from all beats of the same patient exhibit one or more patient-specific patterns ( Fig. 1 (Step 2)). Survival regression was applied to the resultant patient-level APSS in order to identify APSS features that indicate superior or inferior post-ablation maintenance of AF freedom ( Fig. 1 (Step 3-4)).

1) Equivalent Periodic Sources From BSPs:
Given that BSPs of length T are collected using N sensors, we represent the system by an N × T matrix, E. Assuming the mechanical movement of the atria to be minimal relative to the ventricular movement, the BSPs, E, can be represented as a linear mixture of the transmembrane voltages, V.
In our previous work [5], we showed that equivalent atrial periodic sources, S, can be extracted from the BSPs, E, using a second-order blind source separation (SO-BSS) that follows the SOBI algorithm [9]. A consideration to use SO-BSS rather than other methods such as Fast Fourier Transform (FFT) to extract the periodic content, is that FFT is usually performed on longer signals over multiple heartbeats for a satisfactory resolution of power over the frequency domain, such as 4 seconds with a very high sampling frequency (2048 Hz) in Guillem et al. [14]. SO-BSS, on the contrary, can capture the beat-to-beat variability of AF electrical patterns over signals as short as one beat. Following SO-BSS, the BSPs can also be assumed as a linear mixture of K equivalent atrial periodic sources, S, stored in a K × T matrix (K ≤ N ), computed using a linear matrix B derived from performing joint diagonalization over a large number of autocovariance matrices with different time lags.
Thus, the mapping between E, S and V are all linear, as shown in Fig. 1 and described by where A and B are two matrices. Fig. 1 also shows the waveforms of selected transmembrane voltages, BSPMs, and the top ten sources extracted by SO-BSS of an AF episode driven by both rotors and a focal source [5]. It can be seen that the highly periodic components on the atria are underestimated on the torso space, but they are reconstructed by SO-BSS. As previous work [5], the periodicity of an extracted source over any time lag can be measured by auto-correlation function (ACF) [15]. We extracted K = 10 sources from BSPs to achieve a good balance between source estimation accuracy and time efficiency. We represented each periodic source via the cycle length (CL) and the maximal autocorrelation (MaxAC) pairs. On nearly periodic signals, CL marks the length of the time to complete one full cycle, and MaxAC denotes the periodicity of the signal: where the set T AP SS = {100, 101, . . . , 400}ms contains candidate CLs of periodic sources. We used our previously published software [16] to compute these CL-MaxAC pairs.

Regression (GPR): APSS is defined as the following function
The above section estimated a set of CL-MaxAC pairs, which are effectively samples of the peaks of APSS over the included CLs. In order to estimate MaxAC values at other CLs, we adopted a zero-mean Gaussian Process for regression (GPR) function [17], which is a Bayesian method to estimate an unknown function in an infinite domain given a finite number of its evaluations. We used it to represent that MaxAC is continuous over CL, and decreasing towards zero from all CL-MaxAC pairs. The inputs to the GPR estimation are (X, y) = {(CL i , MaxAC i )|i = 1, 2, . . . , K} which are derived from SO-BSS. A moving-max filter of size 10 ms was applied to keep only the dominant CL-MaxAC pairs before GPR estimation. We were interested in the MaxAC values of a limited set of CLs, X * = T AP SS , where the corresponding MaxAC values y * are estimated as: where N is a normal distribution with mean m(y * ) and covariance cov(y * ).
Let κ * = κ(X * , X) or κ(X, X * ), κ = κ(X, X) and κ * , * = κ(X * , X * ), the mean and variance can now be expressed as We chose a Matérn kernel [17] with ν = 3/2 as it is a commonly used isotropic kernel that is a product of an exponential and a polynomial of one degree. The value of the kernel function only depends on the distance r = |x − x | between input pairs: The choice of the length-scale hyperparameter l changes the mean and covariance of GPR-estimated APSS, as shown in (4) and (5). We set the hyperparameter l ∈ {1, 5, 10, 20, 30} as candidates. The effects of choosing different l in estimating APSS can be visualized in the second row of Fig. 2. A small value of l exhibits a sharp peak, whereas a larger value of l leads to a slower descent and a wider spread of estimated MaxAC. We used the scikit-learn [18] Python package (version 0.22.1) to compute the means and covariances of GPR predictive distributions. We show an example APSS estimated by feeding the GPR using CL-MaxAC pairs of S as inputs in Fig. 1 (Step 1). In Fig. 2, we show the periodic content of three AF simulations via their CL-MaxAC pairs computed from the atria and the torso, with and without extraction of equivalent sources using SO-BSS. These AF simulations are typical examples of the three types of AF developed from high-resolution patient atrial meshes, computed using monodomain equations from our previous work [5]: AF driven by a focal source ( Fig. 2(a), see Fig. 3(b) and S2 Video in Feng et al. [5]), AF maintained by additional reentrant sources induced by a focal source ( Fig. 2(b) and Fig. 1, see Fig. 3(a) and S1 Video in Feng et al. [5]), and AF maintained by sustained rotors (Fig. 2(c), see Fig. S2(a) in Feng et al. [5]). These three AF simulations were selected to represent patient AF from the perspective of their periodic components. In this figure, we evaluate the periodic source representation with SO-BSS (from S) and without SO-BSS (from E) by comparing the magnitude and location of the peaks with their atrial counterparts (from V). For atria with a single periodic component, such as the AF episode driven by a focal source in Fig. 2(a), the application of SO-BSS in estimating the periodic content does not make a difference. However, it becomes essential in revealing the periodic components when the atria encompass a mixture of multiple periodic components. This is shown in Fig. 2(b) where a focal source and reentries coexist in the atria. SO-BSS managed to extract highly periodic content at the focal CL of 180 ms, and also other periodic content. In comparison, CL-MaxAC pairs extracted from the signal space did not reveal this periodic component at 180 ms. Likewise, for AF driven by rotors in Fig. 2(c), three periodic sources were present at the atrial space, with CLs around 200 ms, 260 ms, and 385 ms. All these three components were captured by SO-BSS, but the CL-MaxAC pairs calculated directly from BSPM without SO-BSS resulted in multiple false periodic sources with CLs of 200-260 ms. A quantitative comparison with and without SO-BSS over a larger dataset is shown in S1 Appendix.  protocol number 2015-A00401-48, approved on 29/04/2015, clinical trial number 02647749). Written consent was previously obtained from all patients. A total number of 138 persistent AF patients going through catheter ablation at the Haut-Lévêque Cardiology Hospital (Bordeaux, France) were included in the study. The majority of the patients (n = 126) had not received any catheter ablation procedure before. The population characteristics of the included AF patients is shown in Table S3. All anti-arrhythmic medications were stopped 48 hours before the procedure, and diltiazem was given to the patients to suppress the atrioventricular conduction and facilitate f-wave measurement. The BSPMs were acquired at a sampling frequency of 1 kHz, from which 12-lead ECGs were computed. As in our previous work [5], we filtered the entire signal using a standard 2-30 Hz second-order Butterworth filter, and extracted f-waves from patient preoperative BSPMs by removing QRS intervals, which were detected using the Pan-Tompkins algorithm [19], as well as T-waves, which were detected based on the variance between signal channels. Only f-wave segments with length ≥ 800 ms were preserved, which accounted for 17 ± 11 segments for each patient. The ablation protocol was described in Haïssaguerre et al. [20], where focal and reentrant drivers were first mapped by ECG imaging using the same set of surface recordings, and were subsequently ablated until a slowing of local atrial activity. If AF was not terminated, left atrial roof and mitral isthmus lines were ablated. At the end of the procedure, AF was terminated in all patients. We collected their follow-up data up to 4 years post-ablation and counted months from the ablation to the earliest date of AF recurrence, if any. A three-month blanking period after the ablation for any recurrence was applied.

1) Preprocessing of Patient Signals
2) Survival Analysis Using the Selected Features: We used Fig. 1 (Step 1-2) to demonstrate how we represent all periodic sources in a patient by computing features from APSS for each patient. In Step 1, the mean estimated function (black plot) computed from CL-MaxAC pairs (colored dots) of an f-wave segment using GPR was taken as the signal-level APSS. In Step 2, we computed the maximum of the signal-level APSSs (colored plots) over each CL as the patient-level APSS (black plot), and downsampled it every 10 ms to derive 30 patient-level APSS features (red bars), corresponding to smoothed maximum of the MaxAC values on the CL intervals of {100-110, 110-120,..., 390-400}ms of each patient.
To estimate how each of 30 features contributed to the AF recurrence, we adopted Cox's proportional hazards model [21]. The Cox model estimates the hazard ratio for each covariate by assuming that the total hazard of a population, responsible for the AF recurrence of all included patients, is contributed proportionally by each time-independent covariate (or feature). We added an L1 penalty term to Cox's model, to shrink coefficients of less importance to the regression target. The top three features with the highest estimated hazard ratios predicted by Cox's model, using APSS features estimated with l = 10 in GPR from BSPM, are shown in Fig. 1 (Step 3), whereas the hazard ratios for all the other features were 0. We used the lifelines Python toolbox (version 0.26.0) [22], [23] for computing and plotting of the Cox model and the Kaplan-Meier curves.
For survival likelihood estimation, cross-validation has been proven to reduce bias in accuracy estimation [24]. Therefore, we used 10-fold cross-validation to train and test the performance.
In each fold, about 90% of patients were used as a training dataset to select the most relevant feature with the largest hazard ratio, denoted as CL best , and its MaxAC value, MaxAC best . For example, in Fig. 1 (Step 3), the interval of 220-230 ms had the highest hazard ratio, so it was selected as CL best . The average of MaxAC best over all patients, θ, was taken as the cut-off MaxAC for dividing the remaining 10% test patients into lower (MaxAC best ≤ θ) and higher (MaxAC best > θ) hazard groups. The lower and higher hazard groups among the testing patients of all 10 folds were then concatenated to form the total lower and higher hazard groups, for which Kaplan-Meier curves were plotted, as shown in Fig. 1 (Step 4).

A. SO-BSS Reconstructs Characteristic Periodic Components
In Fig. S2, we compared the highest MaxAC calculated using different extraction methods (with SO-BSS or without) and signal types (BSPMs or ECGs), on all f-wave segments of the patients (2368 segments in total). The highest MaxAC value is higher in settings with SO-BSS than in settings without SO-BSS, as expected, showing that SO-BSS was able to unmix BSPs to obtain the underlying highly periodic sources. Similarly, it was easier to reconstruct the periodic sources using the BSPMs than the ECGs, as there are additional channels of BSPs.
We show the APSSs calculated using different extraction methods and signal types from the two patients in Figs S3 to S6. These plots show that SO-BSS is also better at capturing less outstanding periodic sources. For example, in both plots at the column of l = 30, some periodic components with smaller MaxAC values, such as those with CLs between 200-250 ms for the first patient and below 150 ms for the second patient, were captured by SO-BSS on both BSPMs and ECGs, but they were missed by analyzing BSPs directly.

B. Patient-Specific APSS Patterns and Effects of GPR Hyperparameter Selection
To show the existence of consistent patterns across different AF episodes of the same patient, and these patterns may vary across different patients, we plotted APSSs and corresponding features of two patients in Fig. 3, where the BSPM waveforms and the extracted SO-BSS sources of two representative signals of these patients, as well as the extracted CL-MaxAC pairs, can be found in Figs S7 and S8, respectively. Two distinctive patient-specific patterns are revealed in the plots. Fig. 3(a) reveals a common peak between CLs of 120-200 ms across different f-waves. In comparison, the APSSs of another patient in Fig. 3(b) shows a common platform-shaped pattern on CLs of 220-270 ms, reflecting two periodic sources with similar CLs. In addition, adopting a high value of the hyperparameter l helps to reveal patient-specific patterns, shown by a smoother mean APSS curve and a narrower standard deviation band over area with high mean MaxAC values in Fig. 3.
The AF dynamics of some patients varied substantially between different recordings, resulting in diverse APSS patterns. In addition to the post-ablation outcome prediction pipeline, we can interpret common APSS patterns shared by several recordings within such a patient using K-means clustering [25], with the Euclidean distance as the similarity metric. The number of clusters was selected by the highest silhouette score [26], commonly used for evaluating the assignment of clustering labels by how close a data point is to the other data points of the same cluster, compared to the data points of different clusters. An example of such a way to find several patient-specific patterns from one patient is demonstrated in Fig. S9, where clustering into six clusters had the highest silhouette score in Fig. S9(a). The APSS curves of different clusters in Fig. S9(b) demonstrate high similarity globally (blue, green, and brown clusters) and over certain CL ranges (orange, red, and purple clusters) within the same cluster.

C. Post-Ablation AF Recurrence Prediction Using Non-Invasive APSS
To demonstrate the prognostic value of APSS, we show the Kaplan-Meier curves [27] of lower and higher hazards on all 10-fold cross-validation splits and on the final assembled test set using l = 10 for GPR in Fig. 4 and Fig. S10, in a style similar to Fig. 3 in Simon et al. [24]. Fig. 4 demonstrates that the highhazard group had a significantly higher AF recurrence than the low-hazard group on a hazard of 220-230 ms selected using SO-BSS and GPR, with a p-value < 0.001 under a logrank test [28]. This demonstrates that we can effectively screen patients likely to receive long-term benefits from the current procedure, based on their preoperative BSPs.
Furthermore, we tested the discriminative value of the computed feature (CL best ) by testing the statistical difference on the post-ablation AF freedom between low-hazard and high-hazard groups using this feature, with different settings (i.e. with or without SO-BSS, choice of l, type of BSPs) in Table I. With SO-BSS and the settings adopting l ∈ {5, 10, 20}, features of 220-230 ms (with notation 1 ) or 350-400 ms (with notation 2 ) best distinguished the two groups across all splits, and were associated with the best distinguishing effects by receiving the smallest p-values, for both BSPMs and ECGs. With a large value of l, features obtained from downsampling the maximum of all APSS estimations (red dots in Fig. 3) also showed less variation between different CL intervals, which made it more difficult to perform survival regression using these features (Table I). Meanwhile, it was harder to use APSS calculated without SO-BSS to distinguish two patient groups shown by larger p-values, and l ∈ {20, 30} produced the best results. The inflation of optimal l range means that reconstructed periodic sources were less concentrated, suggesting that the SO-BSS helped to reconstruct periodic sources. In all but one cases with p-value< 0.05, consistent features were selected across all splits, acting as an important indicator for prediction confidence. Surprisingly, ECGs distinguished two patient groups slightly better than BSPMs (smaller p-values in Table I), albeit the reduced sharpness of APSS curves compared to BSPMs (the highest MaxAC values are lower in ECGs than in BSPMs in Fig. S2).

IV. DISCUSSION
In this study, we presented a novel metric of AF named APSS, extracted with SO-BSS from either BSPMs or ECGs. On BSPs of multiple f-wave segments, the representation of APSS enabled us to preoperatively select patients that are likely to have better long-term ablation outcomes. This advances non-invasive patient screening for mechanism-directed treatment, which is still lacking to date [29].
APSS reconstructs multiple coexisting equivalent periodic atrial sources from patient BSPs, which reflect the underlying periodic electrical wave patterns, such as focal sources, reentries, and other repetitive electrical activation. Compared to FFT analysis, this approach is applicable to the f-wave as short as a single beat, and has a high time resolution equivalent to the sampling frequency, which is advantageous in capturing AF dynamics with beat-to-beat variation. This is especially beneficial for f-wave analysis since the length of the f-wave segment between adjacent ventricular beats is limited. Another key advantage of patient-level APSS is its ability to reveal patient-specific patterns that are consistent across different beats ( Fig. 3 and Fig. S9). Considering this advantage, the patient-specific APSS pattern was used for explaining the inter-patient difference in treatment outcomes.

A. Technical Considerations of Non-Invasive APSS
Taking into account that the AF mechanisms of persistent AF patients are more elusive, we used patient recordings as opposed to synthetic signals in our previous work [5] to train survival prediction. To cope with the limited number of patient data in clinical studies, we used a uni-variate feature for the prediction. The simple APSS representation also provides a better interpretation of patient physiological condition, and an easier adaptation to data obtained from different centers.
The way we used GPR estimation is different from the common applications of GPR, as the input data only contained samples of the peaks of the ground-truth APSS function. Thus, we selected the covariance function of GPR in a way that can aid clinical decision making. The common patterns from different episodes of a patient showed evidence that patientspecific patterns of APSS can be extracted from preoperative f-waves. In addition, we also proposed a clustering method to discover several common patterns within a patient (Fig. S9). In general, we found that using SO-BSS and GPR with l ∈ {5, 10} achieved a good balance of revealing realistic patient-specific APSS patterns (favoring a larger l) and preserving diversity between APSS features (favoring a small l), while being useful for the survival prediction for both BSPM and ECG. As both signals were equally effective in patient screening, our algorithm can be easily applied in any catheter lab that records ECGs or BSPMs prior to and during the procedure.
Although one may consider using dimension reduction techniques such as Principal Component Analysis [30] to reduce the dimension of features, we deliberately chose the simplest classifier with a univariate feature for prediction, for the clarity of its physiological meaning. We focused on markers extracted solely from BSPs, but combining other patient biomarkers would likely improve the prediction.

B. Mechanistic Insights From Survival Analysis With Non-Invasive APSS
MaxAC values on the CL intervals of 220-230 ms and 350-400 ms were consistently identified as two simple univariate predictive metrics for the long-term ablation outcome of persistent AF patients. As none of CL intervals below 200 ms showed a distinguishing value from our analysis, the selected CLs are likely the shortest CLs responsible for different patient outcomes. AF components in these frequency bands are likely slow reentries with large conduction blocks, or macro-reentries, as the frequency of rotors was typically ≥ 5 Hz in human AF [14]. There are several plausible explanations for why patients with high MaxACs at these bands did not receive the same treatment outcomes as the other patients in the referred ablation strategies.
Firstly, the identification of reentrant circuits is challenging and typically requires entrainment mapping [31], so they could be missed during mapping. Secondly, as a long AF CL of 190 ms already indicates a higher degree of organization in the atria [32], and sometimes used as a criterion of ablation termination, the remaining periodic sources with CL above 200 ms may be mistaken as satisfactory immediate ablation outcomes. Furthermore, since only the highest frequency sites were ablated in DF-targeted ablation strategies [33], coexisting periodic sources which were not the highest DF sites may be missed. These hypotheses, however, warrant future verification studies that compare simultaneous BSPs and intra-cardiac signals of AF patients.

C. Limitation
We computed the APSS using isotropic and stationary kernels in GPR. The isotropic kernel assumes that the diminishing speeds of the MaxAC by increasing and decreasing CL are the same, whereas the stationary kernel assumes that the diminishing speeds are the same for all CLs. This is likely an over-simplification, and more realistic kernel functions should be considered in the future. The clinical follow-ups contain missing entries, and were retrospectively interpreted.

V. CONCLUSION
We presented APSS, a novel non-invasive personalized representation of patient AF conditions, based on equivalent periodic sources derivable from BSPs within a heartbeat. Inference from such short signals is particularly useful for describing AF, given its beat-to-beat variation. In a retrospective analysis of 138 persistent AF patients, APSS demonstrated effective prediction in the long-term outcomes of catheter ablation for persistent AF patients. This exhibits the prognostic value in preoperative BSPs of AF patients, which has the potential to improve patient screening for mechanism-directed treatment.