Detecting Atrial Fibrillation and Atrial Flutter in Daily Life Using Photoplethysmography Data

Objective: Photoplethysmography (PPG) enables unobtrusive heart rate monitoring, which can be used in wrist-worn applications. Its potential for detecting atrial fibrillation (AF) has been recently presented. Besides AF, another cardiac arrhythmia increasing stroke risk and requiring treatment is atrial flutter (AFL). Currently, the knowledge about AFL detection with PPG is limited. The objective of our study was to develop a model that classifies AF, AFL, and sinus rhythm with or without premature beats from PPG and acceleration data measured at the wrist in daily life. Methods: A dataset of 40 patients was collected by measuring PPG and accelerometer data, as well as electrocardiogram as a reference, during 24-hour monitoring. The dataset was split into 75%–25% for training and testing a Random Forest (RF) model, which combines features from PPG, inter-pulse intervals (IPI), and accelerometer data, to classify AF, AFL, and other rhythms. The performance was compared to an AF detection algorithm combining traditional IPI features for determining the robustness of the accuracy in presence of AFL. Results: The RF model classified AF/AFL/other with sensitivity and specificity of 97.6/84.5/98.1% and 98.2/99.7/92.8%, respectively. The results with the IPI-based AF classifier showed that the majority of false detections were caused by AFL. Conclusion: The PPG signal contains information to classify AFL in the presence of AF, sinus rhythm, or sinus rhythm with premature contractions. Significance: PPG could indicate presence of AFL, not only AF.


I. INTRODUCTION
A TRIAL fibrillation (AF) is a cardiac arrhythmia that has been estimated to affect approximately 3% of the adult population, the prevalence increasing at older age [1], [2]. The arrhythmia is associated with increased morbidity, such as stroke and heart failure [3], [4]. Therefore, a timely diagnosis and start of the treatment of AF is essential, and new solutions for unobtrusive, low-cost, and possibly prolonged monitoring are increasingly studied. Especially solutions for long-term monitoring that work in daily life are needed for detecting intermittent episodes of AF that may be missed if monitoring period is short.
Another cardiac arrhythmia causing a similar stroke risk as AF, but is less common, is atrial flutter (AFL) [5], [6]. In AFL the atrial rhythm is regular and the ventricular rate is dependent on atrioventricular conduction and on whether the flutter is typical or atypical. The guidelines for anticoagulation and aims for AFL management are similar as for AF [4]. In addition, many patients with AFL develop later AF [7] or both arrhythmias may coexist [6]. Although the aims in management are similar, the treatment strategies for the two arrhythmias differ. AF is more often treated with medication whereas cardiac ablation is more common in treating AFL [4], [5], the success rate of ablations for a specific type of AFL being 90-95% [4]. Rate control in AFL is often more difficult to achieve than in AF [4] and antiarrhythmic therapy of AF may also cause AFL [5]. Knowing the type of arrhythmia causing the stroke risk is therefore important as antiarrhythmic therapies differ. Photoplethysmography (PPG) is an optical measurement modality that can be used in physiological measurements, such as heart rate monitoring [8], [9]. Reflective PPG is often used for wearable solutions, e.g. wristband devices. The potential of using PPG measured at the wrist to detect AF has been investigated in several studies with promising results [10]- [23]. Most of the approaches have focused on discriminating AF from normal sinus rhythm (NSR) [10]- [12] or, more in general, non-AF rhythms without further dividing the rhythms into different classes [13]- [21]. For classification of multiple rhythms, Corino et al. [22] proposed a method to classify the rhythms into AF, NSR, and other arrhythmias, whereas the approach of Fallet et al. [23] focused on classifying AF, NSR, and ventricular arrhythmias. Furthermore, the potential of using PPG signals to classify multiple cardiac rhythms has been studied with PPG measured with smartphones [24]- [26].
While AF detection with PPG has been widely studied, the literature about using PPG signals to classify AFL is limited. In This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ the study of Corino et al. [22], the class for other arrhythmias included 9 subjects having either AFL, ventricular premature beats (VPB), atrial tachycardia, and variable conduction. The sensitivity and specificity for this class were 75.8% and 76.8%, respectively. In other studies in which AFL is mentioned to appear in the dataset, the data has been often either excluded [16], [17], [21], [27] or it has been included in the group of non-AF rhythms [20]. Kashiwa et al. [21] excluded continuous AFL, but did not distinguish short AFL episodes among AF episodes. The approaches that included AFL in a different class than AF, were based on the information obtained from the inter-pulse intervals (IPIs) [20], [22]. From electrocardiography (ECG) studies we know that rhythm irregularity is not specific for AF, but occurs also during AFL [28]. Therefore, analysis on IPI irregularity patterns may not be sufficient for an accurate classification of these two arrhythmias. AFL can also manifest as a very regular rhythm, which can be a challenge when distinguishing from sinus rhythm. Adding additional information, e.g. from the PPG waveform, may be helpful in detecting different rhythms.
The objective of our study was to develop a classifier for classification of AF, AFL, and other rhythms, using PPG and acceleration data measured at the wrist in daily life. The category other rhythms included NSR and sinus rhythm accompanied with premature beats originating either from the atria or the ventricles. First, we developed an AF classifier based on rhythm irregularity, derived from the IPIs with commonly used features for AF detection, to benchmark the classification performance. This was done in three different ways: by considering AFL as a non-AF rhythm together with sinus rhythm and premature beats, considering AFL with AF because of the similar stroke risk, and by excluding AFL completely. Second, we added new features, such as features from the PPG waveform, to improve the classification performance and provide sufficient information for classifying multiple rhythms, i.e. AF, AFL, and other.

A. Data
The dataset for this study consisted of simultaneous ECG, PPG, and accelerometry measurements in 40 patients undergoing a 24-hour Holter measurement as part of routine clinical care. The patients were contacted by a cardiologist and given at least one week to consider their participation in the study. Before the start of the measurements, the participants gave written informed consent. The study (NL53827.100.15) was approved by the medical ethical committee MEC-U (Medical Research Ethics Committees United) in the Netherlands, and the data was collected in the Catharina Hospital, Eindhoven, the Netherlands.
The ECG was measured with a 12-lead Holter monitor (H12+, Mortara, Milwaukee, WI, USA). The PPG and 3-axis accelerometer measurements at the non-dominant wrist were made with a data logging device equipped with the Philips Cardio and Motion Monitoring Module (CM3 Generation-3, Wearable Sensing Technologies, Philips, Eindhoven, the Netherlands). The PPG sensor was based on reflective mode using two green LEDs. The sampling frequency of both PPG and accelerometry was 128 Hz and the dynamic range of the accelerometer was ±8 g. The accuracy of heart rate measurement when using the same PPG-sensor has been previously reported in [29].
The ECG data were visually analyzed by a clinical expert using an automated rhythm detection software (Veritas, Mortara, Milwaukee, WI, USA). The software extracted beat times from the ECG and identified every beat either as normal, supraventricular premature beat (SVPB), VPB, AF, paced, artifact, or unknown. The rhythm was then confirmed or corrected by the expert. The software labeled also atrial flutters as AF, which were corrected after the visual inspection by the expert.
Out of the 40 patients, 14 had continuous AF during the recording period, 20 had normal sinus rhythm with premature contractions, 4 had continuous atrial flutter, 1 had continuous atrial flutter with atrial tachycardia, and one patient had a very noisy ECG reference. The patient with the very noisy ECG reference was excluded from the analysis, because no classification could be made based on the reference data.
For developing the classification models, the dataset was divided into two parts: a training set and a test set. The training set was used for training of the model and the test set was kept as unseen data to test the classification performance. The split was made by assigning 75% of the patients to the training set and 25% to the test set based on the rhythm, with the aim to have similar rhythm distributions in both datasets. These are presented in Table I based on the percentage of beats in every class and the number of patients having that rhythm. The atrial tachycardia was included in the class of AFL. The patient characteristics in both datasets are presented in Table II.

B. Preprocessing and Pulse Detection
The raw PPG data was uploaded from the data logging device and processed offline. As a preprocessing step prior to pulse detection, the PPG signal was downsampled from 128 Hz to 64 Hz and bandpass filtered to range from 0.3 to 5 Hz. The pulses were detected by identifying the fiducial points, i.e. the troughs, in the waveform by detecting the local minima. The time difference between two consecutive fiducial points was calculated to obtain the IPIs. The IPI time series were used to match the pulses to the labeled ECG beat times. The method for detecting the pulses and synchronizing the PPG and ECG beat information is described more in detail in [17]. Examples of 30-second segments of the PPG waveforms and the corresponding IPIs for different rhythm types are presented in Fig. 1 and Fig. 2. Fig. 2 shows differences in the rhythm characteristics between different AFL subjects.   The values are presented as mean ± standard deviation over subjects. AF = atrial fibrillation, AFL = atrial flutter, HR = heart rate.

C. Modeling Architecture
In this study, two models were developed: an AF classification model based on the traditionally used IPI variability patterns (benchmark model), and a multi-rhythm model to classify AF, AFL, and other rhythms. Fig. 3 shows the block diagram of the two models. The benchmark model takes as input only features computed from IPI series, whereas the multi-rhythm model uses the IPI series, PPG waveform, and accelerometer data as input for the feature computation.
The PPG, accelerometer, and IPI time series data were segmented in 30-second non-overlapping windows for computing the features and every window was labeled based on the rhythm. The beat labels from ECG were used as the ground truth. When the majority of beats were AF, the window was labeled as AF. This was the same for AFL. When there were any SVPBs in the window, and no AF or AFL, the window was labeled as SVPB, and the same for VPB. The windows were labeled as sinus rhythm when they did not contain any of the previously mentioned rhythms. If more than half of the beats were labeled as artifact, the window was discarded from the analysis.
For binary classification, which was done with the benchmark model ((1) in Fig. 3), the classification was performed in three different ways. First, all the windows labeled as AF were considered as one class, and AFL, NSR, SVPB, and VPB together as another class, i.e. non-AF rhythms. Second, AFL was considered as the same class with AF, and finally, AFL was completely removed from the analysis. For the multi-rhythm classification ((2) in Fig. 3), AFL was separated as its own class, and NSR, SVPB, and VPB, treated as one class to which we refer as 'other'. Premature beats were not separated as their own class because there where too few examples compared to AF, AFL, and NSR. In addition, they usually are not considered to require medical treatment.

D. Features
The features to be used for the classification were calculated for every 30-second window in two stages: first for the  benchmark AF classifier using only IPI-features, and second for the multi-rhythm classifier with the aim of improving overall accuracy and classify both AF and AFL separately. All the features used in the analysis are summarized in Table III. Prior to the feature computation, IPIs that were shorter than 200 ms and longer than 2200 ms were excluded from the analysis. This was done because pulses could be sometimes falsely detected or missed, leading to incorrect IPIs. For more robust feature calculation, the features were calculated only if the window contained 20 or more IPIs.
For the benchmark AF classifier, six features characterizing the variability or entropy of the IPI series, which have been used for AF classification in the literature [22], [24], [30], [31], were computed: r Shannon Entropy (ShEn) r Normalized Root Mean Square of Successive Differences (nRMSSD) r pNN40 and pNN70 r Sample Entropy (sampEn) r Coefficient of Sample Entropy (CoSEn) We have previously studied the discriminative power of these features individually for AF detection with PPG and compared that to ECG [17]. In the current paper, these features were used to build the first model to classify AF either with or without AFL.
For the multi-rhythm classifier, additional features were calculated in order to improve the detection accuracy and enable separate AFL classification. The features were derived either directly from the PPG waveform, from the IPI series, or from the accelerometer data. The features of the PPG waveform were computed from the signal segments after subtracting the mean value and dividing by its standard deviation. The new feature set consisted in total of 16 features, which will be described here.
One category consists of features that have been considered to measure the signal quality of PPG. A feature that has been used for motion artifact detection is kurtosis [32]. It is a statistical measure describing the tails of the distribution defined as where μ and σ are the mean and standard deviation of x, respectively, and E(n) is the expected value of the quantity n.
In the PPG analysis domain, Hjorth descriptors called mobility and complexity [33], H 1 and H 2 , have previously been used for analyzing the quality of the signal [34]. The descriptors have initially been developed for electroencephalogram (EEG) analysis [33] and represent the mean frequency and half the bandwidth, respectively. The descriptors are calculated from spectral moments of the signal, the nth order spectral moment being defined asω where S(e jω ) is the power spectrum. From the moments with different orders, H 1 (n) is defined as and H 2 (n) as The spectral moments were implemented in this work in the time domain according to [35]. A similar feature to H 2 (n), which also has its origin in EEG analysis, is Spectral Purity Index (SPI) [35], [36]: Later SPI has been used also for ECG analysis in detecting false alarms of ventricular tachycardia and fibrillation/flutter [37], [38]. Recently, the method was also applied to PPG signals in order to distinguish ventricular arrhythmias from sinus rhythm and AF [23]. In addition, it has been considered as a signal quality metric for PPG [14]. Shannon Entropy in the time domain was included as a feature in the first stage to study the variability of the IPI series, but it can be also extended to the frequency domain. Spectral Entropy (SE) [39] measures the spectral complexity of the time series and can be calculated as where p f is the power spectral density normalized with the total spectral power. Previously, Fallet et al. [23] have studied SE for discrimination of AF and ventricular arrhythmias from PPG. In addition to the features from the frequency domain, additional IPI features were introduced. Modeling RR intervals as a Markov process has been used for AF detection initially by Moody and Mark [40]. In the model, each RR interval was considered to be either short, regular, or long. When every RR interval was assigned to one of these states, transitions between states and transition probability matrices for different rhythms could be calculated. From the transition probabilities, a score that represents the likelihood of the rhythm can be derived. With a similar approach having more states, a good performance in detecting AF from the IPI series has been recently shown [16]. In the current paper, the Markov model was used in order to distinguish between AF and AFL.
The transition probability matrices to model the Markov process were calculated for AF and AFL from the IPI series. These gave the score S that indicates whether the rhythm is more likely to be AF than AFL: where p AF L ij is the probability that after an interval belonging to state i, an interval belonging to state j occurs during AFL, and p AF ij is the same for AF. In addition, as a new feature, the same procedure was used to calculate a score when using pulse-to-pulse-to-pulse intervals (PPPI) instead of IPIs. PPPIs were calculated as the time difference between two fiducial points by skipping one pulse in between, the length of the interval corresponding to the sum of two consecutive IPIs. The patterns formed by PPPIs are different from IPIs and this can be especially helpful for distinguishing when irregularity is due to an alternating IPI-pattern, such as during AFL, instead of due to AF. Both IPI and PPPI series were normalized and the scores filtered according to [40], having as the coefficient k = 0.25. The number of states for the used model was 12. Because a score is produced for every interval, a mean of the scores in the window was taken.
To include information reflecting the heart rate, the maximum, minimum, median, and standard deviation of the IPIs were included as features. In addition, from the PPG waveform, maximum, minimum, mean, and standard deviation of the pulse amplitude, i.e. the difference between the peak and the onset of the pulse, were calculated.
Finally, from the accelerometer data the norm of the accelerations on the three axis was calculated. From the norm, the standard deviation and the maximum absolute value in each window were included to the feature set.

E. Feature Selection
Feature selection was employed to select the optimal set of IPI variability features for the benchmark AF classification models. Based on our previous work [17], all the six features reflecting variability or entropy of the IPI sequence have individually a strong discriminative power in AF classification. Moreover, all these features try to capture relatively similar information and could be redundant. Therefore, the Minimal-Redundancy-Maximal-Relevance (mRMR) criterion [41] was selected as the method to rank the features.
The mRMR method tries to find features that maximize the mean value of mutual information between all individual features and the target class, i.e. the maximal relevance. However, it is likely that these features have a large dependency on each other. Therefore, the minimal redundancy criterion is added, and it is based on the mutual information between the individual features. The balance between the two criteria is optimized by finding the set that maximizes the difference between the maximal relevance and minimal redundancy.
The calculations were made with the implementation provided by the authors of [41] on [42]. The method requires discretization of the features and that was made by having three states when using thresholds at mean ± standard deviation [42].

F. Classifiers
The benchmark AF classifier (Fig. 3 (1)) for the binary classification using the selected IPI-features as input was a generalized logistic regression model. The probability for the 30-second window to contain AF was given by the function: where t is the index of the window, X(t) a vector containing the feature values for the window at time t, and b a vector of the model coefficients.
The threshold for the probability was selected as the one that maximizes Youden index J [43] defined as where TP are true positives, FN false negatives, TN true negatives, and FP false positives. The multi-rhythm classifier ( Fig. 3 (2)) to perform the classification into AF, AFL, and other, was a Random Forest (RF) model [44] taking as input the selected IPI-features and the additional 16 PPG, IPI, and accelerometry features. RFs are ensembles of decision trees that are grown in parallel by selecting a random subset of features to grow each tree. The final classification is based on a majority vote given by the classifications of each individual tree. RFs have a few beneficial characteristics, such as that they are relatively robust to noise and outliers, and can give useful internal estimates about the error and importance of the variables. The latter is an advantage with the small dataset because performing a feature selection for the multi-class problem is more challenging than for the binary case.

G. Cross-Validation
For selecting the number of features for the benchmark AF classifier, leave-one-subject-out cross-validation with the training set was used. The performance was calculated by using the data of one subject for testing and the data of the remaining subjects for training. The number of features went from one to six by adding the features in the order given by the mRMR. The number of features reaching the highest accuracy was selected.
The multi-rhythm classifier was also evaluated with the training set by using 10-fold cross-validation. In order to maintain an equal distribution of the three classes in every fold, the dataset was divided in 10 sets of equal size that were stratified by the classes but not by patients because of the number of AFL patients. The training and testing was performed 10 times with each of the 10 sets serving as a test set once.

A. Training Set
The mRMR selection method on IPI features was used for each of the three class divisions. For AF vs. non-AF classification gave the following ranking independent whether AFL was included in non-AF rhythms or completely excluded from the  analysis: pNN70, sampEn, ShEn, CoSEn, pNN40, nRMSSD, starting from the most relevant one. When AF and AFL vs. other classification was considered, only pNN70 and pNN40 switched order with each other in aforementioned ranking. Based on the leave-one-subject-out cross-validation, the best accuracy was obtained by combining the first three features. The Receiver Operating Characteristics (ROC) curve of the models combining these three features are presented in Fig. 4 when AFL is either included in or excluded from non-AF rhythms, or completely excluded from the analysis. The results are calculated with 43.6% of the data after the windows not containing a sufficient number of IPIs or the reference was labeled as artifact were excluded. The median (lower -upper quartiles) coverage per patient was 47.0% (29.6 -56.2)%. Fig. 4 shows the operating points for the three binary AF classifiers selected by maximizing J. The model AF vs. AFL and other had sensitivity of 93.6% and specificity of 88.2%. Misclassification of AF occurred primarily in presence of SVPB, and AFL, i.e. other arrhythmia generated from the atria, as shown in Fig. 5 on the left. The presence of VPB caused relatively little  IV  CONFUSION MATRIX OF THE CLASSIFICATIONS WITH THE MULTI-RHYTHM  CLASSIFIER OF THE TRAINING SET   TABLE V  CLASSIFICATION PERFORMANCE OF THE MULTI-RHYTHM CLASSIFIER  WITH THE TRAINING SET   TABLE VI  CONFUSION MATRIX OF THE CLASSIFICATIONS WITH THE MULTI-RHYTHM  CLASSIFIER OF THE TEST SET false detections of AF. The model that classified AF and AFL vs. other had better sensitivity and specificity (95.2% and 90.4%).
With this model the false negative classifications were mainly due to AFL, as shown on the right of Fig. 5. The best performance for binary classification with sensitivity of 98.2% and specificity of 90.9%, was obtained when AFL was not present in the data. The multi-rhythm classifier was a RF model consisting of 100 trees and classified the 30-second windows into AF, AFL, or other rhythm. The results calculated with cross-validation are listed in Tables IV and V. Table IV is the confusion matrix of the classifications and in Table V the results are presented as one class vs. all the rest in terms of sensitivity, specificity, positive predictive value (PPV), and accuracy.

B. Test Set
The results of the test set were first calculated with the benchmark AF models. The test set had a coverage of 41.6%, with the remaining windows excluded due to insufficient IPIs or the ECG reference being labeled as artifact. The median (lowerupper quartiles) proportion of windows included for each subject was 42.2% (31.1-50.1)%.
For AF vs. AFL and other classification, the sensitivity and specificity were 96.1% and 96.2%, respectively. When AF and AFL were included in the same class against the rest, the sensitivity decreased to 58.9% and specificity to 92.5%. The best classification performance was again obtained when AFL was completely excluded from the analysis, the sensitivity being 99.1% and specificity 95.4%.
The results of the multi-rhythm model are presented in Tables VI and VII. The sensitivity and specificity for detecting

IV. DISCUSSION
This is the first study showing that both AF and AFL detection is possible from PPG data in daily life. We demonstrated that many of the false positive classifications of a benchmark AF classification model were due to instances of AFL. When considering AFL to be classified with AF as the same class, it was often missed by the benchmark model. The presented multi-rhythm classification algorithm showed much improved performance, particularly making less false positive AF detections, when trained to classify both AF as well as AFL. As the results with the benchmark AF models show, the rhythm characteristics when using the IPI-features did not belong to either of the binary classes, i.e. AF and other, and decreased the classification performance. Therefore, considering AFL as a separate class can be beneficial also in terms of improving AF detection and not only for giving classification for the rhythm type itself.
Combining information from the PPG signal, IPIs, and accelerometer improved the classification accuracy and enabled discrimination of AFL from AF and the other rhythm types. Adding features from the PPG waveform helped in detecting AFL compared to using only IPI-information. In previous work, a comparison of PPG pulses gave different results in a patient suffering from a regular form of typical AFL than in patients with AF or other rhythms [45]. Moreover, some of the features derived from the PPG waveform have been used to discriminate ventricular arrhythmias [23]. The PPG waveform characteristics have been also studied in the context of force-interval relationship during AF [46] and mechanical alternans [47]. This could indicate that the PPG waveform itself contains information about different rhythms and cardiac function.
Discriminating AFL from AF has been also possible based on RR interval series when a multilevel model of the atrioventricular node was used [48]. In the current paper, we included the Markov model approach to process the IPI and PPPI series in order to classify these two rhythms. Thus, the RR intervals or IPIs also contain valuable information when processed in an adequate manner.
The validity of the windows to be analyzed was judged based on the number of pulses detected in that window, but no further signal quality analysis was made. The method detecting the pulses already considers the body acceleration and therefore the number of pulses indirectly already reflects this. In addition, some of the features included in the RF model have been developed or used previously for PPG signal quality assessment, and information of the body acceleration was also given as an input. This may have improved the classification of the more noisy segments.
The study suffers from some limitations. In the dataset, all the patients that suffered from AF, had it continuously. For datasets with intrapatient rhythm variability, data measured before and after electrical cardioversion have been used in different studies to investigate AF detection with PPG. This reflects a hospital setting but not ambulatory monitoring, which is where the wrist-worn applications can really add value. Studies that show AF detection with PPG in ambulatory setting are very few. Shen et al. [49] had measurements from 3 to 8 hours and mention subjects with rhythms that change over time. However, these include AF and eight other rhythms and the proportion of paroxysmal AF remains unclear. Sološenko et al. [20] presented in their study a dataset measured for approximately 22 hours per subject in cardiac rehabilitation. Similarly to our study, all the AF subjects had continuous AF.
The number of AFL subjects in the dataset was less compared to the other groups. AFL can have different rhythm characteristics depending on the type and therefore can vary between subjects. The results between the training set and test set in AFL classification differ slightly due to the different types of AFL cases, e.g. very regular AFL being misclassified as other rhythm in the test set. Yet, the results remain relatively similar when compared to the benchmark AF models.
Premature beats were not separately classified in this work because of the small number compared to other beat types nor were they suppressed in order to reduce false positive detections. Furthermore, the classification was done in windows which makes identifying individual beats difficult. As Fig. 5 shows, premature beats had an impact on false positives when only IPI-based features were used in the classification. Adding PPG-waveform and accelerometry features helped in improving specificity by also reducing false positives caused by premature beats and not only by AFL. Understanding the effect on the classification accuracy caused by higher burden of premature beats and variability in their beat patterns between patients remains for future research.
The dataset was divided into training and test set in order to leave some of the recordings untouched while developing the models. The split was done by patients and no data from a patient assigned to the training set ended up in the test set. Therefore, it was not possible to match the rhythm class distributions completely between the two sets. The characteristics of the sets may, therefore, not be entirely comparable, which is reflected especially in the results of the benchmark AF model.
The choice to use RF was made because of the class distribution and the ability of RF to give information about the feature importance. The model has many advantages, but one drawback is poor interpretability. For clinical applications, more transparent options of models may be more suitable when larger datasets are available.
The selected approach was based on calculating the features in 30-second non-overlapping windows. The features based on the IPIs require a sufficient amount of data for the calculations, and missed pulses caused the data to be discarded, thereby reducing coverage to approximately 45%. The coverage of our approach could be improved by reducing the number of IPIs required per window or by using overlapping windows. However, the effects on the classification performance should be studied. Sološenko et al. [20] used an approach to classify every beat separately, which resulted in a higher coverage (89.2%) during 24-hour measurements. However, the sensitivity for AF with this coverage was only 72.0%. When 50% of the data was judged as analyzable, the sensitivity increased to 97.2% and specificity was 99.6%.

V. CONCLUSION
In this study, we demonstrated that PPG and acceleration measurements at the wrist can be used to discriminate between AF, AFL, and other rhythms in daily life. We showed that with an AF vs. non-AF model and AF and AFL vs. other rhythms model that used only information derived from inter-pulse intervals, the false detections were for a large part caused by AFL. The multi-rhythm model included more information from the wrist measurement, such as features from the PPG waveform and accelerometer data. This model was not only able to improve the overall performance of AF detection, but could also classify AFL with high accuracy. The results of this study indicate that the PPG signal contains sufficient information, derived both from the waveform and IPIs, to accurately classify between AF, AFL, and other rhythms. Thus, PPG could provide promising means to detect AFL along with AF.