Standardized Gaussian Dictionary for ECG Analysis a Metrological Approach

An approach based on dictionary-based Gaussian decomposition of electrocardiogram (ECG) traces is presented and characterized, and its performance potential is demonstrated using traces from the MIT-BIH Arrythmia Database. A Gaussian model is employed to describe ECG morphology. Parameters are estimated using a dictionary-based approach, that is purposely designed to obtain accurate representations with limited complexity and ensure comparability among different traces and subjects. The standardized Gaussian dictionary allows compact representations, enhances comparability and provides the support for machine learning-based diagnostics of ECG traces. Data-oriented large-scale medical analyses of ECG data are made possible, allowing the investigation of elusive cardiac phenomena and personalized diagnostics.


I. INTRODUCTION
E LECTROCARDIOGRAM (ECG) recordings are among the most frequently employed diagnostic tools in medicine, as people of all ages routinely get ECG health checks [1]. Many individuals can thus have a large number of ECG traces, of varying quality and length, associated with them over their lifetimes. The availability of such extensive information raises the possibility of systematic investigation into ECG patterns, their variability, relevant features and trends for possible pathology precursors, that could help develop diagnostics.
Objective comparison of individual ECG traces by means of a standard set of reference features is a key requirement for this kind of big-data inspired study. Prerequisites for a quantitatively oriented analysis are the essential metrological requirements of repeatability and comparability of the analyzed feature set. Consideration must also be given to the nature of features. Bioelectrical signals like ECG traces, electromyograms, electroencephalograms, etc. are often analyzed by considering, for instance, energy in a certain bandwidth, interquantile range, time interval between fiducial points [2], [3], [4]. These features summarize essential characteristics of a waveform for a certain purpose, consequently they may lack generality.
Reconstruction of the original waveform might not be possible, which hardly suits the requirement of general comparability among traces.
In this work we propose a morphological feature set that can be introduced as an explicit intermediate step between a plain ECG trace and its annotations produced by clinicians. The approach is based on Gaussian modelling, that has been shown in the literature to achieve accurate ECG trace representations in a wide variety of cases [5], [6]. A well-designed and wellfitting model can reproduce the morphology of a cardiac period by a limited number of suitable Gaussian components, uniquely identified by a small number of parameters [7]. Therefore, a whole trace can be associated to a very compact parameter set, representing a unique "ECG fingerprint" [8]. As with actual fingerprints, this parameter set can be standardized and, to this aim, we provide criteria for the definition of an objective reference grid of parameter values, exclusively based on the analysis of uncertainty for ECG traces.
The conceptual novelty is the "application independent" nature of the ECG feature set, that satisfies the metrological requirements of repeatability and comparability, and can provide a basis for reliably sharing diagnostic-related medical data. It could be argued that the proposed set does not include any of the specific features in use for ECG trace analyses. However, since an "ECG fingerprint" preserves measurement information, a suitable explicit mapping, or possibly some form of machine learning training can be expected to create links to other trace features of common interest and actually help improve the quality of examination.
In the next Section Gaussian modelling for ECG traces is introduced and relevant literature is discussed. The design of the Gaussian dictionary and criteria to define the associated parameter grid are presented in Section II-B. Section III presents results obtained by the use of the "ECG fingerprint", with examples of monitoring ECG traces taken from a wellknown reference data set.

A. ECG SIGNAL MODEL
A general framework for morphological modelling of heart beats by ECG signal decomposition is presented in [9]. Using Gaussian functions, a single heart beat can be represented by the sum: where α i is the magnitude of the i-th component, τ i is its time position relative to a reference point and σ i is a shape-related (dispersion) parameter. At least one Gaussian component is needed for each elementary ECG wave (P, Q, R, S and T), but two more are usually added to help describe asymmetries in P and T waves, making seven in all [7], [10], [11]. In practice, shapes of actual ECG recordings may not exactly follow this model. Our analysis of acquired waveforms consistently showed that rigid allocation to specific elementary waves must be avoided, in which case a pool of seven components can still be employed, resulting in a compact representation of the ECG trace. The use of a Gaussian model was proposed in [12] for the generation of realistic synthetic ECG traces. ECG compression and classification were discussed in [7], whereas [13] aimed at identification and removal of contaminants in the cardiac trace. Algorithms of significant computational complexity are often involved, since signal model (1) is nonlinear in the parameters τ i , σ i . Approaches proposed in the literature include gradient descent [7], non-linear Kalman filtering [11], [14], extended Kalman filtering/smoothing [15].
More recently, dictionary-based signal decomposition has been considered for ECG signals. A dictionary is obtained by selecting a finite set of K functions from a family D = {φ γ }, where γ is a generic parameter set. A single dictionary element (atom) can be associated to an index k and indicated as φ γ k (t), where γ k denotes specific values of γ . Since k univocally identifies γ k we use notation φ k (t) for simplicity. The model for a single heart beat takes the general form: where each atom is associated with a non-zero magnitude value a i and I(x) is a signal-specific index subset, whose cardinality |I(x)| is assumed to be small for any signal of interest [16]. It should be remembered that ECG trace analysis starts with segmentation into single heart beats, after which signal model (2) is fitted to each segment. Dictionary-based decomposition yields the index set I(x) and the set of amplitudes A dictionary may be constructed from ECG-specific waveform templates, as reported for instance in [17]. In this case features are specific to a given subject or to the particular set of traces forming the training set. For Gaussian dictionarybased ECG analysis, promising results are shown in [5] and the approach has been successfully employed for compressive sensing of fetal ECG signals [18]. Even these works, however, do not specifically address the aspect of generality, which is the main focus of this work.
Our preference for a pre-defined Gaussian dictionary as a standard general representation tool is summarized by the following reasons: 1) the suitability of Gaussian kernels for the morphological description of ECG waveforms is demonstrated in several works, for instance, [10], [14], [15]; 2) model (1) is adaptable and can be applied to nonpathological, as well as to pathological traces. Although some loss of accuracy might be incurred with the latter, a Gaussian representation can be quite general; 3) computational costs of feature extraction algorithms are acceptable for a range of medical and healthcare equipment, including comparatively unsophisticated devices. These characteristics allow to meet challenges involved in defining a common feature set for waveforms acquired by various devices under different conditions, in particular: • achieve robustness against noise and artifacts in recorded traces, and • provide compact representations of trace data, that can be useful for archiving, transmission and efficient analysis.

B. GAUSSIAN DICTIONARY DESIGN
Functions in a Gaussian dictionary have the general expression: where u ∈ R is a location parameter and s ∈ R + is a scale parameter.
In practice a segment would be a finite-length sequence of samples represented by vector x = [x(n 1 T s ), . . . , x(n 2 T s )] T , where T s is the sampling interval, n 1 and n 2 are the initial and final segment indices, N = n 2 − n 1 + 1 and superscript 'T' denotes transposition. In this case a dictionary is a matrix D with size N ×K and its elements are the column vectors of N sampled values of (3), characterized by distinct parameter couples: Since only finite discrete sets of parameter values can be considered in a dictionary, a suitable two-dimensional grid needs to be defined for γ = {s, u}. For the purpose of discussing design criteria, we keep using continuous-time functions. In this case dictionary elements are vectors in a Hilbert space, where the vector product between signal x(t) and one of the functions φ k (t) is defined by the integral: The function φ s,u (t), as defined in (3), is normalized so that: When index set I(x) is given, signal model (2) is linear with regards to parameters a i , whereas the dependence on non-linear parameters u and s is accounted for in the selected set of functions φ k (t). The decomposition can be written as: where r x (t) is the waveform estimation residual. Accuracy is determined by two aspects: 1) the quality of the dictionary, that should offer an appropriate variety of functions to choose from. In this regard, the design problem translates into the determination of two finite sets of values for u and s that enable sufficiently accurate representations for ECG trace segments; 2) the effectiveness of the selection process leading to an index set I(x) with any segment. To ensure generality in our study, we assume that any ECG trace segment can be described by model (1), whose effectiveness is well proven. We disregard superpositions for simplicity, so that each elementary Gaussian wave in (1) can be considered individually when the projection < x(t), φ {s,u} (t) > is analyzed.
The effect of a discrete parameter grid on estimation accuracy can be analyzed as a quantization problem. We introduce a standard criterion for defining parameter quantization steps and ranges, based on the analysis of the effect of gridding on reconstructed waveform accuracy. Let s and u be the grid steps, respectively for s and u, with s , u > 0. For a Gaussian wave with parameters σ , τ the two conditions |s − σ | ≤ s 2 and |u − τ | ≤ u 2 will be satisfied.
Step sizes determine resolutions for the estimates of parameters σ and τ , lower bounding measurement uncertainties accordingly.
Dropping subscripts for simplicity, the projection on φ {s,u} (t) of a Gaussian elementary wave having amplitude α and parameters σ , τ is given by the closed-form expression: where: Because of the normalization in (3), the amplitude estimate is: and its relative deviation has the general expression: Plots of (10) are shown in Fig. 1, where the abscissa is the time location difference u − τ normalized by the wave parameter σ , and plots are parameterized by the ratio s/σ . Obviously, zero relative deviation is achieved when s = σ and u = τ . Since discrete parameter values are considered in the dictionary, quantized values must be assumed instead. Given step sizes s and u it is: therefore it suffices to analyze the behaviour of (10) in the neighbourhood of s σ = 1 and |u−τ | σ = 0. Horizontal lines in Fig. 1 illustrate the boundaries corresponding to ±5% relative deviation. It can be seen that too narrow dictionary elements (that is, s σ < 1) overestimate amplitude, whereas wider ones ( s σ > 1) lead to underestimation. When u = τ , that means the time location of a Gaussian elementary wave is exactly matched by a dictionary element, relative amplitude deviation remains within the ±5% boundaries with scale parameter variations up to ±10%.
Time location resolution is important for the accurate determination of ECG fiducial point position, therefore we refer to the ECG sampling interval, setting a uniform quantization step for u: By this choice gridding effects become mostly negligible for time location.
For the scale parameter s the dependence of relative amplitude deviation on the ratio s σ suggests the adoption of a logarithmic grid. This can be built according to the rule: where s min ≤ σ ≤ s MAX is assumed to hold for all possible values of σ and s MAX = s min · (1 + δ s ) H−1 . Setting δ s determines the relative resolution for the estimate of σ . From a review of physiological values we set s MAX ∼ = 90 ms as the upper end for the range of dispersion values s. An objective criterion to determine the lower end may be derived by considering that any measured ECG trace is band-limited by the data acquisition system bandwidth B. Therefore, dictionary elements are not expected to match any elementary wave with a broader bandwidth. Considering the relationship between the −3 dB bandwidth of a Gaussian function of time and its dispersion parameter, a lower bound is then provided by: We chose s min = 2 ms, corresponding to slightly wider than B = 55 Hz, a common bandwidth value for ambulatory electrocardiographic systems complying to IEC 60601-2-47:2012. The range of values should be enough to account for possible morphological variations in the ECG signal and can be covered in 10% increment steps by setting H = 41.
It must be emphasized that the analysis in this Section does not rely on any specific or individual ECG feature. Achieving a satisfactory trade-off between dictionary size and waveform estimation accuracy is far from trivial in practice. Sufficiently fine parameter granularity is needed to preserve the quality of clinical information conveyed by an ECG trace. Using finer steps may reduce approximation distortion, but only up to the intrinsic limits of model (1). Conversely, too fine a granularity would result in an unmanageably large dictionary, making the approach impractical and, possibly, numerically unstable.
Criteria outlined here lead to a dictionary with K = N · H elements, that is a N × N · H dictionary matrix D, admittedly a rather large size. However, the result is a standardized dictionary, that can be pre-computed and stored in any device. Therefore, it only introduces a memory size requirement that can be easily met with low-cost storage devices. On the other hand, ECG analysis based on a standard dictionary can ensure broad comparability among traces acquired by different devices.

C. TRACE DECOMPOSITION ALGORITHM
Determination of the most representative index sub-set I(x) is crucial to the accuracy of dictionary-based signal decomposition. Results may partly depend on the algorithm employed for this purpose, therefore a brief outline is necessary before progressing further in the analysis.
We estimate parameters in (2) by orthogonal matching pursuit (OMP) [19], a recursive "greedy" algorithm that iteratively selects model components. Trace analysis actually begins with a pre-processing stage where low frequency noise and baseline wander are removed. Their contribution is extracted by a cascade of two median filters (window lengths 200 ms and 600 ms, respectively), then subtracted from the original ECG signal [20]. A peak detector next determines the positions of R-wave peaks, creating a sequence of time values T = {T l }, where l is the heart beat index. Taking these positions as reference points, a corresponding sequence of sample vectors x l is produced. Each segment is decomposed by OMP, that yields the two sets A l and I l (x). The most compact representation of a single heart beat is the set l = {T l , I l (x), A l }, since corresponding values of u i and s i are determined by the dictionary when I(x) is given. The sequence = { l } provides the "ECG fingerprint" for a full trace.
Dropping the subscript l for simplicity, OMP can be described as the iterative application of a number of steps, starting from the discrete vector x. Initialization sets the estimated signal tox = 0 and the set of selected dictionary indices to I(x) = ∅. The algorithm performs the following operations: 1) compute r = x −x, then find the dictionary index k * = arg max k |d T k r| 2

D. RESIDUAL PEAKS
OMP performance is critically dependent on the effectiveness of the search for k * in the first step of each iteration. Index k * indicates the dictionary element on which the projection of r is largest. However, discretization of parameters s and u implies that in general an exact match with σ and τ is not possible.
Mismatch can cause spurious peaks to appear in subsequent matching pursuit iterations, preventing correct component detection. To assess this effect we consider the difference ψ(t) = φ s,u (t) − φ σ,τ (t) and analyze the projection < ψ(t), φ s,u (t) >, that can still be described by a closed-form expression.
This provides the residual peak amplitudes that are plotted in Fig. 2, in percent of the original peak amplitude, as functions of the time offset from the original peak position, normalized with respect to σ . Plot parameters are referred to the dictionary element selected in association with the original peak: • Fig. 2(a) -normalized position mismatch u−τ σ ; • Fig. 2(b) -dispersion mismatch, expressed by the ratio s σ . Fig. 2(a) shows, for instance, that a position mismatch not greater than 0.1·σ is needed to keep the residual peaks below 5%. Dispersion mismatch in Fig. 2(b) appears to be less critical. In general, the design criteria discussed in Section II-B for the choice of grid steps appear to be adequate to also address the problem of residual spurious peaks.

E. NOISE SENSITIVITY
The smaller values of s are well-suited for detection of ECG elementary waves in the shorter-duration QRS complex, where R peaks are usually well above the background noise threshold. Smaller signal components can often be buried in noise so that detection, in particular for P waves, becomes harder. Experience showed that narrower Gaussian functions are more liable to produce spurious peaks caused by the presence of noise. For this reason, we need to set additional limits to prevent the algorithm from interpreting noise bursts as spurious Gaussian components.
Let us assume that, in some part of the ECG segment, residual signal components have become comparable with Gaussian white noise, possibly after some iterations. To analyse the problem, we consider the projection: where the random process n(t) has zero mean and finite variance σ 2 n . We are interested in the properties of r n,φ (s, u), which is also a Gaussian random process. Its variance σ 2 r , that is independent of u because of the stationarity of n(t), allows an assessment of noise-related artifacts in the first step of OMP iterations and the consequent probability that a noise-related peak exceeds signal-related peaks. From (15) it follows: where s,u (f ) is the Fourier transform of φ s,u (t) and the final equality holds because of normalization (5). A false detection occurs when peaks due to noise exceed signal-related ones, therefore a reasonable criterion to avoid misdetection is to simply set a threshold equal to κ · σ n . For the algorithm described in Section II-C, as σ n is not known, we consider an estimate obtained from the residual: σ n = 1 N r T r. Indicating byα k the amplitude estimate for a candidate Gaussian dictionary component of index k, this is accepted into the set I(x) only if: Therefore, OMP iterations are stopped when one of the following conditions is met: • a maximum allocated number of Gaussian kernels has been reached; • percent relative deviation (PRD) ofx from the analyzed segment x drops to 1% or lower; • peaks found in the first step of the OMP iteration can no longer be reliably discerned from noise.

III. RESULTS
The aim of our study is to prove the value of a suitably designed Gaussian dictionary as a metrological tool for comparability of ECG recordings in a population. For this reason, only objective criteria referring to quantization, approximation and the robustness to noise and artifacts have been considered. Any input from training with experimental data sets has been intentionally avoided, since generality is emphasized. In this way, any recorded trace is compared against the same standard framework. Results are presented for the ECG traces in the MIT-BIH Arrhythmia Database [21], available at https://physionet.org [22]. This very well-known database contains a set of two-channel ambulatory ECG recordings lasting about 30 minutes each, which means a total of over 100,000 heart beats. These traces have been extensively employed for research and represent a thoroughly studied set of measured data.
Importantly for our work, all traces have been carefully annotated with beat labels, indicating either normal beats or anomalies of different kinds, positioned to coincide with R-wave peaks. This enables us to rely on time positions given in trace annotation files for segmentation, thereby avoiding the possible influence on dictionary-based decomposition of a specific peak detection algorithm. Annotations then allow to tell whether specific patterns in the estimated parameters can be related to information of medical relevance.
We emphasize that dictionary-based analysis should be seen as an intermediate step between trace acquisition and medical interpretation for diagnostic purposes. Annotated traces help validate the concept and, for this reason, our study is strictly limited to general data analysis tools, such as diagrams and histograms. We selected a single ECG trace from each record, indicated as modified limb lead II (MLII), that is obtained by electrodes placed on the chest. This yielded the greatest amount of data, since a trace for this lead was present in 46 records, of which 21 (numbered in the 100s) had been randomly chosen from a set of over 4000 long-term Holter recordings, while 25 (numbered in the 200s) were selected from the same set to include a variety of rare but clinically important phenomena.
The kind of systematic study envisaged in the opening discussion would compare full-trace sets among different subjects, or among recordings referring to the same subject under different conditions, or at different times, e.g., after some months, or years.
Traces in the database under study did not provide this kind of variety. However, as a preliminary observation, we present in Fig. 3 a set of graphs comprising nine different subjects, four male and five female aged between 23 and 83, drawn from among the 47 individuals making up the full set. For each segment, that corresponds to a single heart cycle, estimated amplitudesα i are plotted versus location parameter u i and all segments within a trace are superposed in the plot. Time reference t = 0 corresponds to the R-peak location, accordingly the location parameter value u on the abscissa is given as "delay from reference". Only normal beats are considered.
As discussed in [21], all traces were produced by a set of recording equipment of the same model using standard electrode placements, therefore variations can be related to individual subject features. The purpose of Fig. 3 is to evidence the variety of characteristic individual patterns, that in several cases can make a subject recognizable [23].
A cardiac anomaly is expected to cause variations in the pattern of Gaussian coefficients when anomalous heart beats occur. This kind of study requires comparison among segments within a trace (for instance, l with m when l = m) to evidence local differences, allowing the separation of anomalies from normal beats.
Coefficient clustering is indeed feasible to highlight unusual conditions in any of the ECG traces we considered. Given the variety of subjects and cardiac impairments represented in the MIT-BIH Arrythmia Database, systematic evaluation would by necessity rely on some specific data analysis approach, its performance affecting results. To keep the paper focus on standardization, repeatability and comparability we avoid this, and prefer to discuss in detail three selected cases. Each refers to a different anomaly and we show that relevant clusters can be evidenced by simple and intuitive means, like parameter plots and histograms. This is illustrated by the graphs of Fig. 4, where both the amplitude estimateα i and the dispersion estimateŝ i are considered. The trace for Subject 100 is characterized by premature atrial contraction (PAC) events, occurring in 1.5% of the heart beats. Plots ofα i versus u i andŝ i versus u i in Fig. 4(a) show that most parameter values fall in the same areas for both normal (blue dots) and anomalous (red

FIGURE 5. Subject 212 -right bundle block beats (RBBB) occur in more of the 60% of the heart beats. a) amplitude-versus-delay (top) and dispersion-versus-delay (bottom) coefficient plots for normal (blue dots) and PAC (red dots) beats; b) amplitude-versus-delay plot of T-wave coefficients; the normalized distributions of amplitude values for T-wave coefficients are reported in the inset box; c) ECG trace (top) and amplitude of each T wave versus time position (bottom), the amplitude of a normal T wave (black) is lower than that of a T wave with RBBB beats (magenta).
dots) beats. Indeed, blue dots in the top plot are the same already shown in the plot at the top left corner of Fig. 3. A significant difference is found at around u = −0.2. This is magnified and better evidenced in Fig. 4(b), whereα i is plotted versus u for the ECG P-wave alone. This plot is obtained by simply restricting the range of position values to −0.22 ≤ u ≤ −0.1 s. The graph shows Gaussian parameters for normal beats are closely clustered around mean valueŝ α i ∼ = 0.09 mV andû i ∼ = −170 ms. Not surprisingly, the two histograms for the distance from reference of the P wave, shown in the inset, are well separated, with PAC events occurring earlier. They clearly suggest a threshold value, that could be employed to detect and locate PAC events along the trace as shown in Fig. 4(c).
A conceptually similar situation is presented in Fig. 5, that refers to a subject affected by right bundle branch block (RBBB), a form of irregular heart beat caused by asymmetry in the contraction of ventricles. In this subject (No. 212) RBBB occurs for two-thirds of the time, when the heart rate exceeds approximately 90 beats per minute.
The ECG T-wave is most affected in this case, and the graph in Fig. 5(b) shows that clustering in the range of position values 0.15 ≤ u ≤ 0.35 s can be effective. A bimodal histogram is found for amplitudes, clearly showing a suitable threshold value for detecting and locating RBBB events along the trace.
It should be remarked that comparability among traces is harder to achieve for the amplitude parameter. In this case, several factors can affect the calibration of the signal acquisition channel, same of which cannot be considered under total control. Whereas the calibration state of the electronic recording equipment may be assumed, other factors that can affect waveform measurement, such as the use of dry or wet electrodes, conducting gels, the repeatability of electrode positioning for a given assumed lead, are far less controllable.
Consequently, as far as amplitudes are concerned we consider advisable to restrict the analysis to relative values. The point is illustrated in Fig. 6, where Subjects 119 and 228 are compared. Both are affected by premature ventricular contraction (PVC) in between 20% and 30% of the recorded heart beats. It can be seen that ECG traces, as plotted in Figs. 6(c) and 6(f) are significantly different for the two subjects, although in both cases direct observation of the trace suggests that variations in the amplitude of the R-wave peak value are associated with PVC. Again, this is clearly evidenced by clustering of the appropriate parameter values, as shown in Figs. 6(b) and 6(e) and it should be noticed in this case that two different amplitude thresholds, respectively, 2 mV and 1 mV can be determined. To reduce the dependence on amplitude calibration it is therefore advisable to consider amplitude in relative terms.
When needed, accurate reconstruction is possible, since waveform morphology information is preserved. As an index of reconstructed waveform fidelity Percent Root mean square Difference (PRD) is defined as: where x l is the l-th cardiac cycle in a reference trace x(t) comprising L cycles andx l is its reconstructed estimate. It should be reminded that in general the differencex i − x i includes the contribution of ECG modelling uncertainty. As illustrated in Table 1 the PRD value in the MIT-BIH database is 2.3 ± 2.1%, highlighting very accurate reconstruction performance. 1.5 ± 3.4% of cardiac cycles has a PRD greater than 10%. This occurs especially on segments affected by artifacts, suggesting the good capacity of the dictionary elements to represent the morphology of ECG trace.

IV. CONCLUSION
Electrocardiograms are widely employed in medical practice, sophisticated signal analysis techniques can be applied to interpret them, and medical protocols have been defined to enhance the reproducibility of clinical tests. The approach proposed in this paper introduces a significantly different perspective, where metrological criteria are employed to create a common basis for the objective comparability of recorded ECG traces.
We showed that the dictionary approach we propose is an effective feature extraction scheme, enabling to highlight useful evidence about some kinds of cardiac anomaly. Generality is a built-in characteristic and allows to consider the extracted parameter sets not only as an essential support for extensive data-based medical analyses, but also as the starting point to implement effective machine learning approaches for ECG diagnosis.
Future work will be pointing to developments in the latter area, that shows good potential for innovative instrumentation.